Meta says it's building the 'fastest supercomputer in the world'
Meta has slated mid-2022 for the completion of what it is hyping up as the 'fastest AI supercomputer in the world'.
Meta's AI Research SuperCluster (RSC), which is currently operational but still in phase 2 of its development, will be used to develop advanced AI in areas such as computer vision, natural language processing and speech recognition. This could mean new possibilities for technologies like video processing, and real-time voice translations in different languages within a large group of people, for example.
According to a blog post by technical program manager Kevin Lee and software engineer Shubho Sengupta, the RSC will be instrumental in creating Meta's vision of the metaverse, which the company sees as connectivity and social presence that would be impossible in the physical world.
The RSC includes a storage flash array from Pure Storage, as well as 760 NVIDIA DGX A100 systems linked with NVIDIA Quantum 200 Gb/s InfiniBand fabric. These systems can deliver AI training of up to 1,896 petaflops of TensorFloat-32 performance.
NVIDIA states, "Early benchmarks on RSC, compared with Meta's legacy production and research infrastructure, have shown that it runs computer vision workflows up to 20 times faster, runs the NVIDIA Collective Communication Library (NCCL) more than nine times faster, and trains large-scale NLP models three times faster. That means a model with tens of billions of parameters can finish training in three weeks, compared with nine weeks before.
Pure Storage CTO Rob Lee says that in order to power the metaverse, technologies need to be able to conduct instant data analysis.
"Meta's RSC is a breakthrough in supercomputing that will lead to new technologies and customer experiences enabled by AI. We are thrilled to be a part of this project and look forward to seeing the progress Meta's AI researchers will make.
Meta also stresses that the RSC has been designed for privacy and security from the ground up. The RSC is not connected to the wider internet and it is only connected to Meta's own production data centers. All data is also encrypted.
"Before data is imported to RSC, it must go through a privacy review process to confirm it has been correctly anonymised. The data is then encrypted before it can be used to train AI models and decryption keys are deleted regularly to ensure older data is not still accessible. And since the data is only decrypted at one endpoint, in memory, it is safeguarded even in the unlikely event of a physical breach of the facility," the company states.
The company aims to bump out the number of GPUs from 6,080 to 16,000 before the RSC's completion.
Lee and Sengupta conclude, "Our long-term investments in self-supervised learning and in building next-generation AI infrastructure with RSC are helping us create the foundational technologies that will power the metaverse and advance the broader AI community as well.