Oracle on Wednesdayintroducednew types of clusters set to be available for AI training through Oracle Cloud Infrastructure (OCI). The most powerful cluster will be based on Nvidia’s upcoming onBlackwell GPUsand will offer up to 2.4 ZettaFLOPS of AI performance, making it even more powerful thanElon Musk’s recently announced AI clusters.Oracle’s new supercomputer clusters can be configured with Nvidia’s Hopper or Blackwell GPUs for AI and HPC as well as different networking gear, including ultra-low latency RoCEv2 with ConnectX-7 NICs and ConnectX-8 SuperNICs or Nvidia’s Quantum-2 InfiniBand-based networks, and a choice of HPC storage, depending on performance needs:
OCI’s upcoming supercomputing clusters far exceed the capabilities of current leading systems. The range-topping B200-based OCI Superclusters feature over three times more GPUs than the Frontier supercomputer (which uses 37,888 AMD Instinct MI250X GPUs) and six times more than other hyperscalers, according to Oracle.“We have one of the broadest AI infrastructure offerings and are supporting customers that are running some of the most demanding AI workloads in the cloud,” said Mahesh Thiagarajan, executive vice president, Oracle Cloud Infrastructure. “With Oracle’s distributed cloud, customers have the flexibility to deploy cloud and AI services wherever they choose while preserving the highest levels of data and AI sovereignty.“Several companies are already benefiting from this advanced infrastructure. WideLabs and Zoom are leveraging OCI’s high-performance AI infrastructure to accelerate their AI development while maintaining sovereignty controls.“As businesses, researchers and nations race to innovate using AI, access to powerful computing clusters and AI software is critical,” said Ian Buck, vice president of Hyperscale and High Performance Computing at Nvidia. “Nvidia’s full-stack AI computing platform on Oracles broadly distributed cloud will deliver AI compute capabilities at unprecedented scale to advance AI efforts globally and help organizations everywhere accelerate research, development and deployment.“The upcoming OCI Superclusters will use Nvidia’s GB200 NVL72 liquid-cooled cabinets with 72 GPUs that communicate with each other at an aggregate bandwidth of 129.6 TB/s in a single NVLink domain. Oracle said that Nvidia’s Blackwell GPUs will be available in the first half of 2025 (asavailability of Blackwell this year will be limited), though it is unclear when OCI will offer fully loaded Blackwell-powered clusters.
Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.
Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.