
- Nvidia Rubin DGX SuperPOD delivers 28.8 Exaflops with only 576 GPUs
- Each NVL72 system combines 36 Vera CPUs, 72 Rubin GPUs, and 18 DPUs
- Aggregate NVLink throughput reaches 260TB/s per DGX rack for efficiency
At CES 2026, Nvidia unveiled its next-generation DGX SuperPOD powered by the Rubin platform, a system designed to deliver extreme AI compute in dense, integrated racks.
According to the company, the SuperPOD integrates multiple Vera Rubin NVL72 or NVL8 systems into a single coherent AI engine, supporting large scale workloads with minimal infrastructure complexity.
With liquid cooled modules, high speed interconnects, and unified memory, the system targets institutions seeking maximum AI throughput and reduced latency.
Rubin-based compute architecture
Each DGX Vera Rubin NVL72 system includes 36 Vera CPUs, 72 Rubin GPUs, and 18 BlueField 4 DPUs, delivering a combined FP4 performance of 50 petaflops per system.
Aggregate NVLink throughput reaches 260TB/s per rack, allowing the full memory and compute space to operate as a single coherent AI engine.
The Rubin GPU incorporates a third generation Transformer Engine and hardware accelerated compression, allowing inference and training workloads to process efficiently at scale.
Connectivity is reinforced by Spectrum-6 Ethernet switches, Quantum-X800 InfiniBand, and ConnectX-9 SuperNICs, which support deterministic high speed AI data transfer.
Nvidia’s SuperPOD design emphasizes end to end networking performance, ensuring minimal congestion in large AI clusters.
Quantum-X800 InfiniBand delivers low latency and high throughput, while Spectrum-X Ethernet handles east west AI traffic efficiently.
Each DGX rack incorporates 600TB of fast memory, NVMe storage, and integrated AI context memory to support both training and inference pipelines.
The Rubin platform also integrates advanced software orchestration through Nvidia Mission Control, streamlining cluster operations, automated recovery, and infrastructure management for large AI factories.
A DGX SuperPOD with 576 Rubin GPUs can achieve 28.8 Exaflops FP4, while individual NVL8 systems deliver 5.5x higher FP4 FLOPS than previous Blackwell architectures.
By comparison, Huawei’s Atlas 950 SuperPod claims 16 Exaflops FP4 per SuperPod, meaning Nvidia reaches higher efficiency per GPU and requires fewer units to achieve extreme compute levels.
Rubin based DGX clusters also use fewer nodes and cabinets than Huawei’s SuperCluster, which scales into thousands of NPUs and multiple petabytes of memory.
This performance density allows Nvidia to compete directly with Huawei’s projected compute output while limiting space, power, and interconnect overhead.
The Rubin platform unifies AI compute, networking, and software into a single stack.
Nvidia AI Enterprise software, NIM microservices, and mission critical orchestration create a cohesive environment for long context reasoning, agentic AI, and multimodal model deployment.
While Huawei scales primarily through hardware count, Nvidia emphasizes rack level efficiency and tightly integrated software controls, which may reduce operational costs for industrial scale AI workloads.
TechRadar will be extensively covering this year’s CES, and will bring you all of the big announcements as they happen. Head over to our CES 2026 news page for the latest stories and our hands-on verdicts on everything from wireless TVs and foldable displays to new phones, laptops, smart home gadgets, and the latest in AI. You can also ask us a question about the show in our CES 2026 live Q&A and we’ll do our best to answer it.
And don’t forget to follow us on TikTok and WhatsApp for the latest from the CES show floor!
https://cdn.mos.cms.futurecdn.net/B3ZKJGH9D3EWuVkxLxKLXJ-1920-80.png
Source link




