NVIDIA Unveils Rubin: A Six-Chip Platform Promising 10x Cost Reduction for AI Inference
NVIDIA announces its most ambitious AI platform yet at CES 2026, integrating six new chips designed to dramatically reduce the cost of running AI models.
Overview
NVIDIA has announced its next-generation AI computing platform at CES 2026, introducing Rubin as the successor to the Blackwell architecture. The platform represents what NVIDIA calls its first “extreme-codesigned” system, integrating six purpose-built chips into a unified architecture optimized for the emerging demands of agentic AI and mixture-of-experts models.
What We Know
The Rubin platform comprises six distinct components working in concert:
- Rubin GPU: Delivers 50 petaflops of NVFP4 compute for inference workloads
- Vera CPU: Purpose-built for high-bandwidth data movement
- NVLink 6 Switch: Provides 3.6TB/s bandwidth per GPU, scaling to 260TB/s across a full NVL72 rack
- ConnectX-9 SuperNIC: Next-generation network interface
- BlueField-4 DPU: Data processing unit for infrastructure offload
- Spectrum-6 Ethernet Switch: Delivers what NVIDIA claims is 10x greater reliability and 5x better power efficiency
According to NVIDIA’s official announcement, the platform achieves a 10x reduction in inference token cost compared to Blackwell, while requiring 4x fewer GPUs to train mixture-of-experts models [1]. CEO Jensen Huang characterized the release by stating that “Rubin takes a giant leap toward the next frontier of AI.”
The flagship configuration, Vera Rubin NVL72, combines 72 Rubin GPUs with 36 Vera CPUs in a rack-scale system. NVIDIA also announced the HGX Rubin NVL8, a smaller 8-GPU configuration linked via NVLink for more modest deployments.
Major cloud providers have committed to deploying Rubin-based instances in the second half of 2026. According to NVIDIA, AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will be among the first, alongside NVIDIA Cloud Partners including CoreWeave, Lambda, Nebius, and Nscale [1]. Server manufacturers Dell, HPE, Lenovo, Supermicro, and Cisco are also listed as partners.
Notably, leading AI research organizations including OpenAI, Anthropic, Meta, xAI, Mistral AI, and Cohere have been named as partners for the platform [2].
What We Don’t Know
NVIDIA has not disclosed pricing for the Rubin platform or its components. Given the significant performance claims, the cost structure will be a key factor in determining actual adoption timelines.
The company also provided limited details on several technical specifications, including power consumption, thermal requirements, and the precise architectural improvements over Blackwell that enable the claimed efficiency gains.
Whether the 10x cost reduction claim holds across different model architectures and workload types remains to be validated by independent benchmarks once hardware becomes available.
Broader Context
The announcement arrives as the AI industry continues its rapid buildout of inference infrastructure. Huang noted during the CES presentation that “approximately $10 trillion or so of the last decade of computing is now being modernized” through accelerated computing and AI [2].
Alongside Rubin, NVIDIA announced six domain-specific open models trained on its supercomputers: Clara for healthcare, Earth-2 for climate science, Nemotron for reasoning, Cosmos for robotics simulation, GR00T for embodied intelligence, and Alpamayo for autonomous driving. The company also revealed that the Mercedes-Benz CLA will become the first passenger vehicle to deploy Alpamayo-based autonomous driving capabilities [2].
Sources cited in this article are listed in the provenance record.