Share Share by:
- Copy link
- X
Share this article Join the conversation Follow us Add us as a preferred source on Google
AI is everywhere at CES 2026, and Nvidia GPUs are at the center of the expanding AI universe. Today, during his CES keynote, CEO Jensen Huang shared his plans for how the company will remain at the forefront of the AI revolution as the technology reaches far beyond chatbots into robotics, autonomous vehicles, and the broader physical world.
First up, Huang officially launched Vera Rubin, Nvidia’s next-gen AI data center rack-scale architecture. Rubin is the result of what the company calls “extreme co-design” across six types of chips: the Vera CPU, the Rubin GPU, the NVLink 6 switch, the ConnectX-9 SuperNIC, the BlueField-4 data processing unit, and the Spectrum-6 Ethernet switch. Those building blocks all come together to create the Vera Rubin NVL72 rack.
Demand for AI compute is insatiable, and each Rubin GPU promises much more of it for this generation: 50 PFLOPS of inference performance with the NVFP4 data type, 5x that of Blackwell GB200, and 35 PFLOPS of NVFP4 training performance, 3.5x that of Blackwell. To feed those compute resources, each Rubin GPU package has eight stacks of HBM4 memory delivering 288GB of capacity and 22 TB/s of bandwidth.
You may like
-
Nvidia CEO confirms Vera Rubin NVL72 is now in production -
Nvidia reveals Vera Rubin Superchip for the first time -
Nvidia’s focus on rack-scale AI systems is a portent for the year to come
Per-GPU compute is just one building block in the AI data center. As leading large language models have shifted from dense architectures that activate every parameter to produce a given output token to mixture-of-experts (MoE) architectures that only activate a portion of the available parameters per token, it has become possible to scale up those models relatively efficiently. However, communication among those experts within models requires vast amounts of inter-node bandwidth.