Nvidia claims software and hardware upgrades allow Blackwell Ultra GB300 to dominate MLPerf benchmarks — touts 45% DeepSeek R-1 inference throughput increase over GB200

Nvidia has broken its own records in MLPerf benchmarks using its latest-generation Blackwell Ultra GB300 NVL72 rack-scale system, delivering what it claims is a 45% increase in inference performance over the Blackwell-based GB200 platform in DeepSeek R1 tests. Combining hardware improvements and software optimizations, Nvidia claims the top spot when running a range of models, and suggests this should be a primary consideration for any developers building out “AI factories,” as it could result in major enhancements for revenue generation.

Nvidia’s Blackwell architecture is at the heart of its latest-generation RTX 50-series graphics cards, which offer the best performance for gaming, even if AMD’s RX 9000-series arguably offers better bang for buck. But it’s also what’s under the hood of the big AI-powering GPU stacks like its GB200 platform, which is being built into a range of data centers all over the world to power next-generation AI applications. Blackwell Ultra, GB300, is the enhanced version of that with even more performance, and Nvidia has now tested it with some impressive MLPerf records.

The latest version of the MLPerf benchmark includes inference performance testing using the DeepSeek R1, Llama 3.1 405B, Llama 3.1 8B, and Whisper models, and GB300 NVL72 stole the show in all of them. Nvidia claims a 45% increase in performance over GB200 when running the DeepSeek model, and up to five times the performance of older Hopper GPUs – although Nvidia does note those comparative results came from unverified third parties.

You may like

  • Nvidia’s newest top-tier AI supercomputers deployed for the first time

  • AMD announces MI350X and MI355X AI GPUs, claims up to 4X generational performance gain, 35X faster inference

  • Huawei’s brute force AI tactic seems to be working — CloudMatrix 384 claimed to outperform Nvidia processors running DeepSeek R1

Part of these performance enhancements comes from the more capable tensor cores used with Blackwell Ultra, with Nvidia claiming “2X the attention-layer acceleration and 1.5X more AI compute FLOPS.” However, it was also made possible by a range of important software improvements and optimizations.

Comments (0)
Add Comment