Rebeca Moen
Mar 16, 2026 19:57
NVIDIA launches Vera Rubin platform with seven chips in full manufacturing, promising 10x inference value discount versus Blackwell. Companions delivery H2 2026.
NVIDIA dropped its largest {hardware} announcement since Blackwell at GTC 2026, revealing that each one seven chips powering the Vera Rubin platform have entered full manufacturing. The system targets a 10x discount in inference token prices in comparison with its predecessor—a metric that straight impacts the economics of working AI at scale.
The March 16 announcement comes as NVIDIA trades at $180.25 with a market cap of $4.43 trillion, underscoring investor urge for food for the corporate’s AI infrastructure dominance.
What’s Truly within the Field
Vera Rubin is not a single chip—it is an built-in supercomputer structure constructed round 5 distinct rack configurations. The NVL72 rack pairs 72 Rubin GPUs with 36 Vera CPUs related through NVLink 6, delivering what NVIDIA claims is 10x larger inference throughput per watt at one-tenth the associated fee per token versus Blackwell.
The true shock? NVIDIA Groq 3 LPX integration. The newly acquired inference accelerator provides 256 LPU processors per rack with 128GB of on-chip SRAM and 640 TB/s scale-up bandwidth. Along with Rubin GPUs, NVIDIA guarantees 35x larger inference throughput per megawatt for trillion-parameter fashions.
“Vera Rubin is a generational leap—seven breakthrough chips, 5 racks, one big supercomputer,” CEO Jensen Huang mentioned. “The agentic AI inflection level has arrived.”
Who’s Shopping for
The client record reads like an AI business listing. OpenAI’s Sam Altman confirmed plans to run “extra highly effective fashions and brokers at huge scale,” whereas Anthropic CEO Dario Amodei cited the platform’s capability for “advanced reasoning, agentic workflows and mission-critical selections.”
On the identical day, Meta and Nebius introduced a $27 billion deal powered by Vera Rubin infrastructure—signaling the dimensions of capital flowing into next-gen AI compute.
Cloud availability spans AWS, Google Cloud, Microsoft Azure, and Oracle, plus NVIDIA Cloud Companions together with CoreWeave, Crusoe, Lambda, and Nebius. {Hardware} OEMs Dell, HPE, Lenovo, and Supermicro will ship techniques in H2 2026.
The Effectivity Play
NVIDIA’s DSX Max-Q platform allows dynamic energy provisioning that reportedly permits 30% extra AI infrastructure deployment inside fastened energy budgets. The corporate additionally claims DSX Flex software program can unlock “100 gigawatts of stranded grid energy” by making AI factories grid-flexible property.
For the Vera CPU particularly, NVIDIA touts twice the effectivity and 50% quicker efficiency versus conventional rack-scale CPUs for reinforcement studying workloads—the computational spine of coaching AI brokers.
What to Watch
Accomplice shipments start within the second half of 2026. The important thing query: can NVIDIA really ship at scale? Blackwell confronted provide constraints that pissed off hyperscalers. With seven chips now in “full manufacturing,” NVIDIA is betting it might meet demand that reveals no indicators of slowing.
Picture supply: Shutterstock

