Nvidia's Vera Strategy: Controlling the AI Data Centre Stack

At its annual GTC conference, Nvidia CEO Jensen Huang declared the company expects to sell $1 trillion of AI hardware through 2027, nearly doubling the firm's previous projections. The announcement signals how dramatically the chip giant's ambitions have shifted. This isn't simply about selling faster processors anymore. It's about controlling the entire ecosystem that customers buy.

Nvidia unveiled its new Vera CPU Rack architecture, which brings 256 liquid-cooled CPUs into one rack for CPU-centric workloads, claiming a 6x gain in CPU throughput and twice the performance in agentic AI workloads. But that's only part of the story. The Vera CPU works alongside the Vera Rubin platform, combining new CPU, GPU, networking and storage hardware into a rack-scale design aimed at agentic AI, reinforcement learning and inference.

For years, Nvidia built the GPUs. Customers bought CPUs from Intel or AMD, networking gear from Mellanox, memory from vendors like SK Hynix. System integrators like Dell and Lenovo assembled the pieces. That fragmented model worked when simple performance mattered most. But as AI workloads grow more complex—especially for agentic AI systems that need rapid communication between processors—the bottlenecks shift. Data moving between the CPU and GPU becomes as critical as raw compute speed.

Nvidia's solution: own the whole stack. The Rubin platform encompasses the Rubin GPU, the Vera CPU, a next-generation DPU, advanced NICs, NVLink 6 scale-up networking, and Ethernet switching infrastructure. This integrated approach means NVIDIA is not just selling chips; it is selling entire AI factory blueprints. Every component is co-designed to move data faster and waste less power.

The business logic is sound. The NVL72 design combines 72 Rubin GPUs and 36 Vera CPUs, and can train mixture-of-experts models with one-fourth the number of GPUs required by Blackwell, while also delivering up to 10x higher inference throughput per watt and one-tenth the cost per token. Cloud providers and AI labs should love this: lower power consumption, faster inference, cheaper operations. Meta became the first to deploy Nvidia's Grace central processing units as standalone chips in its data centres, as opposed to incorporated alongside GPUs in a server.

Yet the strategy raises hard questions. Nvidia already controls roughly 80 to 90 per cent of high-end AI accelerator sales. By bundling CPUs, it directly confronts Intel and AMD in their core market. The evolution of the Vera CPU and its integration into deployable rack-scale systems marks Nvidia's entry into direct CPU sales, positioning itself as a competitor to Intel and AMD in the traditional CPU market.

There's intellectual honesty needed here. Nvidia's integrated approach delivers real engineering gains. The Vera chip uses LPDDR5X memory and delivers up to 1.2 TB/s of memory bandwidth. Nvidia says Vera is 50% faster and twice as efficient as traditional rack-scale CPUs, and when paired with Rubin GPUs over NVLink-C2C it can reach 1.8 TB/s of coherent bandwidth, far above PCIe Gen 6. Customers who buy the bundle gain measurable advantages over patchwork alternatives.

But there's a darker edge to vertical integration. Once a data centre operator commits to Nvidia's hardware, networking, and software stack, switching costs become immense. Analysts point out Nvidia's strong position, thanks to its GPU dominance and close ties with cloud providers. But there's buzz about possible regulatory and market headaches as Nvidia tries to scale up manufacturing, certification, and support. Antitrust regulators in multiple jurisdictions are watching. This could lead to increased scrutiny from antitrust regulators concerned about a "monoculture" in AI hardware, though Nvidia's defence remains its relentless pace of innovation that benefits the entire ecosystem.

The real constraint, though, may be supply. A big question is whether Nvidia can meet demand for AI hardware worth $1 trillion in the coming years as the company's supplier TSMC expands its capacity at rather conservative pace. Ambitious sales projections matter little if manufacturing can't keep pace.

Nvidia's bet is that data centre operators will accept tighter coupling to its ecosystem because the performance gains justify the cost and the risk. Nvidia essentially builds entire data centres, supplying clients with rack-scale solutions for AI computing that span GPUs, CPUs, and networking. That lets the company optimise for performance and power efficiency at the system level rather than the component level, which gives it an important edge. "Nvidia produces the lowest cost per token and data centres running on Nvidia generate the highest revenues," according to Huang.

For the moment, that logic is holding. NVIDIA named Amazon Web Services, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, Alibaba, ByteDance, CoreWeave, Lambda, Nebius, OpenAI, Anthropic, Meta and Mistral AI among the companies working with Vera or Vera Rubin systems. When the largest technology companies in the world are betting on your platform, sceptics find it hard to gain traction. Still, the tension between performance optimisation and competitive openness is real. The numbers are staggering. The risks—regulatory, competitive, and operational—are equally substantial. Investors and customers alike should watch carefully whether Nvidia can execute this vision without provoking a backlash that fragments its market position.

The outcome will define not just Nvidia's future, but the shape of the global AI infrastructure for years to come.