Nvidia Vera Rubin Enters Production

Nvidia Vera Rubin enters production and will ship to partners in H2 2026; ICMS memory needs could tighten NAND supply and shift component flows.

January 15, 2026·2 min read
View all news articles
Flat-vector server rack evoking Nvidia Vera Rubin NVL72 memory demand and potential NAND supply pressure.

KEY TAKEAWAYS

  • Nvidia Vera Rubin entered production with NVL72 rack systems slated for partner shipments in H2 2026.
  • ICMS needs about 1,152 TB NAND per NVL72, posing potential near-term pressure on global NAND supply.

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Add your email to receive our free daily newsletter. No spam, unsubscribe anytime.

Or subscribe with

NVIDIA Corp. said on Jan. 12, 2026, that Nvidia Vera Rubin, its new rack-scale AI platform, has entered production. The company expects availability in the second half of 2026 through cloud and server partners. Analyst projections highlight potential strain on NAND supply due to the platform’s large memory demands.

Vera Rubin Architecture and Performance

NVIDIA CEO Jensen Huang introduced Vera Rubin at CES 2026, positioning it as the system-level successor to Blackwell, designed for sustained, interactive AI inference workloads. The platform centers on the NVL72 server, which aggregates 72 GPUs linked by NVLink 6, delivering 3.6 terabytes per second (TB/s) of inter-GPU bandwidth.

The Rubin GPU contains about 336 billion transistors and features a third-generation Transformer Engine using NVFP4 precision. NVIDIA rates the chip at up to 50 petaFLOPS (PFLOPS) for inference and 35 PFLOPS for training, supported by HBM4 memory with 22 TB/s bandwidth. The Vera CPU complements the GPU with 88 Arm Olympus cores and 176 threads, paired with SOCAMM memory at 1.2 TB/s and NVLink-C2C links to GPUs at 1.8 TB/s.

A new memory tier, Inference Context Memory Storage (ICMS), holds key-value caches for stateful, agentic inference. NVIDIA claims this tier can boost throughput and power efficiency by up to five times on some workloads and increase effective token generation per GPU by up to ten times compared with Blackwell.

The platform’s networking and acceleration stack includes ConnectX-9 SuperNICs, BlueField-4 data processing units capable of 1.6 terabits per second (Tb/s) per GPU, and Spectrum-X Ethernet at 102.4 Tb/s.

Shipments and NAND Supply Implications

Production of the Vera Rubin platform has begun, with shipments expected in the second half of 2026 through server partners, major cloud providers, and AI labs such as AWS, Microsoft, Google, OpenAI, Meta, and xAI. Some analysts project initial shipments may slip to the fourth quarter of 2026, trailing competitors targeting earlier releases.

Analyst models assume each NVL72 server requires roughly 1,152 terabytes of NAND flash memory for ICMS. Under this assumption, shipments could reach about 30,000 units in 2026 and 100,000 in 2027. These volumes would represent approximately 2.8% and 9.3% of annual global NAND demand in those years.

The platform’s large per-server memory and system-level design support a long-term demand outlook for AI infrastructure. However, this also creates a potential near-term pressure point for NAND suppliers and large cloud buyers, given the substantial NAND capacity per server.

The Vera Rubin platform’s production and memory architecture highlight a critical supply-chain factor for investors monitoring AI infrastructure growth and component availability.

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Add your email to receive our free daily newsletter. No spam, unsubscribe anytime.

Or subscribe with

Read other top news stories

Adobe Settlement Resolves DOJ Subscription Suit

Adobe Settlement Resolves DOJ Subscription Suit

Adobe settlement ends June 2024 subscription cancellation suit and imposes an injunction and $150 million cost, raising SEC disclosure and reserve questions.

Powell DOJ Probe Halted After Judge Blocks Subpoenas

Powell DOJ Probe Halted After Judge Blocks Subpoenas

Judge blocks subpoenas in the Powell DOJ Probe, forcing an appeal and keeping political scrutiny of the Fed that complicates rate narratives.

Core PCE Inflation Remains Elevated

Core PCE Inflation Remains Elevated

Core PCE inflation stayed elevated in January 2026 and an Iran oil shock could lift inflation and delay Fed easing, raising market volatility.

Meta Avocado Delay Raises Competitive Concerns

Meta Avocado Delay Raises Competitive Concerns

Meta Avocado delay renews scrutiny of the company's AI roadmap and raises execution risk as META stock slipped about 3.2%, pressuring timelines.

U.S. Q4 2025 GDP Revision Slows Growth

U.S. Q4 2025 GDP Revision Slows Growth

U.S. Q4 2025 GDP revision shows slower growth and elevates policy risk as core PCE remains elevated, prompting traders to reassess portfolio positioning.

Rubrik Q4 Results Exceed Guidance; Profitability Improves

Rubrik Q4 Results Exceed Guidance; Profitability Improves

Rubrik Q4 results showed accelerating subscription ARR, wider margins and stronger cash flow, likely drawing investor interest in AI-era data resilience.