Nvidia Vera Rubin Enters Production

Nvidia Vera Rubin enters production and will ship to partners in H2 2026; ICMS memory needs could tighten NAND supply and shift component flows.

January 15, 2026·2 min read
View all news articles
Flat-vector server rack evoking Nvidia Vera Rubin NVL72 memory demand and potential NAND supply pressure.

KEY TAKEAWAYS

  • Nvidia Vera Rubin entered production with NVL72 rack systems slated for partner shipments in H2 2026.
  • ICMS needs about 1,152 TB NAND per NVL72, posing potential near-term pressure on global NAND supply.

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Add your email to receive our free daily newsletter. No spam, unsubscribe anytime.

Or subscribe with

NVIDIA Corp. said on Jan. 12, 2026, that Nvidia Vera Rubin, its new rack-scale AI platform, has entered production. The company expects availability in the second half of 2026 through cloud and server partners. Analyst projections highlight potential strain on NAND supply due to the platform’s large memory demands.

Vera Rubin Architecture and Performance

NVIDIA CEO Jensen Huang introduced Vera Rubin at CES 2026, positioning it as the system-level successor to Blackwell, designed for sustained, interactive AI inference workloads. The platform centers on the NVL72 server, which aggregates 72 GPUs linked by NVLink 6, delivering 3.6 terabytes per second (TB/s) of inter-GPU bandwidth.

The Rubin GPU contains about 336 billion transistors and features a third-generation Transformer Engine using NVFP4 precision. NVIDIA rates the chip at up to 50 petaFLOPS (PFLOPS) for inference and 35 PFLOPS for training, supported by HBM4 memory with 22 TB/s bandwidth. The Vera CPU complements the GPU with 88 Arm Olympus cores and 176 threads, paired with SOCAMM memory at 1.2 TB/s and NVLink-C2C links to GPUs at 1.8 TB/s.

A new memory tier, Inference Context Memory Storage (ICMS), holds key-value caches for stateful, agentic inference. NVIDIA claims this tier can boost throughput and power efficiency by up to five times on some workloads and increase effective token generation per GPU by up to ten times compared with Blackwell.

The platform’s networking and acceleration stack includes ConnectX-9 SuperNICs, BlueField-4 data processing units capable of 1.6 terabits per second (Tb/s) per GPU, and Spectrum-X Ethernet at 102.4 Tb/s.

Shipments and NAND Supply Implications

Production of the Vera Rubin platform has begun, with shipments expected in the second half of 2026 through server partners, major cloud providers, and AI labs such as AWS, Microsoft, Google, OpenAI, Meta, and xAI. Some analysts project initial shipments may slip to the fourth quarter of 2026, trailing competitors targeting earlier releases.

Analyst models assume each NVL72 server requires roughly 1,152 terabytes of NAND flash memory for ICMS. Under this assumption, shipments could reach about 30,000 units in 2026 and 100,000 in 2027. These volumes would represent approximately 2.8% and 9.3% of annual global NAND demand in those years.

The platform’s large per-server memory and system-level design support a long-term demand outlook for AI infrastructure. However, this also creates a potential near-term pressure point for NAND suppliers and large cloud buyers, given the substantial NAND capacity per server.

The Vera Rubin platform’s production and memory architecture highlight a critical supply-chain factor for investors monitoring AI infrastructure growth and component availability.

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Add your email to receive our free daily newsletter. No spam, unsubscribe anytime.

Or subscribe with

Read other top news stories

Anthropic Export Controls Cut Access To Fable

Anthropic Export Controls Cut Access To Fable

Anthropic export controls suspended foreign access to top models, creating a licensing regime and raising compliance and deployment risks for global users.

DOJ Clears Paramount-Warner Deal

DOJ Clears Paramount-Warner Deal

DOJ Clears Paramount-Warner Deal, removing a U.S. antitrust barrier and shifting the regulatory gate to Europe, refocusing traders on approval timing.

RH Q1 Earnings Beat, Raises FY2026 Guidance

RH Q1 Earnings Beat, Raises FY2026 Guidance

RH Q1 earnings beat and a raised FY2026 outlook were offset by softer Q2 revenue guidance and tariff-related backorders that pressured the stock.

SpaceX IPO Crowns Musk World's First Trillionaire

SpaceX IPO Crowns Musk World's First Trillionaire

SpaceX IPO lifted Musk into trillionaire ranks and forced traders to weigh a $135 IPO price, $75B proceeds and steep cash burn against the growth case.

SpaceX IPO Raises $75 Billion As Funds Trim Tech

SpaceX IPO Raises $75 Billion As Funds Trim Tech

SpaceX IPO reshaped tech-sector positioning as hedge-fund selling concentrated prelisting flows into linked vehicles and heightened index inclusion risk.

Meta Outage Halts Facebook, Instagram, WhatsApp

Meta Outage Halts Facebook, Instagram, WhatsApp

Meta outage disrupted Facebook, Instagram and WhatsApp and raised investor scrutiny over ad delivery and disclosure, influencing near-term positioning.