OpenAI Jalapeño Chip Targets LLM Inference

OpenAI Jalapeño chip built with Broadcom is in lab testing and positioned to cut inference costs, reshaping cloud and hardware supplier positioning.

June 24, 2026·3 min read

View all news articles

Flat vector server rack with a stylized OpenAI Jalapeño chip symbolizing a custom AI inference chip and datacenter scale.

KEY TAKEAWAYS

OpenAI and Broadcom co-developed Jalapeño from design to tape-out in about nine months.
Engineering samples are running LLM workloads at target frequency and power, with a technical report planned.
Initial deployment is targeted by the end of 2026 with planned expansion into gigawatt-scale data centers.

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Add your email to receive our free daily newsletter. No spam, unsubscribe anytime.

Or subscribe with

OpenAI said on June 24, 2026, that the OpenAI Jalapeño chip, co-developed with Broadcom, is its first custom "Intelligence Processor" for large language model (LLM) inference. The chip is now in lab testing and designed to support a multi-generation compute platform.

Design and Early Testing

Jalapeño is OpenAI’s first AI accelerator, purpose-built for LLM inference powering ChatGPT, Codex, the OpenAI API, and future agentic products. The chip is a blank-slate application-specific integrated circuit (ASIC) tailored to OpenAI’s inference patterns. Its architecture reduces data movement and balances compute, memory, and networking to push utilization closer to theoretical peak.

Engineering samples are running machine-learning workloads at production target frequency and power, including GPT-5.3-Codex-Spark. Early lab results indicate substantially better performance per watt than current state-of-the-art accelerators. The companies plan to release a detailed technical performance report in the coming months. Richard Ho, OpenAI’s head of hardware, said, “Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.” [source:11]

Partners, Development, and Deployment

OpenAI and Broadcom co-developed the design to manufacturing tape-out in about nine months. OpenAI led the architectural design, optimizing kernels, memory movement, networking, and serving patterns, using its own AI models to accelerate development. Broadcom handled silicon implementation and data-center networking, including its Tomahawk connectivity chips. Celestica will manage board, rack, and system integration for servers housing the chip. The design was sent to TSMC for production.

Jalapeño is intended for integration into server systems dedicated to OpenAI workloads. The companies describe it as the first step in a multi-generation compute platform, with initial deployment targeted by the end of 2026. Plans include scaling into gigawatt-scale data centers alongside partners such as Microsoft.

OpenAI positions Jalapeño as an extension of its full-stack platform—moving beyond products and models into custom hardware to make advanced AI faster, more reliable, and less dependent on external suppliers. Early internal tests suggest roughly 50% processing-cost savings compared with standard GPU-based systems. As an ASIC, Jalapeño is less flexible than general-purpose GPUs but offers greater cost efficiency for the narrow, high-volume inference kernels that dominate LLM serving.

The chip is designed with flexibility to support all current and future LLMs across the industry, indicating potential applicability beyond OpenAI’s own models.

Market Positioning and Outlook

Jalapeño aims to lower inference costs and provide an alternative to incumbent GPUs, particularly Nvidia’s. Broadcom CEO Hock Tan described the initiative as a “fundamental commitment to scaling the physical infrastructure necessary for the next decade of AI,” with deployment in gigawatt-scale data centers starting in 2026.

The companies have not disclosed formal benchmarks or detailed cost metrics, but early internal estimates suggest up to 50% savings in processing costs compared with existing GPU-based systems. Jalapeño’s capabilities are said to be comparable to Nvidia’s Blackwell GPUs and Google’s tensor processing units (TPUs) for relevant inference workloads.

OpenAI plans to expand the platform in the years ahead, with a multi-generation roadmap. A detailed technical report on performance is expected in the coming months. The press release does not specify pricing, commercial availability, or third-party customer access beyond OpenAI and its cloud partners.

The chip’s deployment marks a strategic shift toward vertically integrated infrastructure for LLMs, reducing reliance on external hardware suppliers and enhancing compute efficiency.

OpenAI Jalapeño Chip Targets LLM Inference

KEY TAKEAWAYS

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Design and Early Testing

Partners, Development, and Deployment

Market Positioning and Outlook

Related Articles

HIGH POTENTIAL TRADES SENT DIRECTLY TO YOUR INBOX

Read other top news stories