OpenAI and Broadcom Unveil 'Jalapeño,' a Custom Inference Chip Said to Cut Costs by Half

The accelerator, co-developed in nine months and bound for initial deployment alongside Microsoft late next year, is the ChatGPT maker's first in-house silicon and a direct bid to loosen its reliance on Nvidia.

OpenAI and Broadcom on Wednesday unveiled Jalapeño, OpenAI’s first in-house silicon, an inference-optimized ASIC that Broadcom chief executive Hock Tan told Bloomberg delivers roughly 50 percent cost savings against the A.I. graphics processing units the industry currently rents by the hour. The announcement, staged jointly out of San Francisco and Palo Alto, is the clearest signal yet that the largest buyers of frontier compute now intend to design around Nvidia rather than through it.

The chip was designed by OpenAI, industrialized by Broadcom, and built with Celestica. Microsoft is the named lead partner for data-center deployment, with initial rollout targeted by the end of 2026. The work extends a ten-gigawatt custom-compute agreement the two companies disclosed last year, and slots into a now-familiar pattern: hyperscalers absorbing more of the silicon stack as soon as their workloads become legible enough to harden into ASICs.

What’s structurally notable is the pace. OpenAI and Broadcom say Jalapeño went from initial design to tape-out in nine months, which they characterize as among the fastest cycles ever for a high-performance semiconductor. Greg Brockman, OpenAI’s president and co-founder, told CNBC that “The degree to which our models have been able to accelerate it was very surprising to us.” Read literally, that’s a productivity claim. Read structurally, it’s the first public assertion by a frontier lab that its own models are now compressing the design timelines of the hardware those models will run on.

Richard Ho, who leads OpenAI’s hardware program, said the architecture is “optimized” around the kernels, memory movement and serving patterns “that matter most for frontier A.I. models.” The companies jointly describe Jalapeño’s performance per watt as “substantially better than current state-of-the-art,” while conceding that final measurements aren’t complete and a detailed technical report will follow. Engineering samples are already running production-target workloads in OpenAI’s lab, including a model identified as GPT-5.3-Codex-Spark; the chip targets ChatGPT serving and the Codex coding agent, not general-purpose GPU work.

The market has been pricing this trajectory for months. Broadcom shares are up about 10 percent in 2026, and Tan told CNBC that demand from his six largest customers is “simply insatiable,” with elevated orders booked through 2028. Tan and Broadcom’s semiconductor president Charlie Kawwas handed the first samples to Sam Altman and Brockman at OpenAI’s offices, a staged ceremony that reads less like product launch theater than like the 2008 TARP press conferences: a photographed handoff meant to make a structural shift legible to outsiders before its consequences arrive.

OpenAI and Broadcom Unveil 'Jalapeño,' a Custom Inference Chip Said to Cut Costs by Half

Sources