OpenAI and Broadcom Unveil 'Jalapeño,' a Custom Inference Chip to Curb Reliance on Nvidia

The ChatGPT maker's first in-house accelerator, developed with Broadcom in nine months and aimed at initial deployment by year's end, is showing roughly 50 percent cost savings over standard GPU infrastructure, Broadcom's chief executive said.

OpenAI on Wednesday unveiled Jalapeño, its first custom inference accelerator, co-designed with Broadcom and aimed at initial deployment by the end of 2026. At a San Francisco event, Hock Tan and Charlie Kawwas of Broadcom handed the first physical samples to Sam Altman and Greg Brockman, the staged choreography of a partnership formally announced in October and now, nine months from initial design to tape-out, sitting on a table.

The chip is being marketed as an “Intelligence Processor,” an ASIC purpose-built for inference rather than pre-training. Engineering samples are already running GPT-5.3-Codex-Spark at production frequency and power, according to OpenAI and Broadcom, with the company claiming performance-per-watt “substantially better than current state-of-the-art.” Industry analysts speaking to CNBC framed Jalapeño as less flexible than a GPU but cheaper and more efficient for the narrowly defined workloads that now dominate OpenAI’s compute bill: serving ChatGPT, Codex, and the developer API at scale.

That’s the structural argument. Tan told Bloomberg the chip is showing roughly 50 percent cost savings over typical AI graphics processors in early testing. For a company whose marginal cost per query is functionally a tax paid to Nvidia, halving the bill isn’t an optimization. It’s a balance-sheet event.

Broadcom is contributing silicon implementation and its Tomahawk networking; Celestica handles board, rack, and system integration. Richard Ho, who leads OpenAI’s hardware program, said in a statement that the design lets the company “efficiently execute our most important workloads close to the hardware’s theoretical limits.”

Brockman, speaking to CNBC’s David Faber, said OpenAI used its own models to accelerate parts of the chip’s design. “The degree to which our models have been able to accelerate it was very surprising to us,” he said, a quietly recursive admission: the models pay for the chips, and now the models help design the chips that’ll make the models cheaper to run.

Tan laid out the ramp plainly. Late 2026 is “small prototype development.” “We will start seeing it really ramp up in ‘27 and really going full tilt in first half ‘28,” he said, tying the rollout to “the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”

Broadcom’s stock, up about 10 percent year-to-date and roughly sevenfold since the end of 2022, has already priced in the trade Nvidia hasn’t yet been forced to make: that the largest customers of the AI boom would, eventually, prefer to own the silicon they depend on.

OpenAI and Broadcom Unveil 'Jalapeño,' a Custom Inference Chip to Curb Reliance on Nvidia

Sources