OpenAI and Broadcom test Jalapeño custom inference chip in the lab

OpenAI and Broadcom move custom AI chip into lab testing

OpenAI and Broadcom have revealed Jalapeño, a custom inference chip designed to speed up large language model workloads and improve energy efficiency. The companies said engineering samples are already running AI tasks in the lab, marking an early milestone in a broader hardware platform they plan to build over several generations.

The chip is OpenAI’s first Intelligence Processor and part of a larger effort to expand the company beyond models and products into the underlying infrastructure that powers them. OpenAI said Jalapeño was built specifically for LLM inference, with architecture shaped by its own experience running systems such as ChatGPT, Codex, and its API services.

Early results point to better efficiency

OpenAI said initial testing indicates Jalapeño should deliver substantially better performance per watt than current leading alternatives, although final benchmarks have not yet been released. The company plans to share a more detailed technical report in the coming months.

According to OpenAI, the design focuses on reducing data movement and balancing compute, memory, and networking resources more effectively. The goal is to bring real-world utilization closer to the chip’s theoretical peak performance, an important metric for inference systems that need to serve many users quickly and efficiently.

The hardware is not intended as a narrow accelerator for one model or one product line. OpenAI said it was developed to work with current and future LLMs across the industry, reflecting the company’s view of how AI inference needs may evolve.

Built with Broadcom and Celestica

Broadcom is handling key parts of the silicon implementation and networking stack, including its Tomahawk networking technology. Celestica is contributing board, rack, system integration, and production expertise. OpenAI said the collaboration allowed the chip to move from initial design to tape-out in nine months, which it described as an exceptionally fast development cycle for a custom advanced semiconductor.

The company also said its own models helped accelerate parts of the design and optimization process. That means the same AI systems used by customers are also being used to improve the infrastructure that will support future versions of those systems.

Richard Ho, who leads OpenAI’s hardware program, said the design was based on close collaboration with OpenAI researchers and optimized around the workloads that matter most for frontier AI models, including kernels, memory movement, networking, and serving patterns.

Part of a larger infrastructure strategy

OpenAI leadership framed Jalapeño as part of a long-term strategy to build more of the AI stack in house. President and co-founder Greg Brockman said the company wants to make compute more abundant so AI can become faster, more reliable, and more affordable for users and businesses.

Broadcom CEO Hock Tan said the partnership is intended to support a multi-generation roadmap and to enable deployment at gigawatt scale with data center partners, including Microsoft and others, beginning in 2026.

OpenAI said Jalapeño is the first step in a compute platform that will be deployed initially by the end of 2026 and expanded in later generations. The company argues that improving inference infrastructure can directly affect user experience, from quicker ChatGPT responses to cheaper API usage and more dependable service during periods of heavy demand.

For OpenAI, the chip represents a bet that controlling more of the infrastructure beneath its models will help lower costs and increase performance over time. For Broadcom, it deepens a partnership with one of the industry’s most prominent AI companies as demand for specialized hardware continues to grow.