LLMs

The math behind the OpenAI Jalapeño chip

Published byAIDaily Editorial Team
4 min read
Original source author: Dashveenjit Kaur

OpenAI’s financial trajectory hinges heavily on infrastructure costs, a reality that drove the development of the new custom OpenAI Jalapeño chip. Developed in collaboration with Broadcom, the application-specific integrated circuit (ASIC) represents a direct attempt to mitigate the heavy capital expenditure associated with third-party hardware. While Nvidia currently commands an estimated 75% profit margin on […] The post The math behind the OpenAI Jalapeño chip appeared first on AI News .

Share:

OpenAI’s financial trajectory hinges heavily on infrastructure costs, a reality that drove the development of the new custom OpenAI Jalapeño chip. Developed in collaboration with Broadcom, the application-specific integrated circuit (ASIC) represents a direct attempt to mitigate the heavy capital expenditure associated with third-party hardware. While Nvidia currently commands an estimated 75% profit margin on its high-end processors, OpenAI operates on tighter margins, keeping roughly 33 cents of profit on each dollar generated after accounting for its massive operational expenses. The financial burden of running large language models at scale is severe. Last year, keeping ChatGPT servers responsive had cost OpenAI a staggering US$8.4 billion. With the platform now attracting 900 million weekly users, that operational cost is projected to reach approximately US$14 billion this year. Over the next eight years, OpenAI has committed roughly US$1.4 trillion to computing power, a massive bet for a company currently generating US$25 billion in annual revenue. Designing Hardware for LLM Inference The OpenAI Jalapeño chip, dubbed as the company’s first “Intelligence Processor”, is built specifically for large language model (LLM) inference rather than general-purpose AI workloads. OpenAI provided the core architectural design based on its specific model roadmaps and serving systems, while Broadcom managed the silicon engineering and high-performance networking integration. TSMC handles the physical manufacturing in Taiwan, and Celestica is tasked with building the board and rack systems. According to OpenAI, early lab samples are already running frontier workloads, including an unreleased GPT-5.3-Codex-Spark model, at target production frequency and power. Richard Ho, head of OpenAI’s hardware program, noted that the architecture minimizes data movement to push realized utilization closer to its theoretical peak performance. Unlike general-purpose accelerators adapted from legacy AI workloads, this architecture specifically balances compute, memory, and networking resources to solve the data-movement bottlenecks native to interactive LLM serving. To achieve this at scale, the platform integrates Broadcom’s Tomahawk networking silicon directly into the design, allowing the custom processors to communicate across massive, clustered data center environments. The vertical integration flywheel By moving into custom silicon, OpenAI shifts from being a mere software layer to a vertically integrated infrastructure company . This full-stack strategy spans the entire pipeline: chip architecture, software kernels, memory systems, network scheduling, and the final application layer . Much like Apple’s tight coupling of proprietary hardware and iOS, OpenAI can now optimize its infrastructure around its exact internal model roadmaps . This integration feeds a continuous operational flywheel . Enhanced infrastructure efficiency lowers the cost of both training and serving models . More affordable serving leads to better, more responsive products, which drives user volume and revenue to be reinvested back into the next generation of custom infrastructure . Overcoming the late-mover advantage By introducing its own silicon, OpenAI enters a landscape where its primary competitors have spent nearly a decade developing proprietary hardware. Google began deploying its Tensor Processing Units (TPUs) in 2015 and now controls roughly a quarter of global AI computing capacity outside of Nvidia’s supply chain. Amazon has shipped over one million of its custom chips, while Meta and Microsoft continue to scale their own infrastructure. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant,” said Greg Brockman, president and co-founder of OpenAI. “By designing more of the stack ourselves, we can serve more intelligence with greater efficiency.” To close this timeline gap, OpenAI accelerated the development phase. The OpenAI Jalapeño chip transitioned from a blank-slate design to manufacturing tape-out—the final step before physical production—in just nine months. The engineering teams achieved this timeline by utilizing OpenAI’s own language models to automate and optimize portions of the hardware design process. This creates a unique feedback loop where the models served to users are actively being leveraged to build the physical infrastructure that will run future iterations. Initial deployment of the hardware into data centres is scheduled to begin by the end of 2026. Broadcom CEO Hock Tan confirmed that the rollout will scale alongside infrastructure partners, including Microsoft, to prepare for gigawatt-scale data centre integration. (Photo by OpenAI ) See also: Omio scales travel product development using OpenAI models Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information. AI News is powered by TechForge Media . Explore other upcoming enterprise technology events and webinars here . The post The math behind the OpenAI Jalapeño chip appeared first on AI News .

Key takeaways

  • The Jalapeño chip may inspire Brazilian companies to develop their own hardware solutions.
  • Vertical integration in hardware production could be a strategic model for reducing operational costs.
  • The ability to control costs and optimize operations will be crucial for competitiveness in the tech sector.

Editorial analysis

The introduction of the Jalapeño chip by OpenAI represents a strategic move that could have significant repercussions for the technology sector in Brazil and Latin America. With the growing demand for large-scale language models, the ability to develop custom hardware may enable local companies to become more competitive in a market dominated by giants like Nvidia. This innovation could inspire Brazilian startups and companies to invest in their own hardware solutions, creating a more robust and self-sufficient ecosystem.

Moreover, OpenAI's approach towards vertical integration in hardware production could serve as a model for Brazilian companies looking to reduce operational costs and increase efficiency. The integration of various components, such as Broadcom's networking silicon, highlights the importance of strategic partnerships to optimize performance and scalability. This may encourage similar collaborations between Brazilian firms and technology suppliers, fostering an innovative environment.

With projected operational costs potentially reaching US$ 14 billion, it is crucial for Brazilian companies operating in similar areas to consider their own infrastructure strategies. The ability to control costs and optimize operations will be a competitive differentiator. The development of specialized chips could be a viable path for companies looking to stand out in an increasingly saturated and demanding market.

Finally, the evolution of the Jalapeño and its impact on OpenAI's ability to serve a growing user base clearly indicates that hardware innovation is essential for long-term sustainability. Brazil, with its potential in technology and innovation, should closely monitor these developments and consider how it can position itself to seize the opportunities arising from the evolution of language models and the infrastructure needed to support them.

What this coverage includes

  • Clear source attribution and link to the original publication.
  • Editorial framing about relevance, impact, and likely next developments.
  • Review for readability, context, and duplication before publication.

Original source:

AI News

About this article

This article was curated and published by AIDaily as part of our editorial coverage of artificial intelligence developments. The content is based on the original source cited below, enriched with editorial context and analysis. Automated tools may assist with translation and initial structuring, but publication decisions, factual review, and contextual framing remain editorial responsibilities.

Learn more about our editorial process