LLMs

The math behind the OpenAI Jalapeño chip

Publicado porRedacao AIDaily
4 min de leitura
Autor na fonte original: Dashveenjit Kaur

OpenAI’s financial trajectory hinges heavily on infrastructure costs, a reality that drove the development of the new custom OpenAI Jalapeño chip. Developed in collaboration with Broadcom, the application-specific integrated circuit (ASIC) represents a direct attempt to mitigate the heavy capital expenditure associated with third-party hardware. While Nvidia currently commands an estimated 75% profit margin on […] The post The math behind the OpenAI Jalapeño chip appeared first on AI News .

Compartilhar:

OpenAI’s financial trajectory hinges heavily on infrastructure costs, a reality that drove the development of the new custom OpenAI Jalapeño chip. Developed in collaboration with Broadcom, the application-specific integrated circuit (ASIC) represents a direct attempt to mitigate the heavy capital expenditure associated with third-party hardware. While Nvidia currently commands an estimated 75% profit margin on its high-end processors, OpenAI operates on tighter margins, keeping roughly 33 cents of profit on each dollar generated after accounting for its massive operational expenses. The financial burden of running large language models at scale is severe. Last year, keeping ChatGPT servers responsive had cost OpenAI a staggering US$8.4 billion. With the platform now attracting 900 million weekly users, that operational cost is projected to reach approximately US$14 billion this year. Over the next eight years, OpenAI has committed roughly US$1.4 trillion to computing power, a massive bet for a company currently generating US$25 billion in annual revenue. Designing Hardware for LLM Inference The OpenAI Jalapeño chip, dubbed as the company’s first “Intelligence Processor”, is built specifically for large language model (LLM) inference rather than general-purpose AI workloads. OpenAI provided the core architectural design based on its specific model roadmaps and serving systems, while Broadcom managed the silicon engineering and high-performance networking integration. TSMC handles the physical manufacturing in Taiwan, and Celestica is tasked with building the board and rack systems. According to OpenAI, early lab samples are already running frontier workloads, including an unreleased GPT-5.3-Codex-Spark model, at target production frequency and power. Richard Ho, head of OpenAI’s hardware program, noted that the architecture minimizes data movement to push realized utilization closer to its theoretical peak performance. Unlike general-purpose accelerators adapted from legacy AI workloads, this architecture specifically balances compute, memory, and networking resources to solve the data-movement bottlenecks native to interactive LLM serving. To achieve this at scale, the platform integrates Broadcom’s Tomahawk networking silicon directly into the design, allowing the custom processors to communicate across massive, clustered data center environments. The vertical integration flywheel By moving into custom silicon, OpenAI shifts from being a mere software layer to a vertically integrated infrastructure company . This full-stack strategy spans the entire pipeline: chip architecture, software kernels, memory systems, network scheduling, and the final application layer . Much like Apple’s tight coupling of proprietary hardware and iOS, OpenAI can now optimize its infrastructure around its exact internal model roadmaps . This integration feeds a continuous operational flywheel . Enhanced infrastructure efficiency lowers the cost of both training and serving models . More affordable serving leads to better, more responsive products, which drives user volume and revenue to be reinvested back into the next generation of custom infrastructure . Overcoming the late-mover advantage By introducing its own silicon, OpenAI enters a landscape where its primary competitors have spent nearly a decade developing proprietary hardware. Google began deploying its Tensor Processing Units (TPUs) in 2015 and now controls roughly a quarter of global AI computing capacity outside of Nvidia’s supply chain. Amazon has shipped over one million of its custom chips, while Meta and Microsoft continue to scale their own infrastructure. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant,” said Greg Brockman, president and co-founder of OpenAI. “By designing more of the stack ourselves, we can serve more intelligence with greater efficiency.” To close this timeline gap, OpenAI accelerated the development phase. The OpenAI Jalapeño chip transitioned from a blank-slate design to manufacturing tape-out—the final step before physical production—in just nine months. The engineering teams achieved this timeline by utilizing OpenAI’s own language models to automate and optimize portions of the hardware design process. This creates a unique feedback loop where the models served to users are actively being leveraged to build the physical infrastructure that will run future iterations. Initial deployment of the hardware into data centres is scheduled to begin by the end of 2026. Broadcom CEO Hock Tan confirmed that the rollout will scale alongside infrastructure partners, including Microsoft, to prepare for gigawatt-scale data centre integration. (Photo by OpenAI ) See also: Omio scales travel product development using OpenAI models Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information. AI News is powered by TechForge Media . Explore other upcoming enterprise technology events and webinars here . The post The math behind the OpenAI Jalapeño chip appeared first on AI News .

Pontos-chave

  • O chip Jalapeño pode inspirar empresas brasileiras a desenvolverem suas próprias soluções de hardware.
  • A verticalização da produção de hardware pode ser um modelo estratégico para a redução de custos operacionais.
  • A capacidade de controlar custos e otimizar operações será crucial para a competitividade no setor de tecnologia.

Análise editorial

A introdução do chip Jalapeño pela OpenAI representa um movimento estratégico que pode ter repercussões significativas para o setor de tecnologia no Brasil e na América Latina. Com a crescente demanda por modelos de linguagem de grande escala, a capacidade de desenvolver hardware customizado pode permitir que empresas locais se tornem mais competitivas em um mercado dominado por gigantes como Nvidia. Essa inovação pode inspirar startups e empresas brasileiras a investirem em suas próprias soluções de hardware, criando um ecossistema mais robusto e autossuficiente.

Além disso, a abordagem da OpenAI em direção à verticalização da produção de hardware pode ser um modelo a ser seguido por empresas brasileiras que buscam reduzir custos operacionais e aumentar a eficiência. A integração de diferentes componentes, como o silício de rede da Broadcom, mostra a importância de parcerias estratégicas para otimizar a performance e a escalabilidade. Isso pode incentivar colaborações semelhantes entre empresas brasileiras e fornecedores de tecnologia, promovendo um ambiente de inovação.

Com a previsão de custos operacionais que podem alcançar US$ 14 bilhões, é crucial que empresas brasileiras que operam em áreas semelhantes considerem suas próprias estratégias de infraestrutura. A capacidade de controlar custos e otimizar operações será um diferencial competitivo. O desenvolvimento de chips especializados pode ser um caminho viável para empresas que desejam se destacar em um mercado cada vez mais saturado e exigente.

Por fim, a evolução do Jalapeño e seu impacto na capacidade de OpenAI de atender a uma base de usuários crescente é um indicativo claro de que a inovação em hardware é essencial para a sustentabilidade a longo prazo. O Brasil, com seu potencial em tecnologia e inovação, deve observar atentamente esses desenvolvimentos e considerar como pode se posicionar para aproveitar as oportunidades que surgem com a evolução dos modelos de linguagem e da infraestrutura necessária para suportá-los.

O que esta cobertura entrega

  • Atribuicao clara de fonte com link para a publicacao original.
  • Enquadramento editorial sobre relevancia, impacto e proximos desdobramentos.
  • Revisao de legibilidade, contexto e duplicacao antes da publicacao.

Fonte original:

AI News

Sobre este artigo

Este artigo foi curado e publicado pelo AIDaily como parte da nossa cobertura editorial sobre desenvolvimentos em inteligência artificial. O conteúdo é baseado na fonte original citada abaixo, enriquecido com contexto e análise editorial. Ferramentas automatizadas podem auxiliar tradução e estruturação inicial, mas a decisão de publicar, a revisão factual e o enquadramento de contexto seguem responsabilidade editorial.

Saiba mais sobre nosso processo editorial