OpenAI today introduced GPT-5.3-Codex-Spark, a new model that is interesting not only on its own but also because of the hardware used for its training. Instead of Nvidia accelerators, OpenAI utilized unique chips from Cerebras in this case. This move marks the first major product from OpenAI’s infrastructure partnership with the AI chip company.
Last month, OpenAI signed a contract with Cerebras to provide computing power based on these chips. This multi-year agreement, valued at over ten billion dollars, will deploy 750 megawatts of Cerebras’s wafer-scale systems. Now, a month later, we have the first model.
Cerebras is known for creating processors the size of an iPad with characteristics unattainable by conventional chips. For instance, the current WSE-3 chip contains 4 trillion transistors, 900,000 AI-optimized cores, and 44GB of on-chip memory with a bandwidth of 21 PB/s. This architecture is specifically designed to minimize latency, which is often a bottleneck in interactive AI workloads.
The GPT-5.3-Codex-Spark model is a smaller, streamlined version of GPT-5.3-Codex, optimized specifically for ultra-low-latency coding workflows. It is also OpenAI’s first model designed for real-time programming. The company is providing Codex-Spark on Cerebras as a research preview to ChatGPT Pro users, allowing developers to experiment while OpenAI and Cerebras work on increasing data center capacity and reliability.
Our latest frontier models have shown particular strengths in their ability to perform long-running tasks, operating autonomously for hours, days, or weeks without intervention. Codex-Spark is our first model designed specifically for real-time work with Codex, enabling developers to make targeted edits, refactor logic, or refine interfaces and see the results immediately. With this model, Codex now supports both long-term, ambitious tasks and getting work done in the moment. We hope to learn how developers use it and incorporate their feedback as we continue to expand access.
At launch, Codex-Spark supports only text mode with a 128k token context window. During the research preview, Codex-Spark will have its own rate limits, and its usage will not count towards standard limits. However, during times of high demand, users may experience limited access or temporary queues as reliability is maintained for all users.
This release is significant as it represents OpenAI’s first production deployment on silicon other than its long-standing core stack with Nvidia. The move is part of a broader strategy by OpenAI to diversify its hardware suppliers and build a resilient portfolio that matches the right systems to the right workloads. The AI industry has been heavily reliant on Nvidia, which holds an estimated 80% or more of the market share for AI chips. Exploring alternatives like Cerebras allows OpenAI to mitigate supply chain risks and leverage specialized hardware that could unlock new capabilities, particularly for low-latency inference.
The introduction of a real-time coding assistant like Codex-Spark could fundamentally change development workflows. While current AI coding assistants from competitors like Google, Amazon, and Anthropic have significantly improved developer productivity, they can still struggle with complex projects requiring large context windows. A model optimized for near-instant feedback allows for a more interactive and iterative coding process. As Sean Lie, CTO and co-founder of Cerebras, stated, the partnership aims to discover “new interaction patterns, new use cases, and a fundamentally different model experience” made possible by fast inference. This development signals a future where AI not only assists with large, autonomous tasks but also becomes a seamless, real-time partner in the creative process of software development. [HOLDER:]
A video circulating online showcases a high-fidelity dummy model of the upcoming Samsung Galaxy S26…
The Leibniz Supercomputing Centre (LRZ) in Garching, near Munich, has officially unveiled Euro-Q-Exa, the first…
Framework, a company known for its modular laptops and commitment to transparency, has increased its…
In a move that complicates the global semiconductor supply chain, the Pentagon has officially added…
Streamer and psychology Master's graduate PerriKaryal, with a combined following of nearly 100,000 on Twitch…
In a significant glimpse into the future of mobile computing, Lenovo has revealed a groundbreaking…