Categories: Technology

OpenAI Unveils GPT-5.3-Codex-Spark, a Real-Time Coding Model Powered by Cerebras Chips

OpenAI today introduced GPT-5.3-Codex-Spark, a new model that is interesting not only on its own but also because of the hardware used for its training. Instead of Nvidia accelerators, OpenAI utilized unique chips from Cerebras in this case. This move marks the first major product from OpenAI’s infrastructure partnership with the AI chip company.

Last month, OpenAI signed a contract with Cerebras to provide computing power based on these chips. This multi-year agreement, valued at over ten billion dollars, will deploy 750 megawatts of Cerebras’s wafer-scale systems. Now, a month later, we have the first model.

A New Architecture for Real-Time AI

Cerebras is known for creating processors the size of an iPad with characteristics unattainable by conventional chips. For instance, the current WSE-3 chip contains 4 trillion transistors, 900,000 AI-optimized cores, and 44GB of on-chip memory with a bandwidth of 21 PB/s. This architecture is specifically designed to minimize latency, which is often a bottleneck in interactive AI workloads.

Photo Cerebras

The GPT-5.3-Codex-Spark model is a smaller, streamlined version of GPT-5.3-Codex, optimized specifically for ultra-low-latency coding workflows. It is also OpenAI’s first model designed for real-time programming. The company is providing Codex-Spark on Cerebras as a research preview to ChatGPT Pro users, allowing developers to experiment while OpenAI and Cerebras work on increasing data center capacity and reliability.

Our latest frontier models have shown particular strengths in their ability to perform long-running tasks, operating autonomously for hours, days, or weeks without intervention. Codex-Spark is our first model designed specifically for real-time work with Codex, enabling developers to make targeted edits, refactor logic, or refine interfaces and see the results immediately. With this model, Codex now supports both long-term, ambitious tasks and getting work done in the moment. We hope to learn how developers use it and incorporate their feedback as we continue to expand access.

Capabilities and Limitations

At launch, Codex-Spark supports only text mode with a 128k token context window. During the research preview, Codex-Spark will have its own rate limits, and its usage will not count towards standard limits. However, during times of high demand, users may experience limited access or temporary queues as reliability is maintained for all users.

A Strategic Shift Beyond Nvidia

This release is significant as it represents OpenAI’s first production deployment on silicon other than its long-standing core stack with Nvidia. The move is part of a broader strategy by OpenAI to diversify its hardware suppliers and build a resilient portfolio that matches the right systems to the right workloads. The AI industry has been heavily reliant on Nvidia, which holds an estimated 80% or more of the market share for AI chips. Exploring alternatives like Cerebras allows OpenAI to mitigate supply chain risks and leverage specialized hardware that could unlock new capabilities, particularly for low-latency inference.

The Future of AI-Powered Coding

The introduction of a real-time coding assistant like Codex-Spark could fundamentally change development workflows. While current AI coding assistants from competitors like Google, Amazon, and Anthropic have significantly improved developer productivity, they can still struggle with complex projects requiring large context windows. A model optimized for near-instant feedback allows for a more interactive and iterative coding process. As Sean Lie, CTO and co-founder of Cerebras, stated, the partnership aims to discover “new interaction patterns, new use cases, and a fundamentally different model experience” made possible by fast inference. This development signals a future where AI not only assists with large, autonomous tasks but also becomes a seamless, real-time partner in the creative process of software development. [HOLDER:]

Casey Reed

Casey Reed writes about technology and software, exploring tools, trends, and innovations shaping the digital world.

Next Galaxy S26 Ultra Leak Reveals Major Design Overhaul: Softer Corners and a New Camera Island »

Previous « Germany Boots Up First EuroHPC Quantum Computer, Euro-Q-Exa

Published by

Casey Reed

3 hours ago

Galaxy S26 Ultra Leak Reveals Major Design Overhaul: Softer Corners and a New Camera Island

A video circulating online showcases a high-fidelity dummy model of the upcoming Samsung Galaxy S26…

3 hours ago

Technology

Germany Boots Up First EuroHPC Quantum Computer, Euro-Q-Exa

The Leibniz Supercomputing Centre (LRZ) in Garching, near Munich, has officially unveiled Euro-Q-Exa, the first…

5 hours ago

Hardware

Framework Raises RAM Prices Again Amidst Deepening Global Memory Shortage

Framework, a company known for its modular laptops and commitment to transparency, has increased its…

6 hours ago

Technology

US Tightens Grip on Chinese Memory, But Soaring Prices May Shift the Market

In a move that complicates the global semiconductor supply chain, the Pentagon has officially added…

6 hours ago

Gaming

Streamer Develops DIY Brain Stimulation System for Physical In-Game Motion Feedback

Streamer and psychology Master's graduate PerriKaryal, with a combined following of nearly 100,000 on Twitch…

9 hours ago

Hardware

Lenovo Showcases Samsung’s 96GB, 9600 MT/s LPCAMM2 Module, Heralding a New Era for Laptop Performance

In a significant glimpse into the future of mobile computing, Lenovo has revealed a groundbreaking…

10 hours ago