Inside Codex: How the Agent Loop Builds, Calls Tools, and Stays Fast

OpenAI’s Codex CLI builds and manages an agent loop that structures prompts, executes tools, and streams results through the Responses API. It prioritizes statelessness and Zero Data Retention, using exact-prefix prompt caching and appending changes to avoid cache misses. For long threads, Codex compacts conversation state via /responses/compact to stay within context limits while preserving understanding.
Key Points
- Codex’s agent loop cycles between model inference and tool calls until an assistant message ends the turn; prompts include system/developer instructions, tools, and layered input items.
- Codex pre-inserts sandbox/permission rules, optional developer and user instructions (AGENTS files and skills), and environment context before the user’s message.
- Tool usage is explicit: function_call and function_call_output items are appended to input so each subsequent prompt is an exact prefix of the next, enabling prompt caching.
- Codex keeps requests stateless (no previous_response_id) to support ZDR, so it relies heavily on exact-prefix prompt caching and careful change management to avoid cache misses.
- To prevent context overflow, Codex uses the /responses/compact endpoint to shrink conversations while retaining latent understanding via an encrypted compaction item.
Sentiment
The discussion is notably divided. While many appreciate the transparency of OpenAI's blog post and the value of open-sourcing Codex, practical experiences with the tool are deeply polarized. Technical users share nuanced knowledge about API internals and context management in constructive debate. A significant undercurrent of frustration targets proprietary coding tools and their bug responsiveness. The community broadly agrees that these agent loops are conceptually simple but practically valuable, with the real differentiation coming from UX, features, and model quality.
In Agreement
- Open-sourcing agent loops is valuable for learning, transparency, and community contribution — Codex's approach here is praised
- The prompt caching and stateless operation design described in the article is sound engineering that enables efficient long sessions
- The compaction endpoint is best-in-class for managing long conversations and preserving context
- Writing progress updates to markdown files is an effective strategy for bridging context window limitations across turns
- The Rust rewrite of Codex CLI dramatically improved performance and resource usage compared to the JS version
Opposed
- Some find Codex CLI practically unusable — slow, fails to solve problems, and gets stuck in loops far more than Claude Code
- The agent loop is fundamentally simple (call model, run tool, repeat) and the article dresses up a straightforward concept
- Codex CLI lacks critical features like hooks and diff visualization that competitors have, making it less practical for real workflows
- Current AI pricing is unsustainably subsidized and the true costs of SOTA inference haven't materialized yet
- Claude Code being proprietary is indefensible when the model is the real value — the harness should be open for community bug fixes