OpenAI Launches GPT-5.2-Codex for Advanced Agentic Coding and Cyber Defense

OpenAI launched GPT-5.2-Codex, an agentic coding model with stronger long-horizon performance, better tool use and vision, and state-of-the-art results on key engineering benchmarks. It significantly improves reliability across large codebases and Windows environments, and advances cybersecurity capabilities while remaining below ‘High’ on OpenAI’s risk scale. The rollout begins for paid ChatGPT users, with API access and a vetted trusted-access program for defensive use cases to follow.
Key Points
- GPT-5.2-Codex advances agentic coding with long-horizon reasoning, context compaction, stronger tool use, improved vision, and better performance on large refactors, migrations, and Windows environments.
- It sets state-of-the-art results on SWE-Bench Pro and Terminal-Bench 2.0 and sustains reliable work across large repositories and extended sessions.
- Cyber capabilities see a major jump, though the model remains below ‘High’ under OpenAI’s Preparedness Framework; additional in-model and product safeguards are in place with a published system card.
- A real-world case shows GPT-driven assistance helped discover and responsibly disclose new React vulnerabilities, illustrating both defensive acceleration and dual-use risks.
- Deployment starts for paid ChatGPT users now, API access is coming, and a vetted, invite-only trusted access program will unlock more permissive capabilities for defensive cybersecurity.
Sentiment
The overall sentiment of the Hacker News discussion is cautiously optimistic and mixed. While there is clear enthusiasm for the model's advanced reasoning capabilities, its potential in cybersecurity, and observed performance improvements in specific areas, significant concerns were raised regarding privacy, perceived slowness, and potential reliability issues in complex, nuanced coding scenarios. Hacker News generally agrees on the utility of such models but also voices skepticism and challenges general claims of superiority without practical context or benchmarks, with some users still preferring competitor offerings for specific coding needs.
In Agreement
- Codex models (GPT 5.x) excel at careful, methodical reasoning, bug finding, and identifying inconsistencies, which is seen as astounding.
- Models have already reached a significant level of utility for security work, particularly for hypothesis identification and automating vulnerability analysis grunt work, freeing human testers for more creative tasks.
- The strategy of providing invite-only trusted access to more permissive models for vetted defensive cybersecurity professionals makes sense for balancing accessibility with safety and managing dual-use risks.
- Some users confirm that GPT 5.2 (the base model) outperforms Gemini and Claude for certain tasks, such as building UI elements from Figma links, indicating a competitive edge in specific areas.
- The Codex CLI is valued by some users as a favorite coding assistant, and the development team is responsive to user feedback, suggesting a positive user experience despite perceived slowness.
Opposed
- A significant privacy and security concern is the inability to delete tasks in Codex, meaning code diffs and prompts are seemingly stored indefinitely, with only an 'archive' option.
- Claims of GPT 5.2's general superiority over competitors like Claude are disputed without specific benchmarks, as the overall effectiveness in real/large codebases heavily depends on the integrated tools (e.g., Claude Code, Gemini-CLI).
- Some users experienced GPT 5.2 overfitting to common implementations, incorrectly pattern-matching, ignoring explicitly highlighted differences in prompts, breaking working code, and refusing to accept corrections, leading to a dangerous 'target fixation'.
- A perceived slowness of GPT5.x models compared to competitors like Opus 4.5 causes some users to lack enthusiasm for this new release, preferring models that don't rely as much on 'thinking'.
- Some users previously found Codex to be significantly worse than Claude Code in both user experience and actual results, expressing a hope that this new release will allow OpenAI to truly compete again.