Coding Agents Don’t Lack IQ—They Lack Context
The author argues that coding agents fail mainly due to missing context, not insufficient intelligence. While models excel on contest problems, real-world development depends on implicit knowledge about architecture, history, practices, and business needs that isn’t centrally documented. Progress requires richer contextual ingestion and synthesis, human guidance for gaps, and agents that know when to ask for help.
Key Points
- Model intelligence is increasingly sufficient; the limiting factor for coding agents is missing real-world context, not raw capability.
- Current agents reliably handle tasks up to about one commit (Level 2) on existing codebases; larger autonomous scopes fail primarily due to context gaps.
- Essential context includes not just code and docs but emergent architecture, historical decisions, unwritten dev/deploy practices, and product/business requirements.
- This context is fragmented, often undocumented, and requires sophisticated preprocessing and synthesis—simple file or tool access is not enough.
- To progress, agents must be fed richer context, keep humans in the loop to fill inevitable gaps, and learn to detect and ask for missing context.
Sentiment
The overall sentiment in the Hacker News discussion is largely in agreement with the article's core premise, validating that context is a significant bottleneck for coding agents. Many users share anecdotal evidence and propose solutions centered around improving context management, indicating a general acceptance of the problem statement. While a strong minority voice challenges the fundamental intelligence or architecture of LLMs as the deeper issue, and concerns about future 'responsibility' are raised, the dominant view aligns with the article's focus on context.
In Agreement
- LLMs are demonstrably bad at managing context, leading to 'context poisoning' where they struggle to discard irrelevant information and focus on the latest intent in multi-step operations.
- Models often 'forget' earlier parts of large files and lack a native understanding of time or the chronological evolution of a codebase.
- Humans effectively abstract and summarize information, and LLMs require similar mechanisms like hierarchical summaries, dynamic notes, and the ability to update them with every code change.
- Practical solutions like starting new chat sessions, asking for summaries, or using 'branch to new chat'/`/compact` commands are necessary human-in-the-loop interventions for context management.
- Refactoring codebases for 'LLM compatibility' – by making code more modular, adding extensive documentation (including the rationale behind decisions), and annotating for future regressions – significantly improves agent performance and is beneficial for humans too.
Opposed
- LLM 'intelligence' itself is still a bottleneck, with models making 'astoundingly stupid decisions with full confidence' even with complete context, indicating context is not the only or even the primary problem.
- The fundamental architecture of LLMs as 'next token predictors' inherently limits their capability for complex, long-horizon programming tasks, and no amount of context engineering will fundamentally overcome this.
- Competition-style benchmarks (like ICPC) are not representative of real-world software development, as they comprise small, self-contained problems that LLMs excel at due to knowledge and speed, not deep contextual understanding.
- The issue is not 'context' in a human-centric sense, but rather a limitation of the model's architecture itself, which performs as designed.
- The ultimate bottleneck for AI coding agents will be 'responsibility' for code quality and errors, as the speed of agent development outstrips human capacity for review and accountability.