Small, Business-Value Units Make AI Coding Work

AI-assisted coding succeeds or fails based on how well we manage the unit of work and its context. Because errors compound across turns and real projects are messy, we need small, human-legible tasks with verifiable checkpoints that deliver business value. User stories—augmented with better guidance for agents—offer the best foundation, which the StoryMachine experiment aims to refine.

Key Points

Context engineering is critical: provide just enough, well-structured context to enable one-shot solutions; too little causes hallucinations, too much leads to context rot.
Errors compound across multi-turn agent workflows, so long tasks need small steps with human-legible, verifiable checkpoints to limit propagation.
Benchmark claims about long-horizon competence often assume low messiness; real software work is messy, reducing success rates substantially.
The right unit of work is small and delivers business value; user stories are a strong foundation because they align stakeholders and anchor outcomes.
Agent planning is useful but should operate over small, business-valued units; the StoryMachine experiment explores enhanced story formats to guide agents.

Sentiment

The overall sentiment is cautiously optimistic but leans heavily towards agreement with the article's practical advice while expressing significant skepticism about the revolutionary claims often made for AI in coding. Many practitioners confirm that AI works best for small, well-defined tasks under close human supervision, and its current capabilities are not a 'step change' or '10x' productivity boost for complex, long-horizon tasks. There's a strong undercurrent of frustration with AI's current limitations (hallucinations, not following instructions, requiring intense auditing), though a dedicated minority sees substantial potential with significant investment in learning 'advanced techniques.'

In Agreement

Working in small, concrete chunks is beneficial for managing AI-assisted development, aligning with traditional project management principles.
Explicitly managing context, such as summarizing completed tasks and feeding a new context for subsequent tasks, is effective in reducing issues like hallucinations and improving speed.
AI agents are 'handy' when supervised closely and given 'well-scoped (small) tasks,' as they are not suitable for unsupervised, large-scale code generation.
AI excels at specific, limited tasks like code review, brainstorming, ideation, and generating solutions for small, well-defined problems that can be quickly vetted by a human.
The mental overhead of auditing and correcting AI-generated code, even for small tasks, often negates any perceived time savings, emphasizing the need for manageable units.
AI-generated code requires 'EXTREME supervision' and 'constant handholding' to ensure quality and correctness, making small units of work easier to manage.

Opposed

Some highly experienced users claim significant productivity gains (managing large codebases in production) after dedicating hundreds or thousands of hours to mastering 'advanced techniques' for AI coding, suggesting a potential beyond what most perceive.
Skepticism exists regarding the notion that 'most SWE folks still have no idea' about the true advancements in AI coding, with many arguing that improvements are incremental rather than revolutionary.
The value proposition of AI as pushed by companies is often for doing larger units of work, leading to user exhaustion from constant auditing and fixing edge cases, implicitly challenging the practical feasibility of breaking down all work into truly small units without extreme human overhead.
Claims of 2x, 5x, or 10x productivity boosts from AI are widely dismissed as unrealistic for 'non-trivial software,' suggesting that current tools are not a 'step change.'
AI agents, even with small and tightly crafted prompts, can ignore instructions, generate code with fundamental errors (e.g., C++ features in C), and still not save time.
Unlike human junior developers who learn from mistakes, AI agents are 'stuck at the level of expertise of its model,' requiring new model releases for improvement, which limits their long-term growth as a 'developer'.