Small, Business-Value Units Make AI Coding Work
AI-assisted coding succeeds or fails based on how well we manage the unit of work and its context. Because errors compound across turns and real projects are messy, we need small, human-legible tasks with verifiable checkpoints that deliver business value. User stories—augmented with better guidance for agents—offer the best foundation, which the StoryMachine experiment aims to refine.
Key Points
- Context engineering is critical: provide just enough, well-structured context to enable one-shot solutions; too little causes hallucinations, too much leads to context rot.
- Errors compound across multi-turn agent workflows, so long tasks need small steps with human-legible, verifiable checkpoints to limit propagation.
- Benchmark claims about long-horizon competence often assume low messiness; real software work is messy, reducing success rates substantially.
- The right unit of work is small and delivers business value; user stories are a strong foundation because they align stakeholders and anchor outcomes.
- Agent planning is useful but should operate over small, business-valued units; the StoryMachine experiment explores enhanced story formats to guide agents.
Sentiment
The community is broadly skeptical of grand AI coding productivity claims while largely agreeing with the article's specific advice to work in small, verifiable chunks. Most commenters report qualified, modest benefits from AI coding tools rather than the transformative gains promised by advocates. There's particular frustration with the exhaustion of constant code review and the cognitive cost of maintaining suspicion toward AI output. The minority of enthusiastic advocates face sharp pushback and accusations of unverifiable claims.
In Agreement
- Small units of work with clear verification checkpoints are essential for AI coding reliability
- Context management and engineering is the real bottleneck, not model intelligence
- Error rates compound across multi-turn workflows, making long agentic sessions unreliable
- TDD provides the missing check-and-balance discipline that most AI coding workflows lack
- Summarizing tasks and starting fresh contexts works better than continuing in degraded sessions
Opposed
- User stories are the wrong unit of work entirely — software decomposes into horizontal infrastructure layers, not vertical feature slices
- If AI tools require this much process management overhead, the productivity gains are questionable or illusory
- Even small, precisely scoped tasks routinely fail, producing invalid code that needs manual correction
- The 'you're holding it wrong' defense masks fundamental limitations — a tool requiring thousands of hours of practice is not the revolution being marketed
- Developers serve the tools rather than tools serving developers — a reverse centaur dynamic