Stop Vibes, Start Verifying: Deterministic Guardrails for AI Agents

Added Dec 8, 2025
Article: PositiveCommunity: NeutralMixed
Stop Vibes, Start Verifying: Deterministic Guardrails for AI Agents

The author critiques LLM-as-a-Judge as an unreliable way to police AI outputs because it stacks probabilistic judgments. They advocate deterministic, code-based verification—assertions, parsing, and real checks—to block errors regardless of model confidence. Steer, an open-source Python library, implements a verification layer and a "Teach" loop to catch and fix failures without redeployments.

Key Points

  • LLM-as-a-Judge creates a circular, probabilistic feedback loop that can rubber-stamp hallucinations; you can’t fix probability with more probability.
  • Treat agents like software: use assertions, unit tests, and deterministic checks to block unsafe or incorrect actions.
  • Replace vibe checks with verifiable code paths: make real HTTP requests, parse SQL ASTs, and query databases for disambiguation.
  • Steer provides a verification layer that enforces hard guardrails around agent functions using simple, composable verifiers.
  • A built-in "Teach" loop converts caught failures into targeted rules that patch behavior without prompt rewrites or redeploys.

Sentiment

HN broadly agreed with the problem diagnosis—that LLM-as-a-Judge is flawed and that probabilistic systems need deterministic guardrails—but was skeptical of the specific solution offered. The article's "Teach" loop drew pointed criticism as being no more reliable than what it was meant to fix. Much of the discussion drifted into broader debates about LLM limitations, sycophancy, and the nature of intelligence rather than engaging with the article's actual technical proposal. The community gave the article high upvotes, suggesting the problem framing resonated strongly, but the comment thread reflects a more cautious or critical stance toward the implementation.

In Agreement

  • Using LLM-as-a-Judge to validate LLM outputs compounds the hallucination problem rather than solving it—adding probabilistic judgment on top of probabilistic generation.
  • Deterministic acceptance criteria (hard assertions like format checks, type validation, range tests) are the right abstraction for wrapping LLM outputs, converting probabilistic generation into binary pass/fail results.
  • Pre-flight checklist verification—confirming that the agent has correct, factually grounded inputs before it starts reasoning—reduces unforced errors and is both practical and underprovided in current agent frameworks.
  • Code-based verification (actually making network calls to validate URLs, parsing SQL via AST) is more reliable than asking an LLM to evaluate its own output.
  • Treating AI agents like software—with assertions, unit tests, and hard failure modes—is the correct engineering mindset for building reliable production systems.

Opposed

  • The article's "Teach" loop, which injects learned correction rules back into the LLM's context on subsequent runs, is itself probabilistic: the model may or may not follow those injected rules, so this is still fixing probability with more probability.
  • True reliability requires rules embedded in APIs and code, not in prompt context—any LLM rule injected into a context window is just an advisory the model can ignore.
  • For the pre-flight check examples described (verify patient ID, image modality, date range), conventional deterministic code would be more reliable than an AI-based approach in the first place.
  • The fundamental hallucination problem is architectural: LLMs operate at the word/token level without a fact representation, so no amount of guardrailing can make them reliably accurate on tasks requiring factual truth.
  • The article's thesis about "stopping vibes" is undermined by its own "Teach" feature, which relies on the model's vibes-based interpretation of injected correction rules.
Stop Vibes, Start Verifying: Deterministic Guardrails for AI Agents | TD Stuff