Skip to main content

The Agent Specification Gap: Why Your Agents Ignore What You Write

· 12 min read
Tian Pan
Software Engineer

You wrote a careful spec. You described the task, listed the constraints, and gave examples. The agent ran — and did something completely different from what you wanted.

This is the specification gap: the distance between the instructions you write and the task the agent interprets. It's not a model capability problem. It's a specification problem. Research on multi-agent system failures published in 2025 found that specification-related issues account for 41.77% of all failures, and that 79% of production breakdowns trace back to how tasks were specified, not to what models can do.

The majority of teams writing agent specs are committing the same category of mistake: writing instructions the way you'd write an email to a competent colleague, then expecting an autonomous system with no shared context to execute them correctly across thousands of runs.

Why "Clear" Instructions Fail in Practice

When engineers write agent specifications, they write for the version of the reader who already knows what they mean. The spec says "clean up the database entries" and the author has a specific mental picture: archive soft-deleted rows older than 90 days, skip anything flagged as pending, leave everything else untouched. The agent reads the same four words and has none of that picture.

Natural language is underspecified by design. Human communication works because we carry enormous amounts of implicit shared context — domain knowledge, institutional memory, conversational norms. Agents don't have that context unless you put it in the spec explicitly. Recent benchmarking of frontier models on agentic instruction-following found that even the best-performing models achieve only 48.3% success on tasks that require bridging literal instructions with contextual reasoning. The other half of tasks fail not because the model can't execute the mechanics but because the spec leaves too much unstated.

The failure compounds in multi-step workflows. An agent with 85% per-step accuracy running a 10-step workflow completes it correctly only 20% of the time. If each step has an underspecified precondition or an ambiguous success criterion, errors don't just accumulate — they cascade. Step 3 misinterprets what step 2 produced. Step 6 executes on stale state. Step 9 defines "done" differently than the spec intended.

The Three Anti-Patterns That Break Specs

Most specification failures fall into three categories, and understanding them is the prerequisite to fixing them.

Underspecified preconditions. The spec describes what the agent should do without stating what must be true before it starts. An instruction to "update the user preferences" doesn't tell the agent whether the user record must exist first, whether it should create a record if it doesn't, or what to do if the preferences schema has changed. An agent executing this in a test environment might succeed because the records are always there. The same agent in production encounters a fresh user and either errors out, creates a corrupt record, or silently skips the operation — behavior that was always possible but never specified.

Ambiguous success criteria. The spec doesn't define what "done" looks like. "Analyze the document and extract key insights" sounds like a complete instruction. It isn't. What counts as a key insight? How many should there be? What format should they take? What should the agent do if the document is too short to have meaningful insights, or if it's in a language the agent handles poorly? Without an explicit success condition, the agent invents its own — and its definition diverges from yours in unpredictable ways across different inputs.

Implicit world-state assumptions. The spec was written assuming the environment looks a certain way: specific services are available, particular schemas are in place, prior steps have completed successfully. The agent can't see these assumptions; it can only act on what's in its context window. Research on what gets called "implicit intelligence" — the gap between what users say and what they mean — finds that environmental factors (the state of external systems, permissions, resource availability) are almost never explicitly stated in agent specs, yet they determine whether the agent behavior is correct.

The worst specs contain all three. "Remove outdated entries" has an underspecified precondition (which database? which table?), an ambiguous success criterion (what makes an entry outdated?), and an implicit assumption (that the entries are safe to delete and not referenced elsewhere). An agent that successfully deletes everything older than a date it infers from context is technically doing what the spec says. The production incident that follows is entirely predictable.

The Structural Fix: Specs as Behavioral Contracts

The mental shift that makes specifications reliable is treating them like software contracts rather than task descriptions. A task description tells the agent what you want. A behavioral contract tells the agent what must be true before it starts, what must be true when it finishes, and what invariants it cannot violate in between — regardless of what specific operations it uses to get there.

This isn't a new idea. Design-by-Contract (DbC) has been a software engineering principle since the 1980s. It just hasn't been applied systematically to agent specifications, even though agents are exactly the kind of autonomous component where contract enforcement matters most.

A spec structured as a behavioral contract has four required elements:

Preconditions — explicit statements of what must be true before the agent executes. Not "the database should be available" but "the users table must exist and contain records matching the provided ID. If the record does not exist, abort with error code USER_NOT_FOUND." Preconditions give the agent a clear halting condition before it takes any action, which prevents the class of failures where an agent proceeds on incorrect assumptions.

Postconditions — explicit statements of what must be true when the task completes. Not "the report should be generated" but "the output must be a JSON object conforming to ReportSchema, with a status field set to complete, containing at least one entry in findings." Postconditions give the agent a testable definition of success. Without them, the agent has to invent its own exit condition — and it will.

Invariants — constraints that must remain true throughout execution, regardless of intermediate steps. "Do not delete records flagged with protected: true." "Do not make API calls to external services not in the approved list." "Do not modify records outside the scope of the current task." Invariants encode the "obviously you wouldn't do that" knowledge the spec author carries but never writes down.

World-state context — explicit statements about the environment the agent is operating in. Which version of the database schema applies? What permissions does the agent have? Are there other processes that might be modifying the same resources concurrently? World-state context is the hardest part to write because it requires the spec author to make tacit knowledge explicit — but it's where most production failures originate.

Structuring Specs for Reliable Execution

Beyond the contract elements, the physical structure of a spec affects how reliably an agent follows it. Research on instruction-following in large language models shows non-linear compliance degradation as instruction complexity increases. Models that reliably follow five constraints begin dropping constraints when the count reaches fifteen. The spec that works in your test prompt — clean, focused — degrades as you add edge cases over time.

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates