Skip to main content

The Agent Degraded-Mode Spec Is the Document You Didn't Write

· 11 min read
Tian Pan
Software Engineer

When the search index goes stale, the vendor API throttles you, the database read replica falls behind, or a downstream microservice starts returning 503s, your agent has to decide what to do. In most production agent systems, that decision was never made. It was inherited — silently — from whatever the engineer who wrote the tool wrapper happened to type at 4 PM on a Tuesday in week three of the project.

The result is what your customers eventually write for you: a Reddit thread, a support transcript, a quote in a press article. "The assistant told me my balance was $0 when my account was actually fine — turns out their lookup service was down." That paragraph is the degraded-mode spec your team didn't write. It is now public, it is now the customer's, and it is the version your engineering org will spend the next quarter responding to.

The pattern is consistent across teams that ship agents to production. The happy-path spec is meticulous: tool schemas, prompt templates, eval suites for the workflows the team designed. The degraded-mode behavior — what the agent does when one of those tools is down, slow, rate-limited, or returning stale data — is implicit. It lives in the catch blocks of individual tool wrappers, in default timeouts inherited from HTTP libraries, in retry counts copied from a Stack Overflow answer. Five engineers shipped five different fallback behaviors. The agent's overall posture under degradation is the cartesian product of those individual choices, and nobody has ever read it end-to-end.

The Failure Catalog Your Wrappers Encode by Accident

Walk through the tool wrappers in any agent codebase and you will find at least five different reactions to the same class of failure:

  • Silent fail. The wrapper catches the exception, returns null or an empty string, and the agent reasons over a missing field as if it were a present-but-unknown value. The user gets a confidently wrong answer with no indication that anything went sideways.
  • Retry forever. The wrapper has a retry loop with exponential backoff and no upper bound. A 30-second outage becomes a 4-minute conversation hang. The agent's tool budget is consumed trying to make the same call work, and the user watches a spinner.
  • Partial answer with no warning. The wrapper returns {"data": null, "error": "timeout"} and the prompt template doesn't mention what to do with that shape. The model picks one of the keys, often the wrong one, and continues.
  • Generic refusal. The wrapper bubbles the exception up, the agent layer catches it, and the user gets "I'm sorry, I can't help with that right now" — indistinguishable from a safety refusal, a permission failure, or a model outage.
  • Confidently wrong. The wrapper has a stale-cache fallback that nobody documented. The data is from yesterday. The prompt has no signal that this is fallback data, so the model treats it as fresh, and the answer is wrong by a margin the user only catches three days later.

Each of these is a defensible choice in isolation. Stacked together across a dozen tools, they produce an agent whose degraded behavior is unpredictable, untested, and impossible to communicate to a customer-success team or a regulator. The agent has a specification — it is just one nobody read before it shipped.

Reliability research has started to put numbers on this. Recent eval frameworks introduce metrics like the Graceful Degradation Score, which measures partial-credit completion under simulated dependency failures, and the Meltdown Onset Point, which detects the moment a long-horizon agent starts compounding errors after a single tool failure. Most teams running these evals discover their agent's degradation curve drops off a cliff long before they expected. The implicit spec was worse than the team's optimistic mental model — every time.

A Per-Tool Degraded-Mode Contract

The fix is not to write a single fallback function and call the system robust. It is to name, per tool, what the agent should do under each class of failure. That contract is short — it fits on one page — and the act of writing it forces the conversation that the implicit spec evaded.

A workable contract has three columns: failure class, severity, and degraded action.

  • Failure class. Be specific. "Tool errored" is not a failure class. "Tool returned 429 rate-limited," "tool timed out before first byte," "tool returned 200 with an empty result," "tool returned cached data older than the freshness contract" — those are failure classes the agent's behavior should depend on.
  • Severity. Is this recoverable, recoverable-with-disclosure, or fatal-for-this-task? A stale cache hit on a "show me my dashboard" query is recoverable-with-disclosure. The same stale cache on a "what's my account balance" query is fatal-for-this-task.
  • Degraded action. What does the agent do? Options: retry-with-backoff (bounded), fall through to a cheaper tool, return a partial answer with explicit disclosure, refuse with a specific reason, or escalate to a human.

The contract is per-tool because the right answer depends on the data the tool fetches. A search index that's two hours stale is fine for a "summarize the documentation on X" query and catastrophic for a "has anything changed in the last hour" query. The same tool, the same staleness, two different correct degraded behaviors. The team that ships a global "if stale, proceed anyway" policy across every tool is shipping a spec they would never sign off on if they wrote it down.

The act of writing the contract surfaces decisions that were previously distributed across catch blocks. It is the same shift that moved infrastructure from "however the engineer who wrote it configured it" to "Terraform manifest reviewed by two people." The artifact existed before, encoded in code; the discipline is making it explicit so the team can govern it.

Disclosure as a First-Class Output

The piece teams skip most often is user-facing degraded-mode UX. The agent has a different mode now — the customer should be able to tell. "Sorry, I can't help" is the wrong message when the truth is "I can answer most of this, but the inventory system is unreachable right now, so I'm working from yesterday's snapshot for the stock levels." Those are different signals that require different user actions.

The teams that get this right treat the degraded disclosure as a structured output, not a string the model invents. The agent's response carries a parallel degraded field — {"missing_tools": ["inventory_lookup"], "data_freshness": "2026-04-27T14:00Z", "user_action": "verify before checkout"} — and the rendering layer turns that into a UX banner, a call-out in the response, or a refusal-with-reason. The model is responsible for emitting the structured signal; the front-end is responsible for displaying it.

This separation matters for two reasons. First, the model is unreliable at remembering to mention degradation when it produces flowing text, especially under context pressure. Wiring degradation as a structured field that the wrapper layer always sets — independent of what the model wrote — guarantees the disclosure happens. Second, the structured field is what makes the SLO measurable. You cannot easily compute "what fraction of degraded responses disclosed the degradation" from natural-language responses; you can compute it trivially from a boolean field.

The Cox Automotive autonomous customer service deployment is an instructive example here: their agents enforce circuit breakers on cost and conversation length, and when either trips, the system gracefully hands off to a human at the dealership rather than continuing in degraded mode. The handoff is the degraded-mode UX, and it is engineered, not improvised. The user sees a recognizable transition, not a silent quality cliff.

The Eval That Proves It Actually Works

Writing the contract is necessary but not sufficient. The discipline that closes the loop is a degraded-mode eval suite that runs the agent against simulated dependency failures and grades the response.

This is a different shape of eval from your golden-path suite. The inputs are not "queries the agent should handle" — they are "(query, dependency-failure) pairs the agent should handle in a specified degraded way." The graders are not measuring answer quality alone — they are measuring whether the agent disclosed the degradation, whether the disclosure matched the actual failure class, whether the agent stayed within the bounded retry envelope, and whether the response was recoverable rather than confidently wrong.

In practice, this means injecting failures at the tool boundary: returning timeouts, stale data, rate-limit errors, malformed payloads, and empty result sets. The eval grades each scenario against the contract. The grade is binary on disclosure ("did the response carry a degraded flag when the dependency failed?") and graded on quality ("given the failure, was this the best partial answer the agent could give?").

Teams that run this eval are surprised twice. First, by how often their agent fails to disclose degradation when the contract says it should. Second, by how often the agent's "answer with disclosure" is actually better — more useful, more actionable, more trusted — than the same agent's response on a happy-path version of the same query. Honest impairment beats confident fabrication on user trust metrics. The customers who got the "I can't reach the inventory system right now, here's what I do know" response come back. The customers who got the confidently-wrong answer, then discovered it was wrong, do not.

SLOs Denominated in Honest Disclosure

The metric that operationalizes the contract at runtime is a degraded-mode disclosure rate: out of all agent turns where a tool dependency was degraded, what fraction emitted a degraded flag in the structured response? This is a different SLO from your latency SLO, your availability SLO, or your eval pass-rate SLO. It is the metric that catches the failure mode where everything looks fine on the dashboard — tools are running, the model is responding, latency is green — but the agent is silently producing wrong answers under degradation because the wrapper layer did not signal it.

A target in the 95–99% range is realistic. The remaining few percent are typically races between dependency state and disclosure logic, or genuine edge cases the contract didn't cover. The SLO is calibrated against the cost of a missed disclosure: in low-stakes contexts (a recommendation feed) you can tolerate more silent degradation than in high-stakes contexts (a billing agent, a medical triage assistant, a financial advice tool).

The org failure mode here is predictable. The team measures availability and latency with great care, ships an agent whose disclosure rate is implicitly 0% under common dependency failures, and discovers the gap at the first incident — typically when a customer complaint surfaces a confidently-wrong answer that the postmortem traces back to a tool that was timing out in a way the agent never disclosed. The fix is not a one-off; it is a metric the team should have been watching from day one, on the same dashboard as the others.

Write It Before the Dependency Fails

The architectural realization is that an agent's degraded-mode behavior is a product spec. It deserves the same treatment as the happy-path spec: written in a document, reviewed by the people who will own the consequences, tested in an eval suite, monitored with an SLO, and updated when the system changes. Treating it as a fallback the on-call invents during the incident is how the customer ends up writing it for you in a support ticket.

The teams that have done this work say the same thing about it: writing the contract took less time than they expected, the conversations it forced were the ones the team had been avoiding, and the eval surfaced specific bugs that the happy-path eval suite had been masking for months. The cost of doing it is small. The cost of the alternative is a spec the company doesn't control, written by a customer with worse information than the engineering team had, and read by everyone except the people who could have prevented it.

If your agent is in production and you cannot point to the document that says what it does when its tools are down, that document still exists. It is just not in your repo yet.

References:Let's stay in touch and Follow me for more thoughts and updates