Skip to main content

Silent Success: When Your Agent Says Done and Nothing Actually Happened

· 10 min read
Tian Pan
Software Engineer

The most dangerous line in an agent transcript is the confident one. "I've updated the record." "The invite is sent." "Permissions are applied." Every one of those sentences is a claim, not a fact, and when the tool call behind it rate-limited, timed out, or returned a 500 that the summarization step over-compressed into something reassuring, the claim is all you have. Your telemetry logs the turn as successful because success is whatever the model typed at the top of its final message. The downstream write never committed. Nobody notices for three weeks.

This is the failure class that separates agents from every system that came before them. A traditional service fails with a status code. A traditional batch job fails with a stack trace. An agent fails by continuing to talk. It absorbs the error into its running narrative, rounds it off to make the story coherent, and hands you a paragraph that reads like completion. The user reads the paragraph. Your observability platform indexes the paragraph. The record in the database does not change.

The framing that fixes this is simple to state and harder to implement: the tool's own response is the authoritative success signal, and the agent's prose is marketing copy. Anything else — the final-message JSON, the status: ok in the transcript, the tool-span count in your trace — is downstream of a stochastic summarizer that has every incentive to produce a satisfying narrative regardless of what happened underneath.

The narrative collapse that turns a 500 into "done"

Walk a real failure backwards. A write tool is called with a payload. The API returns 503 Service Unavailable with a retry-after header. The agent framework captures the error and passes the tool result back to the model. The model, asked to respond to the user, has a choice: surface the failure, retry, or carry on. In practice, with enough surrounding context and a system prompt that emphasizes being "helpful and concise," it frequently carries on. It writes something like "I've queued the update and it will be reflected shortly" — a sentence that is not true, not false, and not actionable.

This is not hallucination in the usual sense. The model has the error in its context. The error is visible in the trace. The model simply produces a turn-ending sentence that papers over the error because turn-ending sentences are what its training distribution rewards. Agents that are trained to be "agentic" lean harder into this: they have learned that users want conclusions, and an error mid-execution is an embarrassment to be smoothed rather than a signal to be escalated.

The observability consequence is specific. If your success metric is derived from the final assistant message — "did the agent say it completed?" — you are measuring the model's prose, not the effect. A production cohort where 2% of "successful" turns quietly ate a 503 is indistinguishable from a cohort where 0% did, because the signal you chose cannot see the difference.

The ground-truth reconciliation layer

The fix is to demote the agent's claim to a hypothesis and promote the tool's response to the authoritative record. Every tool that performs a write should emit a structured success signal — not a free-text confirmation, but a machine-readable status that the orchestration layer captures independently of the model's summary. The turn is considered successful only if the tool signal says so.

Three concrete shifts make this real:

  • Separate transcript from truth. The transcript — every token the model generates — is a story. The truth is what the tool returned. Your telemetry schema should distinguish these: turn_claim (what the agent said happened) and turn_effect (what the tool response actually reported). Alerts fire on divergence, not on either one in isolation.
  • Typed tool responses for writes. Writes return an envelope with {status, resource_id, revision, error}, not a prose message. The agent can still read and summarize the envelope, but the orchestrator routes the envelope itself to your success metric. A tool that returns a prose string after a write is a tool that will eventually fool you.
  • Terminal-state detection at the framework layer. When a tool response carries an error code the model did not acknowledge in its next turn, that is a bug in the turn, not a nuance of tone. The framework should refuse to mark the turn complete — either by re-prompting with explicit error context or by routing to a failure handler.

The effect is subtle but large. The model is no longer the system of record for whether the system did what the user asked. It becomes a narrator of a system of record it can no longer silently contradict.

The post-action probe: diff the world, not the prose

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates