Skip to main content

The Agent Undo Button Is a Saga, Not a Stack

· 10 min read
Tian Pan
Software Engineer

A user clicks "undo" on an agent action that fanned out to twelve tool calls. The agent sent two emails, created a calendar invite, updated a CRM record, charged a card, and posted to a Slack channel. Three of those operations are non-reversible by API. Two are reversible only by an inverse operation that fires its own downstream notification. The remaining seven each have their own definition of idempotency that the planner never reconciled. The undo button you shipped looks reassuring. It quietly succeeds about 60% of the time and silently fails the rest.

This is not a UX bug. It is a saga-pattern problem that distributed-systems engineers have been working on for thirty years, and ignoring that lineage is the most expensive way to discover it.

The instinct, when product asks for an undo button on agent actions, is to model it like a text editor's undo stack: a list of operations, each with an inverse, popped in reverse order. That mental model maps cleanly to single-process state mutations — typing a character, moving a shape, deleting a row in a local document. It maps poorly to a system where each action is a network call to a different vendor with its own consistency guarantees, where some actions cause humans on the other side to take irreversible action of their own, and where the inverse of one step depends on the result of a step that has not yet been compensated.

What "Reversible" Actually Means When the Tool Crosses a Process Boundary

A useful first move is to stop treating reversibility as a binary property and start treating it as a contract on each registered tool. Three coarse classes are enough to be honest with yourself:

  • Cleanly reversible — the tool exposes an inverse API call whose effect, if executed, fully erases the original action. Creating a draft document and deleting the draft. Updating a record and writing back the prior values. These are the easy cases, and they are rarer than your tool catalog implies.
  • Compensable with residue — the original action cannot be erased, but a forward-going compensating action restores a known-good business state. A payment cannot be unsent; a refund can be issued. A meeting cannot be un-scheduled silently; a cancellation can be sent that fires a notification to every invitee. The compensation is a real action with real side effects, not a database rollback.
  • Non-reversible — once executed, the action's effects cannot be neutralized by any sequence of API calls. An email read by a human. An SMS that triggered a callback. A wire transfer past its reversal window. The only honest "undo" is a follow-up message authored by a human acknowledging the mistake.

Every tool registered with the agent should declare which class it belongs to, and the registration should be the source of truth. If a tool author cannot answer the question, the safe default is non-reversible — the failure mode of treating a non-reversible tool as compensable is silently shipping the lie that the user can take it back.

Pre-Compute the Inverse at Execution Time, Not at Undo Time

The other pattern teams reach for first is the wrong shape: store the executed steps in a log, and at undo time, walk the log and synthesize the inverse for each step. This fails for the same reason database rollback fails when state has already escaped the database: the information needed to construct the inverse is no longer recoverable.

If the agent updated a CRM record, the inverse is "set field X back to its prior value Y." That prior value Y must have been captured at the moment of the write, because by undo time the record may have been touched by a human, by another agent, by a webhook from a downstream system, and the field is now Z. Reconstructing Y from the log of agent actions does not work — Y was never written to the log; only the new value was.

The discipline that has to land is that every executed step writes a saga log entry containing both the forward operation and the pre-computed inverse, captured against the world state observed at execution time. The inverse is data, not code synthesized later. At undo time, the engine walks the log and dispatches the pre-computed inverse for each step, in reverse order, with each compensation idempotent and retryable.

This is the same discipline that orchestrators like Temporal codify, and it is the same discipline a homegrown harness has to invent if it is not using one. Skipping it means your undo button works on the demo flow because the demo flow does not have concurrent writers, and breaks on the first user whose CRM is also being updated by their email integration.

The UX Has to Tell the Truth About Partial Reversal

Even with pre-computed inverses, a real undo will land in three states, and a UX that collapses them into a single "undone" toast is lying to the user.

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates