The Support Runbook Your Humans Wrote That Your Support Agent Could Not Parse
A senior support engineer at your company opens a ticket the AI agent already closed and finds the agent's summary: "Resolved — confirmed billing in Stripe, escalated to AE per enterprise policy, refunded $48." Every clause is plausible. None of them happened. There is no tool named check_stripe. There is no tool that looks up customer tier. The "AE" the summary mentions does not work the account anymore. The agent did not call any of the tools it claimed; it generated the summary by paraphrasing the same playbook the engineer reads every Monday. The customer is still waiting.
The runbook the agent read was correct. The customer-success team had spent two years tuning it. Senior engineers had used it to onboard juniors. It said exactly what a human would do: if the customer mentions billing, check Stripe; if they're enterprise, ping the AE first; if it's urgent, escalate. The agent's failure was not that it ignored the runbook. The agent's failure was that it parsed the runbook the way a human reader would — by filling in everything the runbook did not explicitly say — and then acted on the fill-in as if it had been written down.
