Skip to main content

The OOO Auto-Reply Your Agent Did Not Read

· 8 min read
Tian Pan
Software Engineer

Your support agent pages a human at 2 a.m. The human has been out for a week. The OOO message lives in the same inbox the agent is reading. The agent pings the human anyway. The auto-reply lands. The agent thanks it politely and pings again, because the reply did not contain the resolution code it was waiting on. Twelve cycles in, somebody on a different team notices the unread thread is now sixty messages deep and goes manually wake up the on-call.

The agent did exactly what the prompt told it to do. The prompt told it to escalate to a person. The person was a string, not a role. The string did not know about PTO.

This is a small bug with a large lesson. Humans have boundaries — sleep, weekends, vacations, conference travel, the random Tuesday they take off for a dentist appointment. Agents inherit none of those boundaries unless someone wires them in. And the natural place to put that information — the calendar, the OOO message, the on-call schedule — is almost always somewhere your agent is not looking.

The escalation target is a role, not a person

The first design fix is the cheapest, and most agent stacks still get it wrong: the agent's escalation target should be a role, not a username.

This is well-trodden ground in incident response. PagerDuty's whole model of service ownership is built around escalation policies that route to schedules, not individuals. The schedule resolves to whoever is on call right now. If the primary is unavailable, the policy escalates to the secondary after a timeout. The agent on the wire never has to know who any of these people are — it routes to a role and the role dereferences to a person at the moment of the page.

Most LLM agents I have seen ship with the opposite design. Somebody hardcodes a Slack handle into a system prompt. Six months later that engineer changes teams. The agent is still pinging them. The pages get muted. Real escalations get lost because everybody learned to ignore the channel where the bot lives.

The rule is small and worth carving in stone: never let your agent hold a reference to a human. Let it hold a reference to a role, and let the role do its own resolution. The same way your service architecture treats users as IDs that resolve to names, your agent architecture should treat humans as roles that resolve to people.

Calendar and OOO state are first-class context

The harder problem is that even a role-based escalation can fail when the role is empty or the role-holder is out. Somebody is "on call" on paper, but they are at a wedding with no signal, or they are at a conference where Slack DMs do not surface notifications, or they just took the morning off and forgot to update the schedule.

The human side-channel that conveys "they are not here" — the OOO reply, the Slack status set to a beach emoji, the calendar event called OOO, the away indicator next to their avatar — is the most context-rich signal your organization produces about who can actually do work right now. And almost no agent stack treats it as input. It is treated as conversational output that the agent reads, reasons about briefly, and ignores in favor of whatever its prompt told it to do.

The fix is to lift OOO and calendar state out of the conversation surface and into the routing decision. Before the agent escalates, it should ask the routing layer a question that looks like:

  • Is the resolved role-holder available right now?
  • Does their calendar show OOO or an all-day busy event?
  • Has their auto-reply been turned on?
  • Is their Slack status one of a small set of unavailable indicators?

If any of those returns false, the agent should not page them. It should walk the escalation policy to the next person, exactly the way PagerDuty does for SEV-1 incidents. This is not novel. It is the same logic that has run on-call rotations for fifteen years. The only new thing is that your agent has to participate in it instead of routing around it.

Why the loop happens

Once you accept that the agent should never be paging a person who is OOO, the next question is why the loops happen at all. The answer is mechanical and worth stating plainly.

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates