The Agent Paged Me at 3 AM: Blast-Radius Policy for Tools That Reach Humans

April 23, 2026 · 12 min read

Software Engineer

The first time an agent pages your on-call four times in an hour because it's looping on a malformed alert signal, leadership learns something the security team already knew: "tool access" and "ability to create human work" were the same permission, and you granted it without either a safety review or a product-ownership review. Nobody owned the question of who's allowed to interrupt a human at 3 AM, because nobody framed it as a question. It was framed as a Slack integration.

The 2026 agent stack has made this failure mode cheap to reach. Anthropic's MCP servers, OpenAI's Agents SDK, and the whole class of vendor-shipped action tools have collapsed the distance between "the model decided to do a thing" and "a human got woken up." Most teams ship those integrations the same way they ship a database client: scope a token, drop in the SDK, write a system prompt, ship. The blast radius of a database client is a row count. The blast radius of a PagerDuty client is a person's sleep.

The Permission Nobody Audited

Look at how a human-facing tool actually gets installed. An engineer wires up the PagerDuty MCP server, gives the agent a service token with incidents:create, and tests it on a staging incident. It works. Ship.

What was reviewed: the OAuth scope, the API contract, maybe a rate-limit setting on the PagerDuty side. What was not reviewed: under what conditions the agent should decide to invoke that tool, who bears the consequence when it does, how often it's allowed to, and what counts as "the agent should never do this autonomously." Those questions weren't on the integration checklist because they don't fit in a checklist. They're product-ownership questions wearing the uniform of an infra config.

Most agent frameworks today have two modes for any tool: it's available, or it isn't. There is no middle ground between "the model has full access to call this thing on its own" and "the model can't see it." That binary is fine for a SQL read replica. It is the wrong shape entirely for a tool whose downstream effect is a human being.

The asymmetry shows up clearly when you list the permission grants by their human cost. Read a S3 bucket: zero. Write a database row: low, reversible. Post in #general: hundreds of human-attention-seconds. Page the on-call: a person is now awake. File a JIRA: someone's backlog grew. Send an email to the customer: a relationship moved. The integration token treats them as equivalent. The org chart does not.

What "Blast Radius" Means When the Blast Is People

Infrastructure people have been thinking about blast radius for a decade — the unit is "how much of my system breaks." Account isolation, region partitioning, kill-switches, percentage rollouts. The concept maps cleanly to compute and storage because the thing being affected is fungible and re-creatable.

Human-facing tools break that model. The unit isn't a container or a row, it's a person's attention. People are not fungible. A page at 3 AM costs the on-call disproportionately more than a Slack message at 2 PM, and a single false page erodes trust in the alerting system in a way that takes weeks to rebuild. There's no rollback for "we paged you and it was nothing."

So the dimensions you actually need to track for a human-reach tool are different from the infra ones:

Reach: how many people, and which ones (one on-call vs. a #general channel of 400)
Interruption tier: passive (channel post), active (DM), urgent (page)
Reversibility: can the side effect be undone, or has the human already context-switched
Cumulative pressure: this action plus the last six the agent took on the same person
Trust cost: false-positive rate of this class of agent action, not the overall agent

Cumulative pressure is the one teams miss. A single agent action looks defensible in isolation. Three agent actions in a row, against the same human, in the same hour, look like harassment — even if each one was individually correct. The policy layer has to track the recipient as a stateful resource, not the action as a stateless event.

Postmortems from the last year show three recurring failure modes. None of them are about model quality. They're about the lack of a control plane between "the model produced a tool call" and "a human was contacted."

The amplified false positive. An upstream alert fires for a flaky integration. The agent is wired to investigate, summarize, and notify. Its summary lands as a P1-shaped Slack message in the incident channel. Three downstream agents subscribed to that channel each interpret the summary as a real signal and each notify their own owners. One alert becomes seventeen, and the noise crowds out the actual problem.

The recursive page. An agent pages the on-call about a degraded tool. The on-call's response (or absence of response) is itself a metric the agent watches. After five minutes of "no acknowledgment," the agent re-pages — except this time it adds the on-call's manager. Twenty minutes later it has paged a small org chart, none of whom can fix what was originally a transient blip in a downstream API.

The wrong-channel blast. The agent's slack_post tool was scoped to "all channels the bot is in" because that was the easy default. A formatting bug causes it to post a debug trace into #all-engineering instead of the dev channel it intended. The post says "ERROR" three times in bold. Four hundred engineers see it before it's deleted. The trust hit on the agent's outputs is permanent in a way that a single quiet bug fix could not have caused.

What ties them together: in every case the agent did the thing it was technically allowed to do, and the thing was wrong because the policy didn't model human cost.

A Policy Layer That Sits at the Tool Boundary

The fix is not "make the model smarter about when to page." Models are not the right place to enforce a quota, and "the model should know better" is not an availability argument you can take to the post-incident review. The fix is a policy layer that sits between the agent and any tool that reaches a human, enforced by the runtime, not by the prompt.

Loading…

References:

Let's stay in touch and Follow me for more thoughts and updates

Twitter LinkedIn Telegram Discord 小红书

The Agent Paged Me at 3 AM: Blast-Radius Policy for Tools That Reach Humans

The Permission Nobody Audited

What "Blast Radius" Means When the Blast Is People

A Policy Layer That Sits at the Tool Boundary

Recommended Reading

About Tian Pan

The Permission Nobody Audited​

What "Blast Radius" Means When the Blast Is People​

The Recursive Pager: Three Incident Patterns Worth Naming​

A Policy Layer That Sits at the Tool Boundary​

Recommended Reading

About Tian Pan

The Permission Nobody Audited

What "Blast Radius" Means When the Blast Is People

The Recursive Pager: Three Incident Patterns Worth Naming

A Policy Layer That Sits at the Tool Boundary