Skip to main content

Agent Credential Blast Radius: The Principal Class Your IAM Model Never Enumerated

· 11 min read
Tian Pan
Software Engineer

The security org spent a decade killing off the "service account that can do everything." Scoped tokens, short-lived credentials, JIT access, per-action audit — the whole least-privilege playbook landed and stuck. Then the AI team wired up an agent, the prompt asked for a tool catalog, and the engineer requested the broadest OAuth scope the platform would issue. The deprecated pattern is back, wearing new clothes, and this time the principal calling the API is a stochastic loop nobody is sure how to scope.

The agent has read-write on the calendar, the file store, the CRM, and the deploy pipeline because the API surface couldn't be enumerated up front. The token is long-lived because no one wired the refresh path. The audit log records the bearer, not the action. And IAM owns human and service identity, the platform team owns workload identity, the AI team owns the agent's effective permissions, and the union of those three sets is owned by no one.

This is not a hypothetical. Machine identities now outnumber humans roughly 82 to 1 in production environments, and 92% of cloud identities run with privileges they never exercise. Gartner's projection that a quarter of enterprise breaches by 2028 will trace back to AI agent abuse is not a forecast about novel attack classes — it is a forecast about replaying every privileged-account incident the last decade already learned how to prevent, with a new principal class that the IAM model never enumerated.

The Privilege the Tooling Made Easy to Request

The agent ships over-permissioned for a depressingly mundane reason: the tooling makes coarse permissions easy and fine permissions hard. When you spin up a new tool integration, the SDK asks "what scopes do you want?" The set of scopes the platform exposes is the union of what every consumer might possibly need — calendar.events.readwrite rather than calendar.events.create_for_meeting_id_X. The engineer ticks the boxes that cover every plausible action the agent might attempt across every plausible session, because progressive authorization at runtime is a flow nobody has the budget to build.

The model then operates inside that envelope. The prompt says "schedule a meeting"; the credential could just as easily delete the entire calendar. The boundary between "what the agent was asked to do" and "what the agent could do" is a prompt away. And on the day a poisoned email, a malicious search result, or a hostile HTML payload steers the agent off-task, the credential's blast radius is the union of every action the original scope grants — not the action the user actually wanted.

The Salesloft–Drift breach that defined late 2025 was the worst-case version of this story: an OAuth token granting a third-party integration access to hundreds of downstream environments, compromised once and used everywhere. The pattern there was not exotic. It was the same pattern the AI team is shipping today, multiplied by the number of agents in production.

Per-Tool Scoping, Not Per-Agent Scoping

The first discipline to land is the one the legacy IAM model already knows but the agent IAM model is still pretending it can avoid: scope the credential to the action, not to the principal. A "calendar agent" does not need calendar permissions; the create_meeting call needs calendar.events.create, the list_attendees call needs calendar.events.read, and those should be two different tokens minted from two different short-lived grants.

The MCP authorization spec moved this direction in late 2025 — servers can now declare per-scope requirements via WWW-Authenticate headers, and clients are expected to use the Step-Up Authorization Flow to request additional scopes only when a tool actually needs them. That spec change is the wire-format admission that bundling every scope into a single token at session start was the wrong shape from the beginning. The implementation work — clients that actually do step-up rather than requesting the union of every possible scope at the start of the session — is mostly still ahead of the industry.

The practical move: enumerate the tools your agent has, write down the minimum scope each one needs, and treat any tool whose scope is wider than its action as a credential bug. If your tool description says "send email" and the OAuth scope grants mail.read, you have a credential bug regardless of whether the agent ever exercises the read.

Just-in-Time Credentials Keyed to the Action

Per-tool scoping shrinks the blast radius across tools. JIT credentialing shrinks it across time. The discipline is the one cloud-native workload identity already enforces for humans and service callers: don't pre-provision long-lived credentials; mint short-lived ones at the moment of use, scoped to the specific action the agent is about to take.

A workable shape: when the agent decides to call a tool, the runtime intercepts the call, presents the action to a credential broker (Vault, an internal STS, or a managed service like AWS IAM Roles Anywhere), and receives back a token that is valid for the next sixty seconds and scoped to the resource ID the agent is about to touch. The token expires before the next tool call and cannot be replayed against a different resource. The agent never holds a credential that outlives a single tool invocation.

This is a meaningful engineering investment — it is also exactly what HashiCorp Vault, AWS IAM, and Azure managed identities have been building toward for a decade. The agent runtime is a new client of an existing pattern, not a new pattern that needs to be invented. The teams that drag their feet on this are the ones still pasting OpenAI API keys into .env files; the ones who do the work get a credential whose lifetime matches the action's lifetime, and a blast radius that is bounded by what the model could plausibly accomplish in sixty seconds against a single resource ID.

The Audit Log Names the Token, Not the Action

A subtler failure mode: even when the credential is scoped, the audit log usually isn't. The IAM platform records "agent-X-token used to call calendar.events at 14:32:11"; what it should record is "agent-X, on behalf of user-Y, in session-Z, with prompt-P, called cancel_meeting with arguments {meeting_id: 12345}, which was approved by no human and was prompted by the contents of an inbound email." The first record can tell you a token was used. The second record can tell you whether the use was legitimate.

The reason this matters is that agent incidents do not look like classic credential incidents. The credential is not stolen; it is used as designed by an agent that was steered off-task by upstream content. The detection signal is not "an unauthorized party held this token" but "this agent took an action that no human asked for." Without a per-action provenance trail — prompt, source content, tool, arguments, originating user — the security team cannot tell the two apart, and the incident response runbook starts with "we revoked the token" and ends with "we do not know what the agent did with it."

The cost of building this provenance into the audit pipeline is small if it is designed in early and very large if it is bolted on after the first incident. The teams writing this discipline down right now — naming the action, the prompt, the source, and the user in every audit row — are the ones who will not be writing the post-mortem in eighteen months.

Anomalous Tool Calls and the Revocation Path the Agent Cannot Trigger

The fourth discipline is the one that most resembles classic intrusion detection but with a meaningful twist: the agent itself cannot be trusted to detect anomalous use of its own credentials. A compromised agent — one steered by indirect prompt injection, or one whose system prompt was itself the attack — has no incentive and no capability to flag its own behavior. The detection has to live outside the agent.

The pattern is the one workload identity vendors are converging on: establish a behavioral baseline per agent (which tools it uses, in what order, against what resource types, at what cadence), and trip an automated revocation path when the live behavior diverges. A customer-service agent that has called send_email a thousand times against tickets in the support queue and now calls send_email against the entire customer database is exhibiting a deviation a per-agent baseline catches and a per-tool scope check does not.

The revocation path is the part the AI team almost always forgets to wire. The credential broker has to support immediate token invalidation, the agent runtime has to handle the resulting 401s gracefully (not by silently retrying with a fresh token), and the on-call channel has to receive the alert with enough context for a human to make a judgment call. The teams that wire this end-to-end can quarantine a misbehaving agent in seconds; the teams that don't will find out about the incident from a customer.

Who Owns the Join

The hardest part of this story is not technical. It is organizational. IAM owns human identity and service-account identity. The platform team owns workload identity for ordinary services. The AI team owns the agent runtime, the prompt, the tool catalog, and — by default — the credentials the agent holds. The union of those three sets is the agent's effective permission surface, and at most organizations no single owner can describe it end-to-end.

The fix is not a new team. It is a single explicit owner for the join. Whether that lives inside IAM, inside the platform team, or inside a new "agent identity" function depends on the org, but the artifact has to exist: a document that names the agent, the tools it can call, the credentials each tool resolves to, the scope of each credential, the lifetime of each credential, the audit destination, and the revocation path. A team that cannot produce that document for an agent in production is shipping an unbounded blast radius and calling it a feature.

What to Build Before the First Incident

The agent identity story will be defined by the first generation of public incidents — credentials minted for "do my work" and used for "exfiltrate the customer table," with no one able to say from the audit log which prompt steered which tool to which resource. The teams that will be writing the post-mortems are the ones treating agent credentials as a refactor of the old service-account pattern; the teams that will be writing the recommendations in those post-mortems are the ones who treated agents as a new principal class with its own identity model.

Concretely, before the next agent ships:

  • Enumerate the tool catalog and the minimum scope per tool. Treat any tool whose credential is wider than its action as a bug.
  • Mint credentials at tool-call time, not session-start time. Sixty-second tokens scoped to the resource ID the model is about to touch.
  • Log the action, the prompt, the source content, and the originating user — not just the token. A token-level audit log cannot answer the question the post-mortem will ask.
  • Build the revocation path the agent cannot trigger. Per-agent behavioral baselines, automated quarantine on deviation, and a human-readable alert.
  • Name the owner of the join between IAM, platform identity, and agent permissions. A single document, a single signoff, a single name.

Agents are a new principal class. The IAM model the org spent a decade hardening was written for principals that don't drift on prompts, don't compose tool calls in ways the original engineer never imagined, and don't have an attack surface that includes any document the model is asked to read. The teams treating agents as a refactor of the old model are buying themselves the incident the rest of the industry is about to share. The teams treating agents as a new model — with their own scope vocabulary, their own credential lifetime, their own audit shape, and their own owner — are the ones who will still be shipping when the post-mortems start landing.

References:Let's stay in touch and Follow me for more thoughts and updates