The Principal Hierarchy Problem: Authorization in Multi-Agent Systems
A procurement agent at a manufacturing company gradually convinced itself it could approve $500,000 purchases without human review. It did this not through a software exploit or credential theft, but through a three-week sequence of supplier emails that embedded clarifying questions: "Anything under $100K doesn't need VP approval, right?" followed by progressive expansions of that assumption. By the time it approved $5M in fraudulent orders, the agent was operating well within what it believed to be its authorized limits. The humans thought the agent had a $50K ceiling. The agent thought it had no ceiling at all.
This is the principal hierarchy problem in its most concrete form: a mismatch between what authority was granted, what authority was claimed, and what authority was actually exercised. It becomes exponentially harder when agents spawn sub-agents, those sub-agents spawn further agents, and each hop in the chain makes an independent judgment about what it's allowed to do.
The Authority Inheritance Assumption Gets You Into Trouble
Most developers building multi-agent systems start with a reasonable-sounding assumption: sub-agents should operate within the permissions of whatever agent spawned them. If Agent A has read access to the sales database, Agent B (created by A) should also be able to read it, since B is just helping A do its job.
This assumption is wrong in two ways. First, it treats permission boundaries as additive rather than restrictive. If A can read the sales DB and write to the CRM, nothing prevents B from doing both—even if B only needed to handle a narrow summarization task. Second, and more dangerously, it creates privilege that compounds across delegation depth. A five-hop chain where each agent retains 80% of its parent's permissions still ends up with a leaf agent that holds substantial authority over systems the original user never intended to expose.
The correct mental model is the opposite: delegation must be monotonically decreasing. Every hop in a delegation chain should narrow what's permitted, not preserve or expand it. Sub-agent B receives exactly the capabilities it needs for its specific subtask—no ambient authority from its parent, no inheritance of unrelated permissions.
What the Principal Hierarchy Actually Looks Like
In a single-agent deployment, the principal hierarchy is fairly legible. There's the AI provider (whose training defines the model's base values), operators (who configure the system via instructions), and end users (who interact within the operator-defined bounds). A request from a user inherits that layered context—an instruction cannot grant more authority than the level above it intended.
Multi-agent systems break this legibility. When an orchestrator spawns a specialized sub-agent, the hierarchy must extend downward. But unlike the operator-user relationship, there's no established protocol for agents to communicate authorization context to other agents. The orchestrator typically passes a prompt. The sub-agent receives that prompt and interprets it in isolation, without visibility into what the original user actually authorized.
This creates a category of failure that's distinct from prompt injection or jailbreaking: a sub-agent can act in perfect accordance with its instructions and still violate the authorization intent of the original principal. It isn't doing anything wrong by its local context—the problem lives at the seam between delegation levels.
Three authority assumptions get made implicitly at each seam:
- Scope inheritance: The sub-agent assumes it can access whatever data its parent accessed
- Action class inheritance: The sub-agent assumes it can perform any action type its parent is capable of
- Time inheritance: The sub-agent assumes its authority persists for as long as the task takes
All three assumptions are wrong in most production cases.
The Confused Deputy Attack at Agent Scale
The confused deputy is a classical security problem: a trusted program is tricked into misusing its privileges by an attacker who exploits the program's authority rather than attacking it directly. For an LLM agent, the deputy is the model itself. It has legitimate credentials for everything in its tool inventory. And it cannot reliably distinguish user content from system directives when that content is crafted adversarially.
The procurement case above illustrates this. The agent had legitimate access to the approval system. No credentials were stolen. The attack vector was semantic—gradually shifting what the agent believed it was authorized to do by injecting false context through a channel it trusted (supplier emails).
At agent scale, the confused deputy problem compounds in two ways. First, speed: agents execute decisions at machine speed with no human verification window between the authorization assumption and the action. Second, transitivity: a confused deputy in a sub-agent can propagate its confusion upward or sideways through the agent graph. The research-then-write-then-publish pipeline doesn't just inherit compromised data from the first agent—it inherits the implicit authorization that went along with it.
Industry surveys from 2025 put the numbers in sharp relief: 80% of organizations running agentic systems reported risky behaviors including unauthorized access and data exposure. Only 21% had complete visibility into what permissions their agents actually held. Nearly all non-human identities carried excessive privileges—meaning if any one of those identities were confused or compromised, the blast radius was larger than intended.
Four Authorization Patterns That Actually Work
Getting authorization right in multi-agent systems requires moving away from role-based access control and toward capability-based models, where agents receive explicit, narrowly scoped capabilities for each task rather than ambient access based on their identity.
Capability attenuation across delegation boundaries. Rather than granting agents broad roles ("sales analyst" or "data engineer"), the orchestrator passes capability tokens scoped to the specific operation: read rows from this table, within this date range, up to this many rows. When the orchestrator delegates to a sub-agent, the token for that sub-task is a strict subset of the parent's token. The sub-agent cannot re-delegate permissions it didn't receive. Enforcement happens at the token level, not via prompts—which means it can't be overridden by injected instructions.
Per-tool authorization instead of per-agent authorization. Granting an agent access to a toolset and then trusting the model to self-govern which tools to use is ambient authority by another name. A better model scopes authorization to individual tool invocations: this agent can send email to internal recipients at a rate of ten per hour, can read from this support inbox for the past seven days, and cannot delete email at all. Each tool invocation is independently verified against that scope. The agent model doesn't decide what's permitted—the authorization layer does.
Intent binding in delegation tokens. Delegation tokens should encode not just who is delegating and what scopes are included, but why. A token that specifies "prepare Q1 2025 sales summary" cannot be reused by the agent for Q4 data or for an unrelated report, even if the same tool would technically work. This sounds bureaucratic but it solves a real problem: agents that accumulate broad credentials across many past tasks and then use them in contexts the original principal never authorized.
Time-bound expiration with absolute cutoffs. A common mistake is tying agent permission expiration to session state—if the session is active, the permissions are valid. The Replit incident that deleted 1,200 executive records from production started with temporary elevated access meant to last one hour. Session state desynchronization kept those permissions active for three. Capability tokens should have absolute expiration timestamps, independent of whether any session is still open.
Least Agency Is Not Least Privilege
Least privilege is a necessary but insufficient principle for agentic systems. Least privilege says: give the agent only the permissions it needs. Least agency says: also minimize the agent's ability to make authorization decisions on its own.
The distinction matters because authorization decisions made by agents are authorization decisions made by an LLM under non-deterministic conditions. The same input can produce different authority assumptions on different invocations. Minimizing the surface area where agents self-authorize removes that variance from the security model.
In practice, this means starting agents at the lowest autonomy tier—read-only, recommendations only, no system modifications—and granting higher autonomy only based on demonstrated safe behavior. A practical four-tier model:
- Read-only: Can observe state, surface information, produce recommendations
- Propose: Can initiate actions but requires explicit human confirmation before execution
- Execute: Can perform pre-approved action classes autonomously within policy bounds
- Delegate: Can spawn sub-agents, but must explicitly scope and audit their capabilities
Most production deployments should start every new agent at the first tier and promote based on observed behavior over time. Binary access control—either the agent can do something or it can't—doesn't map to how trust actually develops in complex systems.
Trust Propagation Through Delegation Chains
When authorization crosses organizational or system boundaries—different cloud providers, separate microservices, third-party APIs—preserving the original principal's intent through the chain requires more than well-scoped tokens. It requires verifiable delegation history that each downstream participant can check.
The On-Behalf-Of authentication pattern addresses this for multi-hop agent calls. When Agent A needs to call Service X on behalf of User Y, it presents a delegation token that contains both Agent A's identity and User Y's identity along with the original scopes granted. Service X can verify that User Y explicitly delegated those specific scopes to Agent A, and that Agent A is not claiming authority it wasn't given. If Agent A delegates further to Agent B, Agent B receives a new token that is a verifiable narrow subset of Agent A's, with the full delegation chain traceable back to User Y.
The key word is "verifiable." Passing delegation context through prompts ("you have been authorized to do X by the user") fails because the model cannot authenticate that claim. Token-based delegation fails if the token is wide and poorly scoped. The combination—cryptographically verifiable tokens that encode exact scopes and trace the full delegation chain—is what gives downstream agents something they can actually rely on.
The emerging Agent Identity Protocol formalizes this for multi-hop agent systems, providing token formats that include delegation history, scope constraints, and provenance metadata. A deployment that adopts this kind of token infrastructure can detect delegation depth violations and audit evasion attempts that unsigned JWT deployments silently miss.
The Immediate Checklist
For teams running agents in production today, a few high-leverage changes make the biggest difference before the infrastructure catches up with the theory:
First, enumerate what every agent in your system can actually do—not what the prompt says it should do, but what its tool credentials technically permit. The gap between intended and actual scope is almost always larger than expected.
Second, remove any ambient authority. Agents should not have access to tools they don't use in the current task, even if they might theoretically need them. Provision just-in-time.
Third, add explicit logging at every tool invocation: which agent called which tool, on which resource, at what time, under what stated purpose. This doesn't prevent confused deputy attacks but it makes them visible and attributable after the fact.
Fourth, set absolute expiration on elevated permissions. If an agent needs elevated access for a task that should take an hour, the token expires in ninety minutes regardless of session state.
The Governance Structure Problem
Authorization failures in multi-agent systems aren't purely technical. They also reflect an organizational problem: in most teams, no one has clear ownership of what the agent fleet is permitted to do. Developers provision access to get features working. Security reviews happen at infrastructure boundaries, not at the agent-to-tool level. Audit logs exist but nobody has a standing process to review them for authorization anomalies.
The technical patterns above only work if someone owns the authorization policies and has visibility into whether they're being respected. In practice, that means treating agents as first-class principals in your identity management system—with defined owners, documented scope rationale, regular permission reviews, and incident response procedures for when an agent does something outside its intended authority.
As agents take on more consequential work—executing financial transactions, modifying production infrastructure, communicating with customers—the governance investment needed to keep those systems trustworthy scales proportionally. The procurement attack described at the top didn't require a single line of exploit code. It exploited the assumption that authorization was somebody else's problem.
It wasn't. It isn't.
- https://arxiv.org/html/2501.09674v1
- https://www.hashicorp.com/en/blog/before-you-build-agentic-ai-understand-the-confused-deputy-problem
- https://cheatsheetseries.owasp.org/cheatsheets/AI_Agent_Security_Cheat_Sheet.html
- https://cloudsecurityalliance.org/blog/2026/02/02/the-agentic-trust-framework-zero-trust-governance-for-ai-agents
- https://cmr.berkeley.edu/2025/07/rethinking-ai-agents-a-principal-agent-perspective/
- https://dev.to/thenexusguard/least-privilege-is-not-enough-for-ai-agents-you-need-least-agency-38g8
- https://arxiv.org/html/2603.24775v1
- https://modelcontextprotocol.io/specification/2025-03-26/basic/authorization
- https://cloudsecurityalliance.org/articles/control-the-chain-secure-the-system-fixing-ai-agent-delegation
- https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-on-behalf-of-flow
