Skip to main content

The Ghost Employee in Your Audit Log: Agents With Borrowed Credentials Break IAM

· 10 min read
Tian Pan
Software Engineer

Pull up your SSO logs from this morning. Every Slack message, every GitHub PR, every calendar invite, every CI run, every Jira comment your AI agent produced — they all show the same thing the human-typed events show: a person's name, a session token, a green "successful authentication" line. Forensically, you have no way to tell which actions came from a human and which came from an agent the human launched and walked away from. That is the ghost employee problem, and almost every team that shipped agents in the last twelve months has it.

The shortcut that creates the problem is structural, not negligent. When you wire an agent into a tool, the easiest credential is the one already in the engineer's environment — their personal access token, their OAuth session, their device-bound SSO cookie. The alternative is a platform project: provision a first-class identity, federate it across every downstream service, wire it into the audit pipeline, build per-instance revocation. None of that ships in a sprint, and none of it shows up on a feature roadmap. So the agent borrows.

The cost of borrowing arrives later, all at once, on the day something goes wrong. Replit's agent deleted a production database holding data for over 1,200 real customers, then fabricated 4,000 fake accounts to cover the deletion — and because the agent ran with full-scope production write access inherited from a developer's session, no token check could have stopped it. In April 2026 a Vercel employee's AI tool was compromised, the attacker took over the employee's Google Workspace account through it, and the breach reached environment variables and production environments before anyone could trace which actions were the human's and which were the agent's. The thing that made these incidents bad was not the prompt injection or the model failure — it was that there was no clean answer to "who did this," because the agent's identity was a thin film over a person's.

The audit log lies, and that's a design choice

Every IAM system you already run was built on a useful fiction: each authenticated session is a single principal acting with intent. SOC 2 controls assume it. Forensic playbooks assume it. The on-call engineer paged at 3 a.m. assumes it. When you give an agent borrowed credentials, you break that assumption silently — the agent reuses the human's token, the downstream service sees a valid signature, the access log records the human's user ID, and the audit pipeline writes a clean row that says "Sarah merged this PR." Sarah didn't. An agent Sarah launched four hours ago acted on a poisoned issue Sarah never read, and the merge happened while she was at lunch.

This is the difference between impersonation and delegation, and it's the line every agent-IAM treatment has converged on. Impersonation says: the agent acts as the user, the audit log records the user, and the agent's existence is invisible to every system downstream. Delegation says: the agent acts on behalf of the user, the audit log records both the agent identity and the delegating human as separate fields, and downstream services can choose to enforce different policy on the two. OAuth 2.0's on-behalf-of flow has had this concept for years; the IETF draft for AI agents on-behalf-of formalizes the requested_actor and actor_token parameters that make it work for agents specifically. Microsoft Foundry's "subject-actor" trust binding does the same thing in their identity stack. None of this is conceptually new. What's new is that the agent boom raised the cost of skipping it from "we can't tell who ran the cron job" to "we can't tell whether a human or a prompt did the breach."

The reason the audit log can't be retrofitted from impersonation to delegation is that the data was never collected. If your service only logged the bearer token's subject claim, you don't have an "agent that acted on behalf of subject" field hiding somewhere in the row. The remediation is a schema change in every downstream service plus a token format change in the issuer plus a policy migration in the consumer — which is exactly the kind of cross-cutting platform work that gets deferred.

Agents are not service accounts, and treating them as one fails on three axes

The instinct, when a security architect first sees the problem, is to say: fine, give every agent a service account and stop borrowing user creds. This is half-right and ends up creating a different kind of failure if you stop there.

A service account assumes a long-lived workload with a stable purpose — a backup job, a webhook receiver, an indexer. The credential lives for months. The scope is fixed in code review and reviewed annually. The behavior is deterministic enough that anomaly detection works. None of that is true for agents. Agents are ephemeral (a session can spin up to handle one task and disappear), delegated (the same agent template runs as a different human's actor on every invocation), and probabilistic (two runs of the same prompt won't hit the same tool sequence, so behavior baselines are noise). The 2026 NHI surveys put the ratio of non-human to human identities at roughly 144:1 and rising; the average enterprise carries over 250,000 NHIs across cloud environments. Treating agents as a static slice of that population means you inherit the worst NHI failure mode — 71% of NHIs aren't rotated within recommended windows, 97% carry excessive privileges — and you add no new control to compensate for the fact that agents act faster, branch more, and respond to untrusted input.

Three properties have to land, and skipping any of them keeps the ghost-employee problem alive.

First-class identity per instance. Not per agent template, per running instance. The trading agent your team deployed last quarter is one template; the 4,000 sessions of it that ran today are 4,000 identities with separate certificates. SPIFFE/SPIRE has been doing this for microservices for a decade and the pattern transfers cleanly: each agent gets a SPIFFE ID, a short-lived SVID, and a certificate the downstream service can verify. The OIDC for Agents (OIDC-A) draft proposed in late 2025 extends OpenID Connect with the same primitives at the application layer. Pick one and commit; what you cannot do is keep treating the agent as a property of the human's session.

Scopes tighter than the delegating human. The default move is to grant the agent the union of permissions the human has — that's what borrowed credentials produce. The right default is the intersection of "what the human can do" and "what this specific task needs," with TTLs measured in minutes, not the human's session lifetime. The human can read every Confluence page; the agent that summarizes one document needs read on one space, for the duration of one task. Agents run with no situational awareness of business context, no fatigue, and far higher concurrency than the human ever would — every scope you don't tighten is a scope a prompt-injection can drive into the ground at machine speed.

Per-instance revocation, distinct from account suspension. The day a runaway agent or a compromised session shows up, the wrong response is to suspend the human's account — that locks the operator out of the tools they need to investigate. The right response is to revoke the agent's actor token and any in-flight delegations, while leaving the human's session intact. This requires that revocation operate on the actor identity as a separate entity, which most off-the-shelf IAM doesn't do today. Aembit, Astrix, Entro, and a handful of others have built this as a product category; if you're building it yourself, the requirement is clear: a kill switch must terminate one agent without taking down the human or the template.

The forensic reconstruction nobody can run today

The clearest test for whether your agent IAM is real is the question the incident-response lead asks two days into a breach: who did this? If the answer requires correlating an SSO event with a chat transcript, a prompt log buried in the model vendor's API, and a downstream service log that just says "Sarah," you've already lost. The forensics will take weeks, the answer will be probabilistic, and the conclusion in the postmortem will be some version of "we couldn't conclusively determine whether the action was human-initiated."

A real agent identity system makes the question one query. Every action carries: the agent's stable identity (the template), the agent's instance identity (this run), the delegating human, the prompt that initiated the task, and the tool-call chain that led to this specific action. The schema isn't exotic — OpenTelemetry already has the slots for it via attributes on spans. What's missing is the discipline to populate every field at the moment the action happens, not as an afterthought scraped from logs.

Three concrete moves get you most of the way there. Tag every outbound action with the agent instance ID before it leaves your code (don't trust the downstream service to remember it for you). Issue actor tokens that carry the delegating human's identity as a claim, not as the subject — so a service that receives the token can decide independently whether to allow the agent's chain. Stand up an action ledger that's separate from the application audit log, where every prompt, plan, and tool call is recorded with the same span ID — so a forensic query can join across the prompt, the plan, and the eventual database write.

Why this fails to ship until the breach forces it

The cost frame is the hard part of this post. An agent identity system is platform work. It competes for headcount with the AI features the business actually announced, and the feature team will fight it as bureaucracy because every approval gate they didn't have yesterday is a latency tax they have to defend in their roadmap review. The platform team will fight it because the requirements aren't fully crystallized yet — the OIDC-A draft is still moving, the OAuth on-behalf-of for AI agents draft is at version 01, and SPIFFE/SPIRE adoption for agents is early enough that the integration patterns are still being written.

The thing that breaks the deadlock, every time, is an incident. The 2026 NHI Reality Report puts the average dwell time after an NHI breach at 200+ days — three times the human-account number — because nobody can tell the agent did anything wrong while it's quietly siphoning data with the right credentials. Half of surveyed enterprises had already taken a breach attributable to unmanaged NHIs by the time the report was published. The teams writing postmortems in 2026 are explaining to regulators why an action attributed to a senior engineer in their SSO logs was actually an agent acting on a malicious prompt the engineer never saw.

The architectural realization, and the only honest framing for this work, is that agents are a new class of actor. They are not users — users have intent, situational awareness, and a single session at a time. They are not service accounts — service accounts have static purpose and long-lived scopes. They are something else, and every IAM model that hasn't named them is one prompt-injection away from an unauditable breach. Borrowing credentials made it easy to ship the first wave. Naming the actor is what lets you keep shipping the second.

References:Let's stay in touch and Follow me for more thoughts and updates