Skip to main content

When Your CLI Speaks English: Least Authority for Promptable Infrastructure

· 12 min read
Tian Pan
Software Engineer

A platform team I talked to this quarter shipped a Slack bot that wrapped kubectl and accepted English. An engineer typed "clean up the unused branches in staging." The bot helpfully deleted twelve namespaces — including one whose name matched the substring "branch" but which happened to host a long-lived integration environment that the mobile team had been using for a week. No exception was thrown. Every individual call the bot made was a permission the bot legitimately held. The post-mortem could not point to a broken access rule, because no rule was broken. The bot did exactly what its IAM policy said it could do.

The Unix philosophy was a containment strategy hiding inside an aesthetic preference. Small tools with narrow surfaces meant that the blast radius of any single command was bounded by the verbs and flags it accepted. rm -rf was dangerous because everyone agreed it was; kubectl delete namespace required the operator to type out the namespace, and the typing was the gate. The principle of least authority was easy to enforce because authority was lexical: the shape of the command told you the shape of the action.

Then the wrappers started accepting English. Now "the shape of the command" is whatever the LLM decided it meant.

The wrapper is the new confused deputy

The classical confused-deputy pattern — a privileged program tricked by a less-privileged caller into misusing its authority — was supposed to be a relic. It was the kind of bug you saw in early capability systems and decades-old setuid binaries. The recent surge in promptable infrastructure has resurrected it as the dominant authorization failure mode of agentic systems. The Cloud Security Alliance's 2026 research note on agent confused-deputy attacks frames the dynamic precisely: when an AI agent holds broad credentials to act on an operator's behalf, every input channel the agent processes becomes a candidate for authority abuse. An adversary who can write to a Jira ticket, a retrieved document, a tool response, or a Slack message can inject instructions that execute with the operator's full permissions.

But notice: prompt injection is the spectacular case. The everyday case is much quieter. The "deputy" gets confused by its own user, in good faith, because the user asked for "delete the unused branches" and the deputy decided what "unused" meant.

Promptable infrastructure inherits the worst property of natural language: it is paraphrase-rich. The same intent can be phrased a thousand ways, and the same phrase can be interpreted as a thousand intents. A traditional CLI is a parser; a promptable CLI is an interpreter that resolves an underspecified spec into a concrete plan, and then executes the plan with the user's full authority. Every wrapper that translates intent into commands is now sitting on top of an authority-expansion machine.

The Unix model gave you containment by limiting the cardinality of the command set. The promptable model gives you a command set whose cardinality is the size of the language. The blast radius is no longer bounded by the verbs the tool accepts; it is bounded by the imagination of whatever model parses the request.

RBAC was written for verbs, not intents

The IAM stack the industry built for the previous decade — RBAC, sudo, capability tokens, OPA policies, IAM conditions — assumes the principal makes a discrete action request and the policy engine answers yes or no. The grammar is (subject, verb, resource, condition) → allow | deny. It works because verbs are stable nouns: s3:GetObject, pods/delete, secrets/list. You can write a policy against the verb because the verb has a fixed meaning.

Natural language has none of those properties. "Help me clean up the staging environment" is not a verb; it is an underspecified intent that compiles, at runtime, into a sequence of verbs the policy engine has never seen as a unit. RBAC can tell you whether the agent is allowed to call pods/delete. It cannot tell you whether deleting these particular pods was within the scope of what the user asked for. The policy engine sees the leaves; the authorization question lives in the tree.

The 2026 framing emerging from the agent-security research community is that this gap deserves its own name: semantic privilege escalation. The Acuvity write-up describes it as the case where the agent stays within its formal permissions while exceeding the authority the user actually delegated. The arXiv work on "Prompt Flow Integrity" (2503.15547) makes the same point at the architecture level: per-call authorization is necessary but not sufficient, because the policy that matters is the policy on the plan, not the policy on each individual step. SUDO, the attack framework from the 2503.20279 paper, demonstrates the symmetric offensive case: a "Detox2tox" pipeline that rephrases a refused harmful request into a benign-looking sequence of authorized actions, then reassembles the harm at execution time. In each of these, the agent never asked the policy engine for anything it didn't already have.

This is why "give the deploy bot LLM access" is one PR away from "give the deploy bot full admin." The bot didn't get more permissions. The natural-language surface in front of those permissions just got large enough that any subset of them is now reachable through some phrasing.

Intent-bound tokens: sign the English, pin the plan

The architectural primitive that fits this problem is one the IAM industry has flirted with for years and is now being pushed into actual deployment: an intent-bound token. The idea is straightforward and has antecedents in macaroons, biscuits, and the various capability-token designs from the early 2010s. The new wrinkle is that the bound thing is the user's English, not just a scope.

The mechanics:

  1. The user submits an intent in natural language: "scale the checkout service to handle the holiday traffic."
  2. The orchestrator compiles the intent into a concrete plan — a sequence of typed tool calls with resolved arguments. The plan is a deterministic artifact: it can be hashed, serialized, diffed.
  3. The system mints a short-lived token whose payload binds three things: the original prompt, the plan hash, and a scope clause that restricts the token to exactly the calls in the plan, with the resolved arguments.
  4. Each downstream tool re-validates the call against the token. A call that wasn't in the plan fails closed, even if the bot's broader credentials would have permitted it.

The point is not the cryptography — the point is that authority becomes plan-shaped instead of role-shaped. A bot with full kubectl access plus an intent-bound token authorizing only scale deployment/checkout --replicas=N cannot pivot to delete namespace mid-execution, no matter how the plan evolves under prompt injection. The bot's baseline permissions are irrelevant during the lifespan of the token; the token has narrowed them to the resolved plan.

This is also where the IETF and NIST work on agent authentication has been heading. The draft-klrc-aiagent-auth track and the early Microsoft Entra Agent ID work both treat agent identity as something distinct from human or service identity, with the explicit goal of enabling per-task scoping. WorkOS's framing is the same in product terms: agents need authorization that anchors to the task session, not the long-lived agent identity.

Dry-run-and-confirm: making the compiler step visible

Intent-bound tokens fix the runtime problem. They don't fix the authorization problem — the human still has to say yes to the right plan. The pattern that does that work is dry-run-and-confirm, and it is more subtle than "show a diff and click OK."

The naive version: the agent assembles a plan, prints it, the user reads it, the user clicks confirm. This fails for a reason familiar to anyone who has watched a developer click through a Terraform plan: the cognitive load of reading a plan grows faster than the patience to read it, and after the third "looks fine" the human is rubber-stamping. The friction of approval is what gives it value, and the friction degrades by use.

The version that holds up under load has three properties.

First, the plan is the thing being authorized, not the prompt. The user typed English; the system shows back the resolved tool calls. If the resolved plan does not match what the user thought they were asking for, the discrepancy is the security signal. The mismatch is exactly what an intent-bound token would catch at execution; surfacing it pre-execution turns it into a UX affordance.

Second, the gates scale with reversibility, not with action count. The recently published taxonomy work on tool risk classes makes this point in the agent-tool context; it generalizes cleanly to promptable ops. Reversible internal actions can auto-confirm. Public-facing or irreversible actions need an explicit gate. A plan that mixes both should highlight the irreversible steps and require focused approval on those, not on the whole plan. Otherwise users habituate to "approve all" because most of the plan is boring.

Third, the dry-run is honest. Many "plans" produced by agent harnesses today are sketches that the agent revises mid-execution as tool calls return. If the executed sequence can diverge from the dry-run sequence, the dry-run was theater. Either the plan is binding (the agent commits to it and cannot deviate without re-authorization) or the dry-run is a lie that produces false confidence. Pick one.

The audit trail records both the prompt and the action graph

Audit logs designed for traditional ops capture what was executed and by whom. Audit logs for promptable infrastructure have to capture two things in a linked pair: the prompt that came in, and the action graph that came out. Without the prompt, you cannot reconstruct intent; without the graph, you cannot reconstruct what the system did with that intent. The pair is what makes incident response tractable.

The Agent-Sentry work (arXiv 2603.22868) and the broader execution-provenance line of research are converging on a consistent shape: each agent action is a node in a graph, with edges that carry data provenance — which inputs flowed into which outputs, which untrusted source a particular field originated from, which credentials authorized which call. Open standards like W3C PROV are being repurposed for this in the workflow-provenance papers. The 2026 LLM observability tooling (Arize, Confident AI, Portkey) has started shipping the trace half of this; the provenance half is still mostly research, but the direction is clear.

The operational consequence is that "this bot is misbehaving" stops being a forensic mystery. You can answer what intent led to this action, which downstream call was authorized by which clause of which token, and which input field carried the data that flipped the agent's plan into the harmful branch. The Sysdig report on the November 2025 cloud intrusion that achieved admin in under ten minutes is the canonical recent example of why this matters: when LLM-assisted attackers operate at machine speed, post-hoc reconstruction has to be machine-fast too.

The org reality you have to plan for

The architecture is the easy part. The org reality is the part that ships agents to production with admin permissions.

A fictional but representative sequence: a platform team writes a wrapper that lets engineers ask Claude to run common kubectl commands. The wrapper holds a service account with edit permissions on a few namespaces. Adoption is good. Someone files a ticket asking the wrapper to also handle Helm operations; the team broadens the service account. Someone else asks for cross-namespace queries; another grant. Six months in, the wrapper holds cluster-admin because every individual broadening was justified, and no one held the line that adding a natural-language frontend to a credential is a privilege grant in itself.

The lesson the OWASP Gen AI incident roundup and the AquilaX privilege-escalation analyses keep repeating, in different words: the promptable surface is the privilege. When you give an LLM access to a credential, you have not granted it the credential's stated permissions; you have granted it the closure of those permissions under any phrasing of any intent the model can generate. Treat that closure as the actual scope on the audit-and-approval review, not the nominal scope on the IAM page.

A handful of operating principles that follow from this:

  • The scope of an LLM-fronted credential is the credential's permission set, not your intended use of it. Right-size the credential to the worst case, not the typical case.
  • Sub-agents with narrow tools beat one agent with many tools. Snyk's guardrails framing and the OWASP Agent Security cheat sheet both push this: the composition graph is harder to attack when the nodes can't reach each other.
  • An LLM frontend on a tool is a security review event. It does not inherit the previous review of the underlying tool, because the threat model changed when the input language changed.
  • The dry-run is the contract. If your wrapper cannot show the user a plan it will be bound to, you do not have an authorization model — you have a hope.

What this changes about how you build

The work for the next year is to stop treating the LLM in front of the CLI as a UX flourish and start treating it as a new permission boundary that sits in front of every credential it touches. The IAM primitives need to evolve: intent-bound tokens that scope to a plan, dry-run protocols that are binding rather than illustrative, audit trails that capture the prompt-to-action pair as a unit, and reviews that recognize the addition of a natural-language frontend as a meaningful change in attack surface.

The Unix philosophy worked because authority was visible at the syntactic surface. Promptable infrastructure breaks that because the syntactic surface is now arbitrary English. The recovery move is not to abandon the philosophy — it is to push the containment one layer deeper, so the bounded thing is no longer the command but the resolved plan that the command compiled to. The CLI can speak English. The credential underneath should still answer in verbs.

References:Let's stay in touch and Follow me for more thoughts and updates