Skip to main content

101 posts tagged with "security"

View all tags

The Reply-All That Wasn't: Agent Outbound Fan-Out Hazards

· 9 min read
Tian Pan
Software Engineer

The user asked the agent to "let Karen know we're done." The agent called send_email with the recipient field set to karen-team@, the most plausible address its contact-lookup tool returned. The message — three paragraphs of internal-only project status, including a candid line about a customer's renewal risk — landed in forty inboxes. One of those inboxes belonged to the customer in question. The postmortem ran for two weeks.

There was no prompt injection. There was no model jailbreak. The tool worked exactly as specified. The contract the team wrote for send_email was "send a message to a recipient." The contract the world enforces is "broadcast to a group whose composition the sender did not audit." That gap — between what the tool is named and what the tool can actually do — is where most outbound agent incidents live.

Email is the obvious example, but the same hazard hides in every messaging tool an agent ever touches. The thirty years of muscle memory humans built for these channels did not transfer to the planner pattern-matching its way through a contact list.

The Shadow AI Governance Problem: Why Banning Personal AI Accounts Makes Security Worse

· 9 min read
Tian Pan
Software Engineer

Workers at 90% of companies are using personal AI accounts — ChatGPT, Claude, Gemini — to do their jobs, and 73.8% of those accounts are non-corporate. Meanwhile, 57% of employees using unapproved AI tools are sharing sensitive information with them: customer data, internal documents, code, legal drafts. Most executives believe their policies protect against this. The data says only 14.4% actually have full security approval for the AI their teams deploy.

The gap between what leadership believes is happening and what is actually happening is the shadow AI governance problem.

The instinct at most companies is to respond with a ban. Block personal chatbot accounts at the network level, issue a policy memo, run an annual training, and call it governance. This is the worst possible response — not because the concern is wrong, but because the intervention makes the problem invisible without making it smaller.

The SIEM Bill Your AI Feature Forgot to Include

· 10 min read
Tian Pan
Software Engineer

The math is simple and nobody did it. Pre-AI, a single user action — "summarize this ticket," "send this email" — produced one application log line. Post-AI, the same action emits a request log, an LLM call trace, a tool-invocation span for each tool the agent called, a retrieval span per chunk it read, a response log, and an eval log if you sample for offline scoring. The fan-out for one user click is now 30 to 50 records on the floor of your observability pipeline, and that's before retries, before sub-agents, before the planner-executor split that 2x's everything again.

You shipped an AI feature in Q1. In Q2, your security director walks into a budget review with a Splunk renewal that's 4x higher than last cycle. Nobody on the AI team is in the room. The conversation that happens next — about who owns the cost, why the threat-detection rules stopped working, and whether legal hold on every conversation is actually mandatory — is a conversation you should have had at design time and didn't, because the cost didn't show up on the LLM invoice. It showed up downstream, in a tool the AI team has never logged into.

Tool Schema Design Is Your Blast Radius: When Function Definitions Become Security Boundaries

· 10 min read
Tian Pan
Software Engineer

The most dangerous file in your agent codebase is the one you've been writing as if it were API documentation. The tool registry — that JSON or Pydantic schema that tells the model what functions exist and what arguments they take — is no longer a docstring. It is your authorization layer. And if you designed it the way most teams do, you handed the LLM a master key and called it good engineering.

Consider the canonical first cut at a tool: query_database(sql: string). The intent is reasonable — let the model formulate the right SQL for the user's question. The reality is that the model is now an untrusted client with unlimited DDL and DML rights to whatever database the connection string points at. The system prompt that says "only run SELECTs on the orders table" is a suggestion, not a control. When a prompt-injected tool result — an email body, a webpage, a PDF — tells the model to run DROP TABLE users, your authorization model is the model's instruction-following discipline. That is not authorization. That is hope.

Agent IAM Is Not Service IAM: Why OAuth Breaks When Intent Is Constructed at Runtime

· 12 min read
Tian Pan
Software Engineer

The bearer token model has one assumption that agents quietly violate: the caller knows what they want when they ask. OAuth scopes, IAM roles, and API keys are all designed around a principal whose intent is fixed before authentication begins. Your CI runner has stable intent. Your microservice has stable intent. An agent does not. An agent's intent is assembled at request time out of a user prompt, a system prompt, retrieved documents, and the outputs of tools that may themselves have been written by an attacker. By the time the agent reaches for a token, the policy decision that the IAM layer has to make has already been made — by inputs the IAM layer never saw.

This is why the same auth pattern that has worked for fifteen years of service-to-service traffic is now producing a class of incidents nobody has good language for. A prompt injection lifts a long-lived bearer token. An agent "remembers" a permission across sessions because the token outlived the user's intent. A multi-step task that legitimately needs three scopes holds all of them for the entire session instead of acquiring and releasing them per step. None of these are OAuth bugs in the strict sense. They are consequences of stretching a model that assumes static intent to cover a caller whose intent is reconstructed every turn.

The Air-Gapped LLM Blueprint: What Egress-Free Deployments Actually Need

· 11 min read
Tian Pan
Software Engineer

The cloud AI playbook assumes one primitive that nobody writes down: outbound HTTPS. Vendor APIs, hosted judges, telemetry pipelines, model registries, vector stores, dashboard SaaS, secret managers — every one of them quietly resolves to a domain on the public internet. Pull that one cable and the stack does not degrade gracefully. It collapses.

That is the moment most teams discover their architecture has an egress dependency they never accounted for. A "small" prompt update needs to call out to a hosted classifier. The eval suite hits an LLM judge over the wire. The observability agent phones home. The model registry pulls weights from a CDN. None of it is malicious, and none of it is unusual. It is just what the cloud-native stack looks like when you stop noticing the cable.

Agent Credential Blast Radius: The Principal Class Your IAM Model Never Enumerated

· 11 min read
Tian Pan
Software Engineer

The security org spent a decade killing off the "service account that can do everything." Scoped tokens, short-lived credentials, JIT access, per-action audit — the whole least-privilege playbook landed and stuck. Then the AI team wired up an agent, the prompt asked for a tool catalog, and the engineer requested the broadest OAuth scope the platform would issue. The deprecated pattern is back, wearing new clothes, and this time the principal calling the API is a stochastic loop nobody is sure how to scope.

The agent has read-write on the calendar, the file store, the CRM, and the deploy pipeline because the API surface couldn't be enumerated up front. The token is long-lived because no one wired the refresh path. The audit log records the bearer, not the action. And IAM owns human and service identity, the platform team owns workload identity, the AI team owns the agent's effective permissions, and the union of those three sets is owned by no one.

The Agent Permission Prompt Has a Habituation Curve, and Your Safety Story Lives on Its Slope

· 10 min read
Tian Pan
Software Engineer

There is a number that should be on every agent product's safety dashboard, and almost nobody tracks it: the per-user approval rate over time. Ship a permission prompt for "may I send this email" or "may I run this query against production," and the curve goes the same way every time. Day one, users hesitate, read, sometimes click no. By week two, the prompt is the fifth one this hour, the cost of saying no is doing the work yourself, and the click-through rate converges to something north of 95%. The team's safety story still claims that the user approved every action. The user, in any meaningful cognitive sense, did not.

This is not a UX problem that better copy can fix. It is the same habituation phenomenon that flattened cookie banners, browser SSL warnings, and Windows UAC dialogs, applied to a substrate that operates orders of magnitude faster than any of those. A consent gate is a security control with a half-life. Ship it without measuring how fast it decays, and you ship a checkbox the user is trained to ignore by week two — and a compliance narrative that depends on a click that no longer means anything.

The Customer Record Hiding in Your Few-Shot Prompt Template

· 11 min read
Tian Pan
Software Engineer

The privacy auditor's question came two days before the SOC 2 renewal: "Why is the email field in your onboarding prompt's example a real customer address?" The product team rebuilt the chain in their heads. A year earlier, when they shipped the AI summarizer, someone needed a "see how this works" example for the few-shot template. They picked a representative customer record from staging, scrubbed the obvious fields — name, account ID, phone — and committed the file. The customer churned six months later. Their record was deleted from the database per the data retention policy. Their record was not deleted from the prompt template, which had been shipped to every tenant in production.

The team had assumed, like most teams, that the privacy boundary was the database. The prompt template was code. Code goes through review. Review doesn't flag PII because reviewers aren't looking for it in YAML strings labeled example_input:. The DLP scanner that catches PII in Slack messages and email attachments doesn't scan committed code, and even if it did, it wouldn't recognize a partially-scrubbed customer record as personal data because the fields it knew to look for had been removed. Everything that remained — the company size, the industry, the rare job title, the specific city — was data the scanner had no rule for.

Agent Traffic Is Not Human Traffic: Designing APIs for Two Species of Caller

· 11 min read
Tian Pan
Software Engineer

The API you shipped two years ago was designed for a single species of caller: a person, behind a browser or a mobile client, clicking once and waiting for a response. That assumption is now wrong on roughly half of every interesting endpoint. The other half of the traffic is agents — your own, your customers', third-party integrations using your endpoints as tools — and they have different physics. They burst. They retry forever. They parallelize. They parse error strings literally. They act on behalf of a human who will not be available to clarify intent when something breaks.

Most of the production weirdness landing in postmortems this year traces back to one architectural mistake: treating both species as the same caller class. Rate limits sized for human pacing get blown apart by an agent's parallel fanout. Error messages designed to be human-readable get parsed wrong by an agent that retries forever on a 400. Idempotency assumptions that humans satisfy by default get violated when an agent retries the same payload from a recovered checkpoint. Auth logs lose the ability to distinguish "the user did this" from "the user's agent did this on the user's behalf."

The fix is not a smarter WAF or a bigger rate-limit bucket. It is a deliberate API design that names two caller classes, treats their traffic as different shapes, and records the delegation chain so accountability survives the indirection.

Your Agent Has Two Release Pipelines, Not One

· 10 min read
Tian Pan
Software Engineer

A team I worked with shipped a "small prompt tweak" on a Wednesday afternoon. The same PR also added one new tool to the agent's registry — a convenience wrapper around an internal admin API that the prompt would now occasionally invoke. The eval suite passed. The canary looked clean. By Thursday morning a customer's billing record had been mutated by an agent acting on a prompt-injected support ticket, the audit trail showed the admin tool firing exactly as designed, and the on-call engineer's first instinct — roll back the prompt — did nothing useful, because the credential had already been used and the row had already been written.

The post-mortem framed it as a security review failure. It wasn't. It was a release-pipeline failure. The team had shipped two completely different asset classes — a behavioral nudge to the model and a new authority granted to the agent — through the same review, the same gate, and the same rollback story, as if they were the same kind of change. They aren't. And once you see them as two pipelines, most "agent governance" debates become much less mysterious.

The Ghost Employee in Your Audit Log: Agents With Borrowed Credentials Break IAM

· 10 min read
Tian Pan
Software Engineer

Pull up your SSO logs from this morning. Every Slack message, every GitHub PR, every calendar invite, every CI run, every Jira comment your AI agent produced — they all show the same thing the human-typed events show: a person's name, a session token, a green "successful authentication" line. Forensically, you have no way to tell which actions came from a human and which came from an agent the human launched and walked away from. That is the ghost employee problem, and almost every team that shipped agents in the last twelve months has it.

The shortcut that creates the problem is structural, not negligent. When you wire an agent into a tool, the easiest credential is the one already in the engineer's environment — their personal access token, their OAuth session, their device-bound SSO cookie. The alternative is a platform project: provision a first-class identity, federate it across every downstream service, wire it into the audit pipeline, build per-instance revocation. None of that ships in a sprint, and none of it shows up on a feature roadmap. So the agent borrows.