Skip to main content

79 posts tagged with "security"

View all tags

Policy-as-Code for Agents: OPA, Rego, and the Decision Point Your Tool Loop Doesn't Have

· 12 min read
Tian Pan
Software Engineer

The first time a regulator asks you to prove that your support agent did not access a Tier-2 customer's billing record on March 14th, you discover an unpleasant truth about your authorization architecture: the system prompt said "do not access billing for Tier-2," the YAML tool manifest said tools: [search_orders, refund_order, get_billing], and somewhere in between, the model decided. There is no record of a decision, because no decision point existed. Whether the agent did the right thing is not auditable, only inferable from logs of what happened.

This is the part of agent engineering that nobody put on the architecture diagram. Tool permissions today still live in a YAML file edited by whoever spawned the agent, surfaced to the model through a system prompt that describes intent, and enforced — when it is enforced at all — by application code wrapping each tool call in an if user.tier == "premium" check. As tool catalogs cross fifty entries and conditions multiply across tenants, data classes, and user roles, that hand-rolled lattice stops scaling, and the system prompt stops being a reliable enforcement surface. The model is not your authorization layer, even when it acts like one.

What is replacing it is policy-as-code: a dedicated policy engine — OPA with Rego, AWS Cedar, or a similar declarative tool — sitting in front of every tool call as a Policy Decision Point. The engine answers a single question per call: given this principal, this tool, these arguments, this context, is the action allowed? The agent runtime never gets to vote. This post is about what that architecture looks like in practice, and the four problems it solves that no amount of prompt engineering can.

The MCP Server Graveyard: When Your Agent's Dependencies Stop Shipping

· 10 min read
Tian Pan
Software Engineer

The last commit to the MCP server your agent calls every five minutes was eight months ago. The upstream API it wraps rolled out a new authentication model in February. There are 47 open issues, 12 of them flagged security. The maintainer's GitHub account hasn't shown activity since October. Your agent still connects, still receives tool descriptions, still executes calls — and silently, every one of those calls flows through a piece of infrastructure that nobody is watching.

This is the shape of MCP abandonment. Not a malicious rug pull, not a compromised package, just neglect. Somebody published a useful server in 2025, got adopted, then moved on. The server kept working because nothing forced it to break. Until it does — and by then, the trust boundary your agent was crossing every five minutes has already failed.

Most teams adopted community MCP servers the way they adopted npm packages: by running install and reading the README. That mental model makes sense for libraries that sit in your dependency tree, get audited at build time, and surface their deprecations through your package manager. It does not survive contact with MCP, where the dependency is a live trust boundary that the LLM invokes in a loop, with credentials, on production data.

Your OAuth Tokens Expire Mid-Task: The Silent Failure Mode of Long-Running Agents

· 11 min read
Tian Pan
Software Engineer

The first time a production agent runs for forty minutes and hits a 401 on step 27 of 40, the incident review is almost always the same. Someone in the room asks why the token wasn't refreshed. Someone else points out that the refresh logic exists, but it lives in the HTTP client the agent's tool wrapper was never wired into. A third person notices that even if the refresh had fired, two of the agent's parallel tool calls would have tried to rotate the same refresh token at the same instant and blown up the session anyway. Everyone nods. Then the team spends the next week retrofitting credential lifecycle into an architecture that assumed requests finish in 800 milliseconds.

OAuth was designed for a world where an access token outlives the request that uses it. Long-running agents inverted that assumption. The request — really, a chain of tens or hundreds of tool calls orchestrated across minutes or hours — now outlives the token. The industry spent a decade building libraries, proxies, and refresh flows around the short-request assumption, and almost none of it transplants cleanly to agent loops.

Your Planner Knows About Tools Your User Can't Call

· 9 min read
Tian Pan
Software Engineer

A free-tier user opens your support chat and asks, "Can you issue a refund for order #4821?" Your agent replies, "I'm not able to issue refunds — that's a manager-only action. You could escalate via the dashboard, or I can transfer you." The refusal is correct. The ACL at the refund tool is correct. And you have just told an anonymous user that a tool named issue_refund exists, that it is gated by a role called manager, and that your platform accepts order IDs of the shape #NNNN.

Your planner knows about tools your user can't call. That asymmetry — full catalog visible to the reasoning layer, partial catalog executable at the action layer — is where most agent authorization gets quietly wrong. ABAC at the tool boundary catches the unauthorized invocation. It doesn't catch the capability disclosure that already happened one token earlier, in the plan, the refusal, or the "helpful" suggestion of a workaround.

Semantic Cache Is a Safety Problem, Not a Perf Win

· 12 min read
Tian Pan
Software Engineer

A semantic cache hit is the only LLM optimization that can serve the wrong answer to the wrong user in under a millisecond. SQL caches return your row or someone else's because somebody wrote a bad join — the failure mode is a query bug. Semantic caches return another tenant's response because two embeddings landed within 0.03 cosine of each other, which is the system working exactly as designed. The cache is doing its job. The job is the problem.

Most teams ship semantic caching as a cost initiative — there's a "70% bill reduction" deck floating around every AI engineering Slack — and review the cache key the way they'd review a Redis TTL: not at all. That review goes to the perf team. The safety team never sees the design doc because nobody filed a security review for "we added a faster path." Six months later somebody's compliance audit finds that "I can't log into my account, my email is [email protected]" and "I can't log into my account, my email is [email protected]" both vectorized within threshold of "I can't log into my account" and the cache cheerfully served Bob the response originally generated for Jane, including the password reset link her account had asked for.

This post is about why semantic caches deserve the same review rigor as SQL predicates, the cache-key design that prevents cross-user leak by construction, and the audit trail you need to distinguish "cache hit served the right answer" from "cache hit served someone else's answer at sub-millisecond latency."

Token Spend Is a Security Signal Your SOC Isn't Watching

· 10 min read
Tian Pan
Software Engineer

The fastest-moving breach signal in your stack isn't in your SIEM. It's in a spreadsheet someone in finance opens on the first of the month. When an attacker steals an LLM API key, exploits a prompt injection to exfiltrate data, or rides a compromised tenant session to query an adjacent customer's memory, the footprint shows up first as a token-usage anomaly — long before any DLP rule fires, any auth alert trips, or any endpoint agent notices something weird. Billing sees it. Security doesn't.

That gap is not theoretical. Sysdig's threat research team coined "LLMjacking" after watching attackers rack up five-figure daily bills on stolen cloud credentials, and the category has since matured into an organized criminal industry with $30-per-account marketplaces and documented campaigns pushing victim costs past $100,000 per day. OWASP catalogued a startup that ate a $200,000 bill in 48 hours from a leaked key. A Stanford research group burned $9,200 in 12 hours on a forgotten token in a Jupyter notebook. The common thread in every one of these incidents: the billing graph told the story hours or days before anyone in security noticed.

AI Compliance Infrastructure for Regulated Industries: What LLM Frameworks Don't Give You

· 10 min read
Tian Pan
Software Engineer

Most teams deploying LLMs in regulated industries discover their compliance gap the hard way: the auditors show up and ask for a complete log of which documents informed which outputs on a specific date, and there is no answer to give. Not because the system wasn't logging — it was — but because text logs of LLM calls aren't the same thing as a tamper-evident audit trail, and an LLM API response body isn't the same thing as output lineage.

Finance, healthcare, and legal are not simply "stricter" versions of consumer software. They require infrastructure primitives that general-purpose LLM frameworks never designed for: immutable event chains, per-output provenance, refusal disposition records, and structured explainability hooks. None of the popular orchestration frameworks give you these out of the box. This article describes the architecture gap and how to close it without rebuilding your entire stack.

Multi-User AI Sessions: The Context Ownership Problem Nobody Designs For

· 9 min read
Tian Pan
Software Engineer

In August 2024, security researchers discovered that Slack AI would pull both public and private channel content into the same context window when answering a query. An attacker in a public channel could craft a message that, when ingested by Slack AI, would inject instructions into a victim's session — and since Slack AI doesn't cite its sources, the resulting data exfiltration was nearly untraceable. The attack could leak API keys embedded in private DMs. Slack patched it after responsible disclosure.

This wasn't a bug in the traditional sense. It was a consequence of treating context as a shared mutable resource with no per-user access control. And it's a mistake that most teams building shared AI assistants are making right now, just more quietly.

Privacy-Preserving Inference in Practice: The Spectrum Between Cloud APIs and On-Prem

· 9 min read
Tian Pan
Software Engineer

Most teams treat LLM privacy as a binary: either you send data to the cloud and accept the risk, or you run everything on-prem and accept the cost. Both framings are wrong. In practice, there is a spectrum of approaches with very different risk profiles and engineering budgets — and most teams are operating at the wrong point on that spectrum without realizing it.

Researchers recently demonstrated they could extract authentic PII from 3,912 individuals at a cost of $0.012 per record with a 48.9% success rate. That statistic tends to get dismissed as academic threat modeling until a security audit or compliance review lands on your desk. The question isn't whether to care about LLM privacy; it's which controls actually move the needle and how much each one costs to implement.

RBAC Is Not Enough for AI Agents: A Practical Authorization Model

· 11 min read
Tian Pan
Software Engineer

Most teams building AI agents today treat authorization as an afterthought. They wire up an OAuth token, give the agent the same scopes as the human user who triggered it, and call it done. Then, months later, they discover that a manipulated prompt caused the agent to exfiltrate files, or that a compromised workflow had been silently escalating privileges across connected services.

The problem is not that RBAC is bad. It is that RBAC was designed for humans with stable job functions, and AI agents are neither stable nor human. An agent's "role" can shift from read-only research to write-capable code execution within a single conversation turn. Static roles cannot express this, and the mismatch creates a predictable vulnerability surface.

AI Content Provenance in Production: C2PA, Audit Trails, and the Compliance Deadline Engineers Are Missing

· 12 min read
Tian Pan
Software Engineer

When the EU AI Act's transparency obligations take effect on August 2, 2026, every system that generates synthetic content for EU-resident users will need to mark that content with machine-readable provenance. Most engineering teams building AI products are vaguely aware of this. Far fewer have actually stood up the infrastructure to comply — and of those that have, a substantial fraction have implemented only part of what regulators require.

The dominant technical response to "AI content provenance" has been to point at C2PA (the Coalition for Content Provenance and Authenticity standard) and declare the problem solved. C2PA is important. It's real, it's being adopted by Adobe, Google, OpenAI, Sony, and Samsung, and it's the closest thing to a universal standard the industry has. But a C2PA implementation alone will not satisfy EU AI Act Article 50. It won't survive your CDN. And it won't prevent bad actors from producing "trusted" provenance for manipulated content.

This post is about what AI content provenance actually requires in production — the technical stack, the failure modes, and the compliance gaps that catch teams off guard.

The AI Output Copyright Trap: What Engineers Need to Know Before It's a Legal Problem

· 11 min read
Tian Pan
Software Engineer

When a large language model reproduces copyrighted text verbatim in response to a user prompt, who is legally responsible — the model provider, your company that built the product, or the user who typed the query? In 2026, courts are actively working through exactly this question, and the answers have consequences that land squarely on your production systems.

Most engineering teams have absorbed the basic narrative: "AI training might infringe copyright, but that's the model provider's problem." That narrative is wrong in two important ways. First, output-based liability — what the model produces at inference time — is largely distinct from training-data liability and remains an open legal question in most jurisdictions. Second, the contractual indemnification you think you have from your AI provider is probably narrower than you believe.

This post covers the practical risk surface for engineering teams: what verbatim memorization rates look like in production, how open source license contamination actually shows up in generated code, where enterprise AI agreements leave you exposed, and the engineering controls that meaningfully reduce liability without stopping AI adoption.