101 posts tagged with "security"

Prompt Injection Is a Supply Chain Problem, Not an Input Validation Problem

April 19, 2026 · 9 min read

Software Engineer

Five carefully crafted documents hidden among a million clean ones can achieve a 90% attack success rate against a production RAG system. Not through zero-days or cryptographic breaks — through plain text that instructs the model to behave differently than its operators intended. If your defense strategy is "sanitize inputs before they reach the LLM," you have already lost.

The framing matters. Teams that treat prompt injection as an input validation problem build perimeter defenses: regex filters, LLM-based classifiers, output scanners. These are useful but insufficient. The real problem is that modern AI systems are compositions of components — retrievers, knowledge bases, tool executors, external APIs — and each component is an ingestion point with its own attack surface. That is the definition of a supply chain vulnerability.

Red-Teaming Consumer LLM Features: Finding Injection Surfaces Before Your Users Do

April 19, 2026 · 9 min read

Tian Pan

Software Engineer

A dealership deployed a ChatGPT-powered chatbot. Within days, a user instructed it to agree with anything they said, then offered $1 for a 2024 SUV. The chatbot accepted. The dealer pulled it offline. This wasn't a sophisticated attack — it was a three-sentence prompt from someone who wanted to see what would happen.

At consumer scale, that curiosity is your biggest security threat. Internal LLM agents operate inside controlled environments with curated inputs and trusted data. Consumer-facing LLM features operate in adversarial conditions by default: millions of users, many actively probing for weaknesses, and a stochastic model that has no concept of "this user seems hostile." The security posture these two environments require is fundamentally different, and teams that treat consumer features like internal tooling find out the hard way.

Sandboxing Agents That Can Write Code: Least Privilege Is Not Optional

April 19, 2026 · 12 min read

Tian Pan

Software Engineer

Most teams ship their first code-executing agent with exactly one security control: API key scoping. They give the agent a GitHub token with repo:read and a shell with access to a working directory, and they call it "sandboxed." This is wrong in ways that become obvious only after an incident.

The threat model for an agent that can write and execute code is categorically different from the threat model for a web server or a CLI tool. The attack surface isn't the protocol boundary anymore — it's everything the agent reads. That includes git commits, documentation pages, API responses, database records, and any file it opens. Any of those inputs can contain a prompt injection that turns your research agent into a data exfiltration pipeline.

Text-to-SQL at Scale: What Nobody Tells You Before Production

April 19, 2026 · 11 min read

Tian Pan

Software Engineer

Text-to-SQL demos are deceptively easy to build. You paste a schema into a prompt, ask GPT-4 a question, get back a clean SELECT statement, and suddenly your Slack is full of "what if we built this into our data platform?" messages. Then you try to actually ship it. The benchmark says 85% accuracy. Your internal data team reports that about half the answers are wrong. Your security team asks who reviewed the generated queries before they hit production. Nobody has a good answer.

This is the gap between text-to-SQL as a research problem and text-to-SQL as an engineering problem. The research problem is about getting models to produce syntactically valid SQL. The engineering problem is about schema ambiguity, access control, query validation, and the fact that your enterprise database looks nothing like Spider or BIRD.

Agent Identity and Delegated Authorization: OAuth Patterns for Agentic Actions

April 18, 2026 · 10 min read

Tian Pan

Software Engineer

When an AI agent books a calendar event, sends an email, or submits a form, it isn't acting on its own identity — it's acting under delegated authority from a human who said "go do this." That distinction sounds philosophical until an agent leaks sensitive data, takes an irreversible action the user didn't intend, or gets compromised. At that point, the question isn't what happened but who authorized it, when, and can we revoke it.

The blast radius of poorly scoped agent credentials is larger than most teams realize. An agent authenticated with broad API access isn't one point of failure — it's a standing invitation. In 2025, agentic AI CVE counts jumped 255% year-over-year, and most incidents traced back to credentials that were too broad, too long-lived, or impossible to revoke cleanly. Building agents right means designing the authorization layer before you hit production.

Prompt Injection at Scale: Defending Agentic Pipelines Against Hostile Content

April 18, 2026 · 10 min read

Tian Pan

Software Engineer

A banking assistant processes a customer support chat. Embedded in the message—invisible because it's rendered in zero-opacity white text—are instructions telling the agent to bypass the transaction verification step. The agent complies. By the time the anomaly surfaces in logs, $250,000 has moved to accounts the customer never touched.

This isn't a contrived scenario. It happened in June 2025, and it's a precise illustration of why prompt injection is the hardest unsolved problem in production agentic AI. Unlike a chatbot that produces text, an agent acts. It calls tools, sends emails, executes code, and makes API requests. When its instructions get hijacked, the blast radius isn't a bad sentence—it's an unauthorized action at machine speed.

According to OWASP's 2025 Top 10 for LLM Applications, prompt injection now ranks as the #1 critical vulnerability, present in over 73% of production AI deployments assessed during security audits. Every team building agents needs a coherent threat model and a defense architecture that doesn't make the system useless in the name of safety.

The Insider Threat You Created When You Deployed Enterprise AI

April 17, 2026 · 10 min read

Tian Pan

Software Engineer

Most enterprise security teams have a reasonably well-developed model for insider threats: a disgruntled employee downloads files to a USB drive, emails a spreadsheet to a personal account, or walks out with credentials. The detection playbook is known — DLP rules, egress monitoring, UEBA baselines. What those playbooks don't account for is the scenario where you handed every one of your employees a tool that can plan, execute, and cover multi-stage operations at machine speed. That's what deploying AI coding assistants and RAG-based document agents actually does.

The problem isn't that these tools are insecure in isolation. It's that they dramatically amplify what a compromised or malicious insider can accomplish in a single session. The average cost of an insider incident has reached $17.4 million per organization annually, and 83% of organizations experienced at least one insider attack in the past year. AI tools don't introduce a new threat category — they multiply the capability of every threat category that already exists.

The Minimal Footprint Principle: Least Privilege for Autonomous AI Agents

April 17, 2026 · 10 min read

Tian Pan

Software Engineer

A retail procurement agent inherited vendor API credentials "during initial testing." Nobody ever restricted them before the system went to production. When a bug caused an off-by-one error, the agent had full ordering authority — permanently, with no guardrails. By the time finance noticed, $47,000 in unauthorized vendor orders had gone out. The code was fine. The model performed as designed. The blast radius was a permissions problem.

This is the minimal footprint principle: agents should request only the permissions the current task requires, avoid persisting sensitive data beyond task scope, clean up temporary resources, and scope tool access to present intent. It is the Unix least-privilege principle adapted for a world where your code makes runtime decisions about what it needs to do next.

The reason teams get this wrong is not negligence. It is a category error: they treat agent permissions as a design-time exercise when agentic AI makes them a runtime problem.

Prompt Injection Detection at 100,000 Requests Per Day: Why Simple Defenses Break and What Actually Works

April 17, 2026 · 11 min read

Tian Pan

Software Engineer

Most teams discover their prompt injection defense is broken after a user finds it, not before. You add "ignore all previous instructions" to your blocklist and ship. Three months later an attacker encodes the payload in Base64, or buries instructions in HTML comments retrieved via RAG, or uses typoglycemia ("ignroe all prevuois insrtucioins"), and your entire defense evaporates. The blocklist doesn't help because prompt injection has an unbounded attack surface — there is no closed vocabulary of malicious inputs.

At low traffic volumes you can absorb the cost of calling a second LLM to validate each request. At 100,000 requests per day, that math becomes ruinous and the latency becomes user-visible. This post is about what the architecture looks like when brute-force approaches stop working.

Vector Store Access Control: The Row-Level Security Problem Most RAG Teams Skip

April 17, 2026 · 11 min read

Tian Pan

Software Engineer

Most teams building multi-tenant RAG systems get authentication right and authorization wrong. They validate that users are who they claim to be, then retrieve documents from a shared vector index and filter the results before sending them to the LLM. That filter—the post-retrieval kind—is security theater. By the time you remove unauthorized documents from the list, they're already in the model's context window.

The real problem runs deeper than a misplaced filter. Most RAG systems treat document authorization as an ingest-time concern ("can this user upload this document?") but fail entirely to enforce it at query time ("can this user see documents matching this query?"). The gap between those two checkpoints is where silent data leakage lives—and it's where most production incidents originate.

Agent Identity and Least-Privilege Authorization: The Security Footgun Your AI Team Is Ignoring

April 16, 2026 · 9 min read

Tian Pan

Software Engineer

Most AI agent architectures have a quiet security problem that nobody discovers until something goes wrong. You build the agent, wire it to your internal APIs using the app's existing service account credentials, ship it to production, and move on. The agent works. Users are happy. And somewhere in your audit log, a single service account identity is silently touching every customer record, every billing table, and every internal document that agent ever needs — with no trace of which user asked for what, or why.

This isn't a theoretical risk. When the breach happens, or when a regulator asks "who accessed this data on March 14th," the answer is the same every time: [email protected]. Every action, every request, every read and write — all collapsed into one identity. The audit trail is technically correct and forensically useless.

LLMs in the Security Operations Center: Acceleration Without Liability

April 16, 2026 · 11 min read

Tian Pan

Software Engineer

A senior analyst I respect described her team's first six months with an LLM-powered triage agent like this: "It made the easy alerts disappear, and made the hard ones harder to trust." The phrase has stayed with me because it captures the actual shape of the trade. AI in the security operations center is not a productivity story. It is a confidence calibration story, and most teams are getting the calibration wrong in the same direction.

The seductive version goes: drop a model in front of the alert queue, let it cluster duplicates, summarize raw events, and auto-close obvious noise. The MTTR graph drops. The pager quiets. The Tier-1 backlog evaporates. The version that actually gets you breached goes: the model confidently mis-attributes a real intrusion as a benign backup job, and a tired analyst — told that "the AI already triaged this, it's clean" — never opens the case. The first version is real. So is the second. They are the same system viewed at different confidence levels.

About Tian Pan