Skip to main content

One post tagged with "bug-bounty"

View all tags

The Prompt-Injection Bug Bounty: Scoping a Program When 'Broken' Has No Clear Definition

· 12 min read
Tian Pan
Software Engineer

Your security team runs a bug bounty that works. A CSRF gets paid. An XSS gets paid. An IDOR gets paid. The rules of engagement are sharp, the severity rubric is industry-standard, the triage queue moves, and the program produces a steady stream of fixed bugs. Then your AI team ships a feature last quarter — a chat surface, an agent that calls tools, a RAG pipeline that pulls from customer data — and the question that lands on the security team's desk is "what's the bounty scope for this thing?" Nobody can answer.

The reason nobody can answer is that the standard bug bounty rubric was built around a system whose specified behavior is deterministic. A login endpoint either authenticates correctly or it doesn't. An access control check either holds or it doesn't. The AI feature you just shipped has no equivalent ground truth: its specified behavior is "respond helpfully to user input," and a researcher who makes it respond unhelpfully has not necessarily found a bug — they may have found something the model has always done, that nobody knew about, that you're not sure you can fix, and that may or may not reproduce on a second attempt.