The Customer Record Hiding in Your Few-Shot Prompt Template
The privacy auditor's question came two days before the SOC 2 renewal: "Why is the email field in your onboarding prompt's example a real customer address?" The product team rebuilt the chain in their heads. A year earlier, when they shipped the AI summarizer, someone needed a "see how this works" example for the few-shot template. They picked a representative customer record from staging, scrubbed the obvious fields — name, account ID, phone — and committed the file. The customer churned six months later. Their record was deleted from the database per the data retention policy. Their record was not deleted from the prompt template, which had been shipped to every tenant in production.
The team had assumed, like most teams, that the privacy boundary was the database. The prompt template was code. Code goes through review. Review doesn't flag PII because reviewers aren't looking for it in YAML strings labeled example_input:. The DLP scanner that catches PII in Slack messages and email attachments doesn't scan committed code, and even if it did, it wouldn't recognize a partially-scrubbed customer record as personal data because the fields it knew to look for had been removed. Everything that remained — the company size, the industry, the rare job title, the specific city — was data the scanner had no rule for.
