Skip to main content

2 posts tagged with "ai-privacy"

View all tags

Eval Datasets Are Customer Data With a Right Answer Attached

· 12 min read
Tian Pan
Software Engineer

Your golden eval set is a privacy boundary your security team didn't know existed. It is built by sampling production traces, which means it is a curated collection of real customer queries — often containing names, emails, account numbers, transcripts of frustrated calls, half-typed credit card digits — paired with the canonical correct response on top, and then committed to whatever bucket the eval pipeline reads from.

That last part is what makes eval data uniquely dangerous. A raw production trace is sensitive because it captures what the customer said. An eval case is sensitive in a new way because it captures what the customer said plus the labeled correct answer. The label is a derivative work that someone, often an annotator or a domain expert, applied with intent. It signals "this is canonical." It gives the trace a longevity that the original log never had — log retention will eventually rotate the trace out, but the eval case is now a permanent test fixture that the team is committed to keeping green.

Adding a Modality Is a Privacy-Classification Event, Not a Feature Flag

· 11 min read
Tian Pan
Software Engineer

A product manager pings the AI team on a Tuesday: "Customers want to paste screenshots into the support agent. Should be a small lift, right? The model already takes images." The eng lead checks the SDK, confirms the vision endpoint accepts JPEGs and PNGs, ships the change behind a feature flag, and rolls it to ten percent. Two weeks later, the legal team forwards a regulator letter asking why a user's bank statement, an image of their driver's license, and a screenshot containing another customer's order ID all appeared in the agent's training-eligible logs. Nobody on the AI team flagged the modality change, because nobody thought a modality change was a change. The privacy review that approved the text agent never re-ran for the image variant — and the image variant turned out to live under entirely different consent, retention, and residency rules.

This is not a story about a careless engineer. It is a story about a category error built into how most teams ship AI features. Text input is a known data class with a stable threat model: the user types, the user sees what they typed, the engineering team has years of habit around what to log and what to drop. Images are a different data class with a different threat model — they smuggle in metadata the user cannot see, capture surrounding content the user did not intend to share, and create storage and processing footprints with their own residency and contract terms. Treating "now with vision" as a UX iteration, when it is actually a privacy-classification event, is how teams discover at the regulator's request that their PII inventory understated their actual exposure by an order of magnitude.