Skip to main content

2 posts tagged with "ai-fairness"

View all tags

Production Bias Auditing: Catching AI Discrimination Before Your Users Do

· 11 min read
Tian Pan
Software Engineer

The most expensive bias bug I've seen in production was discovered by a Twitter thread, not a dashboard. A small team had shipped a credit-scoring assistant. They'd run the standard pre-launch audit: balanced training set, adversarial debiasing, equalized-odds gap under five percent on the holdout. A month after launch, a user posted screenshots showing women in their household consistently received lower limits than men with identical financials. By the time the team's monitoring caught up, the regulator had already opened an inquiry.

The lesson isn't that the team was lazy. They ran exactly the audit the literature recommends. The lesson is that pre-launch audits measure a snapshot of a model that no longer exists by the time real users hit it. Distribution shifts. New populations show up. A prompt-template change introduces a phrasing artifact that interacts with names. A model upgrade quietly trades calibration for a fluency win. The audit you ran in November does not protect the model running in production in May.

The 20% Problem in Model Routing: When Cost Optimization Creates Second-Class Users

· 9 min read
Tian Pan
Software Engineer

Your routing system works exactly as designed. Eighty percent of queries go to the cheap model; twenty percent escalate to the capable one. Latency is down, costs dropped by 60%, and leadership is happy. Then someone pulls the data by user segment, and you see it: users writing in non-native English are escalated at half the rate of native speakers, and their satisfaction scores are 18 points lower. The routing system treated the query complexity signal as neutral, but it wasn't — it was a proxy for language proficiency, and you've been giving a systematically worse product to a specific group of users for months.

This is the 20% problem. It's not a bug in the router. It's an emergent property of any cost-optimized routing system that nobody measures until it's too late.