Skip to main content

2 posts tagged with "dogfooding"

View all tags

Dogfooding Is Not an Eval Strategy

· 9 min read
Tian Pan
Software Engineer

Every team building an AI product reaches the same comfortable conclusion: "We use it every day, and it works great." That sentence feels like evidence. It is not. It is the single most misleading signal in the room, and it gets stronger — more convincing, more wrong — the better your team is.

Dogfooding tells you the product runs. It does not tell you the product works. Those are different claims, and the gap between them is exactly where your launch goes sideways. The people who built the system are, statistically, the worst possible sample of the people who will use it. They share its mental model, they know its soft spots, and they have spent months training themselves to phrase requests the way the model likes. That is not a test population. That is a control group for a study you never ran.

The Demo-to-Dogfood Gap: Why Your AI Feature Dies Between the Launch Slide and Monday Morning

· 11 min read
Tian Pan
Software Engineer

The demo went perfectly. The room clapped. Two weeks later, the same feature lands in the company Slack for internal use, and by Wednesday a senior engineer is posting screenshots with the caption "did anyone test this?" By Friday the channel has gone quiet — not because the bugs were fixed, but because the people who would have flagged them gave up and went back to their old workflow. The launch is still on the calendar. Nobody has cancelled it. Nobody has the political capital to.

This is the demo-to-dogfood gap, and the MIT NANDA initiative measured it last year at 95% — that is the share of enterprise generative AI pilots that produced no measurable P&L impact, and almost all of them had a demo somebody loved. The model was not the problem. The gap between the demo and the first week of internal use was the problem, and every team that has shipped an AI feature has watched some version of it play out.