Dogfooding Is Not an Eval Strategy
Every team building an AI product reaches the same comfortable conclusion: "We use it every day, and it works great." That sentence feels like evidence. It is not. It is the single most misleading signal in the room, and it gets stronger — more convincing, more wrong — the better your team is.
Dogfooding tells you the product runs. It does not tell you the product works. Those are different claims, and the gap between them is exactly where your launch goes sideways. The people who built the system are, statistically, the worst possible sample of the people who will use it. They share its mental model, they know its soft spots, and they have spent months training themselves to phrase requests the way the model likes. That is not a test population. That is a control group for a study you never ran.
