The Self-Critique Tax: When Asking the Model to Check Its Own Work Costs Double for Modest Wins
A team ships a self-critique loop into production because the benchmark numbers looked irresistible: Self-Refine reported a 20 percent absolute improvement averaged across seven tasks, Chain-of-Verification cut hallucinations by 50 to 70 percent on QA workloads, and reflection prompts pushed math-equation accuracy up 34.7 percent in one widely-cited paper. A month later the finance review surfaces the bill. The product's per-request cost has roughly tripled, p99 latency is up by a factor of three, and the actual quality lift that survived contact with production traffic is closer to three percent than thirty. The self-critique loop is doing exactly what it advertised. The team just never priced it.
This is the self-critique tax: a reliability pattern that reads like a free quality win on a slide and reads like a structural cost increase on an invoice. The pattern itself is sound — there are real cases where generate-then-verify is the right answer. The failure mode is shipping it as a default instead of as a calibrated intervention, and discovering at the wrong time of the quarter that "the model checks its own work" was actually a procurement decision.
