The Annotation Economy: Why Every Label Source Has a Hidden Tax
Most teams pick their annotation strategy by comparing unit costs: crowd workers run about 0.003, human domain experts around $1. Run the spreadsheet, pick the cheapest option that seems "good enough," and ship. This math consistently gets teams into trouble.
The actual decision is not about cost per label in isolation. Every label source carries a hidden quality tax — compounding costs in the form of garbage gradients, misleading eval curves, or months spent debugging production failures that clean labels would have caught at training time. The cheapest source is often the most expensive one when you count the downstream cost of trusting it.
