What Structured Outputs Actually Cost You: The JSON Mode Quality Tax
Most teams adopt structured outputs because they're tired of writing brittle regex to extract data from model responses. That's a reasonable motivation. What they don't anticipate is discovering months later, when they finally measure task accuracy, that their "reliability improvement" also degraded the quality of the underlying content by 10 to 15 percent on reasoning-heavy tasks. The syntactic problem was solved. A semantic one was introduced.
This post is about understanding that tradeoff precisely — what constrained decoding actually costs, when the tax is worth paying, and how to build the evals that tell you whether it's hurting your system before you ship.
