The Abstention Tax You Didn't Budget For
You taught the agent to say "I don't know" when the context was thin and called it a safety win. The OpenAI bill went down. Everyone agreed it was the responsible move. Three months later your VP of Support is asking why headcount projections are off by 40% and nobody in the AI org has an answer, because the metric you tracked was abstention rate and the metric that moved was tickets-per-week — and nobody owned the line that summed them.
This is the abstention tax. It's not a model cost. It doesn't show up on the inference invoice. It shows up downstream, in the queue depth of the human team that catches every "I cannot answer," in the second model call that runs against the enriched context the human had to assemble, in the customer who churned during the wait. The model-only cost frame quietly hides it. And the org seam where the AI team owns abstention and the ops team owns the queue means nobody is incentivized to see it.
