Skip to main content

3 posts tagged with "abstention"

View all tags

Abstention as a Routing Decision: Why 'I Don't Know' Belongs in the Router, Not the Prompt

· 10 min read
Tian Pan
Software Engineer

Most teams handle abstention with a single sentence in the system prompt: "If you are not confident, say you don't know." The model occasionally honors it, frequently doesn't, and the failure mode is asymmetric. A confidently-wrong answer ships at full velocity — it lands in the user's hands, gets quoted in a Slack thread, gets cited in a downstream summary. An honest abstention triggers a customer-success escalation because the user expected the agent to handle the request and now somebody has to explain why it didn't. Six months in, the team has learned which kind of failure costs less to ship, and the system prompt edit that nominally controls abstention has been quietly tuned for compliance, not for honesty.

The discipline that fixes this isn't a better wording. It's recognizing that abstention is a routing decision, not a prompt pattern. It deserves a first-class output channel, its own SLO, its own evaluation harness, and its own place in the system topology — somewhere outside the prompt, where it can be tested, owned, and scaled.

Calibrated Abstention: The Capability Every Layer of Your LLM Stack Punishes

· 11 min read
Tian Pan
Software Engineer

There is a capability your model could have that would, on the days it mattered, be worth more than any other behavioral upgrade you could ship: the ability to say "I don't have a reliable answer to this" and mean it. Not the keyword-matched safety refusal. Not the hedging tic the model picked up from RLHF on controversial topics. The real thing — a calibrated abstention that fires when, and only when, the model's internal evidence does not support a confident response.

You will never get it by accident. Every default in the LLM stack pushes the other way.

The Retrieval Emptiness Problem: Why Your RAG Refuses to Say 'I Don't Know'

· 10 min read
Tian Pan
Software Engineer

Ask a production RAG system a question your corpus cannot answer and watch what happens. It rarely says "I don't have that information." Instead, it retrieves the five highest-ranked chunks — which, having nothing better to match, are the five least-bad chunks of unrelated content — and hands them to the model with a prompt that reads something like "answer the user's question using the context below." The model, trained to be helpful and now holding text that sort of resembles the topic, produces a confident answer. The answer is wrong in a way that's architecturally invisible: the retrieval succeeded, the generation succeeded, every span was grounded in a retrieved document, and the user walked away misled.

This is the retrieval emptiness problem. It isn't a bug in any single layer. It's the emergent behavior of a pipeline that treats "top-k" as a contract and never asks whether the top-k is any good. Research published at ICLR 2025 on "sufficient context" quantified the effect: when Gemma receives sufficient context, its hallucination rate on factual QA is around 10%. When it receives insufficient context — retrieved documents that don't actually contain the answer — that rate jumps to 66%. Adding retrieved documents to an under-specified query makes the model more confidently wrong, not less.