Skip to main content

3 posts tagged with "uncertainty"

View all tags

Epistemic Trust in Agent Chains: How Uncertainty Compounds Through Multi-Step Delegation

· 10 min read
Tian Pan
Software Engineer

Most teams building multi-agent systems spend a lot of time thinking about authorization trust: what is Agent B allowed to do, which tools can it call, what data can it access. That's an important problem. But there's a second trust problem that doesn't get nearly enough attention, and it's the one that actually kills production systems.

The problem is epistemic: when Agent A delegates a task to Agent B and gets back an answer, how much should A believe what B returned?

This isn't a question of whether B was authorized to answer. It's a question of whether B actually could.

The Confident Hallucinator: Runtime Patterns for Knowledge Boundary Signaling in LLMs

· 10 min read
Tian Pan
Software Engineer

GPT-4 achieves roughly 62% AUROC when its own confidence scores are used to separate correct answers from incorrect ones. That's barely above the 50% baseline of flipping a coin. The model sounds certain and polished in both cases. If you're building a production system that assumes high-confidence responses are reliable, you're working with a signal that's nearly random.

This is the knowledge boundary signaling problem, and it sits at the center of most real-world LLM quality failures. The model doesn't know what it doesn't know — or more precisely, it knows internally but can't be trusted to express it. The engineering challenge isn't getting models to refuse more; it's designing systems that make uncertainty actionable without making your product feel broken.

Confidence Strings, Not Scores: Why Your 0.87 Badge Moves Nobody

· 10 min read
Tian Pan
Software Engineer

The product team ships a confidence badge next to every AI suggestion. Green for ≥85%, yellow for 60–84%, red below. They run an A/B test six weeks later and find no change in user behavior at any threshold. False positives at 0.92 confidence get accepted at the same rate as false positives at 0.61 confidence. The team's instinct is to tune the calibration — fit a temperature scaling layer, regenerate the badges, run the A/B again. The numbers shift; the behavior doesn't.

The problem isn't that the model is miscalibrated, though it almost certainly is. The problem is that calibrated probability is the wrong output. The signal a user can act on isn't "how sure" the model is. It's "what specifically the model didn't check." A 0.87 badge tells the user nothing they can verify. "I'm reasonably confident in the address but I haven't checked the unit number" tells them exactly where to look.