The Streaming UI That Committed a Partial Answer Your Model Never Finished
The post-mortem read like a hallucination report. A user had acted on a confidently-worded recommendation that turned out to be wrong in a way the model would not have written if it had finished — except the trace showed the model had not finished. The provider connection dropped at token 412 of an expected 800. The client's error handler logged the failure. The persisted partial message, written to the conversation history as tokens arrived, sat in the user's UI looking exactly like every other complete answer. They acted on it. Support categorized the ticket as a content-quality issue. It took two weeks to route it to the platform team.
Nothing in this chain was a model failure. The model behaved correctly for the 412 tokens it produced. The failure was that the streaming UI and the durable conversation history had quietly disagreed about what counts as a message — and during the exact failure mode that streaming was supposed to make tolerable, the disagreement became the canonical record.
This is the contract between optimistic rendering and durable storage. Most chat products inherit it from a tutorial or a framework without thinking about it as a contract at all, and the gap shows up as a tail of incidents that look like model bugs and aren't.
