The Summarizer That Paraphrased Away the User's Literal Question
A user asks: "Does this qualify as a 'transfer' under article 28?" Forty turns later, the model gives an answer to a different question. The transcript shows the model answered the question it was given. The user is reading a complaint that reads like a hallucination. Both are right. The model never saw the user's question — it saw your summarizer's polite translation of it: "user asked about article 28 applicability."
The word "transfer" was the question. The summarizer threw it away because the summarizer's loss function was tuned to preserve facts, not wording, and the rubric never learned the difference between paraphrasing the topic and paraphrasing the constraint. Topic was preserved. Constraint became fog.
This failure mode is structural, not anecdotal. Any application that compresses long conversations with a model-generated summary has a second model in the critical path — one whose quality contract is usually treated as a token-budget knob rather than as a piece of product logic. That asymmetry is where the bug lives.
