The RAG Read-After-Write Race: When Your Vector Index Cites a Document That No Longer Exists
A user asks your assistant a question at 14:32:07. Your retriever fires at 14:32:08 and pulls back five chunks from the policy handbook. The model thinks for a few seconds, drafts a response, and at 14:32:12 streams back an answer that confidently cites section 4.3 — the section that an admin deleted at 14:32:10 because it was wrong. The user reads an authoritative quotation from a document that no longer exists, complete with a clickable link that returns 404.
Nothing in your stack errored. The retriever returned a valid hit. The model produced fluent, grounded prose. The citation pointed at a real chunk ID that was real when the retrieval happened. And yet the answer is, by every reasonable definition, a hallucination — not because the model made something up, but because the world changed underneath the pipeline between the moment it looked and the moment it spoke.
This is the RAG read-after-write race, and most production pipelines have no defense against it.
