Retrieval Cascade Failure: How Document Deletion Poisons Your RAG Pipeline
A user asks your support bot when the refund window closes. The bot answers "60 days" with cheerful confidence and a citation. The policy page that says "60 days" was deleted from the CMS three months ago. The new policy is 14. Nobody on your team knows the bot is wrong until a customer escalates.
This is a retrieval cascade failure: the document is gone from the source of truth, but its embedding is still in the index, still ranking high on cosine similarity, still feeding the model a ghost. RAG pipelines treat embedding indexes as caches of source content, but most teams build the cache without building the invalidation. Inserts get all the engineering attention. Deletes get a TODO comment.
