Skip to main content

2 posts tagged with "consistency"

View all tags

The Wiki Edit Mid-Flight When Your RAG Pipeline Read It

· 11 min read
Tian Pan
Software Engineer

A tech writer on your platform team is moving a paragraph. Not metaphorically — literally cutting a section from the onboarding page, pasting it into the runbook, deleting a stub draft on a third page, and rewording a deprecated warning on a fourth. The whole edit takes her about eleven minutes. Your RAG ingest job runs every fifteen. It happens to fire at minute six.

For the next fifteen minutes, your retrieval index contains a state of the wiki that did not exist at any single moment in her mind. The onboarding page still has the section. The runbook still doesn't. The stub draft is captured halfway through being deleted, with a placeholder sentence she never intended to publish. The old deprecated warning is still indexed. When an engineer asks the agent "how do we handle credential rotation in this service," the model retrieves contradictory chunks from the same source and confidently synthesizes whichever was ranked higher. The answer is wrong in a shape no one wrote.

This is a failure mode most teams ship without noticing: the source-of-truth is transactional, the ingest is a poll, and the gap between them is where dirty reads live.

The RAG Read-After-Write Race: When Your Vector Index Cites a Document That No Longer Exists

· 10 min read
Tian Pan
Software Engineer

A user asks your assistant a question at 14:32:07. Your retriever fires at 14:32:08 and pulls back five chunks from the policy handbook. The model thinks for a few seconds, drafts a response, and at 14:32:12 streams back an answer that confidently cites section 4.3 — the section that an admin deleted at 14:32:10 because it was wrong. The user reads an authoritative quotation from a document that no longer exists, complete with a clickable link that returns 404.

Nothing in your stack errored. The retriever returned a valid hit. The model produced fluent, grounded prose. The citation pointed at a real chunk ID that was real when the retrieval happened. And yet the answer is, by every reasonable definition, a hallucination — not because the model made something up, but because the world changed underneath the pipeline between the moment it looked and the moment it spoke.

This is the RAG read-after-write race, and most production pipelines have no defense against it.