Skip to main content

One post tagged with "complexity"

View all tags

Retrieval Sprawl: When 'Just Add RAG' Becomes the Architectural Diversion

· 11 min read
Tian Pan
Software Engineer

The pattern is so familiar it's invisible. The model hallucinates a fact, so the team adds a retrieval step. Three weeks later, the model picks the wrong tool from a growing inventory, so they add a retrieval step on the tool catalog. The model's answers feel too generic, so they add a retrieval step on past good answers. A quarter passes, and the system is now a pile of retrievers gluing together a prompt that, fundamentally, still has the original problem.

What changed isn't the failure rate — it's the failure mode's name. "Model wrong" became "retrieval missed," which sounds more tractable but isn't. The eval suite scores higher because the retrieved context is, by construction, in-distribution for the test set. Production tells a different story, but by then the architecture has three retrieval layers, each with its own embedding model, index refresh cadence, and on-call rotation, and nobody wants to be the engineer who proposes ripping them out.

This is retrieval sprawl. It's an architectural diversion: a way of moving a hard problem (prompt design, model capability, ambiguous specifications) into a more comfortable problem (information retrieval engineering) without actually solving anything.