Skip to main content

13 posts tagged with "retrieval"

View all tags

Agentic RAG: When Your Retrieval Pipeline Needs a Brain

· 10 min read
Tian Pan
Software Engineer

Ninety percent of agentic RAG projects failed in production in 2024. Not because the technology was broken, but because engineers wired up vector search, a prompt, and an LLM, called it a retrieval pipeline, and shipped — without accounting for the compounding failure costs at every layer between query and answer.

Classic RAG is a deterministic function: embed query → vector search → stuff context → generate. It runs once, in one direction, with no feedback loop. That works when queries are clean single-hop lookups against a well-chunked corpus. It fails spectacularly when a user asks "compare the liability clauses across these five contracts," or "summarize what's changed in our infra config since the Q3 incident," or any question that requires synthesizing evidence across documents before forming an answer.