The Query Rewrite Layer Your RAG System Is Missing
Most teams tuning a RAG system focus on two levers: chunking strategy and embedding model selection. When retrieval quality degrades, they re-chunk. When recall numbers look bad, they upgrade the embedding model. Both are reasonable moves — but they're optimizing the middle of the pipeline while leaving the highest-leverage point untouched.
The user's query is almost never in the ideal form for vector retrieval. It's terse, colloquial, ambiguous, or assumes context that the index doesn't have. No matter how good your embeddings are, if you're searching with a poorly formed query, you're going to retrieve poorly. The fix isn't downstream — it's transforming the query before it reaches the vector index.
