When Vector Search Fails: Why Knowledge Graphs Handle Queries Embeddings Can't

April 20, 2026 · 9 min read

Software Engineer

Vector search has become the default retrieval primitive for RAG systems. Embed your documents, embed the query, find nearest neighbors — it's simple, fast, and works surprisingly well for a wide class of questions. But production deployments keep hitting the same wall: certain queries return garbage results despite high similarity scores, certain multi-document reasoning tasks fail silently, and certain entity-heavy queries degrade to random noise as complexity grows.

The issue isn't embedding quality or index size. It's that semantic similarity is the wrong abstraction for a significant class of retrieval problems. Knowledge graphs aren't a replacement for vector search — they solve a structurally different problem. Understanding which problems belong to which tool is what separates a brittle RAG pipeline from one that holds up in production.

The Two Primitives and What They Actually Do

Vector search answers the question: what documents are semantically similar to this query? It compresses text into a high-dimensional point, then finds nearby points. The fundamental operation is distance in embedding space.

Knowledge graphs answer a different question: what entities exist, how are they related, and what can I infer by traversing those relationships? The fundamental operations are graph traversal — following edges between nodes.

These are not equivalent operations on the same data. Distance and connectivity are orthogonal properties. A document about "treatment protocols for Type 2 diabetes" and a document about "insulin resistance in adolescents" may be far apart in embedding space yet critically connected through a patient entity who has both conditions. Vector search will miss that connection unless the query itself bridges the gap. A graph traversal finds it by following edges.

Where Semantic Similarity Breaks Down

Multi-hop relationship queries are the most obvious failure mode. "Find all papers that cite papers which cite Smith 2019" requires traversing node → node → node. No amount of embedding quality makes this expressible as a nearest-neighbor search. The query is a graph traversal by definition. The same applies to organizational hierarchy queries, supply chain tracing, citation networks, and any question with "of the X that Y" structure.

Entity disambiguation under noise is a subtler failure. When the same real-world entity appears under different surface forms — "JPMorgan", "JP Morgan Chase", "JPM", "the bank" — vector search handles it poorly without explicit entity resolution. Embeddings compare mention vectors to candidate entity vectors but ignore structural context: which other entities co-occur, what relationships exist, what the local graph neighborhood looks like. EDEGE and similar hybrid approaches that combine semantic embeddings with subgraph structure consistently outperform pure embedding disambiguation, because the graph structure provides global semantic context that a single embedding vector cannot capture.

Aggregation queries collapse entirely. "How many documents mention both X and Y in the context of Z?" is a structured query over document metadata and content. Vector search returns similar documents, not an answer to a combinatorial question. Research benchmarks show graph-based retrieval performing 3x better on aggregation queries than vector RAG, specifically because graph traversal can count edges, filter by node properties, and aggregate across relationships.

Cross-document reasoning at scale degrades as the number of entities grows. In controlled benchmarks, vector RAG accuracy drops to near 0% when queries involve five or more distinct entities. GraphRAG maintains stable performance at ten or more entities, because it explicitly models relationships rather than relying on cosine similarity to discover connections that span documents.

There are also the silent failure modes that are harder to catch. Embedding model version mismatch — where query and index use different model versions — produces valid similarity scores backed by meaningless comparisons. Rare terms, SKUs, specific identifiers, and short queries without semantic context all suffer from overgeneralization: the embedding encodes "similar meaning" when you need exact match.

What Knowledge Graphs Bring to Retrieval

A knowledge graph represents entities as nodes and relationships as typed edges. "Author A wrote Paper B, which cites Paper C, which was funded by Organization D" becomes a traversable structure. Queries that require following these chains — breadth-first or depth-first — become expressible operations instead of approximations.

Microsoft's GraphRAG research demonstrated this concretely on news corpora. Using thousands of Russian and Ukrainian news articles, GraphRAG discovered entities like "Novorossiya" and traced relationship chains across documents where baseline vector RAG returned nothing relevant. The difference wasn't retrieval quality on individual documents — it was the ability to connect information that was distributed across documents and linked only through entity relationships.

The architecture is two-stage: an LLM extracts entities and relationships from source documents in the first stage; in the second stage, graph community detection generates domain-specific summaries at different granularities. The retrieval path then uses graph traversal rather than nearest-neighbor search, while still supporting vector and full-text search where those are appropriate.

For entity-heavy domains — healthcare records, legal documents, financial filings, technical specifications — this architecture produces measurable results. A healthcare implementation connecting patient records, research literature, and treatment protocols reported 18% improvement in diagnostic accuracy for complex cases and 31% reduction in treatment plan development time. The improvement came from surfacing relationships that existed in the data but were invisible to semantic similarity.

Hybrid Retrieval: When You Need Both

The practical architecture for most production systems isn't a choice between graphs and vectors — it's running both and merging results.

HybridRAG pipelines operate in three stages:

Vector search finds semantically relevant entities and document chunks
Graph traversal explores relationships between the entities the vector search returned
Weighted merging combines results using a scoring system that accounts for both similarity and graph distance

This addresses the core weakness of each approach in isolation. Pure vector search surfaces relevant individual documents but misses relationships. Pure graph retrieval is precise about relationships but depends entirely on the quality of entity extraction and graph construction. Hybrid retrieval uses the vector component to find entry points into the graph, then uses graph traversal to follow the connections that semantics alone can't surface.

Loading…

References:

Let's stay in touch and Follow me for more thoughts and updates

Twitter LinkedIn Telegram Discord 小红书

When Vector Search Fails: Why Knowledge Graphs Handle Queries Embeddings Can't

The Two Primitives and What They Actually Do

Where Semantic Similarity Breaks Down

What Knowledge Graphs Bring to Retrieval

Hybrid Retrieval: When You Need Both

Recommended Reading

About Tian Pan

The Two Primitives and What They Actually Do​

Where Semantic Similarity Breaks Down​

What Knowledge Graphs Bring to Retrieval​

Hybrid Retrieval: When You Need Both​

Recommended Reading

About Tian Pan

The Two Primitives and What They Actually Do

Where Semantic Similarity Breaks Down

What Knowledge Graphs Bring to Retrieval

Hybrid Retrieval: When You Need Both