Knowledge Graphs Are Back: Why RAG Teams Are Adding Structure to Their Retrieval
Your RAG pipeline answers single-fact questions beautifully. Ask it "What is our refund policy?" and it nails it every time. But ask "Which customers on the enterprise plan filed support tickets about the billing API within 30 days of their contract renewal?" and it falls apart. The answer exists in your data — scattered across three different document types, connected by relationships that cosine similarity cannot see.
This is the multi-hop reasoning problem, and it's the reason a growing number of production RAG teams are grafting knowledge graphs onto their vector retrieval pipelines. Not because graphs are trendy again, but because they've hit a concrete accuracy ceiling that no amount of chunk-size tuning or reranking can fix.
The Multi-Hop Wall
Vector search works by embedding text into high-dimensional space and finding the chunks closest to your query. For single-hop questions — "What does feature X do?" — this is remarkably effective. The relevant chunk sits in a predictable neighborhood of the embedding space.
Multi-hop questions break this model. Consider: "What scientific work influenced the mentor of the person who discovered the double helix structure of DNA?" Answering this requires three separate retrieval steps:
- Watson and Crick discovered the double helix
- Their mentor was Lawrence Bragg
- Bragg was influenced by X-ray crystallography work
Each fact lives in a different chunk, and none of them are semantically similar to the original question. The bridging facts — the ones connecting person to mentor to influence — score low on cosine similarity because they don't share surface-level vocabulary with the question.
Microsoft's research quantified this gap: baseline RAG captures only 22–32% of comprehensive answers on multi-hop questions. GraphRAG, by contrast, achieves 72–83% — a 3x improvement that comes from preserving the relational structure that vectors discard.
How Graph-Enhanced Retrieval Actually Works
The architecture is simpler than most teams expect. You're not replacing your vector database — you're adding a layer that captures entity relationships your embeddings miss.
Graph construction starts with entity extraction. An LLM (or a lighter NLP pipeline) reads your documents and pulls out entities and their relationships: (Watson) --[MENTORED_BY]--> (Bragg), (Bragg) --[INFLUENCED_BY]--> (X-ray Crystallography). These triples form a knowledge graph that sits alongside your vector index.
At query time, the system runs a dual retrieval path:
- Vector path: standard semantic search for chunks relevant to the query
- Graph path: entity recognition on the query, followed by graph traversal to find connected entities and their associated text
The results merge before being sent to the LLM for generation. The vector path handles straightforward factual retrieval. The graph path handles the relational reasoning — following edges between entities to assemble multi-hop evidence chains.
Recent benchmarks on the Babilong dataset showed graph-RAG with personalized PageRank significantly outperformed GPT-4o's 128k context window on multi-hop questions. The advantage comes from filtering: RAG retrieves only the relevant subgraph, while long-context models struggle with distraction from irrelevant noise.
Three Construction Patterns That Work Without a PhD in Ontology
The biggest objection to knowledge graphs has always been construction cost. Traditional knowledge graph projects required months of schema design, domain expert interviews, and manual curation. LLMs have compressed this timeline from months to hours — but you still need to pick the right construction pattern for your use case.
Pattern 1: LLM-Driven Extraction. Feed your documents through an LLM with a prompt like "Extract all entities and relationships from this text as (subject, predicate, object) triples." GPT-4 or Claude achieves roughly 65.8% accuracy on entity-relation extraction. This is the highest-quality option but also the most expensive — you're making an LLM call for every document in your corpus.
- https://neo4j.com/blog/developer/knowledge-graph-vs-vector-rag/
- https://dev.to/ambalogun/why-rag-is-failing-at-complex-questions-and-how-knowledge-graphs-fix-it-5chj
- https://www.zyphra.com/post/understanding-graph-based-rag-and-multi-hop-question-answering
- https://www.microsoft.com/en-us/research/blog/lazygraphrag-setting-a-new-standard-for-quality-and-cost/
- https://arxiv.org/abs/2501.00309
- https://dl.acm.org/doi/10.1145/3777378
