Graph Reasoning Gaps in LLMs: Scaffolding Relational Tasks That Fool Sequence-Trained Models
A common mistake in AI system design is asking a language model to reason over a graph as if it were reading a document. The model will generate a confident, fluent answer. The answer will be wrong in a way that looks right — it will name real nodes, reference plausible paths, and describe relationships that almost exist. Then you discover your org-chart traversal hallucinates skip-level managers, your dependency resolution misses cycles in graphs over ten nodes, and your three-hop knowledge graph query has a 60% error rate at step two.
This is not a prompt quality problem. It is an architecture problem, and you can diagnose it before writing a single prompt.
