The Intent Classification Layer Most Agent Routers Skip
When you hand your agent a list of 50 tools and let the LLM decide which one to call, accuracy hovers around 94%. Reasonable. Ship it. But when that list grows to 200 tools—which happens faster than anyone expects—accuracy drops to 64%. At 417 tools it hits 20%. At 741 tools it falls to 13.6%, which is statistically indistinguishable from random guessing.
The fix is a pattern that most teams skip: an intent classification layer that runs before tool dispatch. Not instead of the LLM—before it. The classifier narrows the tool namespace so that the LLM only sees the tools relevant to the user's actual intent. The LLM's reasoning stays intact; it just operates on a curated, relevant subset rather than an ever-expanding haystack.
This post explains why teams skip it, what the cost looks like when they do, and how to build the layer properly—including the feedback loop that makes it compound over time.
