MCP Tool Deprecation: Why the Model Still Calls the Old Name
You renamed get_user_email to lookup_contact six weeks ago. The new name shipped, the old handler was removed, the changelog noted it, and your eval set passed. Then last Tuesday a customer support engineer pinged you: an agent had returned an error on roughly three percent of its tool calls during the previous week — tool_not_found: get_user_email. The renamed-away name. The one nothing in the live system advertises anymore.
The prior is sticky. The model your agent is talking to was trained on a corpus where get_user_email was overwhelmingly the canonical way to ask "what is this person's email." Even when the tools array you pass at inference time lists only lookup_contact, the model occasionally — under certain context conditions, especially long traces or recovery-after-error states — falls back to the name it remembers. A hard cutover doesn't eliminate the long tail; it just turns soft failures into hard ones.
This is the failure mode the public MCP versioning conversations keep circling. Servers advertise tools dynamically; clients refresh; the protocol gives you tools/list_changed to nudge stale caches. All true. None of it fixes the model's prior. The protocol is the easy part. The hard part is that the thing on the other end of the JSON-RPC pipe is a probability distribution that watched five years of GitHub and your changelog isn't in its weights.
The cutover model is wrong for tools
The mental model most teams reach for is API deprecation: announce a sunset date, send a Deprecation header for six months, return 410 Gone on the cutover date. That model assumes the caller is a piece of software that reads response codes and updates its code. It works for REST clients written by humans because humans react to 410 by filing a Jira ticket.
A model does not read 410 and file a Jira ticket. A model gets a tool_not_found error, decides what to do based on its training distribution and the surrounding context, and frequently retries with the same name. Or it apologizes to the user. Or it hallucinates a different tool name that's also wrong. Or it succeeds via a different path and you never learn the deprecated name was ever attempted, because the error was absorbed into a recovery branch.
The cutover model also assumes the deprecation window is the costly thing — you want the window short so you can remove the old code. With models, the window is the cheap thing. The deprecated path is a one-line shim that forwards to the new name. The costly thing is the failure mode you create the moment that shim disappears.
Tool renames in MCP-mediated agent systems are not API versioning events. They are model-input distribution shifts, and they need the same kind of dual-write discipline you use when migrating a database column.
The deprecation shim that actually works
The pattern that survives contact with production has three parts, none of which the MCP spec mandates but all of which the protocol allows.
Keep both names live. When you rename get_user_email to lookup_contact, you advertise both in tools/list for at least one model-generation cycle and one client-update cycle, whichever is longer. The old tool's handler is a one-line forward to the new tool with the same arguments. The schema is identical or strictly broader. The description on the old tool starts with a deprecation marker — [DEPRECATED: use lookup_contact] — that the model will sometimes read and react to, and that humans will definitely see when they grep your manifest.
Log every call to the deprecated name. Tag it with the originating session ID, the model version, the position in the trace, and whether the call was the first thing the model tried or a retry after some other failure. This is the data you don't have right now and the only data that tells you when it's actually safe to remove the shim. Without it, your decision to remove is a coin flip.
Define the removal trigger by traffic, not by date. "We'll remove get_user_email six weeks from now" is the wrong commitment. "We'll remove it once weekly invocations are under N for two consecutive weeks across all model versions we route to" is the right one. The model is doing the work of telling you when it has internalized the new name; let it.
This is dual-write for tools. The old tool keeps working, the new tool is the canonical one, and the cutover happens on the empirical decay curve of the old name's invocation rate — not on a date some PM picked because the next release branch cuts that day.
- https://modelcontextprotocol.io/specification/versioning
- https://modelcontextprotocol.io/specification/2025-06-18/server/tools
- https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1915
- https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1039
- https://github.com/orgs/modelcontextprotocol/discussions/76
- https://mcp-toolbox.dev/reference/versioning/
- https://www.speakeasy.com/mcp/tool-design/dynamic-tool-discovery
