Skip to main content

You Can't Email a Changelog to a Model: Why API Deprecation Breaks When the Caller Is an LLM

· 10 min read
Tian Pan
Software Engineer

API deprecation is a communication protocol that assumes the receiver can read. You publish a changelog, send an email to registered developers, add a Deprecation header, give six months of notice, and trust that a human on the other end will see the warning, file a ticket, and migrate before the sunset date. That entire workflow quietly stopped working the moment your most active caller became a language model.

An LLM does not subscribe to your developer newsletter. It does not have a Slack channel where someone pastes your migration guide. It rediscovers your API on every single call — from a tool description it was handed, a documentation page that may be eighteen months stale, or a memory of how your API looked in its training data. There is no persistent client you can version, notify, or page. Each request is a fresh negotiation with an entity that has no memory of your last announcement and no obligation to read your next one.

This is not a hypothetical. As agents become the dominant consumers of internal and external APIs, the deprecation playbook every backend team has used for fifteen years is failing in a specific, diagnosable way — and most teams discover it only when a "deprecated for six months" endpoint is still serving an agent in production with no path to make it stop.

The caller you cannot reach

Traditional deprecation rests on an assumption so basic nobody writes it down: there is a stable client, and behind that client is a person. The OpenAI deprecations page, RFC 8594's Sunset header, Zalando's API guidelines — they all describe the same shape. Announce the change, give a generous window, point to a migration guide, and the maintainer of the calling code does the work.

An LLM caller breaks every link in that chain.

It has no stable identity. The same logical "client" might be three different model versions across a week of rollouts, each with different training data and different memories of your API. You cannot pin it, you cannot enumerate it, and you cannot diff what it knew yesterday against what it knows today.

It has no inbox. There is no email address, no registered webhook, no dashboard login. Your migration email lands nowhere. The human who owns the agent might read it — but they are not in the loop on the specific call where the agent picked your deprecated endpoint, and they often do not even know their agent is calling you.

It does not read your docs at call time. It reads whatever fragment of your API surface made it into the context window: a tool schema, a retrieved doc chunk, or nothing at all, in which case it falls back to memory. Research on code generation against evolving APIs found that models default to memorized API patterns even when the prompt explicitly provides updated information — the deprecated call is simply the higher-probability token. Your fresh documentation loses to your old documentation because your old documentation was in the training set and got reinforced ten thousand times.

So the "six months of notice" you gave reached exactly one audience: the humans who already know how to migrate. The caller actually generating the traffic never got the memo, because there is no memo format it can receive.

Why the model keeps calling the dead endpoint

It helps to be precise about how an LLM ends up calling something you deprecated, because each path needs a different fix.

Path one: training-data memory. The model learned your API from GitHub, Stack Overflow, and your own docs as they existed before its knowledge cutoff. If you renamed POST /v1/charge to POST /v1/payments last year, the model still has thousands of examples of the old name and a handful of the new one. Studies of LLMs generating code against post-cutoff APIs found old-API usage and hallucinated APIs among the dominant failure modes — over 40% of execution failures traced to wrong parameters or fabricated behavior. The model is not being careless. It is being statistical.

Path two: stale retrieval. Your agent uses RAG over a documentation corpus, and that corpus was indexed before the deprecation. The model is doing exactly what you asked — grounding its call in retrieved docs — and the docs are wrong. Deprecation that updates the canonical doc site but not every downstream index has not actually propagated.

Path three: a stale tool schema. The agent was configured with an MCP tool definition or an OpenAPI spec that still describes the deprecated surface. Here is the subtle part: in the LLM era, a tool's description is part of its contract. Changing the wording of a parameter description changes the model's probability of selecting that tool, even if the underlying code is untouched. A deprecation that does not reach the tool schema is invisible, because the schema is the only thing the model actually consults.

The common thread: in every path, the model is calling your old endpoint because the old endpoint is what its available evidence describes. "Deprecated" is a fact about your intentions. The model only sees your interfaces. If the interface still presents the endpoint as live and selectable, the endpoint is live and selectable, regardless of what your changelog says.

"Deprecated" is a state of mind until something enforces it

A Deprecation: true header is a message to a human reading logs. An LLM generating the next request does not parse your response headers as policy — at best it sees them as data in a tool result, and it has no instinct to treat that data as a directive to change behavior. Soft deprecation — "still works, please stop" — assumes the caller feels social pressure or schedules migration work. The model feels nothing and schedules nothing. It will call the deprecated endpoint on request one and request one million with identical enthusiasm.

This means soft deprecation does not exist for LLM callers. There is no honor-system middle state. The endpoint is either selectable or it is not, and the only thing that decides which is the enforcing layer between the model and your API. Concretely, that layer is your tool gateway, your MCP server, or your API gateway — the component that decides which tools the model is offered and what happens when it calls one.

That reframes the entire deprecation effort. The work is not "announce the change and wait." The work is "change what the model can see and do." Three mechanisms actually carry weight with a model caller:

  • Remove it from the tool surface. If the deprecated endpoint is no longer in the tool schema handed to the model, the model cannot select it. This is the single most effective control, because it operates on the only API representation the model reads. The tradeoff is timing — pull it too early and in-flight agents break.
  • Make the error message do the teaching. When a deprecated call does come in, the response is your one guaranteed channel to the model. The model reads tool results. A 410 with a body of "endpoint removed" teaches nothing. A response that states what failed, why, and the exact replacement call with correct parameter names gives the model everything it needs to self-correct on the very next turn — and good agent loops will.
  • Shim it server-side. Route the deprecated path to the new implementation transparently, so old calls keep producing correct results while you migrate the surface. This buys time without breaking agents, at the cost of carrying the shim.

Notice none of these is "send a notification." You are not informing a caller. You are reshaping the environment the caller acts in.

Designing deprecation for an audience that reads only interfaces

Once you accept that the model only ever sees interfaces and tool results, deprecation becomes a design problem with a clear target. A few principles follow directly.

Treat the tool schema as the contract, and version it. The MCP ecosystem has been converging on exactly this — using metadata fields to signal version, what a tool deprecates, and minimum protocol versions, because there is no other place a model-facing change can live. The safest evolution strategy is the boring one: only add, never silently mutate. A new endpoint gets a new tool entry; the old one gets pulled from the surface on a schedule, not patched in place. If you must change a description, treat it as a behavioral change and re-test, because you just edited the contract.

Write error messages for a model, not a sysadmin. Schema-design research on tool APIs is blunt about this: generic errors rarely lead to a successful retry, while errors carrying schema hints — the expected field, the allowed values, the corrective action — let the model issue a corrected call without any human in the loop. Your deprecation error is a teaching opportunity aimed at the only reader who matters. Spend words on it.

Plan for hard cutoffs, because soft ones do not land. With human callers you lean on the long soft-deprecation window. With model callers, the soft window accomplishes nothing except letting traffic accumulate on the dead endpoint. The honest design is a shim plus a dated hard cutoff: keep the old surface working through transparent routing, but stop presenting it, and commit to a removal date you actually enforce at the gateway.

Instrument the surface, not the mailing list. You cannot ask agents who is still on the old version. You can watch your own gateway. Track which tool schemas are being served, which deprecated paths still see traffic, and which agent operators they belong to. That telemetry — not a registered-developer list — is your real picture of migration progress.

The shift: deprecation is an environment change, not an announcement

The deeper lesson generalizes past APIs. For fifteen years, deprecation was a social process: you communicated an intention, gave humans time, and trusted them to act. That worked because the caller was a person who could read, remember, and plan.

The caller is increasingly not a person. It is a model that reads only what you put in front of it, remembers only what it was trained on, and plans nothing. You cannot inform it. You can only change what it sees — the tool schema it is offered, the error messages it gets back, the endpoints the gateway will actually route. Deprecation stops being a message and becomes a property of the environment you construct for the model on every call.

Teams that internalize this stop writing migration emails for an audience that cannot read them and start treating the tool gateway as the real deprecation control plane. The changelog still matters — for the humans who own the agents. But the model was never going to read it. If you want a model to stop calling something, the only language it understands is the interface itself.

References:Let's stay in touch and Follow me for more thoughts and updates