Skip to main content

The JSON Schema Your Output Passed and Your Downstream Consumer Rejected for Semantic Drift

· 10 min read
Tian Pan
Software Engineer

A JSON schema validates the shape of your output. It does not validate the meaning of the values inside that shape. For nine months, every output your AI pipeline produces passes validation cleanly, your monitoring shows schema validity at 100%, and your team treats a schema-valid response as a contractually correct one. Then a model upgrade ships, every output continues to validate, and your Slack alerting channel goes from 50 messages a day to 800 overnight.

The schema did not break. The distribution of values inside it did. That is the gap most AI teams discover in production: the JSON contract is a type system, not a behavior system, and a downstream consumer was depending on a value distribution the contract was never asked to enforce.

The Failure Mode

Picture a triage pipeline. The model takes an inbound support ticket and outputs a JSON object that looks like this:

{
"category": "billing" | "technical" | "account",
"priority": "low" | "medium" | "high",
"summary": string
}

The schema is enforced at the integration boundary. The validation layer rejects anything that does not match. The downstream service consumes these objects and applies a business rule: any ticket with priority: high escalates to an on-call Slack channel.

For nine months, the historical mix is roughly 70% low, 25% medium, 5% high. The on-call channel sees about 50 escalations per day, which is a load one person can triage. Capacity planning, alert routing, and the rotation schedule are all calibrated to that mix.

A model upgrade ships on a Tuesday morning. The new model is more confident in its judgments. The same prompts now yield a 30/35/35 split across (low, medium, high). Every output is still schema-valid. Every enum value is still in the allowed set. The validation layer reports zero rejections. The inference team's dashboard shows the same green it has shown for nine months.

The Slack channel receives 800 messages on Tuesday. The on-call cannot keep up. The incident is filed against the downstream service for "spammy alerting." The root cause takes a week to find because nobody is looking at the producer; the producer's contract says it is healthy.

The bug is not in the model. The bug is in the assumption that a syntactic contract is sufficient to govern a semantic dependency.

Why the Schema Cannot Catch This

A JSON schema is, by design, a description of structure. It says what fields exist, what types they hold, and what enum values are permitted. It does not say anything about the rate at which each enum value should appear, the joint distribution of fields, or the correlations the downstream system has come to depend on.

This is not a flaw in JSON Schema. It is the boundary of what a type system can do. A type system answers: is this value in the allowed set? It does not answer: is the mix of values in the allowed set consistent with what you saw yesterday?

When the output producer is a deterministic piece of code, the question rarely comes up. Deterministic code does not silently shift its output distribution under your feet. A model does. A model's output distribution is a function of its weights, its prompt, its training data, and the inference-time settings. Any of those can change. When they change, the schema does not notice.

The team that treats schema validity as the production contract has signed a contract that the producer can revise without violating. The provider can ship an upgrade that changes the meaning of every downstream decision and the contract will not flag it because the upgrade did not change the contract's referent. The referent was always the shape, never the meaning.

The Three Places the Drift Hides

There are three reliable places a structurally valid output drifts in ways the schema will not catch.

Enum mix shift. The model starts producing a different ratio of allowed enum values. The example above is this case. The downstream system has a business rule keyed on a specific value, and the rule's blast radius is sensitive to how often that value appears.

Field correlation shift. Individual fields look fine in isolation. The joint distribution changes. The old model produced category: billing paired with priority: high 2% of the time. The new model produces that pairing 18% of the time. A downstream queue routing rule that key-partitions by category was sized for the old joint distribution and is now hot on the billing partition.

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates