Provider-Side Safety Drift: When Your Product Regresses Without a Deploy
A prompt that worked on Tuesday returns "I can't help with that" on Thursday. The CI eval is green. The model name in your config didn't change. The prompt is byte-identical, hashed and pinned in source control. And yet a customer support thread is forming around the new refusal — the AI team won't see it for two weeks because it has to bubble through tier-one support, get triaged, and finally land on someone who can read the trace.
This is provider-side safety drift, and it is the most underbuilt monitoring gap in production AI today. Frontier providers tune safety filters, refusal thresholds, and content classifiers server-side on a cadence that is not on your release calendar. Your team isn't subscribed to it. There is often no release note. And the regressions are asymmetric in a way that is genuinely hard to detect: refusals creep up for legitimate intents while harmful queries you assumed the provider was filtering quietly start slipping through. The boundary moves on both sides, independently, without warning.
