A team I worked with last quarter shipped an AI assistant with a tidy stack of supporting documents: an in-product tooltip warning that the AI may produce inaccurate results, a help-center article titled "How does the assistant work," an internal support runbook for handling escalations, and a public model card listing the underlying model, the tools the assistant could call, and the data domains it covered. The launch went well. Six months later the prompt had been edited fourteen times, the model had been swapped from one tier to another with subtly different refusal behavior, two new tools had been added, one tool had been deprecated but not removed from the prompt, and the language settings had been opened from English-only to nine locales.

Every single one of those documents was wrong. Not catastrophically wrong — the kind of wrong where a sentence is half-true, a capability is described in language the model no longer matches, a refusal pattern is documented that the new model never triggers, a tool name appears in the help article that the assistant won't actually call. The kind of wrong that produces a slow drip of confused support tickets, a few customer trust regressions when the AI does something the docs say it won't, and — because the company sells into a regulated vertical — a small but real compliance gap that nobody on the AI team had thought to track.