The Compliance Audit That Asked Which Model Produced Which Output
The auditor's question sounds simple. She has your appeals log open, points at a row from eight months ago, and asks which model decided that case. Your engineer pulls up the schema: there is a model column, and every decision in the audit window says v1. Then someone from the platform team mentions, almost in passing, that the alias behind v1 rotated four times during the audit period — a base model upgrade, a fine-tune refresh, a vendor-side capacity move, and one rollback that lasted six hours during an incident. The honest answer is that you cannot say which checkpoint produced that decision. The auditor writes something down. That phrase is not a regulator-acceptable answer, and you have just learned that the system you shipped has been failing an audit requirement it was never designed to meet.
The gap here is not a missing log line. The gap is between two different ideas of what "model" means. To the engineers shipping the system, v1 is an endpoint — a stable contract callers can point at while the thing behind it gets upgraded for free. To the auditor, "the model that produced this decision" is a specific artifact: a weight checkpoint, a hash, a thing you could in principle re-run on the same input and get a defensibly similar output. Endpoint aliases were invented to hide checkpoint rotation from callers. Audit-grade provenance demands the opposite — that every decision be attributable to exactly the checkpoint that produced it. The two ideas were on a collision course from the start; the audit just happened to be where they met.
