Data-Sensitivity-Tier Model Routing: Governing Which Model Sees Which Data
Your AI system routed a patient query to a self-hosted model at 9 AM. At 11 AM, that model's pod restarted during a deployment. The request queue backed up, the router detected a timeout, and it fell back to the cloud LLM you use for generic queries. The query completed successfully. No alerts fired. Your monitoring dashboard showed green. Somewhere in that exchange, protected health information traveled to a vendor with whom you have no Business Associate Agreement.
That's not a hypothetical. It's the default behavior of nearly every AI routing stack that wasn't explicitly designed to prevent it.
