The Customer-Facing AI Postmortem When Nothing Crashed
Your status page is green. Your error rate is zero. Your uptime dashboard reads 100% for the seventh consecutive month. And yet at 9:14 AM on a Tuesday, your account team is forwarding you a message from a Fortune 500 customer that says, "Our team noticed the assistant has been worse this week. Can you tell us what changed?" Twelve more like it land before lunch. None of them will be answered by the existing incident-comms playbook, because that playbook was built for outages, and nothing has crashed.
This is the customer-facing AI postmortem problem, and it is the single most consistent gap I see across teams shipping LLM features into enterprise contracts. The reliability surface has shifted from "is it up" to "is it as good as it was last week," and almost none of the comms infrastructure has caught up. Status pages don't have a tile for it. Severity rubrics don't grade it. Support macros default to "we identified an issue and resolved it," which reads as either dismissive or alarming depending on the customer's mood that day.
