The Human Review Queue Is Your P0 SLA: When HITL Becomes the Bottleneck
The first incident is rarely an outage. It's a Slack message from someone in customer success: "Hey, are we OK? Five customers in the last hour escalated tickets that have been sitting in 'awaiting review' for over a day." You check the model latency dashboard. Green. You check the agent's success rate. Green. You check the cost-per-call graph. Healthy. Everything you instrumented is fine. The thing that's broken is a queue your monitoring stack doesn't know exists, staffed by people whose calendars your capacity planner doesn't read, governed by an SLA that nobody has ever written down.
That queue is your human-in-the-loop escalation path. You added it three months ago "for safety" — the agent would defer to a human reviewer on the small fraction of cases where its confidence was low or the action was high-stakes. At launch it caught maybe a dozen items a day. The ops team handled them between other tasks. It was a backstop, not a system. Today it's processing thousands of items, the median time-to-resolution has tripled, and the customers waiting in line are quietly churning. The HITL path didn't fail. It just stopped being treated like production.
