The LLM Request Lifecycle Your try/catch Is Missing
The most dangerous failure your LLM stack can produce returns HTTP 200. The JSON parses. Your schema validation passes. No exception is raised. And the response is completely wrong — wrong facts, wrong structure, truncated mid-sentence, or fabricated from whole cloth.
A single try/catch around an LLM API call handles the easy failures: rate limits, server errors, network timeouts. These are the visible failures. The invisible ones — a model that hit its token limit and stopped mid-answer, an agent that looped 21 extra tool calls before finding the right parameter name, a validation retry that inflated your costs by 37% — produce no exceptions. They produce results.
The fix is not better error handling. It is modeling the LLM request lifecycle as an explicit state machine, where every state transition emits an observable span, and failure modes are first-class states rather than buried exception handlers.
