The SSE Keep-Alive Your Reverse Proxy Stripped, And The Prompt You Paid For Twice
Your agent called a tool that took 35 seconds. During those 35 seconds, no tokens flowed from the model back to the browser. The provider's SSE stream was still open. Your tool was still running. The user's spinner was still spinning. And somewhere in the middle, a reverse proxy you do not control decided the connection had been quiet for too long, closed it, and your client's reconnection logic dutifully restarted the entire request from scratch.
The first response was 4,200 prompt tokens and 600 completion tokens. The second response was 4,200 prompt tokens and 600 completion tokens. The user got one answer. Your invoice got two.
