Skip to main content

One post tagged with "blue-green-deployment"

View all tags

The KV Cache Warm-Up Cron That Ran in Blue and Never in Green Because the Host Pinning Never Moved

· 11 min read
Tian Pan
Software Engineer

The incident review reconstructed a deployment from twelve days earlier as the cause of a 3.6× spend increase, and nobody on the call had been in the room when the change shipped. The deployment was routine: blue/green swap, traffic moved to green on schedule, blue decommissioned, the pipeline turned green, the release engineer closed the ticket. None of the production SLOs tripped. None of the application-layer alerts fired. The system ran exactly as designed.

What had been designed was a five-minute cron that pre-warmed the provider's prompt cache against the stable system-prompt prefix every five minutes. The warm-up gave the team a 91% cache hit rate on cold starts and roughly a 4× cost advantage on the first request per session. The cron had been authored a year ago when the blue/green pattern was first introduced, and its host selector was pinned to the blue pool to avoid running the warm-up twice during overlap windows. When green became the live color and blue went away, the cron lost its host and silently transitioned from "running every five minutes" to "running never." The cache hit rate decayed over the next 36 hours as the provider's cache TTL aged out the pre-warmed prefixes. The cost dashboard, averaging per-request cost across a daily window, smoothed the slope until the next billing cycle made it loud.