Your Agent Has Two Release Pipelines, Not One
A team I worked with shipped a "small prompt tweak" on a Wednesday afternoon. The same PR also added one new tool to the agent's registry — a convenience wrapper around an internal admin API that the prompt would now occasionally invoke. The eval suite passed. The canary looked clean. By Thursday morning a customer's billing record had been mutated by an agent acting on a prompt-injected support ticket, the audit trail showed the admin tool firing exactly as designed, and the on-call engineer's first instinct — roll back the prompt — did nothing useful, because the credential had already been used and the row had already been written.
The post-mortem framed it as a security review failure. It wasn't. It was a release-pipeline failure. The team had shipped two completely different asset classes — a behavioral nudge to the model and a new authority granted to the agent — through the same review, the same gate, and the same rollback story, as if they were the same kind of change. They aren't. And once you see them as two pipelines, most "agent governance" debates become much less mysterious.
