The Supervisor Agent That Rubber-Stamped Its Subagent Because They Shared a Prompt Template
A team I talked to last month was proud of a number: their supervisor agent approved 97% of its subagents' plans on first review. They read that as "the subagents are competent." A red-team review six weeks later read it as "the supervisor and the subagents are the same evaluator scoring its own output." Both readings fit the data. Only one of them was load-bearing in production.
The supervisor-reviews-subagent pattern is the most common shape multi-agent systems take in 2026 — somewhere around 70% of production deployments, including most of the reference designs the big labs publish. It looks like a check on paper. A planner decomposes the task, specialist workers produce plans, a supervisor reviews each plan before authorizing execution. Separation of concerns, clean audit trail, the works. The problem is that if you build the supervisor and the subagents from the same base prompt template — even with role-specific addenda differing by a paragraph — you have not built a check. You have built a system whose review step is an artifact of the same model agreeing with itself.
