The Second System Effect in AI: Why Your Agent v2 Rewrite Will Probably Fail
Your agent v1 works. It's ugly, it's held together with prompt duct tape, and the code makes you wince every time you open it. But it handles 90% of cases, your users are happy, and it ships value every day. So naturally, you decide to rewrite it from scratch.
Six months later, the rewrite is still not in production. You've migrated frameworks twice, built a multi-agent orchestration layer for a problem that didn't require one, and your eval suite tests everything except the things that actually break. Meanwhile, v1 is still running — still ugly, still working.
This is the second system effect, and it has been destroying software projects since before most of us were born.
A Pattern as Old as Software Itself
In 1975, Fred Brooks described a phenomenon he'd observed at IBM: when an architect designs their second system, it becomes the most dangerous system they will ever design. The first system is "spare and clean" because the designer is cautious, still learning, and constrained by uncertainty. The second system absorbs every feature that was deferred from the first, every optimization that seemed premature, and every generalization that "would be nice to have." The result is bloated, late, and often worse than what it replaced.
Brooks was describing IBM's transition from relatively simple operating systems for the 700/7000 series to the catastrophically ambitious OS/360. The pattern has repeated across every era of computing since. And it is repeating right now, at scale, across the AI agent ecosystem.
The conditions are almost perfectly set for it. Agent v1 is always a prototype that surprised everyone by working. The team accumulated a long list of compromises they wanted to fix. A new framework promises to solve the problems they hit. And someone — usually the most senior engineer — says the magic words: "We should just rewrite it properly this time."
The Three Over-Engineering Traps
Agent rewrites don't fail because the team is incompetent. They fail because of predictable over-engineering patterns that feel like good engineering decisions in the moment.
Trap 1: Premature Multi-Agent Architecture
The most expensive mistake in the current agent ecosystem is reaching for multi-agent orchestration before you've exhausted what a single agent can do. Analysis of 47 production AI deployments found that 68% would have achieved equivalent or better outcomes with well-architected single-agent systems.
The cost difference is not subtle. For a customer service system processing 2.9 million queries per month, a multi-agent architecture cost 22,700 for a single-agent alternative — with only a 2.1 percentage point accuracy improvement. Token amplification in multi-agent systems runs 4.6x higher, with pure coordination overhead accounting for much of the cost.
The debugging penalty is even worse. Mean time to resolution jumps from 18 minutes for single-agent failures to 67 minutes for multi-agent failures — a 3.7x increase. When a single agent fails, you debug one component. When a multi-agent workflow fails, you trace interactions across agents, coordination logic, and shared state.
Teams reach for multi-agent because it feels architecturally sophisticated. But the decision threshold is straightforward: for monthly volumes under 10,000 queries with linear workflows, single-agent systems win 90% of the time. Multi-agent economics only start to materialize above 50,000 queries per month with genuinely parallelizable workflows.
Trap 2: Framework Lock-In Through Migration
The agent framework landscape is a graveyard of expensive migrations. Teams regularly spend three to six months building on one framework, hit its limitations, and face a 50–80% rewrite to migrate to another. One documented case involved a three-week rewrite where 60% of the codebase changed and produced two production bugs.
The second-system instinct makes this worse. When you rewrite, you don't just port your logic — you "fix" everything. You adopt the new framework's idioms completely. You restructure your tool definitions, your state management, your error handling. Each of these changes is individually reasonable, but collectively they mean you're testing an entirely new system while expecting the reliability of the old one.
The counterintuitive advice from practitioners who've survived multiple migrations: start with high-ceiling frameworks even if they feel like overkill. Growing into a framework is far easier than migrating out. The learning curve for LangGraph is 40–60 hours; for CrewAI, 20–30 hours. But the year-one framework overhead for an eight-engineer team doing a migration runs approximately $11,600 — and that's just the learning cost, not the bugs.
Trap 3: Eval-Driven Development Before You Understand the Task
- https://en.wikipedia.org/wiki/Second-system_effect
- https://iterathon.tech/blog/multi-agent-orchestration-economics-single-vs-multi-2026
- https://theaiengineer.substack.com/p/why-ai-agents-keep-failing-in-production
- https://medium.com/@hieutrantrung.it/the-ai-agent-framework-landscape-in-2025-what-changed-and-what-matters-3cd9b07ef2c3
- https://mbrenndoerfer.com/writing/multi-agent-systems-benefits-challenges-when-to-use-multiple-agents
- https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents
