When Everyone Has an AI Coding Agent: The Team Dynamics Nobody Warned You About
A team of twelve engineers adopts AI coding tools enthusiastically. Six months later, each engineer is merging nearly twice as many pull requests. The engineering manager celebrates. Then the on-call rotation starts paging. Debugging sessions last twice as long. Nobody can explain why a particular module was structured the way it was. The engineer who wrote it replies honestly: "I don't know — the AI generated most of it and it seemed fine."
This scenario is playing out at companies everywhere. The individual productivity story is real: developers finish tasks faster, write more tests, and clear backlogs more efficiently. The team-level story is more complicated, and most organizations aren't ready for it.
The Productivity Paradox at Code Review
The first place team-scale AI adoption breaks down is the most visible: code review.
When developers generate code faster, they also open pull requests faster. Research on large Copilot deployments found that teams with high AI adoption complete 21% more tasks and merge 98% more PRs — but PR review time increases 91%. The bottleneck isn't the coder anymore. It's everyone else in the pipeline.
This makes intuitive sense. Reviewers don't get AI assistance that scales proportionally to the volume they're asked to review. They're still reading line-by-line, checking for correctness, understanding intent, and evaluating architectural fit. The ratio of code produced to code reviewed has been inverted, and most teams haven't changed their review culture to compensate.
The result is one of two failure modes. Either reviewers rubber-stamp PRs to keep up with volume — which is how low-quality AI-generated code enters the codebase unchallenged — or they slow everything down trying to review properly, which causes developer frustration and the perception that review is a bottleneck to route around.
What works instead: stop treating AI-generated PRs like handwritten PRs. Automated tools should handle the routine checks — syntax, security patterns, duplicate code detection — so human reviewers focus exclusively on architectural fit, business logic correctness, and intent alignment. The review checklist changes. "Does this do what it says?" becomes less important. "Does this belong in the codebase in this form?" becomes more important.
How Knowledge Silos Form Faster
Before widespread AI tool adoption, knowledge silos accumulated gradually. A developer would work on a module for months, become the unofficial expert, and others would learn from them over time. This was slow, but it created distributed understanding.
AI tools accelerate code production without accelerating the transfer of understanding. A junior developer can now generate a working authentication module in an hour, have it reviewed by a senior who confirms it looks correct, and merge it — without either of them deeply understanding why it was structured that way. The code works. The knowledge silo formed instantly.
Research measuring code comprehension bears this out. Junior developers using AI assistance show significantly lower understanding of the code they ship compared to code they wrote unaided. The comprehension gap is measurable: juniors who wrote code themselves scored 17 points higher on understanding tests than those who generated it with AI. The code they shipped was equivalent in function. What they internalized was not.
This matters most when things break. Understanding code that you genuinely wrote is qualitatively different from understanding code that was generated and looked correct at review time. When a production incident hits at 2 AM, you need engineers who actually understand the system, not engineers who can describe what the system is supposed to do based on the code's surface appearance.
The protocol shift here is deliberate knowledge attribution. When a PR contains significant AI-generated sections, the PR description should explain the intent and tradeoffs chosen — not just what the code does. This forces the author to develop enough understanding to articulate the rationale. It also gives reviewers a signal when understanding is thin: if the author can't explain why the module is structured as it is, that's a flag.
Code Review Culture Has Already Broken Down
There's a subtler problem than volume: the norms around what code review is for have quietly shifted.
Historically, code review served multiple functions simultaneously. It caught bugs. It ensured style consistency. It spread knowledge across the team. And it was a mechanism for senior engineers to mentor junior engineers — the review comments were part of how juniors learned to write better code.
AI tools disrupt all four functions at once. Style enforcement gets delegated to formatters and linters. Bug detection gets offloaded to AI review tools. Knowledge spreading breaks down because the code is generated, not reasoned through. And mentorship — the function hardest to replace — evaporates because the junior engineer didn't struggle through the problem. They prompted their way past it.
The most experienced engineers on your team are increasingly spending code review time doing archaeology. They're looking at AI-generated code that is locally coherent but globally confused — code that uses correct syntax and passes tests but makes architectural choices nobody would have made deliberately. GitClear's 2024 analysis found an 8x increase in duplicated code blocks in AI-heavy codebases, and that traditional refactoring activity dropped from 25% to under 10% of developer activity. That's not because the codebase needed less refactoring. It's because nobody was building the understanding necessary to identify what needed to be refactored.
- https://www.index.dev/blog/ai-pair-programming-statistics
- https://www.gitclear.com/ai_assistant_code_quality_2025_research
- https://arxiv.org/html/2603.28592v1
- https://stackoverflow.blog/2024/04/03/developers-with-ai-assistants-need-to-follow-the-pair-programming-model/
- https://www.softwareseni.com/why-ai-coding-speed-gains-disappear-in-code-reviews/
- https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/
- https://www.faros.ai/blog/enterprise-ai-coding-assistant-adoption-scaling-guide
- https://www.qodo.ai/reports/state-of-ai-code-quality/
- https://arxiv.org/html/2509.20353v2
- https://martinfowler.com/articles/reduce-friction-ai/
