We just hit a paradox that’s reshaping engineering organizations in 2026, and the data is both fascinating and uncomfortable.
The Numbers Don’t Add Up
Teams with high AI adoption complete 21% more tasks and merge 98% more pull requests. That sounds like the productivity revolution we were promised, right?
But here’s the catch: PR review time increased 91%. That’s not a typo. We’re generating code nearly twice as fast, but review time has almost doubled.
The Bottleneck Shifted
Senior engineers are now spending 19% more of their time on code review, and the PRs themselves are 18% larger due to AI-generated code volume. Human approval capacity has become the constraining factor limiting the productivity gains AI coding tools promised.
At Microsoft, where I spent several years before my current CTO role, we used to worry about keeping developers productive. Now? The concern has flipped. Review capacity, not coding speed, defines engineering velocity.
The Quality Question
Here’s why this matters more than a process hiccup: 61% of developers report that AI produces code that looks correct but is unreliable. This isn’t just about volume—it’s about verification burden shifting to senior engineers.
Amazon recently mandated senior approval for all AI-assisted code after experiencing outages traced to AI-generated logic errors. They’re not alone. Research from CodeRabbit shows pull requests containing AI-generated code had roughly 1.7× more issues than human-written code, with 15-18% more security vulnerabilities.
So What Are Seniors Actually Doing?
This is where the framing matters. Are senior engineers:
A bottleneck limiting the productivity gains AI promised? If we could just speed up reviews, we’d unlock massive velocity improvements.
Or quality gatekeepers preventing costly production incidents and architectural mistakes that would compound over time?
I lean toward the latter. Senior engineers aren’t just checking syntax—they’re validating system design, catching subtle logic errors, and preventing technical debt accumulation. That work is absolutely necessary.
But it raises uncomfortable questions:
- If review is now the highest-leverage activity, are we recognizing and compensating it appropriately?
- Should we be hiring specifically for review capacity, not just coding capacity?
- Do we need entirely new review processes designed for the AI era?
The Organizational Impact
Here’s the part that keeps me up at night: Any correlation between AI adoption and organizational-level performance metrics has evaporated. The gains at the individual level don’t translate upward.
We’re optimizing for code generation speed without considering the entire delivery pipeline. It’s like buying faster assembly line equipment without checking if your quality inspection capacity can keep up.
CFOs are noticing. Enterprises are deferring 25% of planned AI investments to 2027 amid demands for tangible ROI. This review bottleneck is exactly why—the promised productivity gains aren’t materializing at the business level.
What Should We Do?
I don’t have all the answers, but here’s what I’m experimenting with at my company:
-
Tiered review processes: Not everything needs senior architect review. Clear escalation paths based on risk and scope.
-
AI-assisted review tools: If AI creates the problem, can it help solve it? Tools like Anthropic’s Code Review for Claude Code (shipped March 9) run automated reviews before human eyes see it.
-
Review capacity planning: Treating review time as a constrained resource in sprint planning, not an afterthought.
-
Measuring prevented incidents: Tracking near-misses and bugs caught in review to quantify the value of thorough review.
But I want to hear from other leaders: How are you handling this in your organizations? Are we measuring the wrong things? Should we restructure teams around review capacity? Or is this just a temporary growing pain as processes catch up to AI capabilities?
Sources: Faros AI Research, byteiota Analysis, CodeRabbit Study, Opsera 2026 Benchmark