I need to talk about something that’s been bothering me for the past 6 months—and after digging into the data this weekend, I’m convinced we have a massive leadership blind spot that nobody’s addressing.
The Numbers Don’t Make Sense
Our team’s throughput metrics look incredible:
- Pull requests up 98% compared to Q4 2025 (before AI coding assistants rolled out)
- Code commits up 59% across the org
- Individual developer velocity up 40-60% based on self-reported surveys
Sounds amazing, right? Except when I look at what actually matters:
- Deployment frequency: basically flat (only up 8% despite all that extra code)
- Lead time for changes: up 12% (it’s taking longer to ship features)
- Customer-facing releases: down 3% quarter over quarter
We’re generating mountains of code but shipping less value to customers. How is that even possible?
What I Think Is Happening
After reviewing 3 months of engineering data and talking to 15+ engineers across teams, here’s my theory:
We’re optimizing for activity, not outcomes.
The AI coding assistants (GitHub Copilot, Cursor, etc.) are making it incredibly easy to write code. Junior engineers who used to spend 6-8 weeks ramping up are now productive in 3-4 weeks. Mid-level engineers are cranking out features at senior-level speed.
But all that code has to be reviewed. And our review processes haven’t scaled with the code volume.
According to recent industry data, PR review times are up 91% when AI is heavily involved. AI-generated PRs wait 4.6x longer for review pickup and have only a 32.7% acceptance rate vs 84.4% for human-written PRs.
So we’ve just moved the bottleneck from writing to reviewing.
The Uncomfortable Questions
This raises some questions I don’t have answers to yet:
1. Are we measuring the wrong things?
Our dashboards track lines of code, commit velocity, and PR throughput. But these are activity metrics. They measure effort, not impact.
Should we be tracking:
- Features shipped to production?
- Customer value delivered per sprint?
- Time from idea → customer impact?
- Main branch success rate? (Industry benchmark is 90%, we’re at 72%)
2. Are we creating a productivity theater?
When engineers know they’re measured on PRs merged and commits pushed, they start gaming the system:
- Splitting features into micro-PRs to hit velocity targets
- Shipping incomplete features just to close tickets
- Generating code because the AI makes it easy, not because it’s necessary
This is Goodhart’s Law in action: “When a measure becomes a target, it ceases to be a good measure.”
3. What happens when the review bottleneck breaks?
Right now, senior engineers are doing 4-6 hours/week of additional code review to handle the AI-generated volume. They’re burning out.
If we don’t fix this, one of three things happens:
- Senior engineers start rubber-stamping reviews (quality drops)
- Senior engineers quit (brain drain)
- We hire more reviewers (expensive, doesn’t scale)
None of these are good outcomes.
What Actually Matters in 2026?
I keep coming back to this article: “More Code, Fewer Releases: The Engineering Leadership Blind Spot of 2026”. The core insight is that most engineering leaders haven’t updated their dashboards to reflect where bottlenecks have actually shifted.
AI hasn’t made us more productive—it’s just moved the constraint.
The real question is: are we building the right things, and are we building them well?
Not: “How many lines of code did we write this week?”
Looking for Perspectives
I’m curious how other engineering leaders are thinking about this:
- What metrics are you actually tracking to measure engineering effectiveness in the AI era?
- How are you handling the review bottleneck when AI is generating 40-60% of your team’s code?
- Have you seen deployment frequency decouple from code volume like we have?
- What does “productivity” even mean when AI can write code faster than we can validate it’s correct?
I don’t have this figured out yet. But I’m pretty sure optimizing for code commits in 2026 is like optimizing for email volume in 2015—you’re measuring activity, not accomplishment.
Would love to hear how others are navigating this.