Eight months ago, we rolled out AI coding assistants across our EdTech engineering team. The adoption was immediate—developers loved them. Within weeks, I was seeing activity metrics I’d never seen before: commits up 40%, pull requests up 65%, story points completed up 35%.
I thought we’d found the holy grail of engineering productivity.
But here’s what’s kept me up at night: our feature delivery velocity is exactly the same as it was before AI.
We’re generating more code than ever. Our developers genuinely feel more productive. But we’re shipping features at the same rate we did eight months ago. In some sprints, we’re actually shipping less because of the chaos that comes with all this new code.
The Data That Confirms I’m Not Crazy
I started digging into this and found I’m not alone. CircleCI’s 2026 State of Software Delivery report shows a 59% increase in average engineering throughput with AI tools. That tracks with what we’re seeing.
But here’s the kicker: their data also shows that feature branch throughput went up 15% while main branch throughput went DOWN 7%.
Teams are moving faster on their branches, but the code is getting stuck somewhere before it reaches production.
Where I Think the Productivity Is Vanishing
After months of observation and painful retrospectives, here’s my hypothesis: the bottleneck shifted from coding to everything that happens after coding.
Specific patterns I’m seeing on my team:
1. Review Queue Explosion
Our PR queue has tripled in size. PRs are not just more numerous—they’re also larger and more complex. Senior engineers who used to spend 30% of their time reviewing now spend 60%. They’re exhausted, and despite their best efforts, things are slipping through.
2. QA Team Overwhelmed
Our QA team’s capacity didn’t magically scale with the code output. They’re drowning. Features are “done” from an engineering perspective but sitting in a QA backlog for days.
3. Integration Chaos
More parallel development means more merge conflicts, more CI/CD queue time, more deployment coordination. Our main branch integration process wasn’t designed for this volume.
4. More Rollbacks
Because reviews are rushed and testing is overloaded, we’re catching issues in production that we used to catch earlier. Rollback rate is up 40%.
The Uncomfortable Realization
I realized we’ve been measuring AI productivity at the input (how fast developers code) instead of the output (how fast we deliver value to users).
Waydev’s research calls this the “engineering leadership blind spot of 2026”—activity goes up, but business outcomes lag. We’re optimizing the wrong part of the system.
It’s like we upgraded the engine on a car but left the transmission, brakes, and steering wheel unchanged. The engine roars, but the car isn’t going any faster because the other systems can’t keep up.
What I’m Trying Now
We’re experimenting with:
- Dedicated review capacity: Rotating senior engineers into full-time review weeks
- Stricter PR size limits: AI makes it easy to write 500-line PRs, but they’re impossible to review well
- QA automation investment: Using AI tools to generate test cases, not just implementation code
- Process redesign: Questioning every handoff that was designed for lower throughput
But I’ll be honest—I don’t know if these will work. The pressure to “move fast” is immense, especially when competitors are adopting the same tools.
Questions for This Community
For those of you managing engineering teams in the AI era:
- Are you seeing the same pattern? Increased activity but flat delivery velocity?
- Where is your bottleneck? Review? QA? Integration? Something else?
- What metrics are you tracking? I’m realizing commits and PRs are vanity metrics—what actually matters?
- How are you adapting your processes? What’s worked? What hasn’t?
I keep asking myself: are we optimizing the wrong part of the system? Should engineering leaders be investing in review infrastructure, QA automation, and integration tooling instead of just coding tools?
Would love to hear how other teams are navigating this.