I’ve been tracking our engineering metrics closely since we rolled out AI coding assistants last quarter, and something isn’t adding up.
The individual-level data looks phenomenal. Our developers report saving an average of 3.6 hours per week. Our most active AI users are merging 60% more pull requests than they did six months ago. When I sit in sprint reviews, I hear stories about features that used to take three days getting knocked out in an afternoon.
But when I zoom out to the quarterly roadmap review? We’re shipping roughly the same number of features we did a year ago. Our time-to-market hasn’t improved. Customer-facing velocity is… unchanged.
The Productivity Paradox in Numbers
This isn’t just my team. The research is showing the same pattern across the industry:
- Individual gains are real: Developers complete 21% more tasks, merge 98% more PRs with AI tools
- But organizational throughput stalls: Review times increase 91%, average PR size balloons 150%
- Quality issues emerge: Bug counts up 9%, security vulnerabilities more common in AI-assisted code
- Perception vs reality: Developers feel 24% faster, but controlled studies show they’re actually 19% slower
That’s a 43 percentage point gap between how fast we think we’re going and how fast we actually are.
Where’s the Constraint?
From my perspective as VP Product, I keep asking: If coding is faster, what’s now the bottleneck?
Some hypotheses I’m working through:
1. The review bottleneck: Humans can’t review AI-generated code as fast as AI can generate it. The 91% increase in review time suggests this is real. Larger PRs + more subtle bugs = slower, more careful reviews.
2. The testing bottleneck: CI/CD pipelines weren’t designed for 60% more PRs. Teams that haven’t invested in automated testing are seeing their build queues explode.
3. The quality bottleneck: Speed gains evaporate when code needs rework due to bugs, security issues, or violations of team standards (design systems, accessibility, etc).
4. The wrong bottleneck: Maybe coding was never the constraint for product velocity. Product decisions, customer feedback loops, go-to-market execution—those might be the actual limiting factors.
The Framework That Helps Me Think About This
I keep coming back to Theory of Constraints. Optimizing a non-constraint doesn’t improve system throughput. If we’ve made coding 3.6 hours faster per week but haven’t touched the constraint (maybe it’s product prioritization, maybe it’s deployment approvals, maybe it’s customer discovery), we’ve just moved the pile of WIP to a different part of the system.
The gains are real at the individual level. But system-level velocity requires optimizing the slowest part of the pipeline.
What Are You Seeing?
For teams that are capturing AI productivity gains at the org level—what did you change besides adopting AI coding tools?
Did you overhaul your review process? Invest heavily in automated testing? Restructure how you break down work? Change your definition of “done”?
Or are you seeing the same paradox—faster coding, same product velocity—and treating it as a signal that coding wasn’t your bottleneck to begin with?
I’m genuinely trying to figure out if we’re missing something structural, or if this is just the reality: AI makes coding faster, but product development is a system, and you need to upgrade the whole system to capture the gains.
Sources: AI Coding Statistics, AI Productivity Paradox, Why Teams Are Busier But Not Faster