I need to share something that’s been bothering me for months, and I’m curious if other product and engineering leaders are seeing the same pattern.
Our engineering team fully adopted AI coding tools about 18 months ago. GitHub Copilot for everyone, Cursor for the senior engineers who wanted it, Claude Code for complex refactoring. The team loves these tools—our developer satisfaction scores are the highest they’ve been in years. Everyone feels more productive.
But here’s the thing: our velocity metrics haven’t budged. Sprint velocity? Flat. Time to ship features? Basically the same. DORA metrics? No meaningful improvement.
Then I started digging into the research, and the numbers are even more alarming than I expected:
The Productivity Paradox:
- AI tools now write 41% of all code across the industry (26.9% of production code that actually ships)
- Developer adoption is at 84%—this is mainstream, not experimental
- Yet organizational productivity gains have plateaued at around 10%
That’s it. 10%. After all this investment, all this adoption, all this excitement.
Even worse—the perception gap is massive:
- Developers think they’re 20% faster
- Studies show they were actually 19% slower
- That’s a 39-point perception gap between feeling productive and being productive
And the bottleneck just moved:
- Teams with high AI adoption complete 21% more tasks
- But PR review time increased by 91%
- The code review process is now the constraint, not code generation
I’m sitting in budget meetings where our CFO is asking pointed questions about our AI tool spend. And honestly? I don’t have great answers. The developers are happier, but the business outcomes aren’t there.
From a product perspective, this feels like when you ship a feature that gets great NPS scores but doesn’t move retention or revenue. The user sentiment is positive, but the business metrics tell a different story.
Questions for the group:
- Are you seeing actual velocity improvements, or just developer happiness improvements?
- Have you changed your review processes, testing infrastructure, or deployment pipelines to match the new code generation pace?
- How are you measuring AI tool ROI when self-reporting is this unreliable?
- Did we hit some kind of ceiling where code generation speeds up but everything downstream becomes the bottleneck?
I’m not anti-AI—these tools are clearly valuable for developer experience and retention. But if we’re being honest about the business case, 10% productivity gains for the level of investment and organizational change feels… underwhelming.
What am I missing? Or is this just the new reality we need to accept?