TL;DR: Everyone on my team is using AI coding tools and swears they’re 2x faster. Our sprint velocity? Basically flat. What gives? ![]()
I need to vent, and maybe get a reality check from this community.
Three months ago, our engineering team went all-in on AI coding assistants. GitHub Copilot for everyone. Cursor subscriptions. The works. The engineers were thrilled—finally, tooling that keeps up with their brains!
Fast forward to today: In our retros, devs report saving 3-4 hours per week. They’re completing tickets faster. Code review requests are flying. Our Slack is full of “look what AI generated!” screenshots.
But here’s the thing that’s keeping me up at night… our actual delivery velocity hasn’t budged.
We’re shipping roughly the same number of features per sprint as we did before AI. Our design-to-production cycle time? If anything, it’s slightly longer. And our engineering manager is scratching his head because the math just doesn’t add up.
The Numbers Don’t Lie (But They’re Confusing)
I went down a research rabbit hole this week, and wow—we’re not alone:
- 84% of developers now use AI coding tools (source)
- Organizations see only 10% productivity gains despite this massive adoption (source)
- Developers report saving hours, but review time increased 91% while tasks completed rose just 21% (Faros AI research)
That last stat hit me like a truck. 91% increase in review time. That’s our bottleneck right there.
What I’m Seeing From the Design Side 
Here’s my perspective as someone who bridges design and engineering:
Our engineers are cranking out code faster—that part is real. But the PRs are… how do I say this kindly… inconsistent. Some are brilliant. Others feel like they were written by someone (something?) that doesn’t quite understand the product context.
Code review has become this weird quality gate that’s more intense than before. Senior engineers are spending MORE time reviewing, not less, because they have to verify that the AI-generated code actually does what it claims.
Meanwhile, I’m sitting in design reviews wondering: Are we building the right features faster, or just building more features? Because from where I sit, we’re generating a lot of output, but I’m not sure it’s translating to better user experiences.
The Trust Gap That Nobody Talks About
Here’s what really bothers me: I’ve noticed devs using tools they don’t fully trust because they feel like they have to. Like, if you’re not using AI, you’re falling behind.
One of our junior engineers told me privately that Copilot sometimes slows him down, but he’s worried that admitting that makes him look “resistant to innovation.” That broke my heart a little. ![]()
The data backs this up: trust in AI coding tools dropped from 40% to 29% while adoption rose to 84% (source). That’s a 55-point gap between usage and confidence!
Are We Measuring the Wrong Things?
I keep coming back to this question: What does “productivity” even mean?
- Is it PRs merged per week?
- Features shipped per sprint?
- Customer problems solved?
- Business outcomes delivered?
Maybe individual developer speed just isn’t the right metric when we’re building complex products as a team. Maybe the bottleneck was never “how fast can one person write code”—it was always coordination, communication, and making sure we’re building the right thing.
What I Want to Know 
For the engineering leaders here:
- Are you seeing the same disconnect between individual speed and team velocity?
- How are you measuring AI impact beyond “hours saved”?
- What changed in your processes to actually capture the gains?
- How do you create space for people to honestly say “this tool isn’t helping me right now”?
For the product folks:
- Are we optimizing for output or outcomes? Does it matter if we ship features twice as fast if they don’t move metrics?
I want to believe AI tools can genuinely transform how we work—I’ve seen glimpses of it. But right now, it feels like we’re coding faster but shipping slower, and I can’t figure out if that’s a tooling problem, a process problem, or a measurement problem.
Or maybe it’s all three? ![]()
Curious if anyone else is experiencing this productivity paradox, or if we’re just doing it wrong.