I need to share something that’s been bothering me for months. My team has embraced AI coding assistants—GitHub Copilot, Cursor, Claude Code, you name it. Our adoption rate mirrors the industry: 93% of developers now use these tools. Yet when I look at our sprint velocity, deployment frequency, and actual feature delivery, we’re seeing maybe a 10% improvement. Maybe.
I thought I was missing something, until I found the research.
The Perception Gap
A METR study ran a controlled experiment with experienced developers. The result? Developers using AI were 19% slower on average. But here’s the kicker: they believed they were 20% faster. Before starting, they predicted AI would make them 24% faster. After finishing—even with objectively slower results—they still thought AI had sped them up by about 20%.
This isn’t just a measurement problem. It’s a perception problem.
The Data Paints a Messy Picture
Let’s be honest about what the research shows:
- 93% adoption, 10% productivity gain - That’s a massive disconnect
- AI-assisted code has 1.7× more issues and 9% more bugs per developer
- PR sizes increased 154% on average with AI tools
- Bain & Company described real-world savings as “unremarkable”
- Meanwhile, GitHub, Google, and Microsoft’s early studies claimed 20-55% faster task completion
Someone’s measuring the wrong thing. Or maybe we all are.
What Are We Actually Measuring?
Here’s my theory: We’re measuring coding speed, not problem-solving speed. We’re counting commits and PRs, not customer value delivered. We’re tracking lines of code written, not bugs prevented or technical debt avoided.
AI tools are incredible at autocompleting boilerplate, generating tests, and converting comments into code. They make typing faster. But typing was never the bottleneck.
In our team, developers spend maybe 32% of their time actually writing code. The rest is meetings, code reviews, debugging, understanding context, aligning with product, waiting for CI/CD, and dealing with the friction of organizational complexity.
If AI makes that 32% twice as fast, we’ve gained… 16% overall. And that assumes zero quality tradeoff, which the data suggests isn’t true.
So What Should We Measure?
I’m genuinely curious what others think. If lines of code, commits, and PR velocity aren’t the right metrics, what is?
Some candidates:
- Time to value - How long from idea to production?
- Deployment frequency - Are we shipping more often?
- Change failure rate - Are we shipping more bugs?
- Mean time to recovery - Can we fix issues faster?
- Cognitive load - Are developers less stressed and more focused?
- Defect density - Quality per feature, not just speed
Or maybe the answer is simpler: developer satisfaction. If people feel more productive and enjoy their work more, does it matter if the spreadsheet doesn’t show a 50% velocity gain?
The Uncomfortable Question
Is the AI coding assistant productivity promise oversold? Or are we just measuring it wrong?
I’m not anti-AI. I use these tools every day. But if 93% of us have adopted something and organizational productivity has barely moved, we need to either:
- Figure out what we’re missing in how we measure productivity
- Admit that AI coding assistants solve the wrong problem
- Accept that 10% is good enough and price expectations accordingly
What are you seeing in your teams? Are you measuring productivity differently? Have you found metrics that actually correlate with AI tool usage?
I’d love to hear what’s working—or not working—for others.