Six months ago, our engineering team adopted AI coding assistants across the board. GitHub Copilot for most folks, a few trying Cursor and Codeium. The feedback from developers has been overwhelmingly positive—they feel faster, more productive, less bogged down by boilerplate.
But here’s what’s keeping me up at night: Our delivery metrics haven’t changed. At all.
The Math That Doesn’t Add Up
Recent research shows developers save 3.6 to 5.4 hours per week using AI coding tools. That’s substantial—nearly a full workday. If coding time dropped by that much, you’d expect to see our sprint velocity jump, cycle time shrink, and features shipping faster.
Instead? Our average cycle time is still hovering around 7 days. Our velocity is flat. And when I dig into the data, I see why: coding is only 43% of our cycle time.
The other 57%? That’s pull request reviews, QA testing, integration checks, and deployment. And here’s the kicker—while our developers are writing code faster, our PR review time has increased by nearly 90%.
The Bottleneck Just Moved
We didn’t eliminate friction; we just shifted it downstream. Now we have:
- Junior developers producing more code than ever before, but senior engineers are drowning in review requests
- AI-generated code that’s harder to review because it looks correct but can have subtle bugs, incomplete error handling, or security issues (research shows 48% of AI-generated code has vulnerabilities)
- Security and QA teams that can’t keep pace with the volume of changes
- Integration and testing phases that have become the new constraint
I greenlit a significant investment in these AI tools thinking we’d see measurable delivery improvements. Our developers are happier, which matters. But from a business perspective, we’re not shipping faster.
What Are We Missing?
I can’t be the only leader facing this. If you’ve adopted AI coding assistants, what have you done to address the downstream bottlenecks?
Have you:
- Changed your code review process?
- Restructured teams to handle increased volume?
- Invested in different tooling for the review/QA/security phases?
- Measured where your actual bottleneck is now?
I’d love to hear what’s worked—or what hasn’t. Because right now, I’m sitting on happy developers and flat delivery metrics, and I need to figure out where to invest next.
Sources: