I need to have a hard conversation about something I’m seeing across engineering and product teams. Your engineering organization might be celebrating velocity improvements from AI coding tools, but if you’re a PM looking at actual customer-facing delivery, you might be seeing a very different story.
Let me share what happened at my company.
The Disconnect
Three months ago, our VP of Engineering presented impressive metrics to the board:
- 85% increase in PRs merged
- 60% increase in commits per engineer
- “Dramatically improved development velocity” from AI tool adoption
The board was thrilled. Our technical roadmap was apparently accelerating.
Except… as VP of Product, my metrics told a completely different story:
- Feature delivery timeline: unchanged (still averaging 6-8 weeks from idea to customer)
- Customer-facing releases: same monthly cadence
- Product roadmap completion: actually slightly behind plan
- Customer satisfaction with pace of innovation: declining 8 points
We were generating more code but delivering the same amount of value. Something wasn’t adding up.
Where the Time Goes
I dug into where the supposed velocity gains were disappearing. Here’s what I found:
The Review Bottleneck (we’ve discussed this extensively in other threads):
- Average time from PR creation to merge: 91% longer
- Senior engineer capacity consumed by review: massive increase
- Review iteration cycles: nearly doubled
Increased Bug Fixing:
- Production incidents: up 23%
- Time spent on bug fixes vs new features: shifted from 20/80 to 35/65
- Customer-reported issues: up 31%
Technical Debt Servicing:
- Refactoring PRs (fixing AI-generated code): up dramatically
- Code quality issues requiring rework: significant increase
- Architecture inconsistency fixes: new category of work
Integration and Testing:
- Features passing code review but failing integration tests
- More time in staging catching issues
- Longer QA cycles for AI-assisted features
The Real Cycle Time
When you measure the full cycle - from feature concept to customer value - the AI productivity gains largely evaporate:
Pre-AI Full Cycle (average feature):
- Design & planning: 1 week
- Implementation: 2 weeks
- Code review: 3-4 days
- QA & testing: 1 week
- Deployment & monitoring: 2-3 days
- Total: ~4.5 weeks
Post-AI Full Cycle (average feature):
- Design & planning: 1 week (unchanged)
- Implementation: 1 week (faster!)
- Code review: 7-8 days (slower!)
- QA & testing: 1.5 weeks (more bugs to catch)
- Bug fixes & rework: 4-5 days (new category!)
- Deployment & monitoring: 2-3 days (unchanged)
- Total: ~4.5 weeks (same!)
We saved a week in implementation but lost it in review, QA, and rework.
The Customer Interview Data
I ran customer interviews specifically about product velocity perception. The feedback was concerning:
“Features are announced but seem to take forever to actually ship.”
“Quality has gotten worse - more bugs, more edge cases not handled.”
“New features feel rushed, like they weren’t fully thought through.”
“We asked for [feature A] but got [feature A-] that doesn’t quite solve our problem.”
This last point was particularly telling. AI helps engineers implement quickly, but speed isn’t the same as solving the right problem well.
The Business Case Is Falling Apart
We invested significantly in AI coding tools:
- Copilot subscriptions for 80 engineers: ~K/year
- Training and enablement: ~K
- Infrastructure for AI-assisted development: ~K
The promised ROI was based on 30-40% productivity improvements. But when you measure productivity as “customer value delivered per dollar spent,” we’re seeing maybe 5-8% improvement at best.
And that’s before accounting for:
- Senior engineer satisfaction decline (retention risk)
- Increased technical debt (future cost)
- Higher production incident rate (customer impact)
- Longer feature planning cycles (context-building overhead)
The Hard Conversation with Leadership
Last week I had to present this analysis to our CEO and board. It was not the conversation anyone wanted to have after celebrating our “AI transformation.”
I showed them two charts side by side:
Chart 1: Engineering Metrics (PRs, commits, “velocity”)
- Trending up dramatically
Chart 2: Product Metrics (features shipped, customer satisfaction, business KPIs)
- Flat or slightly declining
The question I posed: “Which metrics actually matter for our business?”
Where Engineering and Product Need to Align
The core issue: engineering and product are optimizing for different things.
Engineering is (understandably) excited about AI tools making coding faster and is measuring success by coding metrics.
Product is focused on customer value delivery and measuring success by customer outcomes.
These need to be the same thing.
What We’re Changing
We’re realigning around shared metrics:
Old Metrics (engineering-focused):
- PRs per week
- Commits per engineer
- Lines of code
New Metrics (outcome-focused):
- Customer-facing features shipped per quarter
- Time from customer request to delivered solution
- Production incident rate
- Customer satisfaction with product evolution
- Technical quality (measured by refactor/rework rate)
Process Changes:
- Feature planning now includes “full cycle time” estimates, not just implementation time
- Engineering estimates include review, QA, and likely rework
- We budget senior engineer review capacity as a constraint in planning
- AI tool usage is encouraged but not mandated - teams choose based on what actually improves outcomes
The Cultural Challenge
The hardest part is managing the narrative. The AI vendor messaging is everywhere: “2x developer productivity,” “ship faster with AI,” “transform your engineering organization.”
Our engineers believe they’re more productive (and in a narrow sense, they are - writing code is faster). But the business isn’t seeing the productivity gains translate to outcomes.
This creates tension. Engineers feel their wins aren’t being recognized. Product feels engineering doesn’t understand the business impact. Leadership is confused about whether AI tools are working.
What Product Leaders Need from Engineering Leaders
If you’re a PM or product leader working with an engineering org using AI tools heavily, here’s what to ask for:
- Full cycle metrics: Not just implementation time, but idea-to-customer delivery time
- Quality metrics: Bug rates, rework rates, technical debt accumulation
- Capacity modeling: Explicit accounting for review bottlenecks in planning
- Shared success criteria: What customer outcomes are we trying to improve?
- Honest assessment: Where are AI tools actually helping vs. creating new problems?
The Path Forward
I’m not anti-AI tools. The implementation speedup is real and valuable for certain types of work. But I am deeply concerned about the mismatch between engineering metrics and business outcomes.
We need to be honest about the full system impacts - review bottlenecks, quality issues, senior engineer burden - and design our processes and metrics around what actually matters: delivering value to customers sustainably.
The productivity illusion is dangerous. It makes us feel like we’re moving faster while the customer experience suggests otherwise.
Questions for the Community
For other product leaders:
- Are you seeing this disconnect between engineering velocity metrics and actual feature delivery?
- How are you measuring AI tool ROI in product terms?
- What metrics have you aligned on with engineering?
For engineering leaders:
- How do you balance AI productivity narratives with product delivery realities?
- What does “productivity” actually mean when the bottleneck shifts from coding to review?
We need to figure this out together because the current state - celebrating code generation while customers wait the same amount of time for features - isn’t sustainable.