DORA Metrics vs AI Velocity: What Should Engineering Leaders Actually Measure in 2026?

After three years as CTO watching the AI productivity conversation evolve, I need to call out what I think is the fundamental problem:

Everyone’s tracking PR count. But PR count is not productivity.

Metrics Drive Behavior—Choose Carefully

The metrics we track determine what our teams optimize for. And right now, we’re optimizing for the wrong things.

I keep seeing orgs celebrate:

  • “AI users merge 60% more PRs!”
  • “Code output increased 98%!”
  • “Developers report feeling 20% more productive!”

Then I ask: “What happened to your deployment frequency? Change failure rate? Customer satisfaction?”

Usually: silence, or declining trends.

DORA Still Matters—Maybe More Than Ever

The DORA metrics remain the best sanity check because they tie delivery speed directly to system stability:

  1. Deployment frequency - How often do we actually ship to production?
  2. Lead time for changes - Commit to production (not PR merge)
  3. Change failure rate - What percentage of deployments cause incidents?
  4. Time to restore service - When things break, how fast do we recover?

These metrics force you to care about the entire value stream, not just the part AI makes faster.

The New Reality: Separating Code Output from Value Delivery

Here’s the framework I’m using in 2026:

:cross_mark: Vanity Metrics (Stop Tracking These)

  • Pull requests per week
  • Lines of code written
  • “Percentage of code written by AI”
  • Individual developer velocity
  • Time to PR merge

:white_check_mark: Business Impact Metrics (Start Tracking These)

  • Time to customer value (idea → production → adoption)
  • Deployment success rate (deployments without incidents)
  • Customer satisfaction (NPS, support tickets, adoption)
  • Production stability (MTTR, incident frequency)
  • Team health (engagement, knowledge distribution, collaboration)

What the Research Actually Shows

The data is stark: Research from 2026 shows that correlation between AI adoption and company-level KPIs completely evaporates.

Teams merge more PRs. Companies don’t ship more value.

Why? Because we’re measuring intermediates, not outcomes.

Writing code is easy. Validating it, integrating it, deploying it safely, ensuring customers actually want it—that’s the hard part. And that’s what actually creates business value.

My Recommendations for 2026

1. Stop celebrating velocity
Start celebrating: successful deployments, resolved customer issues, reduced incidents.

2. Track end-to-end, not intermediates
Measure from customer need → delivered solution, not from code start → PR merge.

3. Make quality visible
Track defect rates, rework rates, production incidents alongside velocity.

4. Include team health
Knowledge distribution, collaboration time, psychological safety matter.

5. Align engineering and business metrics
If engineering hits all its numbers but the business isn’t succeeding, your metrics are wrong.

The Question That Matters

Not: “How many PRs did we merge this quarter?”

But: “How much customer value did we deliver, and how sustainably did we do it?”

For engineering leaders: What are you actually measuring in 2026? Have you stopped celebrating velocity? What metrics tie your team’s work to business outcomes?

I’d love to hear what frameworks are working for others.


Sources: AI Productivity Paradox Research | Rethinking Productivity Metrics | Developer Productivity Measurement 2026

Michelle, this is exactly the conversation we need to be having. Because metrics don’t just measure behavior—they drive it.

When We Measured Wrong Things, We Got Wrong Behavior

Last year, our leadership team was tracking:

  • PRs merged per engineer
  • Lines of code per sprint
  • Ticket completion velocity

You know what we got?

  • Engineers gaming the metrics (small PRs, unnecessary refactors)
  • Competition instead of collaboration
  • Junior devs copying senior devs’ code to boost their stats
  • Team morale in the toilet

We were literally incentivizing the opposite of what we wanted.

The Shift: Team Metrics Over Individual Metrics

We stopped tracking individual velocity entirely and started measuring team outcomes:

Team Health Metrics:

  • Knowledge distribution (can >1 person explain each system?)
  • Collaboration time (pairing, mob programming %)
  • Psychological safety scores (bi-weekly pulse)
  • Learning & development (skills acquired, mentoring hours)

Delivery Metrics (Team-Level):

  • Sprint goals achieved (not tasks completed)
  • Deployment success rate
  • Customer value delivered (features in production + adopted)
  • Incident response time and frequency

Quality Metrics:

  • Production defect rate
  • Security findings
  • Accessibility compliance
  • Technical debt ratio

The shift: We stopped asking “who was most productive?” and started asking “did the team deliver customer value sustainably?

What Changed

Behavior shifted immediately:

  • Senior engineers started pairing with juniors (improves team capability)
  • Code reviews became thorough and educational (improves quality)
  • Engineers collaborated on complex problems (knowledge distribution)
  • Team celebrated shared wins, not individual stats

Michelle’s framework is spot-on. Individual velocity metrics in the AI era just measure how fast we can manufacture problems. Team success metrics measure how well we’re actually serving customers.

Coming from financial services, I have to add the compliance and quality perspective that’s often missing from these productivity discussions.

In Regulated Industries, Speed Without Quality Is Liability

Michelle’s DORA metrics are crucial. Keisha’s team health focus makes sense. But let me add what matters in fintech:

Compliance & Risk Metrics:

  • Defect escape rate - Bugs found in production vs caught in testing
  • Security findings - Vulnerabilities per release
  • Audit readiness - Can we explain every line of code?
  • Regulatory compliance - Are we meeting SOC2, PCI-DSS requirements?

What Happened When We Optimized for Velocity

Six months ago, our PR velocity increased 73% with AI adoption. Leadership celebrated.

Then:

  • Security audit found 2.3× more vulnerabilities than previous quarter
  • Production incident rate increased 41%
  • Failed compliance audit for the first time in company history

The root cause? Fast code that wasn’t properly validated for security, compliance, or production resilience.

We were measuring speed. We should have been measuring safe, compliant delivery.

What We Track Now

Quality Gates (Must-Pass):

  • Security scan results (no critical/high findings)
  • Compliance checklist completion
  • Test coverage > 80% for financial logic
  • Architecture review sign-off for sensitive systems

Health Metrics:

  • Production incidents per 1000 deployments
  • Mean time to detect + resolve security issues
  • Compliance violations (should be zero)
  • Customer-reported defects per release

Sustainable Delivery:

  • Deployment frequency (with safety gates)
  • Change lead time (including compliance review)
  • Rollback rate (should be < 5%)

The message to my team: I don’t care how fast you merged PRs if we fail an audit or have a security breach.

In regulated industries, the only productivity metric that matters is safe, compliant value delivery. Everything else is vanity.

From the product side, I’m grateful for this thread because it’s highlighting a fundamental misalignment I’ve been struggling with:

Engineering Metrics Must Tie to Business Metrics

Michelle’s DORA framework makes sense. Keisha’s team health focus is important. Luis’s quality gates are critical.

But here’s my challenge: All of those are still engineering-internal metrics.

What about the business outcomes we’re supposed to be driving?

The Framework I’m Proposing

Engineering-Product Joint Metrics:

1. Features Shipped → Adoption Rate → Customer Value

  • Not just “merged” or “deployed”
  • Track: % of users who actually use the feature within 30 days
  • Measure: Customer satisfaction, retention impact, business metric movement

2. Cycle Time: Need Identified → Solution Delivered → Impact Measured

  • End-to-end, not just code → deployment
  • Include: discovery, validation, iteration based on feedback
  • Goal: Learning and value, not just shipping

3. Quality from Customer Perspective

  • Production defects (Luis’s metric) ✓
  • But also: Customer support tickets, user-reported issues
  • Time to fix customer-impacting bugs vs internal technical debt

4. Shared Accountability

  • Engineering owns: deployment success, system stability
  • Product owns: feature adoption, customer value
  • Joint ownership: Time to customer value, customer satisfaction

Why This Matters

Last quarter, engineering hit every metric:

  • Deployment frequency: ✓ Improved
  • Change failure rate: ✓ Reduced
  • Lead time: ✓ Faster

But product missed our goals:

  • Feature adoption: ✗ 23% below target
  • Customer satisfaction: ✗ Declined 8 points
  • Revenue impact: ✗ 40% below projections

If engineering succeeds but product fails, whose metrics were right?

I think the issue is we’re measuring different things. Engineering optimizes for delivery. Product optimizes for outcomes. And the gap between them is growing.

Proposal: Engineering and product should co-own end-to-end value delivery metrics. Not separate scorecards. Shared success criteria.

Because shipping fast code that customers don’t want or use is not productivity—it’s waste.

As a practitioner actually doing the work, I love the strategic thinking here. But I want to add the human perspective on metrics:

Metrics Should Help Teams Improve, Not Punish Them

I’ve been on teams where metrics were used to:

  • Rank engineers against each other
  • Justify layoffs
  • Create competition instead of collaboration
  • Punish “low performers”

And I’ve been on teams where metrics were used to:

  • Identify bottlenecks and help
  • Celebrate team wins
  • Guide process improvements
  • Support struggling engineers

Same metrics. Completely different cultures. Completely different outcomes.

What Happened When We Tracked Individual Velocity

When my previous company tracked PRs per engineer:

  • We started splitting up work artificially to boost numbers
  • Code reviews became rubber stamps (don’t slow down teammates)
  • No one wanted to work on complex, low-PR-count projects
  • Knowledge hoarding (sharing slows you down)

When we switched to team-level outcome metrics:

  • Collaboration increased
  • Code quality improved
  • People took on hard problems
  • Knowledge sharing became normal

My Alternative Metric Framework

Focus on Flow, Not Speed:

  • Cycle time - How long does work actually take?
  • Blockers - What’s preventing progress?
  • Rework rate - How often do we rebuild/fix?
  • Team satisfaction - Are people burnt out or energized?

The Question That Matters:
“Did we build the right thing, the right way, sustainably?”

Not:

  • “Did we build fast?”
  • “Did we merge lots of PRs?”
  • “Did we write lots of code?”

Michelle, your DORA metrics answer “did we deploy safely and quickly?”

David, your adoption metrics answer “did customers want this?”

Luis, your compliance metrics answer “did we build it correctly?”

But I want to add: “Are we building a team that can sustain this long-term?”

Because burnout, knowledge loss, and morale collapse are also productivity killers. We just don’t measure them until it’s too late.