DORA Metrics Predict Performance, But Are We Using Them for Learning or Surveillance? Culture Determines Which

Three months into my current role, I inherited a team with “great” DORA metrics. Deployment frequency? Check. Lead time? Impressive. Change failure rate? Within acceptable range. The dashboards looked beautiful.

But the team was miserable. Quality issues kept surfacing weeks after deployment. Retrospectives felt like performance reviews. Engineers were stressed, burned out, and planning their exits.

What went wrong? The metrics were being used as surveillance tools, not learning instruments.

The Same Metrics, Opposite Outcomes

Here’s what I’ve learned across 18 years in engineering leadership: DORA metrics predict performance, but culture determines whether they help or hurt. The same five metrics (yes, deployment rework rate joined the party as the 5th metric) can drive two completely different outcomes:

The Surveillance Trap:

  • Metrics become individual performance evaluations
  • Teams game the system (empty commits, hiding incidents, splitting PRs artificially)
  • Trust erodes, psychological safety collapses
  • Metrics improve on paper while actual delivery degrades

The Learning Approach:

  • Metrics indicate system health, not people scorecards
  • Focus shifts to “What’s blocking us?” not “Why are you slow?”
  • Blameless retrospectives identify bottlenecks
  • Continuous improvement becomes team-driven, not manager-driven

Real Examples from My Teams

Team A (Previous Company - Surveillance Culture):

  • High deployment frequency achieved by deploying configuration changes and empty commits
  • Incidents were “resolved” in tools but root causes never addressed
  • Lead time looked great because work was split into tiny PRs to game the metric
  • Result: Burned out team, technical debt explosion, customer satisfaction plummeting

Team B (Current Company - Learning Culture):

  • Lower deployment frequency, but every deploy delivers real value
  • Incidents trigger blameless postmortems that improve the system
  • Lead time conversations focus on reducing handoffs and wait times
  • Result: Happy team, declining incident rate, customers love the quality

The 2026 AI Complication

Here’s a new wrinkle we’re all dealing with: AI tools are raising throughput across the industry. The 2025 DORA Report found that AI adoption improves deployment frequency but increases delivery instability.

This means:

  • Deployment frequency is rising across your organization
  • Change failure rate might be following right behind it
  • You need to monitor quality metrics with extra care
  • The surveillance approach becomes even more dangerous (blaming engineers for AI-amplified failures)

Psychological Safety Is the Foundation

The research is clear: psychological safety is among the strongest predictors of software delivery performance. Teams where members feel safe to take risks and voice concerns consistently perform better across all DORA metrics.

But surveillance-style metrics destroy psychological safety. When engineers fear their deployment frequency will be used in performance reviews, they optimize for the metric, not for the outcome.

What Actually Works

In the teams I’ve rebuilt using a learning approach:

  1. Never discuss individual metrics in 1:1s or reviews - DORA metrics are team-level indicators only
  2. Weekly “system improvement” sessions - Review metrics to identify bottlenecks, not to blame people
  3. Celebrate when metrics temporarily worsen - If a team slows deployments to fix technical debt, that’s success
  4. Pair DORA with quality metrics - Code quality, customer satisfaction, and technical debt alongside velocity
  5. Platform engineering investment - Reduce cognitive load system-wide rather than pushing individuals harder

The most powerful shift: When a metric looks bad, the first question is “What’s blocking the team?” not “Who’s underperforming?”

The Question for This Community

How are you using DORA metrics in your organizations?

Are they tools for learning and improvement? Or have they become surveillance instruments that teams learn to game?

For those who’ve successfully built a learning culture around metrics, what practices made the difference? For those dealing with surveillance cultures, what strategies have helped shift the conversation?

I’m particularly curious how other engineering leaders are handling the AI-amplified throughput challenge while maintaining quality and team health.


Context: Director of Engineering at Fortune 500 financial services company, formerly Intel and Adobe. Leading teams of 40+ engineers through digital transformation while trying to maintain the human elements that make engineering work sustainable.

Luis, this hits close to home. I’ve seen the surveillance trap destroy teams at scale.

The challenge at the CTO level is that board members and investors now ask about DORA metrics directly. “What’s your deployment frequency?” “How fast do you recover from incidents?” These questions come from a good place—they want evidence that engineering is performing—but they create enormous pressure to use metrics for evaluation rather than learning.

The External Pressure Problem

When your board asks about DORA metrics quarterly, there’s a natural temptation to cascade that pressure down the organization. Show me the numbers. Explain the trends. Why isn’t this team performing like that team?

I watched a peer CTO go down this path. Six months later, their engineering organization had:

  • Teams gaming every metric imaginable
  • The best engineers planning their exits
  • Trust between engineering and leadership completely broken
  • Board meetings where the metrics looked great while product quality collapsed

What Actually Works at Scale

Here’s the framework I’ve implemented across our 80+ engineer organization:

1. Never discuss individual metrics in performance reviews

This is non-negotiable. DORA metrics are team-level indicators. The moment you start slicing to individuals, you lose all context and create perverse incentives.

2. Team-level metrics with blameless retrospectives

Every sprint, teams review their own DORA metrics and answer: “What’s blocking us from improving?” Not “Why are we slow?” but “What systemic issues can we address?”

3. Focus board conversations on “What’s blocking us?”

When the board asks about metrics, I reframe: “Here’s what DORA tells us about our delivery system health. Here are the three bottlenecks we’re addressing. Here’s our investment in platform engineering to improve the system.”

This shifts the conversation from surveillance to strategy.

4. Platform engineering as DORA investment

We’ve made significant platform engineering investments specifically to improve DORA metrics system-wide. Internal developer portals, standardized CI/CD, observability tooling—all reduce cognitive load and improve delivery without pushing individuals harder.

The Question That Haunts Me

Luis, how do you handle executives who want to use metrics for performance evaluation? I’ve had hiring managers ask “What’s this candidate’s deployment frequency at their current company?” and had to explain why that’s the wrong question.

The cultural shift from surveillance to learning starts at the top, but maintaining it requires constant vigilance. One wrong comment in a leadership meeting can undo months of trust-building.

The AI complication you mentioned is real. We’re seeing deployment frequency rise across teams using Copilot and other tools, but we’re also seeing more subtle bugs slip through. The surveillance approach would blame engineers for “not testing properly.” The learning approach asks “What validation systems do we need for AI-assisted code?”

Appreciate you raising this conversation. It’s one of the most important cultural questions engineering leaders face in 2026.

The scaling challenge hits different. Metrics become more important AND more dangerous during rapid growth.

I experienced this firsthand at Slack vs. my current EdTech startup. At Slack, we had a strong learning culture that made DORA metrics genuinely useful. Teams used them to identify bottlenecks and improve workflows. It felt natural.

At my current startup, we’ve grown from 25 to 80+ engineers in 18 months. I had to actively prevent a surveillance culture from forming, and it required constant intervention.

The “New Manager” Problem

Here’s what I didn’t expect: First-time engineering managers default to metrics for performance evaluation. They’re anxious about evaluating engineers, looking for “objective data,” and DORA metrics seem like the answer.

Three months in, I started hearing phrases like:

  • “Sarah’s deployment frequency is lower than the team average”
  • “Why isn’t Mike’s lead time improving?”
  • “This team’s change failure rate is concerning”

All red flags. All surveillance language. All from managers who genuinely meant well but didn’t know better.

What I Did About It

1. Explicit Team Charters

Every team has a charter that states: “DORA metrics measure system health, not individual performance. We use these metrics to improve our workflows, not to evaluate team members.”

New managers read this on day one. It’s not subtle.

2. Manager Training on Metrics as Learning Tools

Monthly training sessions for engineering managers:

  • How to read DORA metrics as system indicators
  • How to facilitate blameless retrospectives
  • How to ask “What’s blocking us?” instead of “Why are you slow?”
  • How to celebrate when teams slow down to fix technical debt

The first-time managers need this repeatedly. It’s not intuitive.

3. Metrics Retrospectives (Not Reviews)

Every sprint, teams hold a “metrics retrospective”:

  • Review DORA trends (not individual data)
  • Identify system bottlenecks
  • Propose improvements
  • Track whether improvements work

Managers facilitate but don’t evaluate. The team owns the conversation.

4. Celebrate System Improvements (Even When Metrics Worsen)

Last quarter, one of our teams intentionally slowed their deployment frequency to refactor a critical service. Their DORA metrics looked worse for six weeks.

In our all-hands, I publicly celebrated them. “This team made a strategic decision to fix technical debt. Their metrics temporarily worsened, and that’s exactly what we want to see. They’re optimizing for long-term system health, not short-term numbers.”

That moment did more to establish learning culture than any policy document.

The Question for This Community

How do others train new managers to use metrics properly?

Michelle mentioned the challenge from the CTO level. I’m dealing with it at the VP level during hypergrowth. The managers I’m hiring have never managed before, and they’re desperate for “data” to guide their decisions.

What training, frameworks, or practices have helped you shift new managers from surveillance thinking to learning thinking?

Luis, your distinction between Team A and Team B resonates deeply. I’ve seen both outcomes within the same organization based entirely on how managers frame metrics conversations. The learning approach isn’t just better for morale—it actually delivers better software.

Coming from design land, I have a probably-controversial take: DORA metrics can accidentally hurt design-engineering collaboration when teams optimize for them without balancing quality.

My Experience: When Speed Metrics Broke Collaboration

Last year, the engineering team at my previous company started tracking DORA metrics seriously. Great! They wanted to improve delivery speed. Also great!

Then things got weird.

Our design system adoption stalled because engineers started pushing back on design changes: “This will slow our deployment frequency.” Feature polish got cut because “it increases lead time.” Visual bugs were marked as “won’t fix” because fixing them would hurt the team’s change failure rate (since they’d require new deploys).

The engineering metrics looked phenomenal. Deployment frequency up, lead time down, everything green.

But the product looked increasingly rough. User complaints about UX issues went up. Design system adoption—which we’d spent six months building—basically stopped.

The Hidden Cost

Here’s what I learned from watching this unfold: When engineering optimizes DORA metrics without balancing quality and UX, collaboration breaks down.

Engineers weren’t being malicious. They were responding to incentives. Their manager reviewed DORA metrics in 1:1s. Their team celebrated hitting deployment frequency targets. The metrics became the goal.

But software isn’t just about delivery speed. It’s about delivering value. And value includes things that don’t show up in DORA metrics: design quality, accessibility, user satisfaction, design system consistency.

My Startup Failure Lesson

During my failed startup, we made the opposite mistake and then the same mistake.

First: We over-indexed on design quality. Every pixel perfect. Every interaction delightful. We shipped beautifully slow.

Then: We panicked and optimized for speed. We shipped fast with terrible UX. Churned users faster than we could acquire them.

Both approaches failed because we didn’t balance speed with quality.

The Question That Haunts Me

Do we need “design system DORA metrics” or is that the wrong approach entirely?

Part of me wants metrics for design system adoption, accessibility compliance, design QA pass rates. Concrete numbers that balance engineering’s velocity metrics.

But another part of me worries that’s just creating more metrics to game. More surveillance potential. More ways to miss the point.

Maybe the answer is simpler: Pair DORA metrics with user satisfaction and quality metrics. Don’t just track how fast you ship. Track whether what you shipped actually worked.

Luis, your point about “celebrate when metrics temporarily worsen” resonates. If a team slows down to improve accessibility or fix design debt, that’s success. But it requires engineering leadership to recognize that design quality IS system quality.

The surveillance vs learning distinction applies to design-engineering collaboration too. Are we learning together how to ship quality faster? Or are we optimizing separate metrics that accidentally put us in conflict?

Product leader’s honest take: Sometimes I don’t care about DORA metrics at all.

Don’t get me wrong—I care deeply about engineering efficiency. But what actually matters to me (and to the business) is: Are we shipping the right things? Are customers happy? Are we moving the business metrics?

You can have perfect DORA metrics and still build the wrong product. I’ve seen it happen.

The Tension I’ve Observed

At my previous company, engineering got really serious about DORA metrics. They wanted to deploy daily, reduce lead time, minimize change failure rate. All good goals.

But it created friction with product strategy:

  • We needed 2-week beta cycles for our B2B customers to test features properly
  • Engineering wanted to deploy daily, which broke our beta process
  • Product needed time for proper user research before releases
  • Engineering optimized for speed, which meant less time for iteration

Neither side was wrong. But we were optimizing for different things.

When Surveillance Metrics Make Everything Worse

When engineering treats DORA metrics as surveillance tools (individual performance evaluation), the product-engineering tension becomes conflict.

Surveillance approach scenario:

  • Engineering team measured on deployment frequency
  • Product asks for a complex feature requiring careful design
  • Engineers push back because it “hurts our metrics”
  • Product leader (me) frustrated that engineering priorities don’t align with business needs
  • Engineers frustrated that product “doesn’t understand velocity”

Result: Dysfunction.

Learning approach scenario:

  • Engineering team uses DORA metrics for system health
  • Product asks for complex feature
  • Engineering explains technical trade-offs: “We can ship this fast but increase technical debt, or take longer and maintain system health. What’s the business priority?”
  • Product and engineering collaborate on timeline that balances speed, quality, and business needs

Result: Alignment.

The Framework That Works

Here’s what I’ve learned across 12 years in product leadership:

Engineering owns DORA metrics for system health.
These are internal operational metrics. How fast can the system deliver changes? How reliable is the system? How quickly can we recover from failures?

Product-Engineering jointly own business outcome metrics.
Feature adoption, customer satisfaction, retention, revenue impact. These are the metrics that actually matter to the business.

Regular alignment on trade-offs (speed vs quality vs scope).
Weekly product-engineering syncs specifically to discuss: “What are we optimizing for this sprint? Speed? Quality? Specific business outcome?”

Maya’s Point About Design Resonates

Maya’s comment about engineering optimizing DORA at the expense of design quality hits home. I’ve seen the same thing with product requirements.

“We can’t do that user research because it slows lead time.”
“We can’t iterate based on feedback because it hurts our deployment frequency.”
“We can’t fix that UX issue because it increases change failure rate.”

All symptoms of metrics becoming goals rather than indicators.

The Question for Luis

How do you balance DORA optimization with product delivery needs?

Specifically, when product asks for something that will temporarily worsen your DORA metrics (big refactor, slower deployment cadence for stability, etc.), how do you frame that conversation with your team and with executive leadership?

The surveillance approach makes these conversations adversarial. The learning approach makes them collaborative. But I’m curious about the practical mechanics.


Thanks for raising this, Luis. Critical conversation for cross-functional teams. The surveillance vs learning distinction applies across the entire product development process, not just within engineering.