DevEx Elevated from 'Soft Concern' to Performance Indicator. But What Are We Actually Measuring?

DevEx is now a KPI on my executive dashboard. That happened fast.

Six months ago, developer experience was a “nice to have” – something we acknowledged in planning sessions but never quite prioritized. Now? Our board asks about it in quarterly reviews. Our Series B investors want to see DevEx metrics alongside velocity and quality. Our CTO is expected to report on it monthly.

I should be celebrating. This is what we wanted, right? Engineering effectiveness finally getting executive attention. Developer happiness treated as a business outcome, not just an engineering concern.

But here’s where I’m stuck: What are we actually measuring?

The Framework Buffet

I’ve spent the last three weeks researching this. The landscape is… overwhelming.

There’s DORA metrics – the OG of engineering effectiveness. Deployment frequency, lead time for changes, change failure rate, mean time to recovery. Clean, measurable, widely adopted (40.8% of orgs according to recent data). But DORA tells you about your delivery pipeline, not necessarily about developer experience.

Then there’s SPACE – satisfaction, performance, activity, communication, efficiency. More holistic, captures the human element. But also more subjective, harder to track consistently, and honestly a bit fuzzy when you’re trying to explain it to a CFO who wants hard numbers.

Now we have DX Core 4 – which combines DORA, SPACE, and DevEx metrics into speed, effectiveness, quality, and business impact dimensions. It’s comprehensive. It’s also complex. And it requires buy-in from multiple teams to instrument properly.

Oh, and there’s the Developer Experience Index with 14 factors. And build duration metrics. And PR velocity. And developer survey scores. And time-to-first-commit for new hires.

The choice paralysis is real.

What My Leadership Actually Wants

Here’s what happened in our last exec meeting:

“David, what’s our DevEx score?”

I don’t have a single score. I have deployment frequency trending up, developer satisfaction surveys at 7.2/10, average PR review time at 4.3 hours, and build times that vary wildly by service (2 minutes for frontend, 23 minutes for our monolith).

“So… is that good?”

I don’t know. Good compared to what? Good according to which framework? Good enough to justify the platform engineering team we’re building?

The ROI Problem

The research says each one-point gain in developer experience saves 13 minutes per developer per week. Over a year, that’s 10 hours per engineer. For our 35-person engineering team, that’s 350 hours annually – roughly $35K in reclaimed time per point.

Great! Except… how do I measure that one-point gain? Is it a survey question (“Rate your overall developer experience 1-10”)? Is it a composite score across multiple metrics? Is it comparing our numbers to industry benchmarks we can’t access?

And here’s the uncomfortable truth: I can’t prove we’re improving DevEx without measuring it consistently. But I also can’t measure it consistently until I pick a framework. And I can’t pick a framework without understanding what we’re actually trying to optimize for.

What I Think We’re Missing

The more I dig into this, the more I suspect we’re asking the wrong question.

Instead of “What framework should we adopt?” maybe we should be asking:

  • What specific friction are our developers experiencing right now? (Not theoretical DevEx, actual pain points)
  • Which of those friction points correlate with business outcomes we care about? (Shipping speed, quality, retention)
  • What’s the simplest metric that would tell us if we’re reducing that friction? (Not the most comprehensive – the simplest)

Because right now, it feels like we’re at risk of building a DevEx measurement theater – dashboards full of numbers that make executives feel good but don’t actually help developers ship better software faster.

So Here’s My Question for This Community

What are you actually measuring when it comes to developer experience?

Not what you should be measuring according to the frameworks. Not what looks good on slides. What metrics are you tracking that actually drive decisions and improvements?

And more importantly: How did you choose them? Did you start with DORA and expand? Did you run developer surveys and let the pain points guide your metrics? Did you just measure build times because that was easiest to instrument?

I’m especially curious to hear from folks who’ve tried multiple approaches. What worked? What was just measurement theater? Where did you waste time?

Our engineering team deserves better than fuzzy feel-good metrics that don’t lead to action. But they also deserve better than surveillance dashboards that track everything and improve nothing.

How do we get this right?

Real talk: we’re tracking DORA + quarterly developer surveys. That’s it.

Started with grand ambitions to implement full SPACE framework last year. Built dashboards tracking 23 different metrics. Activity levels, communication patterns, code review velocity, merge frequency, PR sizes, cycle time breakdowns… the works.

Three months in, nobody was looking at the dashboards. Six months in, we couldn’t remember why we picked half the metrics.

The Dashboard Problem

You know what we discovered? Too many metrics = no metrics. Leadership stopped checking because there was no clear signal. Engineers stopped caring because it felt like surveillance. And I spent more time explaining the dashboard than actually improving developer experience.

Classic case of measuring everything and improving nothing.

What Actually Correlates

After we simplified, here’s what we found matters for our team:

Developer satisfaction scores (quarterly anonymous survey, 8 questions) → directly correlates with retention. When satisfaction drops below 6.5/10, we see engineers start interviewing within 2-3 months. When it’s above 7.5/10, retention is solid.

Build times (p50 and p95) → correlates with deployment frequency. When our monolith build crossed 25 minutes, deploy frequency dropped 40% because devs batched changes to avoid waiting. Got it under 15 minutes, frequency went back up.

Time to unblock (self-reported in survey: “How often do you feel blocked waiting for others?”) → correlates with both satisfaction and velocity. High blockers = low satisfaction = missed sprint goals.

That’s it. Three metrics we actually act on.

The Goodhart’s Law Problem

Here’s what worries me about DevEx as a KPI: when a measure becomes a target, it ceases to be a good measure.

Example from my Adobe days: Team was measured on PR velocity. Know what happened? Devs started breaking work into tiny PRs to hit velocity targets. PRs got smaller, velocity went up, but features took longer to ship because of coordination overhead.

We were optimizing the metric, not the outcome.

So my advice: Start simple. DORA + one DevEx metric that your team actually complains about. Not what the framework says you should measure. What pain point keeps coming up in retros?

Is it slow CI/CD? Measure build times.

Is it unclear requirements? Measure rework percentage or “changed after PR opened” rate.

Is it knowledge silos? Measure PR review times across teams.

Pick the friction that hurts most, measure it simply, fix it, then add the next metric.

Because at the end of the day, your developers will tell you what’s broken. The trick is listening to them, not to the framework vendors.

@product_david - what’s the #1 complaint you hear from your engineering team in sprint retros? Start there.

Okay but can we talk about the elephant in the room?

Metrics often feel like micromanagement disguised as “improving developer experience.”

I’ve been on both sides of this. As a design systems lead, I track some metrics. As someone who coded at a startup, I’ve been on the receiving end of… let’s call it “enthusiastic measurement.”

Story Time: My Failed Startup

We were obsessed with velocity metrics. Sprint points completed. Story count. Commit frequency. PR merge rate. All tracked religiously in our Monday leadership meetings.

You know what we weren’t measuring? Whether anyone actually wanted the product we were building.

We had beautiful charts showing increasing velocity. We were shipping so much code. And we had zero product-market fit.

DevEx was probably great – we had fast builds, clear processes, minimal blockers. We were a well-oiled machine… building the wrong thing.

Good DevEx metrics told us nothing about whether we were succeeding as a business.

The Trust Problem

Here’s my hot take: when management starts measuring everything developers do, it breaks trust.

I get that’s not the intent. The intent is improvement. But the impact is often “my manager is tracking how long my PRs stay open.”

Especially when metrics aren’t paired with genuine improvement investments. “We tracked your pain points! Now… keep dealing with them because we’re not prioritizing platform work this quarter.”

That’s not DevEx. That’s surveillance theater.

What I’d Actually Want Measured

If I were still coding full-time, here’s what I’d want leadership tracking:

  • Time to get unblocked when I’m stuck waiting for someone else
  • Clarity of requirements when I pick up a ticket (maybe “percentage of tickets that change scope mid-development”?)
  • Ease of local setup for new engineers (time from laptop arrival to first commit)
  • Documentation quality (can I find answers without asking?)

Notice what’s missing? Output metrics. Because output metrics make me feel like a code factory, not a builder.

The Real Question

@product_david you asked “how do we get this right?”

I think you start by asking: Are we measuring to improve the experience, or to justify headcount to executives?

Because those are different goals with different metrics.

If it’s genuinely about experience: ask your engineers what makes their job harder. Then measure whether those things improve. Simple.

If it’s about executive reporting: be honest about that. Don’t pretend surveillance dashboards are “for the developers.”

Also… did we ship the feature users wanted? Did the design solve the problem? Did we learn something from building it?

Good DevEx should make us ship better things faster. Not just more things faster.

Just my $0.02 from someone who’s lived through both sides. :woman_shrugging:

Both perspectives here are valid, and I think they point to the real challenge: DevEx measurement must serve both developer needs and business accountability.

Let me share how we’re approaching this at our company, because I’ve struggled with exactly this tension.

The Strategic Frame

Here’s the reality: as CTO, I need to defend engineering investments to the board. When we ask for platform engineering headcount, infrastructure spend, or tooling budgets, someone will ask “what’s the ROI?”

“Our developers will be happier” doesn’t cut it.

“We’ll improve deployment frequency by 30% and reduce incidents by 25%, leading to $X in reclaimed engineering time and $Y in reduced downtime costs” – that gets approved.

So yes, we measure. But we measure with purpose.

Our Approach: DX Core 4 (Customized)

We use a modified version of DX Core 4 that balances four dimensions:

1. Speed (DORA delivery metrics + perceived productivity)

  • Deployment frequency
  • Lead time for changes
  • Developer survey: “I can ship my work without excessive waiting”

2. Effectiveness (Developer Experience Index, simplified)

  • Quarterly survey: 6 core questions about tooling, documentation, and blockers
  • We track the index score AND the individual pain points

3. Quality (DORA stability + code quality perceptions)

  • Change failure rate
  • MTTR
  • Developer survey: “I’m confident in our code review and testing processes”

4. Business Impact (the part most frameworks skip)

  • Engineering cost as % of revenue
  • Feature delivery vs roadmap commitments
  • Customer-reported bugs per release

Real Example: Build Time Investment

Q3 2025: Build times hit 35 minutes for our main service. Developer satisfaction survey flagged “slow CI/CD” as top pain point.

Investment: 2 engineers for 6 weeks to optimize builds and implement better caching.

Outcome:

  • Build time: 35 min → 14 min (60% reduction)
  • Deployment frequency: 2x per week → 5x per week (150% increase)
  • Developer satisfaction (CI/CD question): 5.2/10 → 7.8/10
  • ROI calculation: 21 min saved per build × 200 builds/week × $150/hour eng cost = $630K annually

That’s the story I told the board. And it’s why they approved our platform engineering team expansion.

The Framework Trap

@eng_director_luis is absolutely right about metric overload. We tried comprehensive measurement initially – failed spectacularly.

Here’s what works for us:

Don’t just adopt DORA or SPACE verbatim. Customize to your business context.

Ask: What outcomes matter for OUR business? B2B SaaS with enterprise customers needs different metrics than B2C mobile app.

For us: deployment frequency matters because we have aggressive feature commitments. For a medical device company? Maybe stability and compliance audit-readiness matter more.

The Trust Concern

@maya_builds – your point about surveillance vs improvement is critical. We address this through transparency:

  1. Developers see the same dashboards executives see. No hidden metrics.
  2. Survey is anonymous and aggregated. We see themes, not individual responses.
  3. Metrics drive investment decisions, not performance reviews. If build times are slow, we invest in builds – we don’t blame developers.

The moment developers feel metrics are used against them rather than for them, trust collapses and metrics become useless.

Board Expectations

One more reality check: in 2026, boards expect engineering effectiveness metrics. That’s not going away.

The question isn’t whether to measure. It’s what to measure and how to use those measurements.

Measure what you can act on. Act on what you measure. And connect engineering improvements to business outcomes the board cares about.

For @product_david: Start with the pain point Luis mentioned (what comes up in retros?), measure it simply, fix it, prove the ROI, then expand your measurement as you build credibility.

Don’t try to implement comprehensive DevEx measurement before you’ve proven one win.

Adding one more dimension that I haven’t seen mentioned yet: whose experience are we measuring, and are we measuring it equitably?

The Aggregation Problem

Here’s what happened when we first rolled out DevEx surveys at my current company:

Overall score: 7.1/10. Looks decent, right?

Then we segmented by cohort:

  • Senior engineers (5+ years): 7.8/10
  • Mid-level engineers (2-5 years): 7.2/10
  • Junior engineers (<2 years): 5.9/10

And when we further segmented by background:

  • CS degree from target schools: 7.6/10
  • Bootcamp graduates: 5.8/10
  • Self-taught: 5.4/10

Average metrics hide disparate experiences. And those disparate experiences often correlate with retention problems.

What We Discovered

Our junior engineers were struggling with completely different friction than senior engineers.

Senior engineers complained about:

  • Slow code review cycles
  • Unclear architectural decisions
  • Meeting overhead

Junior engineers complained about:

  • Incomplete onboarding docs
  • Uncertainty about who to ask for help
  • Fear of “looking stupid” in front of senior engineers
  • Unclear expectations about code quality

We were measuring DevEx as a monolith. But onboarding experience, mid-career experience, and senior experience are different problems requiring different solutions.

Example: Time to First Commit

One metric we started tracking by cohort: time from start date to first merged PR.

Across all engineers: 8.3 days average

By segment:

  • CS grads from target schools: 4.2 days
  • Bootcamp grads: 11.7 days
  • Self-taught: 13.1 days

That’s a 3x difference. And it correlated directly with 6-month retention rates.

Why? Our onboarding docs assumed familiarity with our tech stack. CS grads had the background. Bootcamp and self-taught engineers were learning on the fly.

Fix: Created role-specific onboarding paths. Added “foundations” modules for common gaps. Assigned peer mentors.

Outcome: Time to first commit for bootcamp grads dropped to 6.8 days. 6-month retention improved by 30%.

The Inclusive Measurement Question

When you choose DevEx metrics, ask:

  • Who defined what “good” experience looks like? (Was it mostly senior engineers? Mostly people who look like the existing team?)
  • Do these metrics capture the experience of your most junior engineers? (Or just the people who already know the system?)
  • Are you measuring equity in experience? (Do different cohorts have meaningfully different DevEx scores?)

If your DevEx metrics show great results on average but you’re losing junior engineers and underrepresented talent, your metrics are hiding a problem.

What We Track Now

We still track the standard stuff @cto_michelle mentioned (DORA, surveys, build times). But we always segment by:

  • Seniority level (junior/mid/senior)
  • Time at company (<6 months / 6-18 months / 18+ months)
  • Educational background (when we have the data)

And we track onboarding-specific metrics separately:

  • Time to first commit
  • Time to first on-call rotation
  • Time to independently leading a feature
  • 30/60/90 day onboarding satisfaction

Why This Matters

You can have “good” DevEx on average while systematically providing worse experience to certain groups. That creates retention problems and limits your hiring pipeline.

In an industry where we’re all competing for talent, and where underrepresented groups already face higher barriers to entry, equitable DevEx isn’t just about fairness. It’s about competitiveness.

@product_david – when you look at your DevEx data, can you see whether junior engineers have the same experience as senior engineers? Whether your hiring pipeline from different backgrounds is supported equally?

If not, your measurements might be telling you everything’s fine when there’s actually significant inequity in experience.