If We Can't Measure Codebase Health, We Can't Sell It to Execs: What Metrics Actually Work?

I spent six months trying to get our CFO to approve a major tech debt initiative. Every meeting ended the same way:

CFO: “Prove that tech debt is causing our problems.”
Me: (Shows cyclomatic complexity charts)
CFO: “I don’t know what that means. Show me business impact.”

We were speaking different languages. And until I learned to translate, I was losing the argument.

The Breakthrough

The turning point came when I stopped talking about code quality metrics and started talking about business outcomes.

Here’s what finally resonated:

1. Velocity Decline Trend

What I showed:

  • Q1 2024: Average 42 story points per sprint
  • Q4 2024: Average 29 story points per sprint
  • 31% decline in delivery capacity with same headcount

How I framed it:
“We’re delivering the equivalent of 3 fewer features per quarter than we were a year ago. Same team size, lower output. Technical debt is the drag coefficient.”

This got attention because it directly impacted product roadmap commitments.

2. Incident Rate Acceleration

What I showed:

  • Production incidents increased 2.3x year-over-year
  • Each P1 incident costs ~$50K in engineering time + customer impact
  • Annual tech-debt-related incident cost: $800K+

How I framed it:
“We’re spending more on firefighting than on building. And every incident erodes customer trust, which Finance cares about because it impacts retention and NRR.”

3. Time-to-Market Degradation

What I showed:

  • Tracked 3 “reference features” we’d built in 2023
  • Built same-sized features in 2024
  • 40% longer delivery time for equivalent complexity

How I framed it:
“In the time it took us to respond to a competitive threat, our competitor shipped AND iterated twice. We lost 2 major deals specifically because we were ‘too slow to adapt.’”

This was the killer metric for our CEO. Lost deals = lost revenue. Clear cause and effect.

4. Onboarding Time Expansion

What I showed:

  • Time from start date to 10th merged PR
  • 2023 average: 3.2 weeks
  • 2024 average: 8.1 weeks
  • 2.5x slower ramp time

How I framed it:
“We’re paying new engineers full salary for 5 extra weeks before they’re productive. Across 20 hires this year, that’s $400K in sunk onboarding cost.”

5. Developer Satisfaction Correlation

What I showed:

  • Quarterly developer satisfaction surveys
  • 2023 Q4: 78% satisfied with “quality of codebase”
  • 2024 Q4: 45% satisfied
  • Voluntary attrition increased from 8% to 18%

How I framed it:
“The engineers who are frustrated with codebase quality are 2.5x more likely to leave. Each departure costs $200K in replacement + lost productivity. This isn’t a morale issue—it’s a retention crisis with real financial impact.”

The Pitch That Worked

I combined all of this into a single slide:

“Technical Debt Costs Us 2 Engineering Teams Worth of Productivity Every Quarter”

  • Lost velocity = 13 engineers (31% of capacity)
  • Incident response = 4 engineers (10% of capacity)
  • Total drag: equivalent to 17 engineers out of 50

"We can either:
A) Hire 17 more engineers ($4M+ annual cost)
B) Invest $2M to fix the tech debt reducing our effective capacity by 34%

Option B is the cheaper way to add capacity."

The CFO approved it in that meeting.

The Ongoing System We Built

Now we track a “Codebase Health Score” (0-100) that combines:

  1. Velocity trend (story points per sprint, 3-quarter moving average)
  2. Deployment metrics (frequency, failure rate, rollback rate)
  3. Build/test performance (time to green build, test flakiness %)
  4. Time-to-production (PR opened to deployed in prod)
  5. Developer satisfaction (quarterly survey, codebase quality question)
  6. Onboarding velocity (time to 10th merged PR)

This score gets reviewed quarterly by the board, right alongside our financial metrics.

When the score trends down, we automatically allocate capacity to quality work. No debate, no negotiation. It’s baked into our planning process.

The Hard Part Nobody Talks About

Getting executive buy-in once is easy compared to maintaining that prioritization when feature pressure mounts.

We made it stick by:

  1. Making it non-negotiable - 20% of capacity for quality work is protected in planning
  2. Executive sponsorship - CEO explicitly backed this commitment in all-hands
  3. Visible progress - Monthly infrastructure wins demos to entire company
  4. Tying it to manager OKRs - Eng managers evaluated on codebase health improvements

The metrics got us the initial buy-in. The cultural changes kept the commitment alive.

What I’m Curious About

For other technical leaders who’ve fought this battle:

  1. What metrics resonated with YOUR executives?
  2. How did you quantify the “invisible work” of tech debt?
  3. What arguments DIDN’T work? (So we can all avoid them)
  4. How do you maintain momentum when priorities shift?

I’m particularly interested in hearing from leaders at earlier-stage companies. At a startup, losing 3 months of velocity could be existential. How do you make this trade-off when you’re racing for product-market fit?

This is exactly the playbook I wish I’d had 18 months ago.

@cto_michelle you nailed the critical insight: Technical metrics don’t resonate with business leaders. Impact metrics do.

The Retention Correlation I Added

Your slide about “costing us 2 engineering teams” is brilliant. I added a retention dimension that made it even more compelling:

Exit Interview Analysis: Tech Debt vs. Compensation Mentions

I started tracking how often departing engineers mentioned “code quality” or “codebase frustration” versus “compensation” as reasons for leaving.

The data shocked our exec team:

  • Total exits in 2024: 12 engineers
  • Mentioned compensation: 4 (33%)
  • Mentioned tech debt/code quality: 9 (75%)
  • Mentioned BOTH: 3 engineers

The insight: Engineers who mention codebase frustration are 3x more likely to cite it than compensation issues.

But here’s the kicker: We spent $150K on retention bonuses (targeting comp concerns) but only $0 on codebase improvements (the actual primary driver).

The Pitch That Landed

We’re solving for the wrong retention problem.

I showed this to our CEO and CFO:

  • Cost to give 10% retention raises to at-risk engineers: $200K/year
  • Cost of 3-month tech debt sprint: $800K one-time
  • Expected retention improvement from raises alone: Minimal (it’s not the real problem)
  • Expected retention improvement from codebase quality: 12 → 2 voluntary exits = $2M saved in replacement costs

ROI: 2.5x return in first year, and ongoing benefits after that.

The CFO approved it immediately. Why? Because I showed that our current retention strategy was addressing the symptom (comp) while ignoring the disease (working conditions).

The Metric That Surprised Me

Your “Developer Satisfaction Correlation” metric is critical, but I’d add a specific question that’s been illuminating:

“On a scale of 1-10, how proud are you of the code you’re shipping?”

This question cuts through everything. Engineers who answer 6 or below are 4x more likely to leave within 6 months.

It’s not about “satisfaction with tools” or “happiness at work.” It’s about professional pride in the work itself.

When engineers feel like they’re shipping duct-taped solutions instead of quality work, they start looking elsewhere. Not for more money—for a place where they can do work they’re proud of.

The Ongoing Challenge

Your point about maintaining prioritization is THE challenge. We’ve had exec support for 8 months now, and here’s what I’ve learned:

What works:

  • Quarterly “codebase health” reviews with exec team (same rigor as financial reviews)
  • Making it part of eng manager OKRs (they’re accountable for the metrics)
  • Celebrating infrastructure wins publicly (so people see the value)

What doesn’t work:

  • Assuming “we agreed on this once” means it’s permanent
  • Letting quality time get eaten by “urgent” features without pushback
  • Accepting “just this quarter” exceptions that become the norm

The metrics open the door. Culture keeps it open.

The “velocity decline” metric is what finally broke through for us, but I’d add a visual element that made it even more compelling.

The Pain Point Heatmap

@cto_michelle your metrics are all quantitative, which is powerful. I added a qualitative overlay that made the problem concrete and spatial for executives who don’t live in the codebase.

What I built:

A visual heatmap of our codebase showing:

  • Red zones: Areas that generate the most bugs, take longest to change, receive most engineer complaints
  • Yellow zones: Moderate friction areas
  • Green zones: Well-maintained, easy to work with

Then I overlaid:

  • Which red zones touch customer-facing features
  • Which red zones block strategic initiatives
  • Which red zones cause the most production incidents

The visualization made it obvious: Our biggest business priorities were sitting on top of our worst technical debt.

The Conversation This Enabled

CFO: “Why is the checkout flow red?”
Me: “Because it was built in 2019 as a quick MVP, never refactored, and now every change takes 2-3x longer than it should. That’s why the mobile checkout feature you wanted took 8 weeks instead of 3.”

CFO: “Why haven’t we fixed it?”
Me: “Because it’s never been prioritized. Every quarter we choose new features over maintenance.”

That one conversation got us $500K allocated to checkout system refactoring.

The Metric That Resonates in Financial Services

We added a metric specific to our industry: Regulatory Compliance Velocity

How long does it take to implement regulatory changes? (In fintech, this is life-or-death)

  • 2023 average: 4 weeks
  • 2024 average: 11 weeks
  • Compliance risk increased 2.75x

When I showed our Chief Risk Officer that tech debt was making us slower to comply with regulations, we got immediate executive support. In our industry, compliance delays = existential risk.

The lesson: Translate tech debt into the language of what YOUR executives care most about.

  • Fintech: Regulatory compliance velocity
  • E-commerce: Checkout conversion optimization speed
  • B2B SaaS: Time to respond to RFP requirements
  • Consumer: A/B test iteration velocity

The Counter-Metric I Track

Here’s something I don’t see others talking about: Cost of NOT fixing tech debt.

I track “features we couldn’t build” or “opportunities we couldn’t pursue” because of tech debt constraints.

Examples:

  • Mobile app delayed 6 months because our API wasn’t designed for mobile use cases
  • International expansion delayed 9 months because of hardcoded US assumptions
  • Enterprise tier blocked because multi-tenancy would require massive refactoring

Each of these represents millions in potential revenue we couldn’t capture.

When I presented “Tech debt cost us $3M in lost opportunities this year,” the conversation shifted from “Should we invest in tech debt?” to “How fast can we start?”

The Metrics I Wish I’d Tracked Earlier

Looking back, I wish I’d tracked:

  1. Feature estimation accuracy over time - Are we getting worse at estimating because of growing tech debt?
  2. “Emergency hotfix” frequency - How often are we context-switching to fix production issues?
  3. “Can’t do that because…” log - Every time engineering says “that’s not possible with our current architecture”

These would have made the case even earlier.

From product leadership, I want to highlight something that often gets lost in these conversations:

The velocity decline metric only works if you can show the TREND, not just a snapshot.

The Mistake I See Engineering Leaders Make

They show: “We’re moving slower than we used to”

Execs hear: “You’re being less productive”

The narrative matters as much as the numbers.

The Framing That Works Better

@cto_michelle your approach is solid, but I’d add one critical element: Show that velocity decline is ACCELERATING, not linear.

What resonates:

“In Q1, we delivered 5% less than the quarter before. In Q2, 12% less. In Q3, 19% less. If this trend continues, by Q4 2025 we’ll be delivering 40% less than we did in Q1 2024.

This is the “burning platform” narrative that drives urgency.

Linear decline looks like a productivity problem. Accelerating decline looks like a structural problem that requires investment.

The Product Velocity Lens

I measure something slightly different that complements your metrics:

“Feature Delivery Time Trend” for equivalent complexity

Instead of story points (which are relative and gameable), I track:

  • Pick 3 “reference features” we built in Year 1 (e.g., “Add new payment method,” “Build export to CSV,” “Implement SSO”)
  • Track how long it takes to build equivalent features in Year 2, Year 3
  • Control for complexity by using PM estimates, not eng estimates

Our data:

  • Year 1: Reference features averaged 3.2 weeks
  • Year 2: Same complexity averaged 4.8 weeks (50% slower)
  • Year 3: Same complexity averaged 6.1 weeks (90% slower)

This metric is undeniable for product and exec teams because:

  • It’s not “engineering making excuses”
  • It’s not “changing story point estimates”
  • It’s the same work taking objectively longer

And it directly impacts our ability to make roadmap commitments to customers and the board.

The Lost Opportunity Cost Frame

@eng_director_luis mentioned this, and I want to emphasize it:

“Tech debt doesn’t just slow us down—it closes off strategic options.”

We couldn’t build:

  • Mobile app (API not designed for it)
  • Enterprise tier (no multi-tenancy)
  • International expansion (US-centric assumptions baked in)

Each of these was a board-approved strategic priority that we had to push out 12-18 months because of tech debt.

When the board asked “Why are we behind on strategic initiatives?” I pointed to tech debt. That got their attention in a way that velocity charts never did.

The Metrics That Didn’t Work

Let me share what DIDN’T resonate with our exec team:

:cross_mark: Code coverage percentage - “Why does 85% coverage matter?”
:cross_mark: Technical debt in hours/days - “How accurate is this estimate?”
:cross_mark: Cyclomatic complexity scores - “I don’t know what this means”
:cross_mark: Number of open bugs - “Are these real bugs or nice-to-haves?”

These are all valuable metrics for engineering teams, but they don’t translate to business impact.

The Question for Michelle

Your Codebase Health Score (0-100) is interesting, but how did you prevent it from becoming a vanity metric?

At my last company, we had a similar “Engineering Health” score. The problem was:

  • It was composed of too many sub-metrics (9 different inputs)
  • No one could explain what “73” meant vs “68”
  • It was easily gamed (fix the easy metrics, ignore the hard ones)

How did you make yours meaningful and actionable rather than just another dashboard number?

I love that everyone’s focusing on quantitative metrics, but I want to argue for something different:

Qualitative data matters, especially for executive communication.

The Power of Engineer Quotes

When @vp_eng_keisha shared exit interview data, that’s powerful. But what made the difference for us was pairing the numbers with actual engineer quotes.

Instead of: “75% of exiting engineers mentioned tech debt”

Try: “75% of exiting engineers mentioned tech debt. Here’s what they said:”

“I felt like I was maintaining a house of cards instead of building software.”

“Every PR was a negotiation with a codebase that fought back.”

“I couldn’t do the quality work I’m proud of, so I left.”

Those quotes hit differently than percentages. They make the problem emotional and human, not just analytical.

The Survey Question That Changed Everything

We added one question to our quarterly eng surveys:

“On a scale of 1-10, how proud are you of the code you’re shipping?”

The answers were devastating:

  • Q1 2024: Average 6.2
  • Q2 2024: Average 4.8
  • Q3 2024: Average 3.9

When I showed this trend to our CEO along with the quote: “I’m embarrassed to show my portfolio work from this company,” he immediately understood why we were losing senior engineers.

Numbers show scope. Quotes show pain. You need both.

The Visualization That Worked

@eng_director_luis mentioned the pain point heatmap—I love this approach because it makes the abstract concrete.

I did something similar for design systems:

“Component Graveyard” - A visual showing:

  • 47 different button variants across the codebase
  • 23 different modal implementations
  • 14 different spacing systems
  • 8 different color palettes (we have a “design system” with 1 defined palette)

I showed this to our exec team and asked: “How would you feel if our brand looked like this to customers?”

The answer: “Horrified. We’d never allow that.”

My response: “This IS what our internal systems look like. And this is what our engineers and designers deal with every single day.”

That visualization got us design system funding in the next budget cycle.

The Metric I Wish Existed

Here’s what I struggle with: How do you measure the compound effect of small frustrations?

It’s not one big problem. It’s the death by a thousand cuts:

  • 30-second delay to start local server (10x per day = 5 minutes lost)
  • Flaky tests requiring 2-3 re-runs (15 minutes lost)
  • Searching for the right component (20 minutes lost)
  • Build times (30 minutes lost)
  • Context switching from broken workflows (45 minutes lost)

Total: 2 hours per day lost to friction.

Across 30 engineers: 60 hours/day = 7.5 FTEs lost to fighting tools.

But there’s no metric that captures this aggregate effect. Each individual item is “not worth fixing.” Collectively, they’re destroying productivity and morale.

How do we measure this? How do we make executives understand the cumulative cost of small frictions?

The Honest Truth

All these metrics are valuable. But the hardest part isn’t convincing execs once—it’s maintaining that conviction when priorities shift.

Six months ago, our exec team was fully bought in. Tech debt was a priority.

Now, we’re behind on roadmap commitments. Sales is screaming for features. The board wants to see growth.

And guess what’s getting cut? The 20% quality time we fought so hard for.

@cto_michelle you mentioned this challenge—I’d love to hear more about how you’ve maintained momentum when the business says “we can’t afford to slow down right now.”

Because that’s the real test of whether these metrics create lasting change or just temporary buy-in.