Adobe Cut Lead Time 65% by Ditching Story Points. Should We All Stop Playing Estimation Poker?

So here’s a story I haven’t told publicly before. :sparkles:

Back in 2021, my B2B SaaS startup was dying. We had 3 paying customers, 18 months of runway left, and a board meeting in 6 weeks. Our CEO (bless him) decided the problem was that engineering wasn’t moving fast enough. So we implemented story points. Velocity tracking. Sprint retrospectives focused on “why we only hit 32 points instead of 40.”

We went from shipping features slowly to shipping features really fast. Our velocity chart looked amazing. :chart_increasing:

We were out of business 8 months later.


Adobe cut lead time by 65% by ditching velocity metrics entirely

I just read about Adobe’s productivity transformation and it hit me like a ton of bricks. Between 2024-2026, they completely abandoned velocity-based performance evaluations and moved to a composite productivity scorecard built on DORA metrics + flow metrics + business outcomes.

The results:

  • Lead time: 7 days → 2 days (65% improvement)
  • Cycle time: reduced by 40% through WIP limits
  • Local development friction: reduced by 50% with faster build tools
  • Engineering work tied directly to feature adoption and revenue

They stopped measuring how much work we’re doing and started measuring how quickly valuable work reaches customers.


The problem with velocity theater :performing_arts:

Here’s what I learned from my startup failure: Story points incentivized us to ship features fast, not to ship the right features.

Our design team (me) would spend 2 weeks researching a customer problem. Engineering would estimate it at 13 points. Product would say “that’s too expensive, do this 5-point feature instead.” We optimized for velocity, not for customer impact.

When we finally shut down, you know what our customers told us? “You kept adding features we didn’t ask for. We just wanted the core product to work reliably.”

We were so busy moving fast we forgot to ask if we were moving in the right direction. :broken_heart:


The design-engineering incentive misalignment

From a design systems perspective, velocity metrics create perverse incentives:

  • Velocity rewards new features. So engineers resist refactoring, accessibility improvements, design system consistency work - anything that doesn’t ship visible features.

  • Velocity rewards individual throughput. So engineers avoid collaborative design reviews, pair programming, knowledge sharing - anything that slows down their personal point total.

  • Velocity rewards predictable work. So teams avoid ambitious technical challenges, innovative solutions, or anything with uncertainty.

The work that makes products great doesn’t show up on a velocity chart. The work that prevents tech debt doesn’t earn you story points. The mentoring that develops junior engineers doesn’t boost team velocity.

Adobe figured this out. They replaced velocity with metrics that actually matter:

  • How fast do changes reach production? (Deployment frequency)
  • How quickly can we deliver value? (Lead time for changes)
  • How reliable are our releases? (Change failure rate)
  • Are customers adopting what we ship? (Feature adoption rates)
  • Is this work tied to business outcomes? (Revenue impact)

So here’s my question for the community :thinking:

What are you actually measuring in your engineering teams? And more importantly - does it incentivize the behavior you actually want?

Because I’m looking at our design system roadmap right now and I’m realizing: if I measured my team on velocity, we’d ship 40 new components this quarter. If I measure on impact, we’d ship 3 components that 12 product teams actually adopt.

Those are wildly different outcomes.

I’m also curious: Has anyone here successfully moved away from story points? What did you replace them with? Did your engineers resist? Did leadership freak out about losing “quantifiable productivity metrics”?

And honestly - has anyone else been burned by velocity theater? Where you’re shipping fast but shipping the wrong things?

I can’t be the only one who learned this lesson the hard way. :speech_balloon:


Sources:

Maya, this hits home. I’ve been on both sides of this conversation over 18 years - as an engineer gaming story points at Intel, and now as a director trying to measure 40+ engineers without creating the same perverse incentives. :light_bulb:

Your startup story is painful but valuable. The challenge you’re describing - optimizing for output over outcomes - is exactly what I see happening at my financial services company right now.

Story points are broken when tied to performance reviews

Here’s the data point that should terrify everyone: A 2025 DORA study found that 42% of teams admitted to manipulating velocity metrics when tied to performance reviews.

Think about that. Nearly half of teams are gaming the system. Engineers inflate estimates. Teams cherry-pick easy work. Junior engineers get pressured to accept unrealistic point commitments.

I’ve seen this firsthand. An engineer on my team once spent 3 days refactoring a critical authentication module that was creating security vulnerabilities. Made the codebase 40% faster, eliminated 3 production incidents per week. Zero story points. Looked like a “low productivity week” on the velocity chart.

That’s insane.

But DORA alone isn’t enough either :thinking:

Here’s where I’ll push back a bit on the Adobe example. Yes, DORA metrics are a massive improvement over story points. Lead time, deployment frequency, change failure rate, MTTR - these measure flow and reliability, not arbitrary complexity estimates.

But in my world (financial services with compliance requirements), DORA metrics alone miss critical context:

  • Security review time doesn’t show up in deployment frequency
  • Regulatory compliance work doesn’t ship fast but is legally required
  • Cross-team dependencies can tank your lead time through no fault of your team

A recent GetDX analysis (March 2026) makes this point explicitly: DORA metrics alone are insufficient for modern engineering teams, especially AI-assisted ones.

What actually works: Flow metrics + SPACE framework

My current approach combines three measurement systems for different purposes:

1. Flow metrics for planning:

  • Cycle time (how long work takes once started)
  • Work in progress limits (preventing context switching)
  • Throughput (completed work per time period)
  • These help us identify bottlenecks and optimize process

2. SPACE framework for team health:

  • Satisfaction and well-being (survey quarterly)
  • Performance (business outcomes, not velocity)
  • Activity (DORA metrics)
  • Communication and collaboration quality
  • Efficiency and flow (developer experience metrics)

3. Outcome metrics for impact:

  • Feature adoption rates (are customers using what we ship?)
  • Incident reduction (is quality improving?)
  • Time to customer value (from idea to customer hands)
  • Revenue/cost impact (business outcomes)

Different metrics for different purposes. Not one number to rule them all.

The pendulum swing danger :balance_scale:

Here’s my concern with the “ditch all estimation” movement: I’ve seen teams swing from “measure everything badly” to “measure nothing” and declare victory.

Then 6 months later, leadership has no idea why projects are late, teams can’t identify their own bottlenecks, and everyone’s back to intuition-driven chaos.

Maya, your question is spot on: Does it incentivize the behavior you actually want? But you still need some measurement to answer that question.

The key insight from Adobe isn’t “stop measuring.” It’s “measure what actually matters to customers and the business.”

That’s a huge difference.

What works for mid-size teams?

Since Keisha asked (hi Keisha! :waving_hand:) about what works for 20-50 person teams:

Start with lead time for changes as your north star metric. It’s simple, hard to game, and directly measures value delivery speed.

Add deployment frequency to measure how often you’re delivering.

Layer in team satisfaction surveys quarterly to catch velocity theater before it metastasizes.

And critically: Decouple metrics from individual performance reviews. Team metrics for process improvement. Individual reviews based on impact, collaboration, growth.

Adobe did this right. We’re trying to do it right. But it requires leadership buy-in that “productivity” isn’t a single number.

Anyone else navigating this transition in a regulated industry? Financial services, healthcare, etc.? How do you balance flow metrics with compliance overhead?

Maya, your startup story is exactly why I joined this forum. The honest post-mortem about chasing the wrong metrics. :bar_chart:

I’m coming at this from the product side, and I want to celebrate the part of the Adobe transformation that Luis didn’t emphasize enough: They tied engineering work directly to feature adoption and revenue.

That’s the game-changer.

Story points never helped me prioritize the roadmap

Here’s my frustration as VP of Product: Story points tell me how complex something is to build. They don’t tell me if it’s worth building.

I’ve sat through hundreds of roadmap prioritization meetings that go like this:

Engineering: “That feature is a 21-pointer. It’ll take the whole sprint.”
Product (me): “Okay, but will customers pay for it?”
Engineering: “That’s not an engineering question.”
Sales: “Customers are asking for it.”
Me: “How many customers? What’s the ARR impact?”
Sales: “All of them. It’s a deal-breaker.”
Engineering: “So we’re building it?”
Me: dies inside

Story points optimize for “can we build it?” Product strategy optimizes for “should we build it?” These are completely different questions, and velocity metrics actively obscure the second one.

What Adobe got right: Outcome metrics :bullseye:

Here’s what I love about Adobe’s approach - they didn’t just replace story points with better process metrics. They added business outcome metrics:

  • Feature adoption rates - Are customers actually using what we ship?
  • Revenue impact - Does this work tie to money?

At my fintech startup, we adopted a similar framework 9 months ago. Our north star metric is now “time to customer value” - measured as days from “customer need identified” to “customer successfully using solution in production.”

Not “time to code complete.” Not “time to QA approved.” Time to customer value.

The features that never shipped :money_with_wings:

Luis asked a version of this question, but I want to make it explicit:

How do you measure what never shipped because you were too busy shipping fast?

My previous company (well-funded Series C) had killer velocity. 40-50 story points per sprint, consistently. Shipped 12 major features in Q4 2024.

Customer NPS dropped 8 points that quarter.

You know why? We shipped features customers didn’t want while ignoring the 3 features they desperately needed - because those 3 were each estimated at 34 points and would “destroy our velocity.”

We optimized for the wrong outcome. We were data-driven, but driving toward the wrong destination.

What we measure now

Our current product metrics framework:

1. Customer outcome metrics (the “why”):

  • Problem resolution rate (did we solve the customer problem?)
  • Feature adoption rate (are customers using it?)
  • NPS delta (did satisfaction improve?)
  • Revenue/churn impact (business outcomes)

2. Delivery health metrics (the “how”):

  • Time to customer value (idea → production usage)
  • Deploy frequency (are we delivering iteratively?)
  • Change failure rate (are we maintaining quality?)

3. Learning velocity (the “what’s next”):

  • Experiment cycle time (how fast can we test hypotheses?)
  • Customer conversations per feature (are we learning?)
  • Pivot/kill rate (are we willing to change course?)

Notice what’s missing? Story points. Velocity. Sprint commitment accuracy.

The uncomfortable question :thinking:

Here’s what I’m still grappling with:

If we stop estimating, how do I answer “when will this be done?” for my Series B pitch deck?

I need some way to forecast. Investors want a roadmap. Sales wants commitment dates. Partnerships need timelines.

Luis’s flow metrics help with “how long does work typically take once started.” But that doesn’t help me answer “should we commit to building this for Q3 or Q4?”

Adobe is a massive company with established revenue. They can afford to ship when it’s ready. I’m at a startup where a missed deadline means a lost partnership worth $2M ARR.

How do you balance outcome-focused measurement with the very real need to forecast delivery timelines?

The ROI conversation I wish more product leaders would have :money_bag:

Maya, you asked if anyone else was burned by velocity theater. Here’s my version:

Last year, my team shipped 47 features. Customers actively use 12 of them.

That’s a 74% waste rate.

If each feature cost an average of $40K in engineering time (conservative estimate for our team size), that’s $1.4M in sunk cost building stuff nobody uses.

Imagine if we’d shipped 15 features that customers actually wanted instead of 47 that looked good on a velocity chart.

That’s the conversation product leaders need to have with engineering leaders. Not “how can you ship faster?” but “how can we ship the right things?”

Where I’m struggling: Forecasting without estimation

David from product here being vulnerable: I don’t know how to do roadmap planning without some form of estimation.

Yes, story points are broken. But if engineering tells me “we have no idea how long anything takes,” how do I have an honest conversation with the board about when we’ll be enterprise-ready?

Luis mentioned flow metrics and historical cycle time. I’d love to hear more about how that actually works for roadmap forecasting in practice.

Anyone doing outcome-driven product development at a startup (not a BigCo with infinite runway)? How do you balance “build the right thing” with “commit to a timeline”?

:rocket:

This conversation is giving me life. :flexed_biceps: Maya’s vulnerability, Luis’s nuanced take, David’s ROI analysis - this is exactly the cross-functional dialogue we need.

I’m coming at this from the VP Engineering seat while scaling from 25 to 80+ engineers. And I need to be honest: I’m terrified of losing all measurement.

I’ve lived the velocity theater nightmare

Maya, your story about shipping the wrong features hit hard. At my previous company (Director of Eng at a growth-stage SaaS), we had the same problem.

Our engineering team had a 92% sprint commitment success rate. Leadership loved our predictability. We were the “most reliable engineering team in the company.”

Then our biggest customer churned because we hadn’t fixed their critical integration bug for 9 months. Why? Because bug fixes don’t generate story points.

We were optimizing for predictability, not for customer retention. That $480K/year customer is gone forever.

Velocity theater kills businesses. I get it. I’ve seen it.

But scaling 25→80 engineers requires some measurement :glowing_star:

Here’s where I’m struggling with the “ditch all metrics” narrative:

When you’re a 5-person startup, you can coordinate via Slack and gut feel. When you’re scaling to 80 engineers across 8 product teams, you need data to identify where things are breaking down.

Without measurement, I can’t answer:

  • Which teams are blocked by dependencies?
  • Where are our quality issues concentrated?
  • Why does Team A ship 2x as fast as Team B on similar work?
  • Are we creating bottlenecks by hiring faster than we’re onboarding?

Luis is right - DORA + SPACE + outcome metrics. But implementing that across 8 teams while also interviewing 40 candidates and negotiating with VCs about runway… it’s a lot.

The mistake: Using the same metrics for different purposes :key:

Here’s what I think we’re all discovering: The problem isn’t measurement. The problem is using the same metric for multiple incompatible purposes.

Story points fail because we use them for:

  1. Estimating completion dates (forecasting)
  2. Measuring team productivity (performance)
  3. Comparing individuals (promotion decisions)
  4. Calculating team capacity (resourcing)

That’s four completely different goals with different success criteria. Of course it fails.

What I’m experimenting with now:

For forecasting: Historical cycle time + Monte Carlo simulation

  • “Based on the last 50 features of similar complexity, there’s a 70% chance this ships in 3-5 weeks”
  • No individual estimation poker. Just statistical forecasting from actual delivery data.

For team health: SPACE framework surveys quarterly

  • Satisfaction, performance, activity, communication, efficiency
  • Not tied to compensation. Used for identifying support needs.

For business impact: Outcome metrics at the team level

  • Customer satisfaction delta
  • Revenue/cost impact
  • System reliability improvements
  • Feature adoption rates

For individual performance: Impact-based reviews with narrative

  • What problems did you solve?
  • How did you help others succeed?
  • Where did you grow?
  • Zero metrics. All narrative and peer feedback.

Measurement nihilism is a real danger :warning:

Luis mentioned the pendulum swing. I’m seeing it happen right now in our industry.

Teams are so burned by velocity theater that they’re rejecting all measurement. “We don’t estimate anymore. We ship when it’s ready.”

That’s fine for established products with patient stakeholders. But I’m at an EdTech startup where:

  • Sales needs to commit to school district implementation timelines (9-month sales cycles)
  • Compliance requires specific features for regulatory approval
  • Investors expect a roadmap in the Series B deck

“We’ll ship when it’s ready” doesn’t work when missing a deadline means losing a $3M contract.

David, I feel your pain about forecasting without estimation. Here’s what’s working for us:

Probabilistic forecasting without individual estimation :bar_chart:

Instead of story pointing, we track:

  1. Historical throughput: How many features (by size category) does this team typically complete per month?

  2. Cycle time distribution: What’s the spread? (50th percentile: 2 weeks, 85th percentile: 4 weeks, 95th percentile: 7 weeks)

  3. Uncertainty buffers: Build in risk based on dependencies, new tech, team composition

Result: “Based on Team A’s historical data, there’s a 70% chance Feature X ships in Q3, 90% chance in Q4.”

Not a precise date. But a probability distribution that’s actually honest about uncertainty.

We stopped asking engineers “how many points?” and started asking “is this more like Feature X (shipped in 2 weeks) or Feature Y (took 6 weeks)?”

Relative sizing without the point system. Forecasting based on actual historical performance, not estimated capacity.

What’s working for mid-size teams (20-50 engineers)? :thinking:

Maya asked. Luis gave a great answer. Here’s my addition:

Start simple. Add complexity only when you feel the pain of not having it.

Minimum viable metrics for a 20-person team:

  1. Lead time for changes (commit → production)
  2. Deployment frequency
  3. Change failure rate
  4. Quarterly team satisfaction survey (5 questions, anonymous)

That’s it. Those four metrics will tell you:

  • Are we delivering value quickly? (Lead time)
  • Are we delivering continuously? (Deploy frequency)
  • Is our quality acceptable? (Failure rate)
  • Is the team burning out? (Satisfaction)

When you scale to 40-50 engineers, add:

  • Cycle time by team (to identify bottlenecks)
  • Customer-facing outcome metrics (adoption, NPS, support tickets)
  • Work in progress limits (to prevent context switching)

Only measure what you’re willing to act on. If you won’t change anything based on a metric, stop tracking it.

The conversation I wish we’d have at the C-level :fire:

Here’s what I want to say to every CEO obsessed with “engineering productivity”:

You can’t measure productivity by counting outputs. You can only measure it by evaluating outcomes.

If my team ships 12 features that customers don’t use, that’s not productivity - that’s waste.

If my team ships 3 features that increase revenue by 40%, reduce churn by 15%, and improve NPS by 12 points - that’s productivity.

Adobe figured this out. They tied engineering work to business outcomes. That’s the transformation.

Not “engineers are shipping more.” But “engineering work is creating more customer and business value.”

My question for the group

For those of you who’ve successfully moved away from story points:

How did you handle the transition period? We’re 3 months into this experiment and some senior engineers are resisting hard. They want quantified productivity metrics because:

  1. They’ve been rewarded for high velocity for years
  2. They’re worried impact-based reviews are “too subjective”
  3. They don’t trust leadership to evaluate them fairly without numbers

How do you build trust in a qualitative performance evaluation system when your culture has been quantitatively driven for a decade?

And David - I’d love to jam on probabilistic forecasting. We’re figuring this out in real-time and I’m sure we’re making mistakes. Always learning. :glowing_star: