Experienced Developers Are 19% SLOWER With AI Tools—Does "Productivity" Mean Different Things for Juniors vs Seniors?

Experienced Developers Are 19% SLOWER With AI Tools—Does “Productivity” Mean Different Things for Juniors vs Seniors?

I’ve been wrestling with some counterintuitive research that challenges everything we think we know about AI coding assistants. :thinking:

The Perception Gap is WILD

A METR study from early 2025 found that experienced developers took 19% longer to complete tasks when using AI tools (primarily Cursor Pro with Claude 3.5/3.7 Sonnet) compared to working without AI.

But here’s the kicker: these same developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.

That’s a massive disconnect between perception and reality. We feel faster, but the data shows we’re actually slower.

Why Are Experienced Devs Slower?

The study recruited 16 experienced developers from large open-source repos (averaging 22k+ stars, 1M+ lines of code) that they’ve contributed to for years. These aren’t junior devs learning the ropes—they’re experts in their own codebases.

The theory: experienced developers approach their work with tons of additional context that AI assistants don’t have. They have to retrofit their own agenda and problem-solving strategies into the AI’s outputs, then spend time debugging what the AI generates.

In my own work on design systems, I see this play out constantly. When I use AI to generate component code, I spend more time reviewing and adjusting it to fit our existing patterns than if I’d just written it myself from scratch. The AI doesn’t understand our specific constraints, naming conventions, or the architectural decisions we made 6 months ago.

But Juniors Are FASTER

Here’s where it gets interesting. Research shows juniors benefit way more:

  • Junior developers: 21-40% productivity boost
  • Senior developers: 7-16% productivity boost (and sometimes negative, as we saw)

Yet seniors ship 2.5x more AI-generated code than juniors, with 32% of seniors reporting over half their production code comes from AI (vs. 13% for juniors).

So what gives? Seniors use AI fundamentally differently—they treat it like a talented but fallible junior dev that needs oversight, structured prompts, and iteration. Juniors treat it like a genius oracle.

The Real Question: What IS Productivity?

This is where the design perspective kicks in. :artist_palette:

Are we measuring the right thing? For juniors, “productivity” might mean learning to ship code faster. For seniors, “productivity” might mean maintaining system coherence, architectural quality, and long-term maintainability.

Speed ≠ Value when you’re a senior engineer.

If I ship a component 40% faster but it creates 3 follow-up PRs to fix edge cases, integrate with existing systems, and update documentation—was I actually more productive? Or did I just shift the work around?

My Failed Startup Taught Me This

At my failed B2B SaaS startup, we moved FAST. We shipped features constantly. Our velocity was incredible.

But we never stopped to ask: are we shipping the right things? Are we building a coherent product or a Frankenstein’s monster of features?

Spoiler: it was the latter. :sweat_smile:

I see the same pattern with AI-generated code. Teams ship faster but create more technical debt, more inconsistency, more “wait, why did we build it this way?” moments.

So What Do We Do?

I don’t have answers, but I have questions:

  1. Should we measure different productivity metrics for juniors vs seniors? Maybe juniors should optimize for learning velocity, while seniors optimize for system health?

  2. Is the 19% slowdown worth it for other benefits? Maybe seniors are slower but produce better code? The study didn’t measure code quality, maintainability, or long-term outcomes.

  3. Are we teaching juniors to skip the learning that made seniors valuable? If juniors rely on AI as a “genius oracle,” do they miss the trial-and-error lessons that shape senior-level intuition?

  4. What happens when these AI-native juniors become seniors? Will they have the context and intuition to use AI effectively, or will they just be fast at shipping code they don’t deeply understand?

My Hot Take

AI tools are revealing what we’ve always known but rarely measured: senior developers create value differently than junior developers, and speed is a terrible proxy for senior-level impact.

Maybe the 19% slowdown is seniors doing what seniors do—thinking about context, consequences, and long-term system health. And maybe that’s exactly what we should be paying them for.

What do you think? Are you experiencing this in your teams?


Sources:

This hits close to home. I’m seeing exactly this pattern with my team at the financial services company.

The Data From Our 40+ Engineer Team

We’ve been tracking AI tool usage for 8 months now, and the numbers mirror what you’re describing:

  • Junior engineers (0-3 years): 30-35% faster at completing tickets, learning curve compressed from 6-8 weeks to 3-4 weeks
  • Mid-level engineers (4-7 years): 15-20% faster, but with 25% more follow-up work
  • Senior engineers (8+ years): Roughly the same speed or slightly slower, but they’re spending that time on architecture, code review, and mentoring

The interesting part? Our incident rate went up 12% in the first 6 months of AI adoption, driven almost entirely by code that “looked right” but had subtle bugs that only manifested under specific conditions.

The Review Burden Nobody Talks About

Your point about seniors acting as oversight for AI-generated code is critical. Our senior engineers are now spending 4-6 hours per week reviewing AI-generated code that has issues they can spot in 30 seconds but juniors can’t see at all.

We had a payment processing bug last month that came from AI-generated error handling logic. It worked fine in happy-path scenarios but failed catastrophically when the payment gateway returned a non-standard error code. A senior engineer would have caught this because they’ve seen that failure mode before. The junior who wrote it with AI assistance had never encountered it.

Different Productivity for Different Roles

I think you’re absolutely right that we need different productivity metrics:

Juniors: We measure learning velocity—are they understanding the patterns? Can they explain why the code works? Are they asking good questions?

Mid-level: We measure consistent delivery—can they ship features with minimal rework? Do they anticipate edge cases? Are they reducing the review burden?

Seniors: We measure architectural quality, system health, and team leverage—are they preventing problems before they happen? Are they making the team around them better? Are they maintaining system coherence?

Speed is only relevant for juniors who are learning to deliver. For seniors, speed is almost irrelevant compared to impact.

The Career Pipeline Question

Your question #3 worries me the most: “Are we teaching juniors to skip the learning that made seniors valuable?”

I’m seeing juniors who can ship code fast but can’t debug it when it breaks. They don’t understand the “why” behind the patterns. They’re dependent on AI to generate solutions because they never built the mental models through trial and error.

In 5-7 years, will these AI-native juniors have the intuition and context to be effective seniors? Or will we have a generation of developers who are fast at shipping but slow at thinking?

That’s the question that keeps me up at night. :confused:

Maya, this research validates what I’ve been arguing with my board about for 6 months. They see “AI productivity gains” in every pitch deck and ask why we’re not moving faster.

The Executive Pressure Problem

The narrative at the board level is simple: AI makes developers 40-60% more productive, so why aren’t you shipping 40% more features?

But that’s not how it works. As you and Luis pointed out, the gains are uneven, the overhead is real, and we’re optimizing for the wrong metrics.

What We’re Actually Seeing at Scale

We have 120 engineers across 8 product teams. We deployed AI coding assistants 9 months ago. Here’s what the data shows:

Velocity Metrics (what the board cares about):

  • Story points per sprint: +22%
  • PRs merged per week: +18%
  • Features shipped per quarter: +12%

Quality Metrics (what engineering leadership cares about):

  • Incidents per 1000 deployments: +15%
  • Time to resolve incidents (MTTR): +8%
  • Code review cycles: +25%
  • Technical debt growth: +30% (measured by “follow-up work” tickets)

So yes, we’re shipping faster. But we’re also creating more problems and spending more time cleaning them up.

The ROI Reality Check

Your question #2 is the one I’m wrestling with: “Is the 19% slowdown worth it for other benefits?”

Here’s my contrarian take: for seniors, maybe the slowdown is actually the ROI.

If a senior engineer is 19% slower but spending that extra time on:

  • Architectural thinking
  • Preventing technical debt
  • Mentoring juniors
  • Code review and quality gates
  • System-wide optimization

…then that 19% slowdown might be exactly what we want. We’re not paying seniors for speed. We’re paying them for judgment, context, and leverage.

The problem is our productivity metrics don’t capture this. We measure output (code shipped) not outcomes (system health, team effectiveness, long-term velocity).

The Two-Track Future?

I’m starting to think we need two development tracks:

Track 1: AI-Augmented Delivery (juniors, mid-levels, well-scoped features)

  • Optimize for speed
  • Use AI heavily for boilerplate, implementation
  • Measure velocity and learning

Track 2: Human-Centered Architecture (seniors, complex problems, system design)

  • Optimize for quality and coherence
  • Use AI sparingly, for research and analysis
  • Measure impact and leverage

The question is: does this create a two-tier team? Do juniors on Track 1 ever learn enough to graduate to Track 2?

What I’m Telling My Board

I’m pushing back on the “AI = more features” narrative. Instead, I’m framing it as:

“AI lets us handle the same ambitious roadmap with fewer engineers, while maintaining quality and avoiding technical debt.”

We’re not shipping 40% more. We’re avoiding hiring 8-10 engineers we would have needed otherwise. That’s the ROI—capability preservation, not velocity multiplication.

But I’ll be honest: it’s a hard sell when every other CEO is claiming 2x productivity gains. :sweat_smile:

This conversation is giving me life because it gets at something I’ve been trying to articulate to my leadership team: we’re measuring productivity like it’s a single dimension when it’s actually multidimensional.

The Equity Dimension Nobody’s Discussing

Can we talk about who benefits from the “junior productivity gains” and who bears the cost of the “senior slowdown”?

At my EdTech startup (80 engineers), I’m seeing patterns that worry me:

Junior engineers (who disproportionately benefit from AI):

  • 40-60% less experienced
  • Often bootcamp grads or self-taught
  • More likely to be from non-traditional backgrounds
  • More diverse (gender, race, geography)

Senior engineers (who bear the review and mentorship burden):

  • More likely to be 10+ year industry veterans
  • Predominantly from traditional CS backgrounds
  • Less diverse overall

So when we say “juniors are 30% faster with AI” but “seniors are slower because they’re reviewing/mentoring more,” we’re essentially saying: AI is making diverse, non-traditional talent more productive, while increasing the burden on a less diverse group.

That’s not inherently bad—in fact, it could be a massive equity win—but only if we:

  1. Recognize and reward the mentorship/review work seniors are doing
  2. Ensure juniors are actually learning, not just shipping
  3. Build pathways for AI-native juniors to develop senior-level skills

The Learning Velocity Question

Maya, your question about whether AI-native juniors will develop senior-level intuition is THE question for our industry’s future.

I’m running an experiment with two cohorts of junior engineers:

Cohort A: AI-Heavy (4 engineers)

  • Use AI tools freely from day one
  • Optimize for shipping velocity
  • Measure: tickets closed, features shipped

Cohort B: Foundations-First (4 engineers)

  • No AI tools for first 3 months
  • Focus on fundamentals, debugging, learning patterns
  • Then introduce AI tools with structured guidance

Early results (6 months in):

  • Cohort A: 35% more tickets closed, but 40% more follow-up work, struggle with debugging
  • Cohort B: Slower initially, but after AI introduction, they’re using it better—more thoughtful prompts, faster at spotting AI errors, better code quality

It’s too early to draw conclusions, but my hypothesis: AI amplifies your foundation. If you have weak fundamentals, AI makes you fast but fragile. If you have strong fundamentals, AI makes you a force multiplier.

Redefining Senior “Productivity”

Michelle’s point about two-track development resonates, but I want to push back slightly: I don’t think we need two tracks. I think we need to redefine what “productive” means for seniors.

Right now, we measure seniors on the same metrics as juniors: velocity, output, speed. But senior value is:

  • Preventative work: The incident that didn’t happen because of good architecture
  • Leverage: Making the team around them 2x more effective
  • Judgment: Knowing when to move fast and when to move deliberately
  • Context: Understanding the “why” behind past decisions

None of this shows up in velocity metrics. But it’s 80% of senior value.

What We’re Doing About It

  1. Separate senior performance reviews from velocity metrics. We evaluate seniors on: architecture quality, mentorship impact, incident prevention, technical decision-making.

  2. Explicitly track mentorship time as “productive work”. If a senior spends 6 hours/week reviewing AI-generated code and teaching juniors, that’s not “overhead”—it’s their job.

  3. Create learning milestones for AI-native juniors. They need to hit proficiency in debugging, architectural thinking, and “knowing when AI is wrong” before we consider them mid-level.

  4. Celebrate different kinds of productivity. Our “wins” channel highlights both “shipped this feature fast” and “prevented this architectural mistake.”

The Uncomfortable Truth

If we only measure speed, we’ll optimize for speed. And we’ll end up with a generation of developers who are incredibly fast at creating technical debt they don’t know how to fix.

AI is exposing what was always true: junior productivity and senior productivity are different things, and we’ve been pretending they’re the same because it was easier to measure just one number.

As the non-technical person in this thread, I’m finding this fascinating because it mirrors a product management problem we’ve been wrestling with.

The Product Parallel: Shipping vs Learning

In product, we have the same tension between shipping features and learning what customers actually need.

Early-stage products optimize for learning velocity: “How fast can we test hypotheses and validate assumptions?”

Mature products optimize for execution velocity: “How fast can we deliver on a validated roadmap?”

The mistake is optimizing for execution before you’ve finished learning. You end up shipping the wrong thing very efficiently.

I wonder if AI is doing something similar to engineering teams—accelerating execution before the learning is complete.

Juniors are in the “learning phase” of their careers. They need to build mental models, intuition, and pattern recognition. AI lets them skip straight to execution. But if the learning isn’t happening, they’re just shipping code they don’t understand.

Seniors are in the “execution + teaching” phase. They’ve done the learning. AI doesn’t help them learn (they already know), so it just creates overhead in the form of review and correction.

The Metrics Problem is a Product Problem

Maya, your question about measuring different productivity metrics for juniors vs seniors is actually a product thinking problem.

For juniors, the job-to-be-done is: Learn how to write good code while delivering value
Success metric: Learning velocity + delivery quality (not just speed)

For seniors, the job-to-be-done is: Maintain system health while enabling the team
Success metric: Team leverage + architectural quality + incident prevention (not individual output)

If we measure both groups on “velocity,” we’re measuring the wrong success criteria for seniors.

The Question I’m Asking My Engineering Partners

When engineering tells me “we’re 20% more productive with AI,” I ask:

  1. More productive at what? Shipping code? Learning? Preventing problems? Maintaining quality?
  2. Over what time horizon? Are we faster this sprint but creating debt for next quarter?
  3. Who’s more productive? Juniors? Seniors? The team overall?
  4. Productive toward what goal? Moving fast? Building right? Learning?

Usually, the answer is: “We’re shipping more code this sprint.” Which is… fine? But not necessarily the productivity we actually need.

The Real Productivity Metric: Value Per Unit of Effort

In product, we’ve learned that feature count is a terrible proxy for customer value. 100 mediocre features < 10 excellent features.

I suspect engineering is learning that lines of code or velocity is a terrible proxy for engineering value.

Maybe the real productivity metric is:
Value Created / (Engineering Effort + Technical Debt + Operational Overhead + Team Friction)

By that measure, a senior engineer who is “19% slower” but prevents 3 major incidents, reduces tech debt, and levels up 2 junior engineers is massively more productive than a junior who ships 40% more code that requires 6 hours of senior review and creates 3 follow-up tickets.

My Controversial Take

The METR study found experienced developers were 19% slower on their own familiar codebases.

But what if we measured productivity at the team level instead of individual level?

A team with seniors using AI thoughtfully + juniors using AI under guidance might be 30% more productive as a unit even if seniors are individually slower, because:

  • Juniors ship faster with senior oversight
  • Seniors catch problems early (cheaper to fix)
  • Technical debt is managed, not accumulated
  • The codebase stays coherent

We’ve been measuring individual productivity when we should be measuring team effectiveness.

Sound familiar, engineering leaders? :blush: