AI dependency is masking skill gaps on my team—how do you actually assess capability now?

I need to be honest about something that’s been keeping me up at night: I can no longer accurately assess the actual capabilities of my engineering team, and it’s creating serious problems for capacity planning.

Three months ago, I hired three mid-level engineers who absolutely crushed their technical interviews. Code challenges? Excellent. System design? Solid. References? Glowing. Fast forward to today, and I’m realizing I have no idea what their baseline capabilities actually are versus what they can accomplish with AI assistance.

The wake-up call

Last sprint planning was a disaster. The team committed to a roadmap based on their recent velocity—which looked great on paper. Then we hit a set of novel architectural challenges that our AI coding assistants couldn’t help with effectively, and velocity dropped 40%.

I thought maybe it was just bad estimation. Then I read the recent Anthropic research showing that developers using AI assistance while learning performed significantly worse on follow-up tests—a 17% drop in skill retention. That’s when it clicked: my team might be more AI-dependent than I realized.

The capacity planning crisis

Here’s what worries me most: when you’re planning capacity for Q2, how do you estimate what your team can actually handle? Traditional metrics assume you know your engineers’ baseline capabilities. But in an AI-augmented world:

  • Can this engineer debug complex production issues independently?
  • Do they understand the architectural trade-offs they’re making, or are they just picking the AI’s first suggestion?
  • Will they struggle when faced with a problem outside their AI tool’s training data?

I’m seeing developers who can ship features quickly but can’t explain the underlying design decisions. Code reviews that look fine on the surface but reveal concerning knowledge gaps when you dig deeper. The research backs this up—teams with high AI adoption merge 98% more pull requests, but PR review time increases 91%. That bottleneck? It’s reviewers trying to assess quality they can’t trust anymore.

The assessment challenge

Traditional engineering assessment doesn’t work anymore:

  • Code challenges: Are you testing the engineer or their prompting skills?
  • Code reviews: When everyone uses AI assistants, how do you spot individual skill gaps?
  • Velocity metrics: 30% faster coding only translates to 8% delivery improvement after testing and reviews—so what are we actually measuring?
  • Pair programming: Used to reveal how engineers think, but now you’re watching their AI interaction, not their problem-solving

Only 38% of companies offer AI training, yet we’ve deployed these tools everywhere. We’re basically running an uncontrolled experiment on our team’s long-term capability.

What I’m trying

I’ve started implementing quarterly “fundamentals check-ins” where engineers work through problems without AI assistance. The results are… sobering. Some of my “high performers” struggle with basic debugging when the AI isn’t available.

I’m also looking at architecture discussions and incident response as capability indicators—AI can’t design systems or handle novel production emergencies. But I’m worried this isn’t enough.

My questions for this community

For other engineering leaders dealing with this:

  1. How are you separating AI-augmented productivity from actual developer capability? Traditional metrics feel broken.

  2. What assessment methods work in an AI-assisted environment? How do you identify skill gaps before they become capacity problems?

  3. Should hiring and promotion criteria change? Are we evaluating the wrong things now?

  4. How do you plan capacity when you’re not sure of your team’s baseline capabilities without AI assistance?

I can’t be the only leader facing this. The speed gains are real (our coding velocity is up 25%), but I’m increasingly worried we’re building overconfidence on a shaky foundation. When the next major technical challenge hits—something truly novel—will my team be ready, or have we accidentally created a dependency we can’t see?

Sources for research mentioned:

This hits close to home, Keisha. I’m dealing with the exact same challenge at enterprise scale—100+ engineers, and I honestly can’t tell you the baseline capability of about 40% of them anymore.

What you’re describing isn’t just a team management problem. It’s a strategic risk that most leadership teams don’t understand yet. When I present capacity plans to the board, I’m making educated guesses about capabilities I can no longer directly observe.

Our three-tier assessment framework

After six months of struggling with this, we implemented what I call the “Capability Reality Check” framework. It’s not perfect, but it’s revealing truths I wish I didn’t have to face:

1. Pairing sessions without AI tools (quarterly)

  • 90-minute problem-solving sessions with no AI assistance allowed
  • Focus on debugging, architecture decisions, and code reading
  • Results: About 35% of our “high performers” drop to average or below
  • This is painful but necessary—you need to know who can function when AI fails

2. Architecture and design discussions

  • AI can generate code, but it can’t design systems holistically
  • Weekly architecture reviews where engineers must defend design decisions
  • We ask: “What trade-offs did you consider?” “What alternatives did you reject and why?”
  • This reveals whether they understand what they’re building or just implementing suggestions

3. Incident response and production debugging

  • How engineers handle real production issues is the ultimate capability test
  • AI tools struggle with novel failure modes and complex system interactions
  • We track: Time to root cause, quality of analysis, effectiveness of fixes
  • This separates people who understand systems from people who can prompt AI

The 91% bottleneck is real

You mentioned PR review time increasing 91%—we’re seeing the exact same thing. The research you cited matches our internal data perfectly. Reviewers spend vastly more time because they can’t trust the quality anymore. They’re essentially re-reviewing everything at a fundamental level.

We’ve shifted to design document reviews ahead of code. If engineers can’t explain their approach in a design doc without AI assistance, they probably don’t understand it deeply enough to maintain it long-term.

Capacity planning with the “AI dependency factor”

For sprint planning and quarterly capacity estimates, we now apply what I call an “AI dependency discount”:

  • Novel work (new architectures, unfamiliar domains): Assume 30-40% lower velocity than recent AI-augmented performance
  • Maintenance and extensions (familiar patterns, similar to training data): Closer to observed velocity
  • Production issues and debugging: Plan as if AI tools don’t exist—they’re unreliable here

It’s uncomfortable explaining this to executives who see the velocity metrics and expect them to hold. But the alternative is missing commitments badly when reality hits.

The harsh truth

Here’s what keeps me up at night: We’re creating a generation of engineers who are incredibly productive within AI’s comfort zone but fragile outside it. The 17% skill retention drop you cited is just the beginning—I suspect it’s worse for engineers who’ve never worked without AI.

Some of our best engineers from 5 years ago could debug anything, read unfamiliar code, and solve novel problems. Many of our recent hires can ship features fast but crumble under pressure when the AI can’t help.

The security vulnerability increase (23.7% according to recent research) is another warning sign. Speed without understanding is dangerous, especially in production systems.

What I’m still figuring out

Two big questions I don’t have answers for:

  1. How do we maintain this dual assessment system long-term? It adds significant overhead, and managers resist it because it slows teams down.

  2. Should promotion criteria fundamentally change? Is “can work effectively with AI” more valuable than “deep fundamental understanding”? I don’t think so, but market pressure suggests otherwise.

Your quarterly “fundamentals check-ins” are a good start. I’d add: make incident response part of your capability assessment, and push for architecture discussions where AI can’t hide gaps.

The uncomfortable reality is that traditional engineering metrics are broken in the AI era. We need new frameworks—and leaders who are willing to acknowledge the dependency risk we’ve created.

The capacity planning challenge you both describe is real, but I’m even more worried about what this means for early-career engineers. We’re creating a generation gap that could be devastating for long-term team health.

The junior engineer crisis

I have a junior developer on my team—let’s call him Alex. Six months in, his metrics looked great. Shipped features on time, code passed reviews, velocity was solid. Leadership loved him.

Then we had a production incident. Critical service degrading, users affected, all hands on deck. Alex completely froze. Couldn’t read stack traces effectively, didn’t know where to start debugging, panicked when AI suggestions didn’t apply to our specific architecture.

It wasn’t his fault—he’d learned to work with AI, not without it. The 17% skill retention drop Keisha mentioned? For engineers who started their careers post-AI, I suspect it’s closer to 50%. They never built the fundamentals in the first place.

The mentorship breakdown

What keeps me up at night is how AI is destroying organic mentorship. The research mentioned a 45% productivity boost but increased isolation—I’m seeing this destroy team cohesion.

Before AI tools:

  • Junior engineers would get stuck and ask seniors for help
  • Seniors could see the thought process, identify knowledge gaps, teach fundamentals
  • Code reviews were learning opportunities
  • Pair programming transferred deep knowledge

With AI tools:

  • Juniors get “unstuck” by AI without understanding why
  • Seniors review finished code but can’t see the learning process
  • Code reviews become “does this work?” instead of “do you understand this?”
  • Pair programming is watching someone prompt AI

I can’t mentor effectively when I don’t see how my team members actually think through problems. And when I do see it (during incidents, architecture discussions), I’m shocked by the fundamental gaps.

What we’re trying

Michelle’s framework is similar to what we’ve implemented, but I’ve added some junior-focused elements:

1. “AI-Free Fridays” for skill building

  • Every Friday afternoon, juniors work on fundamentals without AI assistance
  • Focus on: reading complex code, debugging, algorithm analysis
  • Seniors available for pairing and mentoring
  • Resistance from management (“why slow down productivity?”) but non-negotiable

2. Mandatory pair programming for early-career engineers

  • At least 8 hours per week with AI tools disabled
  • Seniors get to see actual problem-solving ability
  • Juniors learn fundamentals they’d otherwise skip
  • Yes, it’s slower. Yes, it’s necessary.

3. Code walkthroughs where engineers explain AI-generated code

  • If you can’t explain why the AI suggested this approach, you don’t merge it
  • Forces engineers to understand rather than just accept
  • Reveals who’s learning and who’s just prompting

4. Incident response rotation for all engineers

  • Including juniors, with senior backup
  • Nothing reveals capability gaps faster than production pressure
  • AI tools are unreliable in novel failure scenarios

The cultural battle

Here’s my frustration: I’m fighting an uphill battle against velocity metrics. Executives see 25-30% productivity gains and ask why I’m “slowing the team down” with fundamentals training.

I had to show leadership the 30% coding speedup vs 8% delivery improvement data to get buy-in. Even then, the response was “optimize the pipeline” not “ensure our team can function without AI.”

The 23.7% security vulnerability increase should terrify every CTO, but it’s treated as a code review problem, not a fundamental understanding problem.

The diversity and inclusion angle

This particularly impacts first-generation engineers and underrepresented groups. Many are breaking into tech through bootcamps and non-traditional paths—which is fantastic. But if their entire learning experience is AI-augmented, they’re building on a shaky foundation.

I mentor Latino engineers through SHPE. The ones starting careers now face a different challenge than I did: they need to learn when to use AI and when to build skills without it. Many bootcamps and entry programs haven’t figured this out yet.

If we let AI mask skill gaps for early-career engineers, we’re setting up a diversity failure at senior levels in 5-10 years. These engineers won’t have the fundamentals to grow into leadership.

My questions for both of you

Keisha, Michelle—how do you balance this with business pressure for velocity? I’m constantly defending “unproductive” time spent on fundamentals.

And how do you assess potential for growth vs current capability? Some of my “AI-dependent” engineers are smart and could build strong fundamentals with the right environment. But if I only measure current output, I’ll promote the wrong people.

The long game

We’re playing a dangerous long game. The productivity gains are real and immediate. The skill erosion is real but delayed. By the time we realize we’ve created a generation of engineers who can’t function without AI assistance, it’ll be too late to fix easily.

I’d rather have an engineer who understands systems deeply and uses AI to accelerate than one who relies on AI to compensate for missing fundamentals. But the market right now rewards the latter.

We need to decide: are we building long-term engineering capability, or optimizing for short-term velocity? Because increasingly, these feel like mutually exclusive choices.

Reading this thread from a product perspective is… eye-opening and honestly a bit terrifying. This isn’t just an engineering problem—it’s fundamentally undermining how product and engineering collaborate on roadmap planning.

The planning nightmare

Here’s what’s happening on my side: Engineering committed to our Q2 roadmap based on recent velocity metrics that looked incredible. We’re talking about 25-30% faster feature delivery compared to last year. Leadership loved it. Sales promised these features to prospects. Customer success planned onboarding timelines.

Now we’re 6 weeks into Q2, and half the roadmap is at risk. Not because requirements changed or priorities shifted—but because the engineering team is struggling with challenges that “shouldn’t” be hard based on their recent performance.

When I ask what’s different, the answer is always some version of: “This is a novel architecture problem” or “The AI tools aren’t helping with this one.”

The trust breakdown

This is creating a serious trust problem between product and engineering. When engineering says “this will take 3 sprints,” I used to know what that meant. Now I have to ask:

  • Is this estimate based on AI-augmented velocity or baseline capability?
  • Have you done something similar before, or is this novel?
  • If the AI can’t help, what’s the real timeline?

These aren’t questions I should have to ask, but Keisha’s point about that 30% coding speedup only translating to 8% delivery improvement is exactly what I’m seeing in our metrics. The velocity boost is real until it isn’t.

The business impact

Let me make this concrete with real numbers from our last quarter:

Sprint 1-4 (familiar features, patterns AI knows well):

  • Planned: 12 story points
  • Delivered: 14 story points
  • Engineering looked amazing, velocity up 25%

Sprint 5-8 (new payment integration, unfamiliar architecture):

  • Planned: 12 story points (based on previous velocity)
  • Delivered: 6 story points
  • Massive miss, team struggled, blamed “underestimated complexity”

The problem? The complexity was normal for new integrations. The difference was AI couldn’t help as much, and the baseline capability wasn’t where our velocity suggested it should be.

Cross-functional ripple effects

This isn’t just affecting engineering timelines. It’s cascading across the entire organization:

Sales:

  • Closed deals based on promised features that are now delayed
  • Credibility hit with prospects when we push delivery dates
  • Revenue impact from delayed launches

Customer Success:

  • Planned onboarding programs assuming features would ship
  • Had to redesign onboarding flows mid-quarter
  • Customer satisfaction risk from unmet expectations

Marketing:

  • Campaign timelines built around product launches
  • Launch delays affect demand gen plans and budget allocation
  • Market positioning based on capabilities we haven’t delivered

All because we planned capacity based on AI-augmented metrics that didn’t reflect actual team capability for novel work.

What I need from engineering leadership

I’m not blaming engineering—this is a shared problem we need to solve together. But as a product leader, here’s what I need:

1. Transparent capability assessment

  • Help me understand the difference between “AI can help with this” vs “AI probably can’t help”
  • Be honest about baseline capability vs augmented performance
  • Flag when estimates assume AI assistance

2. Dual velocity tracking

  • Show me both: velocity on familiar work and velocity on novel work
  • Michelle’s “AI dependency factor” framework makes total sense
  • I need to plan differently for these two categories

3. Weekly calibration conversations

  • Regular sync on: Is the current sprint in AI’s comfort zone or not?
  • Early warning when work is harder than AI-augmented estimates suggested
  • Shared ownership of realistic roadmap planning

4. Risk-adjusted roadmaps

  • Build in buffer for “novel work” that AI can’t accelerate
  • Be explicit about which features have AI-velocity assumptions
  • Plan conservatively for architectural work

The ROI question nobody’s asking

If AI tools give us 30% faster coding but only 8% faster delivery, and create capacity planning uncertainty that risks entire quarter commitments—what’s the actual ROI?

The license costs are trivial. The risk of over-committed roadmaps and missed business targets? That’s expensive.

I’m not suggesting we abandon AI tools—the productivity gains are real in the right contexts. But we need to be honest about where they help and where they create false confidence.

My ask to this community

For other product leaders or PMs:

  1. How are you adjusting roadmap planning to account for AI-augmented vs baseline engineering velocity?

  2. What questions do you ask engineering to understand if estimates are realistic or based on inflated AI-assisted performance?

  3. How do you communicate these dynamics to sales and leadership without throwing engineering under the bus?

For engineering leaders:

  1. How can product better partner with you on transparent capacity planning in the AI era?

  2. What signals should product watch for that indicate work is outside AI’s comfort zone?

The bigger strategic question

Luis’s point about long-term vs short-term trade-offs hits hard. From a product strategy perspective, I need engineering teams that can handle novel problems, pivot when needed, and build things AI hasn’t seen before.

If AI dependency is creating teams that are fast at familiar patterns but fragile with new challenges, that’s a strategic liability—especially in competitive markets where differentiation comes from doing things differently, not faster versions of what everyone else does.

We need to align on this across product and engineering leadership. The current approach—maximize AI-assisted velocity and hope it holds—is creating business risk that’s becoming impossible to ignore.

Coming at this from a design perspective, but wow—this entire thread is giving me chills because we’re seeing the exact same patterns with AI design tools. And it’s making me rethink everything about craft, learning, and what it means to actually understand your work.

The design parallel

I lead design systems, and about 8 months ago our team started using AI tools heavily—Figma AI, Midjourney for explorations, ChatGPT for component documentation, AI-assisted accessibility checks. Productivity went through the roof. Leadership was thrilled.

Then I asked a junior designer to explain why they chose a specific accessibility pattern for a form component. They couldn’t. The AI had suggested it, they’d implemented it, it passed automated checks, so they shipped it.

Turns out the pattern worked for screen readers but created a nightmare for keyboard navigation—something the AI didn’t catch and the designer didn’t understand deeply enough to spot.

Sound familiar? It’s Luis’s “Alex freezing during the production incident” but for design.

Speed without understanding is dangerous

That 23.7% increase in security vulnerabilities Michelle mentioned? In design, we’re seeing similar issues with accessibility, information architecture, and user experience patterns.

What I’m seeing:

  • Designers who can generate beautiful components quickly but can’t explain the design decisions
  • Perfect visual execution with no understanding of the underlying UX principles
  • AI-generated solutions that work in common cases but break in edge cases
  • Massive productivity on familiar design patterns, paralysis on novel UX challenges

The research about AI coding tools is basically describing what’s happening in design too: fast output, shallow thinking, skill erosion.

The craft vs speed trade-off

Here’s what really scares me: I founded a startup that failed. The hardest lesson I learned was that understanding why things work matters more than making things that work.

AI tools let you skip the “why” and jump straight to “what.” And when you do that consistently, you never build the judgment needed for complex decisions.

My startup failed because:

  • We shipped features users asked for without understanding the underlying problems
  • We optimized for velocity without understanding product-market fit
  • We looked productive based on output metrics while missing fundamental strategic issues

Sound familiar? It’s exactly what David is describing with roadmap planning based on AI-augmented velocity that doesn’t reflect true capability.

The learning gap

When I learned design, I spent hours studying why certain patterns work, analyzing competitors, understanding cognitive load and visual hierarchy. It was slow, sometimes frustrating, but it built deep understanding.

Today’s junior designers can prompt an AI and get a component in minutes. They skip the learning process entirely. And when they face a novel UX challenge—something AI hasn’t been trained on—they’re lost.

Luis’s point about mentorship breakdown hits hard. I can’t mentor effectively when I don’t see the thought process. Code reviews become “does this look right?” instead of “do you understand the principles?”

Our “manual design hours” experiment

Similar to Luis’s “AI-Free Fridays,” we implemented “manual design hours”:

Requirements:

  • 4 hours per week, junior designers work without AI assistance
  • Focus on fundamentals: typography, layout, accessibility principles, user research
  • Senior designers available for mentoring and critiques
  • They must explain their design decisions, not just produce artifacts

Results (painfully honest):

  • About 40% of our “high performers” struggle with basic design principles
  • Many can’t articulate why a design works beyond “the AI suggested it”
  • Some have never done proper user research—they just iterate on AI suggestions
  • Gap between AI-assisted output quality and fundamental understanding is shocking

Management pushback was intense: “Why slow them down?” But after one major redesign failed user testing because nobody understood the actual user needs, they got it.

Questions for the engineers

Reading this thread, I have questions for the engineering side:

1. How do you maintain code craftsmanship when AI makes it easy to skip steps?

In design, we talk a lot about “craft”—the care and understanding that goes into good work. How do you preserve that when AI tools optimize for speed over understanding?

2. Is there an engineering equivalent of “manual design hours”?

Would something like dedicated time for fundamentals work in engineering, or does the business pressure make it impossible?

3. How do you balance tool adoption with skill development?

We want juniors to learn modern tools (including AI), but not at the expense of fundamentals. How do you navigate this?

The uncomfortable truth about productivity

David’s ROI question is the right one: if AI gives 30% faster coding but only 8% faster delivery, and creates planning uncertainty and skill gaps—what’s the real value?

I’d add: if AI creates a generation of practitioners who are incredibly productive within the AI’s comfort zone but fragile outside it, are we actually building valuable teams or creating systemic risk?

From my failed startup experience: short-term productivity gains that sacrifice long-term capability are a trap. We scaled fast, shipped features quickly, and went out of business because we didn’t understand our market deeply enough.

The craft perspective

Here’s my maybe-controversial take: real craft is understanding the principles so deeply that you know when to break the rules.

AI tools are amazing at following rules. They’re terrible at knowing when to break them. If we train a generation of designers and engineers who can follow AI suggestions but can’t think independently, we’re creating teams that can execute the predictable but fail at innovation.

Michelle mentioned engineers who “can ship features fast but crumble under pressure when the AI can’t help.” In design, I’d describe it as: designers who can make things look good but can’t solve hard UX problems.

What I’m trying to preserve

I require every designer on my team to be able to:

  1. Explain their decisions - if you can’t articulate why, you don’t merge/ship it
  2. Work without AI periodically - maintain baseline capability
  3. Handle novel challenges - prove you can think independently when patterns don’t exist
  4. Understand accessibility deeply - not just “it passes automated checks”
  5. Do actual user research - AI can’t tell you what your specific users need

It’s slower. It’s harder to defend to leadership. But Keisha’s question about capacity planning resonates: I need to know my team’s actual capabilities, not their AI-augmented performance.

My questions for this community

For anyone managing creative/technical teams:

  1. How do you preserve craft and deep understanding in an AI-augmented workflow?

  2. What does “AI literacy” actually mean - is it prompting skills or knowing when NOT to use AI?

  3. How do you assess growth potential when current output is AI-augmented?

For Luis specifically, since you mentioned diversity and inclusion: I’m worried AI tools are creating a false sense of capability for early-career folks from non-traditional backgrounds. They look productive initially, but hit a ceiling when fundamentals matter. How do we support them without setting them up for future failure?

The long game (from someone who learned the hard way)

My startup failed because we optimized for speed over understanding. We shipped fast, looked productive, and missed the fundamental strategic insights that would have saved us.

This AI capability gap discussion feels eerily similar. We’re optimizing for velocity metrics while potentially missing fundamental capability building that matters for long-term success.

I’d rather have a designer (or engineer) who understands principles deeply and uses AI to accelerate than one who relies on AI to compensate for missing fundamentals. The former can handle anything. The latter is fragile.

Luis is right: we’re choosing between long-term capability and short-term velocity. And from painful experience, I know which one actually matters when things get hard.