Skip to main content

The AI Adoption Cliff: Why Power Users Disable AI Features First

· 8 min read
Tian Pan
Software Engineer

A study run in early 2025 by METR asked 16 experienced open-source developers to complete real GitHub issues using their preferred AI coding tools. The result: they took 19% longer than the control group working without AI. More striking — those same developers estimated they'd been 20% faster. The perception gap was nearly 40 percentage points.

This isn't an isolated finding. Across companies, across tools, and across role levels, a consistent pattern is emerging: the engineers most capable of evaluating AI are also the ones most likely to turn it off.

The standard narrative around AI tools is an adoption curve — skeptics become converts as the tooling matures. The AI adoption cliff is the opposite: power users arrive first, push hardest, and disengage fastest. Understanding why matters for both the engineers experiencing it and the teams building tools for them.

The Productivity Paradox in Practice

The METR result was initially puzzling. These weren't AI skeptics — they were developers from major open-source projects with meaningful context on their codebases. Yet the slowdown was real and consistent across the group, even as participants reported feeling more productive.

The researchers identified where the time went:

  • Context switching between AI tool interfaces and actual codebase navigation
  • Reading, verifying, and correcting AI-generated code (which looked plausible but often introduced subtle regressions)
  • Prompt engineering overhead — explaining enough context to get usable output
  • Post-generation cleanup: reformatting, removing duplication, adding documentation

Only one developer in the study achieved the promised productivity gain — a 38% speedup — and they had logged over 50 hours with Cursor specifically. The learning curve to get value from AI in complex, long-lived codebases is measured in dozens of hours, not days.

The PR review data tells a similar story from a different angle. In studies of teams using AI coding assistants, developers completed 21% more tasks — but PR review time increased by 91%. Throughput went up; actual shipping speed didn't. The constraint moved from writing code to the human review step that experienced engineers can't skip.

Why Experts Opt Out First

Novices and experts use AI tools differently, and the tools are mostly optimized for novices.

For a junior developer, AI autocomplete is additive. It fills gaps in knowledge, provides patterns they haven't memorized, and accelerates work that would otherwise require documentation diving. The cost of incorrect output is bounded — they're not in a position to make far-reaching architectural decisions anyway.

For a senior developer, the calculus inverts. They already have the pattern library. They already know the approach. What they need isn't generation — it's execution. And execution in a complex codebase requires context that's hard to transfer to an AI in a prompt.

The expert's additional burden is verification. Every AI output creates a review obligation. Experts have the taste to know when something feels wrong and the paranoia to check anyway. That verification tax — applied to every suggestion, every completion, every generated test — adds up.

One senior developer who documented building 150,000 lines of AI-generated code found himself running git reset repeatedly to discard entire sessions. The code was syntactically correct, even functionally plausible — but architecturally incoherent. He had delegated the reasoning, not just the typing.

Skill Atrophy Is the Longer-Term Risk

Beyond productivity, experienced engineers worry about something harder to measure: what they're losing by not doing things themselves.

Anthropic's own research on AI assistance and coding skills found a striking gap. In a randomized controlled trial, developers using AI scored 50% on comprehension quizzes versus 67% for those writing code by hand — nearly a two-letter-grade difference. The largest gap appeared in debugging: identifying when code is wrong and why.

This is a specific and important skill. Debugging requires building a mental model of execution, holding intermediate state in your head, reasoning about failure modes. It's exactly the kind of reasoning that AI offloads — and exactly the kind that atrophies when you stop practicing it.

The automation paradox compounds this over time. When an AI tool is correct 99% of the time, humans stop checking carefully. The skill to catch the 1% — which in software often takes the form of a subtly plausible but semantically wrong output — erodes from disuse. Senior developers understand this dynamic, and many decide the risk isn't worth it.

Automation Bias Doesn't Spare Experts

A common assumption is that expertise protects against automation bias — that experienced practitioners will catch AI errors that novices miss. The research is more complicated.

In studies of medical AI decision support, even very experienced radiologists and pathologists showed measurable automation bias. They accepted AI recommendations that matched their priors while resisting ones that contradicted their intuitions. Expertise didn't eliminate bias; it filtered it through prior judgment. A confident wrong AI recommendation that happens to align with a plausible-looking case is still dangerous.

For engineers, this translates to a specific failure mode: the AI-generated code that matches your first instinct about what the solution should look like, but contains a subtle bug in the edge case you didn't think to consider. Expertise narrows the search space for AI error, but it doesn't close it.

What Sophisticated Users Actually Need

The tools that retain expert users share several properties that most AI features lack by default.

Transparency about uncertainty. Expert users don't want AI to project confidence it doesn't have. They want signals about where the model is extrapolating, where the context is thin, and where the output should receive closer review. A completion that says "I'm guessing at the intended behavior here" is more useful to a senior engineer than one that confidently asserts the wrong answer.

Meaningful control over autonomy levels. The binary of "AI on / AI off" isn't useful. Experts want to calibrate — high autonomy for boilerplate and test scaffolding, low autonomy for core business logic. Tools that offer an autonomy dial, where users set thresholds by task type, retain expert users better than all-or-nothing defaults.

Audit trails and undo. The expert's relationship with AI is adversarial in the productive sense — they're looking for errors. A chronological log of AI actions with per-step reversal enables exactly that kind of engaged skepticism. Without it, accepting an AI suggestion feels like trusting a black box, and experienced engineers distrust black boxes.

Explainable rationale. When AI makes a choice — selecting one approach over another, omitting a consideration, reordering logic — experts want to know why. "I chose X because your codebase uses pattern Y consistently" is reviewable. "Here is code" is not.

Privacy and data controls. Expert developers are the ones who read the terms of service. When an AI tool changes its data training defaults — as GitHub did in April 2025 with Copilot interaction data — experienced engineers are the ones who notice, flag it internally, and route around it. Transparent, per-repository data controls aren't a nice-to-have for power users; they're a prerequisite.

The Design Lesson

There's a useful frame from Notion's AI adoption story. Notion AI reached 90% adoption among users in its first week, not because the AI was dramatically better than alternatives but because of how it was integrated. The AI commands lived inside the familiar slash menu — not a separate tool, not a plugin, but woven into an interface experts already had muscle memory for. Customization was deep enough that professionals could make it fit their workflows rather than adapting their workflows to fit the AI.

The products that fail with expert users tend to optimize for the demo experience: fast, impressive output that works on clean inputs. The products that succeed with them optimize for the collaboration experience: incremental, controllable, explainable, reversible.

Experts have more to lose from bad AI outputs than novices do. They have more context to verify, more systems depending on their judgment, more accumulated taste that the AI cannot replicate. The tools that acknowledge this — that treat expert engagement as adversarial review rather than frictionless acceptance — end up with the users who do the most interesting work.

Moving Forward

The AI adoption cliff isn't inevitable. It's a product failure mode, not a fundamental limit of the technology.

The path through it is designing AI features for the most skeptical, capable user in the room rather than the most credulous one. That means transparency about confidence, meaningful controls over scope, audit mechanisms for every action, and explicit privacy guarantees. It means treating expert pushback as signal, not resistance to change.

The 19% slowdown from the METR study isn't an argument against AI tools. It's an argument for building them differently — where the benefit flows to the people most equipped to extract it, rather than to the people least likely to notice the cost.

References:Let's stay in touch and Follow me for more thoughts and updates