Positive AI Sentiment Dropped from 70% to 60% - What Happened?

Stack Overflow’s 2025 Developer Survey just dropped, and if you care about developer tools — whether you’re building them, buying them, or using them — the headline number should give you pause: positive sentiment toward AI coding tools has fallen from over 70% in 2023 to just 60% in 2025.

Let me break down the data, because as a data scientist, the numbers tell a more nuanced story than the headline suggests. This isn’t developers rejecting AI — it’s developers getting more realistic about what AI can and can’t do.

The Trust Paradox: Using More, Trusting Less

The adoption numbers are unambiguous: 84% of developers now use or plan to use AI tools in their workflow, up from 76% in 2024. 51% of professional developers use AI tools daily. By any measure, adoption is accelerating.

But trust is moving in the opposite direction:

Metric 2024 2025 Change
Positive sentiment 70%+ 60% -10pp
Trust AI accuracy 40% 29% -11pp
Actively distrust AI 31% 46% +15pp
“Highly trust” AI output ~5% 3% -2pp

That last number is the most telling. Only 3% of developers highly trust what AI tools produce. And among experienced developers (10+ years), the “highly trust” rate drops to 2.6%, while the “highly distrust” rate is 20%.

This isn’t luddism. The people who distrust AI the most are the ones who’ve used it the most and have the deepest understanding of code quality.

The “Almost Right” Problem

The survey identified the #1 developer frustration with AI: 66% cite “solutions that are almost right, but not quite.” Another 45% say debugging AI-generated code is more time-consuming than writing it from scratch would have been.

This matches what I see at Anthropic. The “almost right” problem is arguably harder than “obviously wrong.” When AI generates code that looks correct, passes a cursory review, and maybe even works in simple test cases — but has subtle bugs in edge cases, incorrect assumptions about data types, or security issues that only manifest under load — the time cost of finding and fixing those issues can exceed the time saved by generating the code in the first place.

A Clutch survey of 800 developers found that 59% use AI-generated code they don’t fully understand. That’s not a tooling problem — that’s a professional liability.

The METR Study: Feeling Fast vs Being Fast

This is the data point that should worry the entire industry. A rigorous METR study from July 2025 tracked experienced open-source developers using AI tools on real tasks. The results:

  • Developers estimated AI would make them 24% faster
  • After the experiment, developers believed they had been 20% faster
  • Actual measured performance: 19% slower

The perception gap is massive. Developers feel like AI is helping even when it measurably isn’t. This has significant implications for any organization using “developer satisfaction” or “perceived productivity” as a metric for AI tool ROI.

It also helps explain a counterintuitive finding: senior developers (10+ years) ship 2.5x more AI code than juniors. Not because AI is better for experts — but because experts have the judgment to catch AI mistakes before they ship. AI is a ceiling raiser for people who already know what correct looks like, not a floor raiser for those learning.

What Developers Actually Value

When asked what they value in development tools, developers ranked:

  1. Reputation for quality — first
  2. Robust and complete API — second
  3. AI integration — second to last

Let that sink in. The feature that vendors are pouring billions into is the thing developers care about second to least. Developers want tools that work reliably, not tools that generate code they have to verify.

The Numbers That Aren’t Talked About

  • 72% of developers are NOT vibe coding. The narrative that everyone is prompting their way through development doesn’t match the data.
  • 52% don’t use AI agents or stick to simpler tools. The agentic future is not here yet for most developers.
  • 68% of developers expect employers will mandate AI proficiency. Adoption is increasingly employer-driven, not developer-driven. That’s a significant difference in motivation.

What This Means

I don’t read this data as “AI tools are failing.” I read it as the market maturing. The hype cycle is normalizing into something more sustainable:

  1. The 60% positive sentiment is probably the real baseline. The 70%+ was novelty-inflated.
  2. Trust will follow accuracy improvements, not marketing. When tools hallucinate 42% of the time, trust at 29% is rational.
  3. The best teams are building verification into their workflow, not relying on AI output blindly.
  4. Developer experience matters more than AI integration. Teams choosing tools based on reliability over AI features will ship more dependable software.

The question for everyone building, buying, or mandating AI tools: are you measuring actual productivity impact, or are you relying on developer perception? Because the METR study suggests those are very different things.

What’s your team’s experience? Has your trust in AI coding tools changed over the past year?

Rachel, the “almost right” problem is the best description of my daily frustration with AI tools. I want to share what this looks like from the trenches, because I think my experience maps pretty closely to that 60% sentiment number.

My Journey from Enthusiast to Skeptic

A year ago, I was an AI coding tool evangelist. I wrote blog posts about how Copilot doubled my productivity. I told everyone on my team to use it. I genuinely believed we were in a paradigm shift.

Today? I still use AI tools every day, but my relationship with them has fundamentally changed. Here’s what happened:

Month 1-3: The honeymoon. Everything AI generated felt like magic. Boilerplate code, test scaffolding, regex patterns — all faster with AI. I estimated I was 30-40% more productive.

Month 4-8: The plateau. I started noticing that AI suggestions were subtly wrong more often than I initially realized. It would import a package that didn’t exist. It would use a deprecated API. It would write a function that passed basic tests but had an off-by-one error in a loop condition. I was spending more time reviewing AI code than I expected.

Month 9-12: The reckoning. I shipped a bug to production that traced back to AI-generated code I hadn’t reviewed carefully enough. It was a race condition in an async handler — the kind of thing that looks fine in a code review but fails under concurrent load. After that, I became much more selective about when I use AI and much more rigorous about reviewing what it produces.

Where AI Actually Helps Me (Still)

I’m not anti-AI. But I’ve narrowed my use to tasks where the “almost right” failure mode is cheap:

  • Boilerplate and scaffolding — generating a new API endpoint skeleton, test file structure, config templates
  • Documentation — first drafts of JSDoc comments, README sections, API docs
  • Regex and one-liners — pattern matching, data transformation snippets
  • Learning — “how does this library work?” as a starting point for exploration

Where I’ve stopped using AI almost entirely:

  • Business logic — too many subtle correctness requirements
  • Database queries — getting JOINs and edge cases right matters too much
  • Security-sensitive code — authentication, authorization, input validation
  • Complex state management — AI doesn’t understand my app’s state model

The 59% Number Terrifies Me

The statistic that 59% of developers use AI code they don’t fully understand — that keeps me up at night. Not because of AI, but because of what it says about the review culture we’re building.

If I submit a PR and tell my reviewer “I wrote this,” they review it with one level of scrutiny. If I say “Copilot generated this and I reviewed it,” they review it with maybe the same scrutiny. But if neither of us fully understands what the code does because it was AI-generated and only superficially reviewed, we’ve created a collective accountability gap.

I’ve started a practice on my team: any PR that includes AI-generated code gets a comment tag [ai-assisted] so reviewers know to apply extra scrutiny. It’s a small thing, but it’s changed how we think about review responsibility.

The Real Question

Rachel’s framing is right — this is maturity, not rejection. But I’d add: the 60% positive sentiment number will stay flat or decline further unless the tools get meaningfully more accurate. Developer trust is earned in months and lost in one bad production incident. The tools need to be right more often, not just fast more often.

Rachel, I want to push back on one framing here, because I think the product side of this conversation is being missed.

The Business Case Is Still Strong — Even at 60% Sentiment

From a VP of Product perspective, I care about outcomes, not sentiment scores. If 84% of developers are using AI tools daily and 60% view them positively, that’s a product with 84% adoption and a 60% satisfaction rate. In the SaaS world, most products would kill for those numbers.

The question isn’t whether developers like AI tools — it’s whether teams with AI tools ship better products than teams without them. And on that metric, the evidence is still directionally positive, even if the effect size is smaller than the hype suggested.

At my company, we rolled out Copilot Business to all engineers six months ago. Here’s what we actually measured:

  • PR volume: Up ~15% (more code being written and reviewed)
  • Time to first commit for new hires: Down 22% (AI helps with onboarding)
  • Bug rate per 1,000 LOC: Roughly flat (not worse, not better)
  • Developer satisfaction survey: Down from 8.1 to 7.4 over 6 months

That last number maps to Rachel’s thesis. Developers started enthusiastic and became more measured. But the first three numbers tell me the tool is net positive for the business, even if individual developers are frustrated.

The Employer-Driven Adoption Is Rational

Rachel flagged that 68% of developers expect employers to mandate AI proficiency. I know that sounds dystopian from a developer perspective, but from a business perspective, it’s the same story as every previous productivity tool mandate.

Companies mandated version control. Companies mandated CI/CD. Companies mandated code review. In every case, individual developers initially resisted, and in every case, the industry standardized because the aggregate benefits were real even when individual experiences were mixed.

AI coding tools are following the same adoption curve. The question is whether the tooling improves fast enough to convert the reluctant 40% into genuine advocates, or whether it plateaus at “mandated but resented.”

Where I Disagree: The “AI Integration Ranked Last” Argument

When Rachel says developers ranked “AI integration” second to last for tool selection criteria, I think that’s misleading. Developers also ranked “nice UI” low on every tool survey for decades — while consistently choosing tools with better UIs. Stated preferences and revealed preferences diverge.

What I see in our engineering org: when we gave teams a choice between two similar tools — one with strong AI integration and one without — 78% chose the AI-integrated option. They just don’t want “AI” to be the primary selling point. They want a good tool that also has AI, not an AI product that happens to be a tool.

The Real Risk for Product Leaders

The risk I’m watching isn’t the 60% sentiment number. It’s the METR study’s perception gap. If developers feel more productive but aren’t measurably faster, and I’m making investment decisions based on developer perception surveys, I’m optimizing for a phantom metric.

We’ve started measuring AI tool ROI differently:

  1. Code review turnaround time (has it changed?)
  2. Incident rate (are AI-assisted projects more or less stable?)
  3. Time to feature completion (end-to-end, not just coding time)
  4. Customer-facing bug reports (ultimate quality signal)

So far, the data is noisy. Some metrics improved, some didn’t. But at least we’re measuring real outcomes instead of vibes.

This thread is hitting on something I’ve been struggling with as a VP of Engineering: the gap between what leadership wants to believe about AI tools and what the data actually shows.

The Leadership Pressure Is Real

When Satya Nadella says 20-30% of Microsoft’s code is AI-generated, my CEO sends me that article and asks why we aren’t there yet. When Gartner says AI tools improve developer productivity by 30%, my board asks me why our velocity metrics haven’t moved. When a competitor claims their team of 10 ships like a team of 40 thanks to AI, my investors want to know our “AI strategy.”

I’m scaling an engineering org from 25 to 80+ engineers. Every new headcount request now comes with the question: “Have you considered whether AI can do this instead?” The pressure to show AI-driven efficiency gains is enormous, and the METR study showing a 19% slowdown is not the narrative anyone in my leadership chain wants to hear.

What I’m Actually Seeing in My Org

We have about 60 engineers using AI tools across multiple teams. Here’s my honest assessment:

The top 20% of our engineers are genuinely more productive with AI. They use it selectively, they verify everything, and they’ve developed an intuition for when AI suggestions are likely to be right vs wrong. For them, the tools are a net positive.

The middle 60% are roughly neutral. They use AI tools, they sometimes get a speed boost, they sometimes waste time debugging bad suggestions. The net effect is close to zero, with maybe a slight positive for boilerplate tasks.

The bottom 20% are measurably worse. These are mostly junior engineers who accept AI suggestions without sufficient review, introduce subtle bugs, and create technical debt that senior engineers have to clean up. The review burden on the team has increased, not decreased.

The Skill Erosion Problem Nobody Wants to Discuss

Rachel mentioned that senior devs ship 2.5x more AI code than juniors. I want to unpack why this terrifies me as a leader building the next generation of engineers.

We’re in a hiring crisis for junior developers (67% junior hiring decline). The juniors we do hire are entering an environment where AI handles the “practice reps” they used to learn from. At my company, I’ve noticed:

  • Junior engineers are less comfortable reading code they didn’t write (because AI wrote so much of their codebase)
  • They’re weaker at debugging because they’ve had fewer opportunities to write buggy code and trace the problems
  • They struggle with system design because AI handles component-level implementation but can’t teach architectural thinking

I’ve started mandating “AI-free weeks” for our junior engineers — periods where they write everything from scratch. It’s unpopular, it’s slower, and my VP of Product thinks I’m being a luddite. But if we don’t invest in building foundational skills, we won’t have the senior engineers who can actually use AI effectively in 5 years.

My Framework for AI Tool Investment

David’s right that from a business perspective, 84% adoption with 60% satisfaction is strong. But I’m making a different calculation:

  1. AI tools are table stakes for recruiting. Engineers expect them. Not offering Copilot or Cursor is a competitive disadvantage in hiring, regardless of their productivity impact.
  2. The ROI is team-specific, not org-wide. Some teams see 20%+ gains. Others see net negative. Blanket mandates don’t work — you need team-level enablement.
  3. Investment should go to verification, not generation. The tools that help us validate AI code (better testing, static analysis, AI-powered code review) are more valuable than the tools that generate more code faster.
  4. Training budgets need to shift. We’ve spent $40K on AI tool licenses this year. We’ve spent $0 on training engineers to use them effectively. That ratio is wrong.

The Number I Watch

The metric I care about most isn’t sentiment or adoption — it’s escaped defect rate. If AI tools are increasing the number of bugs that make it past code review and into production, I don’t care how fast we’re moving. Speed without quality is just creating future incidents.

So far, our escaped defect rate is flat. Not worse, not better. I consider that a temporary win. But if it starts climbing, AI tool mandates come off the table immediately, regardless of what the CEO read in a Gartner report.