Last April, Satya Nadella said at LlamaCon that “maybe 20%, 30% of the code inside our repos today” was written by AI. Sundar Pichai made a similar claim for Google. Zuckerberg predicted half of Meta’s development would be AI-driven within a year. Then Anthropic’s CEO said 90% of all code would be AI-written within six months.
I manage 40+ engineers at a Fortune 500 financial services company. When I heard these numbers, my first reaction wasn’t excitement — it was: how are they even measuring this?
The Measurement Problem Nobody Talks About
There is no reliable, standardized way to measure the percentage of AI-generated code in a repository. Nadella’s actual language was peppered with “maybe,” “probably,” “something like” — not the confident declarations the headlines portrayed.
Think about what “AI-generated” even means:
- Code that Copilot suggested and a developer accepted verbatim?
- Code that an AI drafted but a developer substantially modified?
- Code where a developer asked ChatGPT for an approach, then wrote it themselves?
- Auto-generated boilerplate from AI-powered scaffolding tools?
At my org, we tried to track this. We instrumented our Copilot deployment to log acceptance rates. Our numbers: roughly 18-22% of committed code originated from AI suggestions across our teams. But that number varies wildly:
- Python/TypeScript teams: 28-35% AI-originated
- Java enterprise services: 15-20%
- Legacy C++ systems: Under 8%
- Infrastructure-as-code: 40%+ (Terraform, CloudFormation)
The language disparity matches what Nadella acknowledged — more progress in Python, less in C++. But even our “high” numbers come with a massive asterisk: most of that code was modified after acceptance.
What the Industry Data Actually Shows
The aggregate statistics are striking but need careful reading:
- 41% of all code written in 2025 is reportedly AI-generated (industry-wide)
- 91% of engineering orgs have adopted at least one AI coding tool
- 65% of developers use AI coding tools weekly (Stack Overflow 2025)
- GitHub Copilot has a 46% completion rate, but only about 30% gets accepted by developers
The gap between “suggested” and “accepted” is where the real story lives. AI is proposing a lot of code. Developers are rejecting most of it.
The Productivity Paradox
Here’s where it gets really interesting. The vendor studies paint a rosy picture — 20-55% faster task completion. But independent research tells a different story:
- Bain & Company described real-world savings as “unremarkable”
- The METR randomized controlled trial with experienced open-source developers found they were actually 19% slower with AI tools
- Developers only spend 20-40% of their time actually writing code, so even significant code-generation speedups translate to modest overall productivity gains
- Large enterprises report 33-36% reduction in code-related development time — but that’s “code-related,” not total engineering time
The 19% slowdown finding deserves attention. Experienced developers estimated they were 20% faster but measured 19% slower. The cognitive overhead of reviewing, validating, and integrating AI suggestions ate more time than it saved.
The Quality Tax
The 2025 DORA Report found that a 90% increase in AI adoption was associated with:
- 9% climb in bug rates
- 91% increase in code review time
- 154% increase in pull request size
- Code duplication up 4x
That last number — 4x code duplication — should concern every engineering leader. AI tools are excellent at generating plausible-looking code, but they optimize for local correctness over global architectural coherence.
The experience-level pattern is particularly revealing:
- Junior devs accept 31.9% of AI suggestions but encounter 8.2 quality issues per PR
- Senior devs accept 23.7% but encounter only 3.1 quality issues per PR
Seniors are more selective. They recognize when AI output “looks right but isn’t.” Juniors don’t have the mental models yet to catch the subtle architectural mismatches.
What I’m Doing About It
Rather than chasing a “percentage AI-generated” vanity metric, we’re focusing on:
- Establishing quality gates — AI-generated code goes through the same review standards, plus automated checks for common AI patterns (duplicate logic, unnecessary abstractions)
- Tracking acceptance quality, not quantity — measuring how often accepted AI suggestions survive code review unchanged, and how often they’re flagged in production
- Segmenting by use case — AI is genuinely excellent for boilerplate, test generation, and documentation. It’s mediocre for business logic and actively risky for security-sensitive code
- Investing in reviewer skills — training senior engineers specifically on AI code review patterns, because reviewing AI output is a different skill than reviewing human output
The Real Question
When a CEO says “30% of our code is AI-generated,” what they’re really signaling is tool adoption and modernization velocity. It’s an investor relations narrative, not an engineering metric.
The question isn’t “what percentage of your code is AI-generated.” It’s:
- Is your team’s defect rate going up or down since AI adoption?
- Is your cycle time actually improving, or just your lines-of-code throughput?
- Are your senior engineers spending more time reviewing AI output than they saved by not writing it?
- Is your technical debt growing faster than your feature velocity?
I’d love to hear from other engineering leaders: what are your actual numbers? Not the headline metrics, but the messy reality. What’s your AI acceptance rate? What’s happening to your code review cycles? And most importantly — are you actually shipping better software faster, or just shipping more code?