AI Now Writes 30% of Microsoft's Code and Over 25% of Google's — What Does That Mean for Code Ownership?

cto_michelle · February 14, 2026, 5:53am

Satya Nadella recently revealed that 20-30% of code in Microsoft’s repositories is now “written by software.” Sundar Pichai shared similar numbers for Google — over 25-30% of new code is AI-generated. These aren’t small experiments. This is production code at two of the world’s largest tech companies.

As a CTO, this raises questions I don’t have good answers to yet.

The Headline Numbers

Microsoft (April 2025):

20-30% of code in repositories is AI-generated
Nadella says “some projects may have all of its code written by AI”

Google (2025):

25%+ of code is AI-assisted
Pichai emphasizes velocity gains (+10% speed), not replacement

Industry-wide:

41% of all code is now AI-generated
76% of developers using (62%) or planning to use (14%) AI coding tools
75% still manually review every AI-generated snippet before merging

The Ownership Questions I’m Wrestling With

1. Who owns AI-generated code?

The legal framework is murky. Generally:

If a human provides “sufficient creative input” (iterative prompting, editing, refining), copyright may attach to the human author
Through employment agreements, that ownership typically transfers to the employer
But what if the “creative input” is just “write a function that does X”?

Microsoft offers IP indemnity for Copilot outputs (if guardrails are enabled). That’s a meaningful commitment. But not all AI tools offer this protection.

2. What about license contamination?

Research suggests ~35% of AI-generated code samples contain licensing irregularities. This is a real liability risk. If an AI tool was trained on GPL code and reproduces patterns from that code, are you now obligated to open-source your project?

Microsoft and Google use sophisticated license detection tools. Most companies don’t have that infrastructure.

3. The “review” problem

75% of developers say they manually review AI-generated code. But do they? Really?

I’ve watched developers accept AI suggestions with barely a glance. The review is increasingly cursory as AI output quality improves. This creates a gap between stated practice and actual practice.

4. Attribution and credit

If 30% of your codebase is AI-generated, how do you:

Evaluate developer performance?
Attribute bugs to authors?
Assess code quality ownership?
Handle code reviews?

What I’m Doing at My Company

Explicit AI code policies - We require annotation when AI generates substantial portions of code
License scanning - We run automated tools to detect potential license contamination
Enterprise tiers only - We use tools with IP indemnification clauses
Review standards - AI-generated code gets the same review standards as human code

Questions for the Community

How are you handling attribution for AI-generated code?
Has anyone actually experienced a license contamination issue?
Do your code review practices change for AI-generated PRs?

This feels like we’re building on uncertain legal foundations. The technology has moved faster than the governance.

security_sam · February 14, 2026, 5:53am

Michelle, this is exactly the kind of discussion we need to be having. The IP and license issues are real, but I want to add the security dimension to this conversation.

The audit trail problem:

When 30% of your code is AI-generated, your security audit process breaks down. Traditional security reviews assume:

Developers understand the code they write
There’s institutional knowledge about why code exists
You can interview the author about edge cases

With AI-generated code:

The “author” is an LLM that can’t be interviewed
The prompt history may not be preserved
The developer who accepted the code may not fully understand it

Vulnerability introduction at scale:

Research shows 45% of AI-generated code contains vulnerabilities. If 30% of your codebase is AI-generated, you’re looking at roughly 13.5% of your entire codebase having potential security issues.

That’s not a rounding error. That’s a significant attack surface.

What I’m seeing in security assessments:

Pattern reproduction - AI tools reproduce common security anti-patterns from their training data (SQL string concatenation, eval() usage, etc.)
Context blindness - AI doesn’t understand your threat model. It might generate perfectly functional code that’s completely inappropriate for your security requirements.
Review fatigue - Security reviewers are overwhelmed. When AI generates more code faster, security can’t keep up.

My recommendations:

Static analysis is mandatory - Run SAST tools on all AI-generated code before merge
Security-focused prompting - Train developers to include security requirements in prompts
Separate review tracks - AI-generated code should get security review, not just functional review
Preserve context - Save prompts and AI responses for later audit

The legal ownership question is important, but I’m more worried about the security ownership question. If a vulnerability in AI-generated code leads to a breach, who’s accountable?

alex_dev · February 14, 2026, 5:54am

I want to push back gently on the framing here. As someone who uses AI tools daily, I think we’re overcomplicating the ownership question.

The practical reality:

When I use AI to generate code, I’m:

Defining the requirements
Choosing which output to use
Modifying it to fit my context
Testing and validating it works
Taking responsibility for the result

This is fundamentally the same as using Stack Overflow, except faster. We didn’t have existential crises about “who owns code copied from Stack Overflow.”

The “30% AI-generated” stat is misleading:

That 30% probably includes:

Boilerplate and scaffolding
Test case generation
Documentation strings
Simple utility functions

The novel, creative, business-logic code is still largely human-written. The AI is handling the boring stuff we would have copy-pasted anyway.

On the review question:

@cto_michelle asked if developers actually review AI code. In my experience:

For autocomplete suggestions (single lines): No, not really
For larger code blocks: Yes, pretty thoroughly
For architecture suggestions: Definitely

The review depth matches the risk level. That’s… rational?

What I think matters more:

Code quality, not code origin - Does it work? Is it maintainable? Is it secure? Those questions matter regardless of who/what wrote it.
Developer responsibility - If I merge it, I own it. Full stop. This has always been true for any code that enters the codebase.
Tooling over policy - Better static analysis, better CI/CD gates, better test coverage. These protect you better than tracking AI attribution.

My honest answer to your questions:

Attribution: I don’t track it. The PR author is responsible for everything in their PR.
License contamination: Haven’t experienced it. Running license scanners would catch obvious issues.
Review practices: Same standards. Code is code.

Maybe I’m being naive, but I think we’re creating process overhead for a problem that existing engineering practices already solve.

eng_director_luis · February 14, 2026, 5:54am

Both Michelle and Alex make valid points. Let me offer the middle-management perspective — I’m caught between policy and practicality daily.

The attribution problem is real, but not for the reasons you think:

I don’t care about AI attribution for legal reasons. I care about it for performance management reasons.

When I look at commit history to understand who contributed what:

Did this engineer solve the problem or did they prompt well?
Is this person growing technically or just becoming a better AI whisperer?
When this code breaks, who actually understands it well enough to fix it?

These questions matter for career development, team composition, and incident response.

The “code is code” argument has limits:

@alex_dev makes a fair point that the developer takes responsibility. But there’s a spectrum:

High confidence code:

Engineer wrote it from scratch
Engineer modified AI output significantly
Engineer can explain every line

Low confidence code:

AI generated with minimal review
Engineer accepted because “it works”
Engineer might not understand edge cases

The codebase doesn’t distinguish between these. That’s a problem when you need to modify it later.

What I’m experimenting with:

Commit message conventions - We’re testing a convention where AI-generated code includes “[AI-assisted]” in commit messages. Not enforcement, just visibility.
Design review separation - We now require human-written design docs before AI implementation. The human does the thinking; AI does the typing.
Ownership rotation - Engineers rotate through code areas they didn’t write (AI or human). This forces knowledge transfer.

My honest assessment:

The 30% number will be 50% in a year and 70% in three years. We need to figure this out now, while the percentage is still manageable. Waiting until the majority of code is AI-generated to establish norms will be too late.

@cto_michelle - Your annotation policy is exactly right. The cost of tracking is low; the cost of not tracking could be significant later.