I’ve been doing bug bounty work for eight years, and lately I’m seeing a pattern that should terrify every engineering team: nearly half of all AI-generated code contains security vulnerabilities.
Let me share what the data actually says, because the numbers are worse than the headlines suggest.
The Numbers Don’t Lie
Recent research shows that 45-48% of AI-generated code contains security flaws. But here’s the kicker: AI-generated code has 2.74x more vulnerabilities than human-written code. When analyzing over 50,000 AI-generated codebases, 68% of projects had at least one high-severity vulnerability, averaging 4.2 security issues per project.
For those of you using GitHub Copilot: 29.5% of Python snippets and 24.2% of JavaScript snippets had security weaknesses. SQL injection appeared in 31% of projects, XSS in 27%, broken authentication in 24%.
I’ve personally found production systems with AI-generated auth flows that leaked Azure service principals, payment processing code with improper input validation, and cryptographic implementations that would make any security engineer weep.
The Paradox That’s Breaking Our Mental Model
Here’s what really keeps me up at night: while trivial syntax errors dropped 76% and logic bugs fell 60%, architectural flaws increased 150% and privilege issues skyrocketed 300%.
Think about what that means for code review. We’ve trained ourselves to catch syntax errors, race conditions, edge cases. But AI doesn’t make those mistakes. Instead, it creates entirely new categories of vulnerabilities:
- Systemic privilege escalation patterns that look fine in isolation
- Architectural security flaws that span multiple files
- Missing security controls that AI simply doesn’t know your application needs
- Inconsistent security patterns that create exploitable edge cases
Traditional code review is built to catch human mistakes. But AI doesn’t make human mistakes—it makes AI mistakes.
Code Review: Last Line of Defense or First Point of Failure?
I’ve been thinking about this question a lot lately. In my bug bounty work, I’m increasingly finding vulnerabilities that passed code review. When I trace them back, they’re almost always AI-generated.
The root cause? Code reviewers are looking for the wrong things.
We’re pattern-matching against human error patterns:
- “Did the developer forget to sanitize input?”
- “Did they handle the error case?”
- “Is the logic correct?”
But AI-generated vulnerabilities look different:
- The code appears to handle inputs correctly—but uses an insecure pattern
- Error handling exists—but exposes sensitive information in error messages
- The logic works—but creates a privilege escalation vector
The Volume Problem
Making matters worse: PRs are getting 18% larger as AI adoption increases. Incidents per PR are up 24%, and change failure rates increased 30%.
We’re asking human reviewers to catch more complex vulnerabilities in more code, faster. That’s not a recipe for security—it’s a recipe for disaster.
What Changes in the Threat Model?
From a threat modeling perspective, here’s what AI-assisted development changes:
- Attack surface expansion: More code, more features, more potential entry points
- Secret leakage: 6.4% of Copilot repos leak secrets—40% higher than traditional development
- Trust boundary confusion: Who’s accountable when AI generates the vulnerable code?
- Supply chain risks: AI can amplify vulnerabilities from training data
The Hard Questions We Need to Answer
I don’t have all the answers, but I think we need to seriously rethink our approach:
-
Should we treat AI-generated code as untrusted by default? Like we do with third-party dependencies?
-
Do we need separate review paths for AI code? With different checklists and reviewer training?
-
Are traditional code reviews even the right control? Or do we need something new—automated security analysis specifically tuned for AI-generated code patterns?
-
Who’s responsible when AI code ships vulnerabilities? The engineer who accepted the suggestion? The reviewer who approved it? The org that deployed the tool?
-
How do we balance velocity with security? Because the entire promise of AI coding is speed—but speed without security is just expensive technical debt.
What I’m Seeing in the Wild
In the last six months, I’ve found:
- Authentication bypass vulnerabilities in AI-generated OAuth flows
- Injection flaws in AI-written SQL query builders that developers trusted because “the AI knows SQL”
- Cryptographic failures using deprecated algorithms that AI pulled from outdated training data
- Broken access control in permission systems that looked correct but had exploitable logic flaws
Every single one of these passed code review. Every single one made it to production.
So: Safety Net or Blind Spot?
I think the uncomfortable truth is that right now, code review is becoming a blind spot.
We’re reviewing AI code with human-focused mental models. We’re trusting that the AI “knows better” about security patterns. We’re moving fast and assuming someone else will catch the problems.
That needs to change. Fast.
What does your team do differently when reviewing AI-generated code? Or are you treating it the same as human code and hoping for the best?
I’d especially love to hear from folks who’ve found AI-generated vulnerabilities in their own systems, or teams who’ve successfully adapted their review processes.
Because right now, 48% is not a statistic we can ignore.
Sources: Veracode AI Security Report, Practical DevSecOps AI Statistics 2026, ACM Study on Copilot Security, GitHub Copilot Vulnerability Research