I’ve spent the last 15 years in security engineering — 6 years at Stripe building fraud detection systems, 4 years at CrowdStrike hunting nation-state actors, and the last 5 years consulting for fintech companies across Africa and Europe. I’ve seen a lot of code. Good code, bad code, and code that makes you question whether the developer had ever heard of OWASP.
But what I’m seeing now with AI-generated code is different. It’s not just bad — it’s systematically vulnerable in ways that suggest we’re optimizing for the wrong metrics.
The Numbers Don’t Lie
Let me start with some data that should make every engineering leader sit up:
- 48% of AI-generated code contains security vulnerabilities (Panto AI Statistics, 2025)
- AI code is 2.74x more likely to introduce XSS vulnerabilities compared to human-written code
- It’s 1.88x more likely to have password handling issues
- Only 55% of AI-generated code passes security tests according to Veracode’s latest research
- 42% contains hallucinations — including phantom functions, non-existent libraries, and incorrect API usage
The language breakdown is even more alarming. Java projects using AI assistance have a 72% security failure rate. JavaScript and Python aren’t far behind.
What I’m Seeing in the Field
In my consulting work, I’ve reviewed codebases from 14 companies in the last 18 months. Eight of them are using GitHub Copilot or similar tools extensively. Here’s what I’ve found:
SQL Injection everywhere. AI loves to concatenate strings into SQL queries. I found this gem last month:
query = f"SELECT * FROM users WHERE email = '{user_input}'" # AI-generated
No parameterization. No input validation. Just raw user input directly into SQL. This is a vulnerability pattern that we mostly eliminated 15 years ago, and AI is bringing it back.
Broken authentication. I’ve seen AI generate JWT validation code that doesn’t verify signatures. Password reset flows that don’t expire tokens. Session management that stores sensitive data client-side. These aren’t edge cases — these are OWASP Top 10 fundamentals.
Insecure deserialization. AI will happily generate pickle, marshal, or eval code without warnings. One client had an AI-generated API endpoint that accepted serialized Python objects from untrusted sources. Remote code execution on a silver platter.
The phantom dependency problem. This one is insidious. AI hallucinates libraries that don’t exist, or suggests outdated versions with known CVEs. Developers don’t always catch this in review. I found a production service using a crypto library version from 2018 with 3 critical vulnerabilities — all because Copilot suggested it and the team assumed it was current.
The Root Cause: Training on Vulnerable Code
Here’s the thing — AI models are trained on public repositories. GitHub, StackOverflow, tutorial sites. And a huge percentage of that code is insecure. Stack Overflow is full of answers from 2012 that were never secure to begin with. Tutorial code that explicitly says “don’t use this in production.”
The AI doesn’t know the difference. It learns patterns. And if the pattern appears frequently in training data, the AI will reproduce it — even if it’s a security anti-pattern.
Are We Optimizing for Speed at the Cost of Security?
Here’s my controversial take: the industry is moving too fast, and we’re creating a security debt that will take years to pay down.
We’re measuring AI coding tools on velocity. How fast can you ship features? How many lines of code per hour? How quickly can junior developers become productive?
But we’re not measuring security. We’re not measuring technical debt. We’re not measuring the long-term cost of shipping vulnerable code faster.
I’ve had CTOs tell me: “We’ll fix security issues when they come up.” But security doesn’t work that way. Vulnerabilities compound. They hide in production for months or years. And when they’re exploited, the cost isn’t just a patch — it’s data breaches, regulatory fines, customer trust, and engineering time spent on incident response instead of building features.
What Should We Do?
I’m not anti-AI. I use Copilot myself for boilerplate and refactoring. But I think we need to be honest about the risks:
- Never AI-generate security-critical code. Authentication, authorization, cryptography, input validation — these should be human-written or use well-tested libraries.
- Automated security scanning is mandatory. SAST tools should run on every PR. No exceptions.
- Security-focused code review. If a PR contains AI-generated code, reviewers should specifically check for OWASP Top 10 vulnerabilities.
- Training. Developers need to understand common AI security pitfalls. What to watch for. What to never accept from autocomplete.
But here’s my question for this community:
Are we creating a security debt that will take years to pay down?
When you’re shipping features 55% faster but 48% of that code is vulnerable, are you actually moving faster? Or are you just deferring the cost to your future self — and your users?
I’d love to hear from other security engineers, engineering leaders, and developers in the trenches. What are you seeing? How are you balancing velocity with security?
Sam Okoye
Senior Security Engineer | Lagos, Nigeria
ex-Stripe, ex-CrowdStrike