87% of AI Agent PRs Contain Vulnerabilities—Are We Building on Broken Foundations?

As CTO, I spend a lot of time thinking about technical risk. But this stat from the recent DryRun Security study stopped me cold: 87% of pull requests from AI coding agents (Claude, Codex, Gemini) contain vulnerabilities, exposing access control gaps, injection risks, and design flaws.

Let me put that in context. If 42% of our code is AI-generated and 87% of that contains vulnerabilities, we’re talking about potentially 36% of our entire codebase having security issues. And by 2027, when AI-generated code hits 65%, that could be 56% of our codebase.

We’re not just shipping faster. We’re potentially building on broken foundations.

What’s the Baseline?

The critical question: how does this compare to human-written code? Another study found that 62% of AI-generated code contains design flaws or known security vulnerabilities—even when using the latest foundational models. Meanwhile, human developers writing from scratch have vulnerability rates closer to 20-30% depending on the domain.

That’s 2-3× worse. And unlike human developers who learn from mistakes, AI models keep making the same categories of errors.

The Types of Vulnerabilities We’re Seeing

In our cloud migration project, security audits have flagged AI-generated code for:

1. Access control gaps - AI assumes happy path authentication, misses authorization edge cases
2. Injection vulnerabilities - SQL injection, command injection, XSS in user input handling
3. Insecure defaults - Permissive configurations, missing encryption, weak validation
4. Logic flaws in security-critical paths - Race conditions, incomplete error handling
5. Excessive I/O operations (~8× more common in AI code) - Potential DoS vectors

The pattern: AI optimizes for “it works” not “it’s secure.” Security requires adversarial thinking—imagining how things could break or be exploited. AI models are trained on making things work, not breaking them.

The CTO Responsibility Dilemma

I’m responsible for balancing innovation with risk management. AI coding tools promise competitive advantage through speed. But if they’re introducing 2-3× more vulnerabilities, what’s my fiduciary responsibility?

In our mid-stage SaaS company, a security breach could:

  • Destroy customer trust
  • Trigger regulatory penalties
  • Expose us to lawsuits
  • Tank our Series B valuation

The velocity isn’t worth it if it creates existential risk.

What We’re Changing

After seeing the security data, I’ve implemented:

1. Enhanced review for AI-generated code

  • Security-focused code review checklist for AI-assisted PRs
  • Mandatory security engineer review for authentication/authorization code
  • Static analysis tools tuned to catch common AI vulnerabilities

2. AI-free zones

  • Banned AI assistance in security-critical components
  • Authentication, authorization, encryption, PII handling = human-only
  • Payment processing, data export, admin functions = human-only

3. Verification before velocity

  • Acceptance criteria now include “security review complete”
  • Velocity metrics no longer count until security sign-off
  • Promoting engineers who catch security issues, not just those who ship fast

4. Audit trail requirements

  • PRs must indicate AI-assisted sections
  • Security reviews must explicitly verify AI-generated code
  • Incident reports track whether AI was involved

The Uncomfortable Questions

If 87% of AI PRs contain vulnerabilities:

  • Should we treat AI-generated code as untrusted by default?
  • Are we creating security debt that will take years to remediate?
  • What’s our liability if an AI-generated vulnerability causes a breach?
  • Can we maintain SOC 2 / ISO 27001 compliance with high-risk AI code?

I don’t have all the answers. But I know that ignoring security implications because we want velocity is how breaches happen.

What are other CTOs and security-minded leaders doing?


Michelle Washington | CTO | Mid-stage SaaS | 25 years in tech | Security-first mindset

Michelle, the financial services regulatory perspective makes this even more urgent.

At my Fortune 500 company, we’re subject to:

  • SOX compliance for financial reporting systems
  • PCI DSS for payment processing
  • GLBA for customer data protection
  • Federal Reserve guidance on technology risk management
  • OCC guidelines on third-party risk (which arguably includes AI tools)

When auditors ask “How do you ensure code quality and security?” we can’t say “Well, AI wrote it and an engineer glanced at it.”

The Audit Trail Problem

Our compliance team has flagged a critical issue: AI-generated code breaks our audit trail assumptions.

Traditional audit trail:

  • Developer with security clearance writes code
  • Code review by another cleared engineer
  • Security team review for sensitive systems
  • Documentation of security considerations
  • Traceability: who made what decision and why

AI-generated code breaks this:

  • Who made the architectural security decisions? (Nobody)
  • What security trade-offs were considered? (Unknown)
  • Why was this approach chosen? (It’s what AI suggested)
  • How do we prove due diligence? (We can’t)

Our external auditors have explicitly said: code without clear human accountability doesn’t meet our security standards.

Enhanced Review for Critical Components

We’ve created security tiers:

Tier 1: AI-prohibited (like your AI-free zones)

  • Authentication and authorization
  • Encryption and key management
  • Transaction processing
  • Audit logging
  • Admin privileges

Tier 2: AI-assisted, security review mandatory

  • User-facing features with data access
  • API endpoints
  • Database queries
  • File operations

Tier 3: AI-assisted, standard review

  • UI components
  • Utility functions
  • Test code (with verification)

This slows us down. But in regulated industries, compliance isn’t optional. The question isn’t “can we use AI?” It’s “where can we use AI without creating regulatory risk?”

The Specialized Review Challenge

Your point about AI not understanding adversarial thinking is critical. We’re training engineers to look for AI-specific vulnerabilities:

  • Injection flaws (AI often misses input sanitization)
  • Broken access control (AI assumes authenticated = authorized)
  • Insecure design (AI doesn’t consider threat modeling)
  • Security misconfiguration (AI uses permissive defaults)
  • Cryptographic failures (AI may use deprecated or weak crypto)

This is OWASP Top 10 stuff—but AI makes these mistakes more frequently than junior engineers.

The ROI Question

Michelle, your velocity vs risk question resonates. If AI speeds up development 30% but creates 2-3× more security vulnerabilities, what’s the true cost once you factor in remediation, audit findings, and potential breach impact?

In financial services, a single security breach can:

  • Cost in remediation
  • Trigger regulatory fines (0M+ range)
  • Cause customer attrition
  • Damage reputation for years

Is 30% faster development worth a 200% increase in breach probability? The math doesn’t work.

I’m increasingly convinced: AI adoption in regulated industries requires security-first guardrails that will negate much of the velocity benefit. That might be okay—if we’re honest about it upfront.


Luis Rodriguez | Director of Engineering | Fortune 500 Financial Services | Compliance is my middle name

The accessibility angle on this is terrifying.

In design systems, I work with a lot of AI-generated UI component code. And here’s what I’ve learned: AI is systematically terrible at accessibility.

WCAG Violations Everywhere

Common AI accessibility failures I see constantly:

  • Missing ARIA labels on interactive elements
  • Keyboard navigation broken (AI assumes mouse-only interaction)
  • Color contrast failures (AI picks colors that look nice, not accessible)
  • Form validation without screen reader support
  • Focus management missing in modal dialogs and dynamic content
  • Semantic HTML ignored (div soup everywhere)

These aren’t security vulnerabilities in the traditional sense, but they’re legal compliance risks under ADA and Section 508. And they exclude users with disabilities—which is an ethical failure.

The Context Blindness Problem

Michelle, your point about AI optimizing for “it works” not “it’s secure” applies to accessibility too: AI optimizes for “it looks right” not “it works for everyone.”

AI doesn’t understand:

  • Screen reader user workflows
  • Keyboard-only navigation patterns
  • Visual impairment scenarios
  • Cognitive accessibility needs
  • Responsive design implications across assistive tech

Because AI is trained on existing code, and a lot of existing code has terrible accessibility. AI is learning and amplifying bad patterns.

My Verification Checklist for AI UI Code

When reviewing AI-generated components:

Always test:

  • Keyboard navigation (tab order, focus indicators, shortcuts)
  • Screen reader announcement (NVDA, JAWS, VoiceOver)
  • Color contrast ratios (WCAG AA minimum: 4.5:1)
  • Responsive behavior (zoom to 200%, mobile viewport)
  • Semantic HTML structure (proper heading hierarchy, landmarks)

This takes time. For a moderately complex component, accessibility verification can take 2-3× longer than the AI generation took. Again: the verification cost often exceeds the generation benefit.

Domain-Specific Vulnerabilities

Michelle’s security tier system is smart. I’d propose an accessibility tier system too:

Tier 1: AI-assisted with mandatory accessibility review

  • Forms and user inputs
  • Interactive components (modals, dropdowns, accordions)
  • Navigation elements
  • Data visualization

Tier 2: AI-assisted with standard accessibility checks

  • Static content components
  • Layout utilities
  • Styling and theming

Tier 3: AI-prohibited

  • Complex interactive patterns
  • Custom ARIA implementations
  • Assistive technology specific code

The pattern: AI should augment human expertise, not replace domain knowledge. For accessibility, there’s no substitute for understanding the user experience of people with disabilities.

Luis’s compliance point applies here too. ADA lawsuits are real. Shipping inaccessible AI-generated code creates legal exposure.


Maya Rodriguez | Design Systems Lead | Confluence Design Co. | Accessibility advocate

Michelle, this is exactly the kind of strategic risk assessment that VPs and CTOs need to be having. The 87% vulnerability rate isn’t just a technical concern—it’s an organizational readiness warning.

At my EdTech startup, we handle student data protected by FERPA. That means:

  • PII of minors
  • Educational records
  • Behavioral data
  • Assessment results

A breach isn’t just bad PR—it could shut us down. Parent trust is everything in EdTech.

The Security Review Gate

We’ve implemented what I call security review gates for AI-generated code:

Gate 1: Engineering Review

  • Standard code review by peer
  • Explicit verification of AI-generated sections
  • Security-focused checklist for common AI vulnerabilities

Gate 2: Security Review (for sensitive data paths)

  • Dedicated security engineer review
  • Threat modeling for AI-generated security logic
  • Pen testing for authentication/authorization changes

Gate 3: Privacy Review (for student data)

  • Privacy team sign-off
  • FERPA compliance verification
  • Data minimization principles checked

This adds 2-4 days to delivery timelines for features touching student data. But the alternative—shipping vulnerabilities that expose student data—is unacceptable.

Training Engineers on AI Security

The challenge: most engineers haven’t been trained to spot AI-specific vulnerabilities. We’re running internal workshops on:

  • Common AI security anti-patterns (based on DryRun Security research)
  • Adversarial thinking (how attackers exploit AI-generated code)
  • Defense in depth (not trusting AI for security-critical logic)
  • Threat modeling (explicitly considering attack vectors in AI code)

This is an investment in organizational capability, not just tooling.

The ROI Question Returns

Michelle asked about balancing innovation with risk. Here’s my framework:

Acceptable AI use:

  • UI components (non-sensitive)
  • Test scaffolding (with verification)
  • Documentation generation
  • Utility functions

Questionable AI use:

  • Business logic (high verification cost)
  • Database access (security risks)
  • API implementations (authentication/authorization concerns)

Unacceptable AI use:

  • Authentication systems
  • Authorization logic
  • Encryption implementations
  • Sensitive data handling

If the security review cost exceeds the generation benefit, we shouldn’t be using AI for that task.

Luis’s compliance point about audit trails is critical. In EdTech, we face compliance audits from:

  • School districts
  • State education departments
  • Privacy regulators
  • Third-party security assessors

We can’t say “AI wrote it” when asked to justify security decisions. Human accountability is non-negotiable.

The Uncomfortable Reality

I think we’re going to see a bifurcation:

  • Low-stakes code (internal tools, non-sensitive features) → heavy AI use
  • High-stakes code (security, privacy, financial) → AI-prohibited or AI-assisted with rigorous review

The 65% AI-generated code future might apply to the low-stakes category. But for mission-critical systems? I don’t see how we get there responsibly.

Michelle, your AI-free zones approach is the right call for now. Maybe in 3-5 years, AI models will be trained specifically for security-conscious code generation. But today? The risk outweighs the velocity benefit for anything that touches sensitive data or security boundaries.


Keisha Johnson | VP of Engineering | EdTech Startup | Student data protection is my top priority