I’ve been noticing something at our EdTech startup over the past few months—our PR review process has gotten… messier. More back-and-forth, more bugs caught in review, more “wait, this logic doesn’t handle edge cases” comments. At first, I thought we’d hired too fast or needed better onboarding. Then I saw the data.
The Numbers Don’t Lie
CodeRabbit just released their State of AI vs Human Code Generation report, analyzing 470 open-source GitHub pull requests. The finding that jumped out: AI-generated code creates 1.7× more issues compared to human-written code.
But it’s not just volume—it’s severity and type:
- Logic and correctness issues: up 75% (business logic errors, misconfigurations, unsafe control flow)
- Code quality and maintainability: 1.64× more frequent
- Security vulnerabilities: 1.57× higher
- Performance issues: 1.42× more common
And the kicker? AI-authored PRs contain 1.4× more critical issues and 1.7× more major issues on average.
The Productivity Paradox
Here’s what makes this complicated: while pull requests per author increased by 20% year-over-year (thank you, AI coding assistants), incidents per pull request increased by 23.5%. We’re shipping more code faster, but we’re also shipping more problems.
At my previous role at Slack, we obsessed over defect escape rates. If we’d seen this trend, alarms would be going off. But in 2026, with 85% of developers using AI coding tools, this is just… normal now?
So What Do We Actually Do About It?
This is where I need the community’s perspective. Should we be treating AI-generated PRs differently in our review process? Here are the options I’m considering:
Option 1: Add a Flag/Label for AI-Generated Code
Simple PR label: “
AI-assisted” or similar. Makes reviewers aware, but doesn’t change the process.
Pros: Transparency, easy to implement, opt-in
Cons: Could create stigma, relies on honor system, might be ignored
Option 2: Require Extra Scrutiny on High-Risk Areas
Focus reviews on where AI struggles most: authentication, authorization, state management, security boundaries. Bright Security’s best practices recommend treating any AI code that touches identity, access, or state as high-risk by default.
Pros: Targeted, evidence-based, addresses actual risk
Cons: Requires defining “high-risk,” needs reviewer training
Option 3: Use AI Review Tools to Catch AI Mistakes
Anthropic just launched Claude Code Review specifically to address this. Fight fire with fire?
Pros: Scalable, catches patterns humans miss, reduces reviewer burden
Cons: False positives, cost, still need human oversight
Option 4: Keep Process the Same, Trust Human Reviewers
Maybe this is a temporary problem. Maybe reviewers will adapt and get better at catching AI-generated issues.
Pros: No process change, avoids complexity
Cons: Ignores the data, increases reviewer cognitive load
Where I’m Landing
I’m leaning toward a combination of Option 2 (focus on high-risk areas) with light implementation of Option 1 (optional flagging). Based on what I’ve read, AI consistently struggles with security boundaries—it optimizes for the happy path while attackers exploit the edge cases.
In our Q2 planning, I’m proposing:
- Enhanced review checklist for authentication, authorization, and state transitions (regardless of AI usage)
- Optional “AI-assisted” label for transparency
- Team education on common AI code patterns and their failure modes
- Experiment with AI review tools on a subset of repos
But I could be totally wrong!
What I’m Curious About
- Have you noticed similar quality issues with AI-generated code on your teams?
- What review practices have you changed (if any) to adapt to AI-assisted development?
- Are you measuring this? What metrics tell you if AI is helping or hurting overall code quality?
- Is this a permanent tradeoff (speed vs. quality) or a temporary growing pain as AI tools improve?
The data says we have 1.7× more issues to deal with. The question is: do we adapt our processes, or do we accept the tradeoff as the cost of 20% more throughput?
I’d love to hear how other engineering leaders are thinking about this.