Okay, I need to share something that’s been driving me absolutely bonkers ![]()
Three months ago, my design systems team started using AI coding assistants. The promise? Ship components faster, iterate quicker, spend less time on boilerplate. The reality? Our PR queue has basically exploded, and I’m watching talented engineers drown in review work.
The Numbers Don’t Lie (But They’re Confusing)
Here’s what actually happened after we adopted AI tools:
- Our team is merging 98% more pull requests than before
- Individual developers report feeling way more productive
- BUT… PR review time increased by 91%

Wait, what? How does that even make sense?
It gets weirder. The average PR size grew by 154%. Turns out, when AI can generate a whole component implementation in minutes instead of hours, developers don’t break their work into smaller, reviewable chunks anymore. They just… ship the whole thing at once.
The Design Systems Nightmare
From a design systems perspective, this created a weird dynamic I didn’t expect.
Engineers can now generate UI code really fast based on our component specs. That should be great, right? More design system adoption, faster implementation, happy designers! ![]()
Except the review bottleneck broke our entire feedback loop. We’d iterate on a component design, engineering would update the implementation with AI in like 20 minutes, and then… crickets for 3-4 days while it sat in the review queue. By the time we got feedback that something didn’t work, we were already two sprints ahead.
The collaboration that made our design system actually good — that tight back-and-forth between design and engineering — basically collapsed under the weight of the queue.
Why AI Code Is Harder to Review
I’m not an engineer (well, not primarily
), but I’ve been watching this closely, and here’s what I’ve noticed about reviewing AI-generated code:
1. Less familiar patterns: AI doesn’t write code like your teammates do. It’s technically correct, but the patterns are just… different. Reviewers spend extra mental energy parsing unfamiliar approaches.
2. More verbose: AI really likes to be thorough. That PR that would’ve been 50 lines from a human? AI writes 200 lines. All technically fine, but way more surface area to review.
3. Lacks context: When a human writes code, they leave breadcrumbs — variable names that reflect the domain, comments about why they chose an approach. AI code often looks good but doesn’t tell you the story of the implementation.
4. Edge cases everywhere: I’ve noticed AI-generated code tends to handle edge cases I didn’t even ask for. Again, seems good! But reviewers have to validate all of it, and sometimes those edge cases introduce subtle bugs.
What We Tried (Spoiler: Nothing Really Worked)
Our team has been scrambling to address this:
- Rotating review responsibilities: Just spread the pain around. Everyone’s equally underwater now!

- Trying AI code review tools: Anthropic just launched their Code Review tool in March, we’re experimenting with it. Early days, but it’s finding stuff humans miss (and also flagging stuff that’s totally fine, so… mixed results)
- Review time limits: “Keep reviews under 30 minutes!” Great in theory, terrible in practice when PRs are 400 lines long
- Async review cycles: Tried doing reviews in batches. Just meant bigger context-switching overhead.
None of it really solved the fundamental problem. We’re generating code faster than we can responsibly review it.
The Productivity Paradox
The really frustrating part? According to the data, developers using AI believe they’re 20% more productive. They feel faster because they’re writing code in less time.
But a recent study found that developers are actually 19% slower at completing tasks when you measure end-to-end delivery. The review bottleneck completely wipes out the generation speed gains.
And then there’s the bug rate. Our incidents went up about 9% since adopting AI tools. Nothing catastrophic, but definitely the wrong direction.
The Real Question
So here’s what I’m struggling with: Is this just the new normal?
Like, is this a temporary transition period while we figure out new workflows? Or did we fundamentally break something about how code review is supposed to work?
Some days I think we need to completely redesign our review process from scratch — maybe treat AI-generated PRs differently, have different approval gates, I don’t know.
Other days I think we should just slow down the AI code generation to match our review capacity, but that feels like leaving productivity on the table (even if the productivity gains are somewhat illusory).
How Are You All Handling This?
I can’t be the only one seeing this pattern, right?
- Have you found workflows that actually work with AI-assisted development?
- Are you treating AI-generated code differently in your review process?
- Did you just… accept the slower velocity as the price of AI code?
- Or did you find some magic solution I’m missing?
At our current pace, I’m genuinely worried about long-term quality. Our design system is too critical to let technical debt sneak in because we couldn’t keep up with reviews. But I also don’t want to be the person who says “stop using the productivity tools” when everyone feels like they’re shipping faster.
Help? ![]()
Sources that made me realize this isn’t just us:
- AI Coding Statistics 2026 — where I found the 91% review time stat
- AI Productivity Paradox Research — the belief vs reality productivity data
- Developer Productivity Statistics 2026 — 41% of code is AI-generated now (!)