So our team adopted AI coding assistants about 6 months ago
Everyone was pumped—management promised we’d ship 30% faster, engineers were excited to spend less time on boilerplate. And you know what? Individually, it feels true. I can spin up a component in half the time, AI handles the tedious parts, and I get to focus on the creative stuff.
But here’s what nobody warned us about: our code review time exploded by 91%.
Not “got a little slower.” Not “needs some optimization.” Ninety-one percent.
I first noticed it when our most senior engineer—let’s call her Sarah—started declining meetings. When I asked why, she said: “I’m drowning in PR reviews. Can’t make progress on my own work.” Turns out she wasn’t exaggerating. We pulled the data:
The numbers from our team:
- PRs per developer: +98% (almost double!)
- Average PR size: +154% (massive!)
- Time to review each PR: +47% (so much more to read)
- Bugs that made it to staging: +9% (quality took a hit)
- Senior engineer time spent on reviews: +91% (the bottleneck)
This isn’t just us. Faros AI research across 10,000+ developers found the same pattern. Anthropic just launched Claude Code Review specifically because “code output per engineer is up 200% this year and reviews were the bottleneck.”
The Paradox Nobody Talks About
Here’s the mind-bending part: everyone believes they’re faster, but we’re not shipping any faster as a team.
There’s even research showing this perception gap. In a METR study, developers using AI were actually 19% slower to complete tasks—but they believed they were 24% faster. Before starting, they predicted AI would speed them up. After finishing (slower), they still thought it had helped.
We’re experiencing something similar. Devs on my team feel productive because they’re churning out code. But our cycle time—idea to production—hasn’t budged. In some sprints, it’s gotten worse.
Why Senior Engineers Became the Bottleneck
The old workflow looked like this:
- Junior writes code (learns patterns, makes mistakes)
- Senior reviews (catches issues, teaches better approaches)
- Code ships
The new workflow:
- Junior (or mid-level) prompts AI to write code
- AI generates 2-3x more code in the same time
- Senior engineer tries to review all of it
- Senior drowns, becomes bottleneck
- Mentorship time evaporates
Amazon just mandated that seniors must sign off on all AI-assisted code after multiple AI-related outages. That’s not a solution—that’s admitting the problem.
Senior engineers didn’t sign up to be AI code validators. They’re supposed to architect systems, mentor juniors, and tackle the hardest problems. Instead, they’re spending their days verifying that an AI didn’t introduce subtle bugs into a 500-line PR.
The Questions Keeping Me Up at Night
Is this just growing pains? Will we adapt and find equilibrium? Or is this a fundamental mismatch between AI coding speed and human review capacity?
Should we invest in AI code review tools? (They exist now—Anthropic, CodeRabbit, others) Or is that just adding more AI to fix the problems AI created?
Are we measuring the wrong things? We track PRs merged and code velocity. Should we care about cycle time and customer value instead?
How do we preserve mentorship? If juniors aren’t writing the initial code, how do they learn? If seniors aren’t teaching during reviews, when does knowledge transfer happen?
My Startup’s Failed Experiment
When I ran my startup (before it failed—that’s a story for another post
), we tried AI coding to move faster. We were a 3-person team, and AI felt like having a 4th developer. We generated so much code!
But then we’d spend days debugging issues we didn’t fully understand because we hadn’t written the code ourselves. We’d ship features faster but break old ones. We optimized for velocity and got fragility.
That experience taught me: Code that ships isn’t the same as code that works and code you understand.
So What Do We Do?
I don’t have answers—just a lot of questions and a pile of data that doesn’t align with the AI coding hype.
From a design systems perspective, this feels like a classic “local optimization, global pessimization” problem. We optimized the “write code” step and created chaos everywhere else.
Has your team experienced this? Are senior engineers drowning in reviews? Have you found solutions that actually work? Or am I missing something obvious here?
Would love to hear especially from engineering leaders who’ve navigated this—and from seniors who are living it right now.
Sources if you want to dive deeper: