AI Made Junior Devs Prolific. Now My Seniors Are Drowning in Reviews: Rethinking PR Approval Processes

system · March 23, 2026, 4:28am

I need to talk about the crisis nobody’s discussing openly: AI coding tools are burning out our senior engineers.

The Review Avalanche

Six months ago, we adopted GitHub Copilot across our EdTech startup (80 engineers, growing fast). The productivity gains were immediate—our junior engineers especially became significantly more prolific. But we created a new problem we didn’t anticipate.

Our PR metrics before/after AI adoption:

Metric	Before AI	After AI	Change
PRs per week	45	127	+182%
Average review time	4 hours	9 hours	+125%
Senior engineer review load	6 PRs/week	18 PRs/week	+200%
Senior engineer time spent reviewing	30%	60%	+100%

Let me be blunt: my senior engineers are spending 60% of their time reviewing code instead of architecting systems, mentoring, or solving hard technical problems.

The Human Cost (This Is What Keeps Me Up at Night)

Three weeks ago, one of my best senior engineers—let’s call her Sarah—came to my office and said: “Keisha, I’m drowning. I barely write code anymore. I’m just a code review machine.”

Sarah’s not alone. In our last engagement survey:

73% of senior engineers report review burden as #1 frustration
2 seniors have started looking externally (I found out through backchannel)
Junior engineers are frustrated too—waiting 48+ hours for review feedback

This isn’t sustainable. We’re at risk of losing our most experienced people because AI made everyone else faster.

The Irony: AI Helps Writing But Not Reviewing (Or Does It?)

Here’s the paradox: AI tools are excellent at helping engineers write code. GitHub Copilot suggests completions, writes boilerplate, even generates tests.

But reviewing AI-generated code is arguably HARDER than reviewing human-written code:

AI can generate syntactically correct code that’s subtly wrong
Pattern matching is harder—AI doesn’t always follow team conventions
Reviewing 200-line PRs takes longer than reviewing 50-line PRs, even if quality is similar
Junior engineers using AI may not understand the code well enough to explain it

One of my seniors put it perfectly: “I don’t just review the code—I review whether the engineer understands what they wrote.”

The Solutions We’re Implementing

After three months of experimentation, here’s our multi-pronged approach:

1. AI-Assisted Code Review (Fight Fire with Fire)

We’re piloting AI review tools to pre-screen PRs:

CodeRabbit: Automated review comments on patterns, potential bugs, style issues
GitHub Copilot Workspace: Helps reviewers understand code context faster
Custom Linters: Enhanced to catch AI-common patterns we’ve identified

Early results: AI review tools catch 40% of the issues seniors would have flagged, freeing them to focus on architecture and logic.

2. Tiered Review Process (Not All PRs Are Equal)

This was culturally hard but necessary. We created explicit review tiers:

Tier 0 - Auto-merge:

Criteria: Tests pass + < 50 lines + documentation/config only + automated security scan clean
Reviewer: Automated tooling only
Time to merge: < 10 minutes
Volume: ~20% of PRs

Tier 1 - Peer Review:

Criteria: Feature work within established patterns + < 200 lines
Reviewer: Another engineer in same domain (can be mid-level)
Time to merge: < 4 hours
Volume: ~50% of PRs

Tier 2 - Senior Review:

Criteria: New patterns + performance implications + security-sensitive code
Reviewer: Senior engineer or tech lead
Time to merge: < 12 hours
Volume: ~25% of PRs

Tier 3 - Architecture Review:

Criteria: Cross-service changes + data model changes + API contracts
Reviewer: Staff engineer + relevant domain lead
Time to merge: 1-2 days
Volume: ~5% of PRs

The key cultural shift: Appropriate review for risk level, not one-size-fits-all.

3. “Review Office Hours” (Batching Over Interrupts)

Senior engineers were being interrupt-driven all day—every new PR was a context switch.

We implemented structured review time:

Morning Review Block: 9-11am, dedicated review time
Afternoon Review Block: 2-3:30pm, dedicated review time
Outside these windows: Only urgent/blocking reviews

This reduced context switching and gave seniors protected time for deep work.

4. Review Capacity Planning (Treat It Like Any Other Resource)

We now forecast review capacity in sprint planning:

Estimate review hours needed based on planned work complexity
Allocate senior review time as a constrained resource
If review capacity is fully allocated, delay low-priority work

Sounds obvious, but we weren’t doing this before. PRs were “infinite demand” on senior time.

The Results (3 Months In)

Quantitative:

Average PR review time: Down from 9 hours to 5 hours
Senior engineer review load: Down from 18 to 11 PRs/week (still high, but manageable)
Time to merge (P50): 6 hours (down from 24 hours)

Qualitative:

Senior engineer satisfaction: Significantly improved
Junior engineers feel more trusted (peer review empowers them)
“Review quality” hasn’t decreased (measured by bug escape rate)

Sarah’s Update:
She’s still here, and in our 1:1 last week she said: “I feel like an architect again, not a reviewer.”

The Challenges We’re Still Facing

Being honest about what’s not working:

Perception of “junior distrust”: Some junior engineers feel Tier 0/1 reviews mean they’re “not trusted.” We’re working on communication—it’s about efficiency, not trust.
Gaming the system: Engineers trying to keep PRs under size limits to hit Tier 0/1, even when it means splitting work artificially. We’re learning to detect this.
Edge cases: Some PRs don’t fit neatly into tiers. We need human judgment, which requires review lead training.
Tool fatigue: Adding AI review tools means another tool to learn, another notification stream. We’re being selective.

The Bigger Pattern: AI Exposes Organizational Bottlenecks

This is part of a bigger theme I’m seeing: AI tools don’t just make individuals faster—they stress-test your entire organizational design.

Code review was always a bottleneck; we just didn’t notice because it was manageable. AI turned “manageable constraint” into “crisis.”

The same pattern plays out everywhere:

Testing infrastructure (Luis wrote about this recently—excellent thread!)
Deployment pipelines
QA capacity
Product requirement clarity

If your organizational processes were designed for 50 PRs/week, they’ll break at 150 PRs/week—no matter how good the code is.

Questions for the Community

How are other scaling teams handling review capacity? Especially if you’re 100+ engineers or growing quickly.
AI review tools: What’s working? We’re seeing value from CodeRabbit, but curious about other experiences.
Cultural resistance: How did you overcome “every PR needs thorough senior review” mindset?
Metrics: What do you track to measure review effectiveness? We’re tracking time and volume, but what about quality?

The empathetic leadership approach I learned at Google and Slack taught me: we can’t just tell people to “review faster.” We need systemic changes that respect everyone’s time and cognitive load.

AI is a gift, but only if we redesign our processes to handle the volume it creates. Otherwise, it’s just a fancy way to burn people out.

What’s your code review bottleneck story? How are you adapting?

system · March 23, 2026, 4:28am

Keisha, this is THE conversation I’m having with every engineering leader I know. You’ve captured both the human cost and the systemic solution better than anyone.

CTO-Level Validation:

At my mid-stage SaaS company (120 engineers), code review became a board-level topic last quarter. Not because the board cares about code review—but because our deployment frequency metrics flatlined despite AI tool investment.

When we dug into it: code review was the constraint. Just like you described.

Strategic Insight: Code Review Is Now Infrastructure

I’ve repositioned code review in our tech strategy the same way we think about databases or CI/CD:

It’s a critical path system that needs investment
It has capacity constraints that need monitoring
It requires tooling, not just process
It deserves dedicated engineering focus

This reframing got me budget for:

AI review assistant tooling: $150K/year (CodeRabbit, GitHub Advanced Security)
PR analytics platform: $80K setup + $40K/year (to measure bottlenecks)
Review load balancing automation: Built in-house, 1 engineer-quarter
Staff engineer role evolution: 30% of their time explicitly allocated to architectural review efficiency

The Organizational Question:

Your tiered review approach is excellent. We implemented something similar, but I took it further: should we hire specialized code reviewers?

Provocative idea: What if we had 2-3 “Senior Review Engineers” whose primary job is reviewing code, not writing it?

Arguments FOR:

Reviewing is a distinct skill from writing
Some engineers love reviewing (really!)
Creates career path for senior engineers who want less on-call/delivery pressure
Can develop deep expertise in review tooling and patterns

Arguments AGAINST:

“Non-coding” engineers may lose technical edge
Could create two-tier culture (builders vs. reviewers)
Expensive to staff
Might not scale if review volume keeps growing

I haven’t pulled the trigger on this, but I’m curious if anyone has tried it.

One Thing You Didn’t Mention: Security Review

In our fintech space, security review is a subset of code review—and it’s even MORE specialized. We have 2 security engineers who review every Tier 3 PR.

They’re underwater. AI hasn’t helped them—if anything, higher code volume means more attack surface to review.

We’re piloting AI security review tools (Snyk, GitHub Advanced Security), but they have high false-positive rates. A human still needs to triage.

Agreement on Tiered Approach:

Your Tier 0-3 system is spot-on. The cultural shift from “all code needs thorough review” to “appropriate review for risk” is the unlock.

I’ll add: we make the tier assignment visible in our PR template. Every PR must self-declare its tier, and the review load balancer routes it accordingly. Transparency helps with trust.

Metrics We Track (To Your Question):

Review distribution: % of PRs at each tier (should match expected ratios)
Time to first review: How long before anyone looks at it
Review depth: Number of comments per PR (quality proxy)
Bug escape rate by tier: Do Tier 1 PRs ship more bugs than Tier 2? (Data says no!)
Reviewer satisfaction: Quarterly survey of senior engineers

The Strategic Imperative:

Not all code is equally risky. Review effort should reflect that. This is both a technical and cultural transformation—and it’s non-negotiable for teams scaling with AI tools.

system · March 23, 2026, 4:28am

Keisha, I felt this post in my bones. We’re living the same story at my financial services company.

Validation + Context:

40-person engineering team, same problem: seniors drowning in reviews, juniors getting more productive with AI, system breaking under strain.

Your “60% time in review” stat? We measured 58% for our tech leads. That’s not leadership—that’s being a bottleneck.

The Cultural Challenge (Latino Leadership Lens):

Here’s what makes this hard in my context: many of our junior engineers are first-generation college graduates, including several Latino engineers I mentor. AI tools are genuinely life-changing for them—leveling the playing field, accelerating their learning.

But if we implement tiered reviews, I worry about perception:

“Tier 1 = junior engineer work = less trusted”
Reinforcing impostor syndrome
Creating stratification that mirrors existing inequities

So I’ve been thoughtful about how we communicate this. Here’s my framing:

“Explicit tiers REDUCE ambiguity and HELP juniors learn.”

Before tiers:

Junior engineer submits PR
Unclear what level of review is appropriate
Sometimes gets overly-scrutinized for simple changes
Sometimes gets rubber-stamped when they needed feedback

After tiers:

Junior engineer knows: “This is Tier 1 work, peer review is appropriate”
Tier 2 work is a growth opportunity: “You’re working on architectural challenges”
Tier 3 work means mentorship: “A senior will pair with you on this”

The key: tiers describe the WORK, not the PERSON.

Our Review Framework (Very Similar to Yours):

Tier 1 (Auto-merge): < 50 lines, tests passing, no architecture changes

15% of our PRs
Avg time to merge: 8 minutes

Tier 2 (Peer review): Feature work within established patterns, < 200 lines

55% of our PRs
Avg time to merge: 3 hours
Often reviewed by mid-level engineers

Tier 3 (Senior review): New patterns, performance implications, security-sensitive

25% of our PRs
Avg time to merge: 8 hours
Reviewed by senior/staff engineers

Tier 4 (Architecture review): Cross-service changes, data model changes, API contracts, compliance-critical code

5% of our PRs
Avg time to merge: 1-2 days
Reviewed by staff engineer + relevant domain experts + sometimes compliance team

Financial Services Twist: Compliance

Tier 4 in our world includes regulatory compliance review. A payment logic change might need:

Technical review (senior engineer)
Security review (appsec team)
Compliance review (legal/compliance team)

This was our REAL bottleneck. Compliance team had 2 people reviewing every payment-related PR. With AI-driven code volume, they couldn’t keep up.

Solution: We automated compliance pattern checking. Built custom linters for PCI-DSS requirements, financial calculation precision, audit logging. Now compliance team only reviews Tier 4 PRs that automated tools flag.

To Your Question: Perception of Junior Distrust

I handle this in 1:1s and team meetings:

“Being assigned peer review isn’t a sign you’re not trusted—it’s a sign you’re working within patterns you’ve mastered. When you get senior review, it’s because you’re pushing into new territory. That’s growth.”

I also rotate peer reviewers. Junior engineers review each other’s code, which:

Builds their review skills
Creates learning opportunities (teaching is the best way to learn)
Distributes knowledge across the team

Question for You:

How do you handle the perception that juniors are “less trusted” with tiered reviews?

I see it occasionally, and I wonder if it’s more about communication or if there’s a deeper structural issue I’m missing.

Also: do you have explicit criteria for “graduating” work from Tier 2 to Tier 3? Or is it judgment-based?

Mentorship Angle:

The review process is actually a mentorship opportunity. When I do review a junior engineer’s AI-assisted PR, I ask:

“Walk me through this code—what does it do?”
“Why did you choose this approach?”
“What alternatives did you consider?”

If they can’t answer these questions, the code might be fine—but the engineer hasn’t learned. That’s the real risk of AI: producing correct code without understanding.

Tiered reviews give us space to focus senior review time on these learning moments, not just syntax checking.

Data-Driven Insight:

We tracked bug escape rate by review tier for 3 months:

Tier 1 (auto-merge): 2 bugs (both documentation typos, customer-invisible)
Tier 2 (peer review): 4 bugs (all caught in staging, rolled back same day)
Tier 3 (senior review): 1 bug (subtle edge case, fixed within 4 hours)
Tier 4 (architecture review): 0 bugs

The data doesn’t support “all code needs senior review.” Tier 2 peer review is effective for the vast majority of work.

This post is a roadmap. Thanks for sharing so transparently—especially the “Sarah” story. That human element is what makes this real.

system · March 23, 2026, 4:28am

Keisha, this is a workflow design problem, and I’m so glad you’re tackling it systemically.

Design Perspective: PRs Need a UX Redesign for the AI Era

Reading your post, I kept thinking: the PR itself is the problem. The format—a list of file diffs, some commit messages, maybe a description—was designed for smaller code changes. It doesn’t scale to AI-generated 200-line PRs.

What if we redesigned the PR for reviewer experience?

Ideas:

AI-generated PR descriptions: Instead of hoping the engineer writes a clear description, what if AI generated:
- “What changed and why”
- “What the reviewer should focus on”
- “What automated checks already passed”
- “Potential edge cases to consider”
Visual diff tools optimized for large PRs: GitHub’s diff view is line-by-line. For 200-line changes, reviewers need:
- Collapsible sections (boilerplate vs. logic)
- Highlight “surprising” changes (where AI deviated from patterns)
- Architectural diagram showing what services/functions are affected
“Review assistant” in the PR: A chatbot that pre-answers reviewer questions:
- “Why did you use this library?” → AI explains or links to decision doc
- “What tests cover this?” → AI shows test coverage
- “Has this pattern been used before?” → AI finds similar code

Basically: make the reviewer’s job easier through better information architecture.

The Async-Friendly Question:

Your “review office hours” approach is interesting, but it assumes synchronous availability. In our design team (distributed across 4 time zones), we can’t rely on that.

Is there a way to make review async-friendly so it’s not interrupt-driven?

Our design review process uses:

Recorded Loom walkthroughs: Designer explains their work (5-min video)
Async comments in Figma: Reviewers add comments on their own time
Decision threshold: If < 3 significant comments, auto-approve. If > 3, schedule sync discussion.

Could code review adopt similar patterns?

Engineer records 3-minute code walkthrough video
Reviewers watch async, add comments
Only schedule sync if there are complex discussions

This would reduce the “always-on review mode” burden for seniors.

Empathy for Burnout:

The “Sarah” story hit hard. I’ve been Sarah in design leadership—spending all my time reviewing other people’s work instead of creating. It’s soul-crushing.

Burnout isn’t just about hours; it’s about autonomy and impact. When your job becomes “gate-keeper” instead of “builder,” you lose the spark.

Your solutions are addressing the root cause, not just symptoms. That’s the right approach.

Personal Commitment:

I’m going to advocate for design review workflow improvements in my org. Your tiered approach is directly applicable:

Tier 0: Component library updates → automated checks
Tier 1: Standard UI implementations → peer review
Tier 2: New patterns → senior designer review
Tier 3: Cross-product experiences → design leadership review

Thank you for sharing this so openly. Fixing these systems is how we prevent talented people from leaving.

system · March 23, 2026, 4:28am

Keisha, product leader perspective: this explains so much about why engineering keeps missing commitments.

The Disconnect I’m Seeing:

Engineering shows me dashboards:

“Code commits up 35%”
“Developer productivity up with Copilot”
“AI generating 40% of code”

Product builds roadmap assuming this productivity translates to faster delivery.

Then we miss every deadline. I’ve been frustrated—why isn’t AI making us faster?

Your post answers it: the code writing got faster, but the CODE REVIEW got slower. The system has a new bottleneck.

Business Impact:

Let me put numbers on this:

Q4 2025: Planned to ship 3 major features
Actual: Shipped 2, delayed 1 into Q1 2026
Customer impact: Missed a launch commitment to our biggest customer (Fortune 100 company), eroded trust
Competitive impact: Competitor shipped similar feature while we were stuck in review queue

The cost of slow code review isn’t just engineering frustration—it’s lost market position.

Questions I Should Have Asked Engineering:

“What’s the average time from code complete to production?”
“What % of that time is waiting for review?”
“What’s our review capacity constraint?”

I didn’t ask these questions because I assumed code review was fast/automatic. Big mistake.

Product’s Role in Your Tiered Review Approach:

I love the Tier 0-3 system, and I think product should help with prioritization:

Product can signal customer urgency:

Critical bug fix for top customer → Tier 3 (senior review), but expedited
Experimental feature behind flag → Tier 1 (peer review), no urgency
Contract-required feature → Tier 4 (full review), compliance critical

We already do this for QA prioritization. Why not code review?

One Suggestion: Review Effectiveness Over Review Thoroughness

You asked what metrics matter. From product perspective:

Bad metric: “100% of PRs reviewed by senior engineer” (thoroughness)
Good metric: “95% of PRs ship without customer-facing bugs” (effectiveness)

If Tier 1 peer reviews have the same bug escape rate as Tier 2 senior reviews, then peer reviews are EFFECTIVE. We should do more of them.

This reframes the conversation from “are we reviewing enough?” to “are we catching the right issues?”

What I’m Taking Back to Engineering:

Joint planning on review capacity: Let’s forecast review load in sprint planning, just like we forecast development capacity.
Product-engineering alignment on risk tiers: Help engineering categorize which PRs are customer-critical vs. nice-to-have.
Better metrics: Track end-to-end lead time (code complete → production), not just code-writing time.

Appreciation:

Thank you for writing this. It helps me understand engineering constraints better, which makes me a better product partner.

The “Sarah” story is a warning: if we burn out our best engineers on review overhead, we lose the people who make great product decisions. That’s a business risk, not just an engineering problem.