The Strangler Pattern: Why Netflix, Amazon, and Uber All Use It for Modernization

If you’ve ever wondered how the largest tech companies migrate their legacy systems without taking their services offline, the answer is almost always the Strangler Pattern.

What Is the Strangler Pattern?

Martin Fowler coined this pattern back in 2001 after observing strangler fig trees in Queensland’s rainforest. These vines start growing in the nook of a host tree, gradually wrapping around it and drawing nutrients. Eventually, the fig becomes self-sustaining, and the original host tree may die - leaving the fig as an echo of its shape.

The software analogy is elegant: rather than replacing a legacy system all at once (the dreaded ‘big bang’ rewrite), you gradually grow new services around the old system until the legacy components are no longer needed.

How It Works in Practice

Step 1: Build the Façade

You create an indirection layer (proxy) that intercepts all requests going to your legacy system. Initially, this façade simply forwards everything to the old system - zero behavior change.

Step 2: Identify Thin Slices

Break your legacy system into manageable pieces. Each slice should be:

  • Independent enough to be replaced in isolation
  • Significant enough to deliver business value
  • Well-defined at its boundaries

Step 3: Build and Route

For each slice:

  1. Build the replacement service
  2. Test it extensively
  3. Update the façade to route traffic to the new service
  4. Monitor for issues
  5. Retire the legacy component

Step 4: Repeat Until Done

Keep slicing and replacing until the entire legacy system is decommissioned.

Why the Big Companies Use It

Netflix migrated from their aging Reloaded media platform to Cosmos using this exact pattern. Both systems ran in parallel while functionality was incrementally shifted. Cosmos is now fully microservices-based, running on Docker and AWS.

Amazon, Uber, Airbnb, Etsy - all transitioned from monolithic architectures to microservices using Strangler or similar patterns.

The common thread: these are businesses that can’t afford downtime. You can’t tell Netflix customers ‘we’re down for six months while we rewrite everything.’

Strangler vs. Big Bang: The Trade-offs

Aspect Strangler Pattern Big Bang Rewrite
Risk Low (incremental changes) High (all-or-nothing)
Business Continuity Maintained throughout Disrupted during rewrite
New Features Can ship during migration Frozen until completion
Cost Timing Spread over project duration Heavy upfront investment
Timeline Flexible, can pause Fixed deadline pressure
Rollback Easy (per component) Near impossible

When Strangler Pattern Shines

  • Large, business-critical systems that can’t go offline
  • Complex codebases where full understanding is impossible upfront
  • Teams that need to deliver features while migrating
  • Organizations that want to learn and adjust during migration
  • Systems with many independent modules or clear boundaries

The Gotchas

  1. The Data Layer: Shared databases are the killer. You can’t easily strangle a tightly coupled data model.

  2. The Zombie Monolith: Some migrations stall at 60-80% complete. You end up maintaining both systems forever.

  3. The Proxy Bottleneck: Your façade layer can become a single point of failure or performance bottleneck.

  4. Premature Decomposition: If you don’t understand your domain well, you’ll draw the wrong service boundaries.

By 2026, over 95% of new digital workloads will be deployed on cloud-native platforms. The Strangler Pattern is how most enterprises will get there - not through heroic rewrites, but through patient, incremental modernization.

What’s your experience with strangler migrations? Have you seen them succeed? Fail? Stall in the middle?

Alex, your overview is excellent. Let me add the financial services perspective on Strangler Pattern adoption, because regulated environments add layers of complexity that general guidance often misses.

Why Strangler Is Mandatory in Financial Services

In banking and financial services, a ‘big bang’ rewrite isn’t just risky - it’s often not permitted by regulators. The OCC and Fed expect continuous operation of critical systems. A rewrite that takes systems offline, even for planned maintenance windows, requires regulatory notification and can trigger enhanced supervisory scrutiny.

The Strangler Pattern aligns perfectly with regulatory expectations:

  • Parallel running: Regulators love when you can prove the new system produces identical results to the old one
  • Rollback capability: Every production change must be reversible - Strangler provides this by default
  • Audit trail: Each migration step is documented, testable, and auditable

The Challenges We Faced

Our core banking migration took 30 months using Strangler. The unique financial services challenges:

1. Data Consistency Is Non-Negotiable

In banking, you can’t have ‘eventual consistency’ on account balances. We had to implement synchronous data verification at the façade layer - every transaction validated against both systems before the new system became authoritative.

2. Compliance Boundaries Don’t Match Technical Boundaries

Our legacy system had compliance logic scattered throughout the codebase. We couldn’t strangle the ‘payments module’ in isolation because AML screening was embedded in 17 different places. We had to first extract and centralize compliance before we could strangle anything.

3. Audit Requirements Added Overhead

Every routing change in the façade required change management documentation, impact assessment, and in some cases regulatory notification. This added 2-3 weeks to each ‘slice’ migration.

What Worked

We used the ‘shadow mode’ pattern extensively: route traffic to both systems, compare results, alert on discrepancies. We ran shadow mode for 90 days on each component before switching production traffic.

This gave us and our regulators confidence that the new system was functionally equivalent.

Alex’s comparison table is useful, but I want to add the strategic decision framework I use when choosing between Strangler and full rewrite. It’s not always Strangler.

The Decision Matrix I Use

Factor Favors Strangler Favors Rewrite
System size Large (>500K LOC) Small (<50K LOC)
Business continuity Mission critical Can tolerate downtime
Domain understanding Still learning Well understood
Team experience Mixed/learning Expert in target stack
Data coupling Can be decoupled Simple data model
Timeline pressure Flexible Hard deadline

When I Choose Full Rewrite

I’ve greenlit full rewrites three times in my career. Each time shared these characteristics:

  1. Small, well-understood system: Under 50K lines of code with clear boundaries and well-documented behavior. The complexity of maintaining a façade exceeded the complexity of just rewriting.

  2. Clean data model: No shared database with other systems, no complex state machines, no regulatory data retention requirements.

  3. Experienced target-stack team: We had engineers who could write the new system faster than they could learn to strangle the old one.

  4. Business could pause: Either a non-revenue-critical system or a willing business sponsor who could tolerate feature freeze.

The Hybrid Approach

Most often, I use a hybrid: Strangle the core business logic, but rewrite the UI layer completely. User interfaces tend to be:

  • Well-isolated from backend services
  • Easier to rebuild than incrementally modify
  • Higher value from a ‘new and improved’ perception

This gives you the risk mitigation of Strangler for the dangerous parts (data, transactions, business logic) with the speed benefits of rewrite for the cosmetic parts.

The Question I Ask

‘What’s the cost of being wrong?’

If a failed rewrite means the business dies, use Strangler. If a failed rewrite means we wasted 3 months and learned something, consider rewrite.

Netflix, Amazon, Uber - they use Strangler because they literally cannot afford downtime. A 3-person startup with 500 users might be better served by a weekend rewrite.

This is fascinating from a design systems perspective! We actually use a variant of the Strangler Pattern for UI modernization that I don’t see discussed as often.

The Design System as Façade

When we modernize legacy applications, the design system itself becomes the strangler layer for the UI. Here’s how it works:

  1. Create the wrapper components: Build design system components that can render legacy UI inside them. The wrapper provides consistent styling, spacing, and interaction patterns.

  2. Strangle from the outside in: Start by replacing the chrome - navigation, layout, typography. Users see a ‘new’ app even though the core functionality is still legacy.

  3. Gradually replace interiors: One form at a time, one data table at a time, replace legacy UI components with design system components.

Why This Works for UX

Users don’t care about your backend architecture. They care about how the app looks and feels. By strangling the UI first:

  • You deliver visible progress early (stakeholder happiness!)
  • Users get improved accessibility immediately
  • You can gather feedback on new patterns before touching business logic
  • The ‘legacy’ feel disappears even before the migration is complete

The Gotcha: Inconsistency During Transition

The hardest part is managing user experience during the strangler phase. Half-old, half-new can be jarring. We mitigate this by:

  • Completing entire user flows before shipping (don’t show half-migrated screens)
  • Using feature flags to hide work in progress
  • Communicating changes to users (‘we’re updating this section next week’)

Michelle’s point about hybrid approaches resonates - we almost always strangle the UI separately from the backend. Different timelines, different teams, different risk profiles.

Has anyone else used design systems as the strangler layer? I’d love to compare notes!