The Strangler Fig Pattern
Replace legacy systems incrementally without the risk of a big bang rewrite. The same way a strangler fig slowly envelops a host tree, your new system gradually takes over -- one route, one feature, one module at a time.
Where the Name Comes From
In 2004, Martin Fowler described this pattern after observing strangler figs in the rainforests of Australia. These remarkable trees germinate in the canopy of a host tree, sending roots down to the ground. Over years, the fig's roots thicken and merge, gradually encasing the host trunk. Eventually the host tree dies and decomposes, leaving the strangler fig standing independently with a hollow core where the original tree once was.
Fowler saw a perfect analogy for legacy system replacement: instead of tearing out the old system all at once, you grow a new system around it. The old system continues to function while the new one gradually takes over its responsibilities. When the migration is complete, you remove the legacy system -- or simply let it wither away.
This approach has become one of the most widely adopted strategies for modernizing legacy applications, and for good reason. It dramatically reduces the risk that comes with replacing systems that businesses depend on every day.
compared to big bang rewrites, which fail at rates between 60-80% according to industry research
new features ship to production throughout the migration instead of waiting for a "big reveal" launch day
if a migrated component fails, route traffic back to the legacy system in seconds -- no full system rollback needed
How It Works: Step by Step
The strangler fig pattern follows five repeatable phases. Each cycle migrates one slice of functionality from the legacy system to the new system. Repeat until nothing is left.
Identify the Next Slice
Analysis PhaseChoose a bounded piece of functionality to migrate. The ideal candidate has clear inputs and outputs, minimal entanglement with other modules, and measurable behavior you can verify. Start with low-risk, high-value slices -- an API endpoint, a report generator, or a self-contained workflow.
Tip: Map your system's request flows first. Slices that handle distinct URL paths or message types are the easiest to isolate and redirect.
Build the Replacement
Implementation PhaseImplement the same functionality in the new system. The replacement must produce identical outputs for the same inputs -- at least initially. Write comprehensive tests that validate behavior parity. This is not the time to add features or change business logic; replicate first, improve later.
Tip: Use contract tests or characterization tests against the legacy system to capture its exact behavior, including edge cases and bugs that users depend on.
Redirect Traffic
Cutover PhasePlace a routing layer (proxy, API gateway, load balancer, or feature flag) between callers and the backend. Start by sending a small percentage of traffic to the new implementation -- 1%, then 5%, then 25%, then 100%. This is where the "strangling" happens: you progressively shift traffic away from the legacy system.
Tip: Feature flags give you the most control. You can target specific users, geographic regions, or account tiers for gradual rollout.
Verify and Monitor
Validation PhaseCompare the new system's behavior against the legacy system. Monitor error rates, response times, and business metrics. Run both systems in parallel if possible, comparing outputs for the same inputs (shadow mode). Only proceed to full cutover when metrics confirm parity or improvement.
Tip: Define "done" criteria before you start: error rate below X%, p99 latency under Y ms, zero data discrepancies for Z days.
Decommission the Legacy Code
Cleanup PhaseOnce 100% of traffic flows through the new system and metrics are stable, remove the legacy code path. Delete the old code, drop the old database tables (after a grace period), and remove the routing rules. This is the step teams most often skip -- do not let zombie code linger. Dead code is still debt.
Tip: Schedule a "decommission sprint" for each migrated slice. If you do not plan it, it will not happen.
Key insight: Each cycle through these five steps should take 2-6 weeks for a single slice. If a slice is taking longer, it is too big -- break it down further. The power of the pattern comes from small, frequent iterations, not one massive migration effort.
When to Use It (and When Not To)
The strangler fig pattern is not always the right tool. Here is how it compares to alternatives, and when each approach makes sense.
Use Strangler Fig When
- The legacy system handles critical business operations that cannot go offline
- You need to deliver value incrementally rather than waiting months for a "big reveal"
- The system has well-defined entry points (HTTP endpoints, message queues, batch jobs)
- Multiple teams need to contribute to the migration in parallel
- You have been burned by a failed rewrite before and need a lower-risk approach
- The business cannot tolerate extended feature freezes during migration
Consider Alternatives When
- The system is small enough to rewrite in 2-4 weeks (just rewrite it)
- The legacy system has no clear routing layer or entry points to intercept
- The data model is so tightly coupled that you cannot migrate one table at a time
- The old and new systems need fundamentally different data models with no clean mapping
- You are replacing a COTS product rather than custom code (vendor migration is different)
- The legacy system will be decommissioned entirely (sunset, not replace)
Pattern Comparison
| Approach | Risk Level | Time to First Value | Rollback Difficulty | Best For |
|---|---|---|---|---|
| Strangler Fig | Low | 2-6 weeks | Easy | Large systems with clear boundaries |
| Big Bang Rewrite | Very High | 6-24 months | Very Hard | Tiny systems only |
| Branch by Abstraction | Medium | 4-8 weeks | Medium | Internal component replacement |
| Parallel Run | Low | 8-16 weeks | Easy | Financial/critical data systems |
Implementation Guide with Code Examples
Three practical implementation strategies, each with concrete code you can adapt. Choose based on your architecture and routing capabilities.
Strategy 1: API Gateway Routing
Best for HTTP-based systems where you can place a proxy in front of the legacy application.
# nginx.conf - Route by URL path
upstream legacy_app {
server legacy-app.internal:8080;
}
upstream new_app {
server new-app.internal:3000;
}
server {
listen 80;
# Migrated endpoints go to the new system
location /api/v2/orders {
proxy_pass http://new_app;
}
location /api/v2/inventory {
proxy_pass http://new_app;
}
# Everything else still hits legacy
location / {
proxy_pass http://legacy_app;
}
}How it works: Every migrated endpoint gets a new location block pointing to the new system. The catch-all location / sends everything else to legacy. As you migrate more endpoints, the legacy block handles less and less traffic until it serves nothing.
Strategy 2: Feature Flag Routing
Best when legacy and new code live in the same codebase, or when you need per-user or per-tenant rollout control.
// orderService.js - Feature flag controlled routing
async function processOrder(order, user) {
const useNewSystem = await featureFlags.isEnabled(
'new-order-processing',
{ userId: user.id, tier: user.accountTier }
);
if (useNewSystem) {
// New implementation
return await newOrderService.process(order);
} else {
// Legacy implementation
return await legacyOrderService.process(order);
}
}
// Gradual rollout configuration:
// Week 1: Enable for internal users only
// Week 2: Enable for 5% of free-tier users
// Week 3: Enable for 25% of all users
// Week 4: Enable for 100% of all users
// Week 6: Remove flag and delete legacy code pathHow it works: The feature flag service decides at runtime which code path to use. You can target by user, account tier, geography, or any attribute. This gives you fine-grained rollout control and instant rollback -- just flip the flag back.
Strategy 3: Event-Driven Gradual Migration
Best for systems that communicate via message queues or event buses. The new system subscribes to events alongside the legacy system.
// Phase 1: New system subscribes and logs (shadow mode)
// Legacy consumer continues processing all messages
legacyConsumer.subscribe('order.created', (event) => {
legacyOrderProcessor.handle(event); // Business logic runs here
});
// New consumer runs in parallel but only logs
newConsumer.subscribe('order.created', (event) => {
const result = newOrderProcessor.handle(event);
metrics.compare(event.id, 'legacy', 'new', result);
// Does NOT commit -- read-only shadow mode
});// Phase 2: New system processes, legacy becomes shadow
// Swap roles: new system is primary, legacy is shadow
newConsumer.subscribe('order.created', (event) => {
newOrderProcessor.handle(event); // Business logic runs here
});
legacyConsumer.subscribe('order.created', (event) => {
// Legacy now in read-only mode for comparison
metrics.trackLegacyDrift(event.id);
});How it works: Both systems subscribe to the same events. In Phase 1, the new system runs in shadow mode -- processing events but not committing results. You compare outputs to verify parity. In Phase 2, you swap which system is authoritative. In Phase 3, you remove the legacy consumer entirely.
Common Pitfalls and How to Avoid Them
The strangler fig pattern is conceptually simple but operationally tricky. These are the mistakes teams make most often.
Never Finishing the Migration
The most common failure. Teams migrate the easy 80% and then lose momentum. The remaining 20% runs on legacy forever, and now you maintain two systems instead of one.
Fix: Set a hard deadline for full decommission. Track migration progress as a team KPI. Budget time specifically for the "last mile" cleanup.
Slicing Too Large
Trying to migrate an entire module at once instead of individual endpoints or workflows. Large slices take too long, carry too much risk, and negate the incremental benefits of the pattern.
Fix: If a slice takes more than one sprint to build, it is too big. Decompose by HTTP route, message type, or individual workflow step.
Shared Database Coupling
Both the old and new systems read from and write to the same database tables. Schema changes in one system break the other, and you lose the ability to deploy independently.
Fix: Use database views or an anti-corruption layer to decouple. Migrate data ownership one table at a time, not all at once.
Adding Features During Migration
"While we are rewriting this, let's also add sorting and filtering." Scope creep turns a 3-week migration slice into a 3-month project. New features make it impossible to verify behavioral parity with the legacy system.
Fix: Replicate first, improve later. Put feature requests in a separate backlog. Add them after the slice is fully migrated and the legacy code is decommissioned.
No Observability on Both Systems
Teams instrument the new system thoroughly but neglect monitoring on the legacy side. Without baseline metrics from the old system, you cannot prove the new system is performing at parity.
Fix: Add logging and metrics to the legacy system before you start migrating. You need a baseline to compare against. This is not wasted effort -- it informs every cutover decision.
Ignoring Data Migration
Focusing on code migration while neglecting the database. The new code is clean and modern, but it still reads from 15-year-old denormalized tables because nobody planned the data migration.
Fix: Include data migration in each slice's definition of done. Use change data capture (CDC) or dual-write patterns to keep data in sync during transition.
Real-World Application Scenarios
The strangler fig pattern adapts to many migration scenarios. Here are three of the most common, with specific guidance for each.
Monolith to Microservices
The Classic Use CasePlace an API gateway in front of the monolith. Extract one bounded context at a time into its own service. The gateway routes requests to the new service for migrated functionality and to the monolith for everything else.
Start With
Authentication, notifications, or reporting -- services with clear boundaries and few cross-dependencies
Key Challenge
Shared database access. Use database views or a data access layer to decouple during transition
Typical Timeline
12-24 months for a mid-size monolith (50-200K lines of code). First service extracted in 4-8 weeks
Related: Monolith to Microservices Deep Dive
Legacy API Replacement
Version MigrationBuild v2 API endpoints alongside v1. External consumers migrate at their own pace while you strangle the v1 endpoints. This approach is especially effective when you have dozens of API consumers who cannot all migrate simultaneously.
Start With
High-traffic endpoints first -- they offer the biggest bang for the migration effort
Key Challenge
Keeping v1 and v2 data in sync during the transition period. Use dual-write or CDC patterns
Typical Timeline
6-12 months. Set a deprecation date for v1 and communicate it early to all consumers
Database Migration
Schema EvolutionMigrate one table (or group of related tables) at a time to a new schema or a new database engine. Use change data capture to keep old and new databases in sync during the transition. Application code reads from the new database; writes go to both until you are confident the new database is authoritative.
Start With
Read-heavy tables with simple schemas. Avoid tables with complex cross-references or triggers
Key Challenge
Foreign key relationships that span old and new databases. Use application-level joins during transition
Typical Timeline
12-18 months for a major database migration. Each table takes 2-4 weeks including dual-write verification
Related: LegacyBank Corp Case Study -- how an 800-engineer team used the strangler fig pattern after a $47M failed rewrite
Timeline Planning: The 6-18 Month Roadmap
Most strangler fig migrations follow a predictable arc. Here is a realistic timeline with milestones, based on patterns observed across dozens of successful migrations.
Foundation
- Map legacy system boundaries, request flows, and data dependencies
- Set up the routing layer (API gateway, reverse proxy, or feature flag service)
- Add observability to the legacy system (metrics, logging, tracing)
- Migrate the first "hello world" slice to prove the pattern works end-to-end
Momentum
- Migrate 3-5 slices per month, increasing team velocity as the pattern becomes routine
- Decommission legacy code for completed slices (do not let this slide)
- Track percentage of traffic handled by new vs legacy systems
- Aim for 60-70% of traffic on the new system by end of this phase
The Long Tail
- Tackle the hardest, most coupled slices (the ones you deferred earlier)
- Migrate shared data models and cross-cutting concerns
- Address the "last 20%" that holds 80% of the complexity
- Resist the urge to declare victory early -- partial migration is worse than no migration
Decommission and Cleanup
- Remove the legacy application, routing rules, and infrastructure
- Archive legacy database backups and drop old schemas
- Update documentation, runbooks, and onboarding materials
- Run a retrospective on what worked, what did not, and lessons for next time
Reality check: Smaller systems (under 50K lines of code) can finish in 6 months. Large enterprise systems (500K+ lines, multiple teams, external integrations) often take 18-24 months. The timeline depends on team size, system complexity, and how much legacy behavior is undocumented. Plan conservatively and celebrate each slice as a win.
Related Resources
Frequently Asked Questions
A big bang rewrite replaces the entire system at once -- you build the new version in parallel and switch over on a single launch day. The strangler fig replaces the system incrementally, one slice at a time. The key difference is risk: a big bang rewrite is all-or-nothing (if it fails, you roll back everything), while the strangler fig lets you roll back individual slices without affecting the rest of the system. Big bang rewrites also require feature freezes on the legacy system during development, which the strangler fig avoids.
Yes, but the database is usually the hardest part. Start by having the new system read from the legacy database through views or a data access layer. For writes, use a dual-write pattern (write to both old and new) or change data capture (CDC) to replicate changes. Migrate data ownership one table or aggregate at a time. The anti-corruption layer pattern is essential here -- it translates between the old data model and the new one, keeping both sides clean.
A single team of 4-6 engineers can run a strangler fig migration on a mid-size system. For larger systems, you can parallelize by having multiple teams each own different slices. The routing layer is the coordination point -- teams agree on URL paths, message types, or feature flag names, and each team migrates their slices independently. You do not need a dedicated "migration team" -- existing feature teams can migrate their own domains as part of normal sprint work.
Frame it in terms of risk and continuous value delivery. Big bang rewrites have a 60-80% failure rate and deliver zero value until launch day. The strangler fig delivers improvements to production every few weeks and can be paused or stopped at any point without losing the work already completed. For executives, the key selling point is: "We start seeing results in month one, not month twelve." Point to industry case studies where rewrites failed -- the most famous being Netscape's browser rewrite, which nearly killed the company.
This is one of the biggest advantages of the strangler fig pattern. Because each slice is independently valuable, you can pause the migration at any point. The slices you have already migrated continue running on the new system. The remaining slices continue running on legacy. You have not wasted any work -- unlike a half-finished rewrite that delivers nothing. When priorities shift back, you pick up where you left off. This "pausability" makes the strangler fig uniquely resilient to organizational changes.
Track three metrics: (1) Percentage of traffic routed to the new system versus legacy -- this is your primary progress indicator. (2) Number of legacy endpoints or modules remaining -- gives a count-based view. (3) Lines of legacy code deleted -- satisfying to watch go down. Visualize these on a dashboard and share with stakeholders monthly. Avoid using "estimated percentage complete" -- it is too subjective. Traffic percentage is objective and cannot be gamed.
Ready to Start Strangling Your Legacy System?
The strangler fig pattern works because it turns a terrifying all-or-nothing rewrite into a series of small, safe, reversible steps. Pick your first slice and start today.