I’m writing this because I wish someone had been this honest with me 18 months ago when we started down the DIY Backstage path.
The headline: What we thought would be “free” ended up costing us over ,000 in engineering time, delayed our platform roadmap by a year, and still resulted in only ~12% daily active usage among our developers.
We’re moving to a managed solution. Here’s the full story.
Context: EdTech Startup, Rapid Growth
When we started evaluating IDPs in late 2024, we were a 25-person engineering team scaling fast. We knew we needed better service ownership, clearer dependencies, and faster onboarding as we planned to double the team.
Backstage was the obvious choice - 89% market share, CNCF project, rich plugin ecosystem. The question was: self-host or managed?
We chose self-host. That decision, in hindsight, was wrong for us.
The “Free” Math
Here’s how we justified DIY Backstage:
Managed solutions cost: -100/engineer/month
Our projected team size: 80 engineers
Annual cost at scale: ,000 - ,000/year
DIY Backstage cost: “Free” (just implementation time)
Seemed like an easy decision. Invest a few months upfront, save -100k annually forever.
The Real Math
Here’s what actually happened:
Phase 1 - Planning (6 months):
- 1 senior platform engineer (50% time)
- Architecture design, plugin evaluation, infrastructure planning
- Cost: ~,000
Phase 2 - Implementation (12 months):
- 2 platform engineers (100% time)
- Core setup, plugin integration, custom development, migration
- Cost: ~,000
Ongoing - Maintenance (current):
- 1.5 platform engineers (ongoing)
- Plugin updates, bug fixes, version upgrades, support
- Annual cost: ~,000
Total 18-month cost: ,000+
And we’re STILL not feature-complete compared to Spotify Portal or Roadie.
The Hidden Costs Nobody Warned Us About
Beyond the direct engineering costs, here’s what blindsided us:
Plugin Maintenance Hell
We started with 12 plugins. Each one needs:
- Version compatibility testing before Backstage upgrades
- Monitoring for security vulnerabilities
- Bug fixes when things break
- Custom modifications to fit our needs
- Documentation for our team
Three plugins were abandoned by their maintainers. We had to fork and maintain them ourselves.
Two plugins had conflicting dependencies. Solved with workarounds that broke on every upgrade.
Plugin debt is real and expensive.
The Version Upgrade Treadmill
Backstage releases frequently. That’s great for the ecosystem. Terrible for maintenance.
Every upgrade requires:
- Testing all plugins for compatibility
- Reviewing breaking changes
- Updating custom code
- Regression testing
- Coordinating deployment window
We’re currently 4 versions behind because we don’t have capacity to upgrade. Which means we can’t use newer plugins. Which means we’re missing features that managed solutions have.
We’re paying to stay behind.
Custom Integration Burden
Every internal tool needs integration:
- GitHub (relatively easy)
- Jenkins (plugin exists but needed customization)
- Our internal deployment system (built from scratch)
- Our service mesh (built from scratch)
- Cost tracking system (built from scratch)
- Security scanning (built from scratch)
Each custom integration is code we own and maintain forever. Each one adds to our technical debt.
The Adoption Challenge
Even after all this investment, our actual usage is disappointing.
Who uses it:
- New engineers (first 2 weeks of onboarding)
- Tech leads (service catalog during architecture review)
- On-call engineers (finding ownership during incidents)
Who doesn’t:
- Most engineers in daily work
- Anyone who can solve their problem with grep + Slack
- Developers who found workarounds
Daily active users: ~10 out of 82 engineers (12%)
This isn’t because the implementation is bad. It’s because we built infrastructure before we built the culture and processes that make it valuable.
The Opportunity Cost
Here’s what really kills me: What could we have built instead?
With k and 18 months of platform engineering time, we could have:
- Built golden path templates for all our service types
- Created self-service infrastructure provisioning
- Developed comprehensive local development tooling
- Implemented automated testing and deployment pipelines
- Built internal tools that developers actually asked for
Instead, we built a portal that most developers don’t use.
Why We’re Moving to Managed
We’ve decided to move to Spotify Portal (or possibly Roadie - still evaluating). Here’s the math that changed our minds:
Managed solution cost: /engineer/month × 80 engineers = ,000/year
Current DIY cost: ,000/year (and growing as team scales)
Savings: ,000/year
Plus:
- No more version upgrade burden
- Professional support when things break
- Features we don’t have to build
- Security patches we don’t have to apply
- Onboarding guides we don’t have to write
- Platform engineers freed up for high-value work
The managed solution is literally cheaper AND better than DIY.
What We Learned
1. “Free” Is a Lie
Open source isn’t free. It’s a different cost model. Instead of paying vendor, you pay:
- Implementation costs
- Maintenance costs
- Opportunity costs
- Operational costs
For some orgs, that math works out cheaper. For us, it didn’t.
2. Commodity Infrastructure Isn’t Competitive Advantage
Running our own Backstage instance doesn’t differentiate our EdTech platform. Our teaching tools differentiate us.
Platform engineering should enable differentiation, not consume resources.
Great platform teams focus on:
- Golden paths specific to their domain
- Integrations with their specific toolchain
- Custom workflows for their developers
- Unique organizational needs
They don’t focus on:
- Keeping web frameworks up to date
- Managing plugin dependencies
- Running infrastructure for their infrastructure
3. Build vs Buy Requires Honest Assessment
The build vs buy decision should consider:
Build makes sense when:
- You have unique requirements
- You have strong platform team capability
- Maintenance cost is acceptable
- Build time is acceptable
- You can match vendor feature velocity
Buy makes sense when:
- Solution is commodity
- Vendor has economics of scale
- Your team is small or stretched
- Time to value matters
- You want to focus engineering on differentiation
We convinced ourselves we met the “build” criteria. We didn’t.
4. Platform Team Size Matters
If you have a 10+ person platform team, DIY Backstage might be manageable. If you have 2-3 people trying to support 80+ engineers, managed solutions make way more sense.
We were trying to do everything: CI/CD, cloud infrastructure, developer tooling, AND maintain Backstage. Something had to give.
5. Organizational Maturity Gates Technology Choices
Michelle’s recent post about platform engineering maturity really resonated. We jumped to a Stage 4 solution at Stage 2 maturity.
We didn’t have:
- Clear service ownership model
- Established golden paths
- Strong platform team credibility
- Developer trust and adoption
Technology can’t fix organizational immaturity. We learned this the expensive way.
What I’d Do Differently
If I could restart 18 months ago:
- Start with managed solution: Get portal up fast, learn what’s valuable
- Focus on golden paths: Invest in making developers productive
- Build organizational maturity: Service ownership, standards, processes
- Measure ruthlessly: Track developer productivity, not portal features
- Reassess build vs buy yearly: Maybe DIY makes sense at 200 engineers
The Uncomfortable Truth
The real reason we chose DIY wasn’t the cost math. It was ego.
We’re a tech company. We’re engineers. We wanted to prove we could build it ourselves. Open source means we should self-host, right?
That’s not strategy. That’s pride.
The hardest part of this decision wasn’t the technical analysis. It was admitting we made the wrong call.
Questions for This Community
For those who’ve been through similar evaluations:
- How did you calculate TCO honestly? What costs did we miss?
- At what team size does DIY Backstage become cost-effective?
- For those using managed solutions - what’s your experience been?
- For those successfully running DIY - what’s your team size and structure?
- How do you measure IDP ROI beyond “portal exists”?
I know there are orgs where DIY Backstage is the right call. I want to understand what makes it work for them, so I can better understand where we went wrong.
Moving Forward
We’re not abandoning platform engineering. We’re doubling down on it.
But we’re doing it smarter:
- Managed portal so platform engineers focus on golden paths
- Clear metrics tied to developer productivity
- Quarterly assessment of build vs buy for each component
- Honesty about what differentiates us vs what’s commodity
Platform engineering is journey, not a destination. And sometimes the journey means admitting when you took a wrong turn.
I’m sharing this because I suspect we’re not alone. The 89% market share means lots of orgs are implementing Backstage. The 10% adoption rate suggests many are struggling like we did.
If this helps even one platform team make a better-informed decision, it was worth the vulnerability of sharing our failure.