Self-Hosting Backstage Takes 12-18 Months. Buying Costs $200K/Year. What's the Break-Even Calculation?

I’m at a crossroads with our platform investment decision, and I need to reality-check my thinking with folks who’ve been through this.

Here’s the scenario: We’re a Series B fintech startup with about 150 engineers today, expected to hit 300 within 18 months. Our internal developer portal situation is… let’s call it “artisanal.” Lots of Confluence docs, Slack channels, and institutional knowledge trapped in senior engineers’ heads.

We need a real platform. Leadership agrees. The debate is how.

The Build Case (Self-Hosted Backstage)

Our platform team put together a detailed proposal:

  • Timeline: 12-18 months to production-ready
  • Team: 7 FTEs initially (1 PM, 2 frontend engineers, 2 full-stack, 2 DevOps)
  • Upfront investment: ~$1,050,000 (year one fully loaded costs)
  • Ongoing costs: ~$1,100,000/year (4 FTEs ongoing development, 2 FTEs maintenance, $50K infrastructure)
  • 3-year total: $3.25M

Their argument: Complete control, no vendor lock-in, unlimited customization, aligns with our “build our own tools” culture.

The Buy Case (Managed Platform)

Finance ran numbers on managed solutions (Roadie, Humanitec, Port):

  • For 300 developers: ~$72,000-$80,000/year (based on Roadie pricing at $20-22/dev/month)
  • Timeline: 1-2 months to production
  • Team: 2-3 engineers for integrations and golden paths
  • 3-year total: ~$240,000 + engineering capacity for custom work

Their argument: Immediate value, predictable costs, frees engineering capacity for differentiated work, lower risk.

The Math Seems Obvious But…

On paper, buying saves us $3M over three years. That’s not a rounding error—that’s 15+ engineering headcount or a year of runway.

But the engineering team pushes back hard:

  • “We’ll be locked into a vendor”
  • “We have unique security requirements” (do we though?)
  • “We need customization flexibility” (for what specifically?)
  • “Managed platforms are black boxes” (are they?)

I get it. I was a PM at Google and Airbnb—we built everything. But we also had infinite resources.

What I’m Struggling With

  1. Hidden costs: What am I missing in both scenarios? The maintenance burden for self-hosted feels hand-wavy. The “lack of customization” concern for managed feels theoretical.

  2. Real vs perceived constraints: How do I separate genuine technical requirements from engineering preference?

  3. Decision framework: Is there a structured way to evaluate this beyond just cost comparison? How do other product leaders think about build vs buy for platform investments?

  4. Hybrid approaches: Is there a middle ground that gives us both control and speed?

The Real Question

For those who’ve made this decision—what did your break-even analysis actually look like? What factors ended up mattering most? And most importantly, what did you learn that you wish you’d known when you were in my position?

The way I see it, we can either spend $3M building a platform or spend $240K buying one and invest the $3M delta in features that drive revenue. But I’m a product person, not a platform engineer, so I’m probably missing something critical.

What am I missing?


Cross-posted from internal strategy doc. Numbers slightly adjusted for confidentiality but ratios are accurate.

You’re not missing anything—your numbers are spot on, and the math should be obvious. We went through this exact analysis 18 months ago.

Let me tell you what happened.

We Chose Build (And I’d Choose Differently Today)

Initial math favored buying, but we chose to build on Backstage anyway. Here’s why:

  1. Security/compliance requirements: Financial services, need tight control
  2. Long-term strategic control: Platform is core to our technical strategy
  3. Customization needs: Unique workflows that “wouldn’t fit” managed solutions
  4. Engineering culture: “We build, we don’t rent” mentality

Sounds rational, right?

The Reality Check

Timeline: We said 18 months. Actually took 22 months to production-ready state.

Budget: Came in 40% over initial estimates. Why? TypeScript hiring was harder than expected (most of our platform team came from Go/Python backgrounds). Had to bring in contractors for frontend work.

Maintenance burden: This is where your engineering team’s estimates are dangerously optimistic. We thought 1-2 FTEs for maintenance. Reality is 2-3 FTEs just for keeping up with Backstage core updates, plugin compatibility, and security patches.

Hidden Costs Your Calculation Is Missing

  1. Opportunity cost: While platform team maintained Backstage internals, they weren’t building platform capabilities. We delayed critical service onboarding automation by 9 months because team was stuck debugging React component issues.

  2. TypeScript expertise tax: If your platform team isn’t already strong in React/TypeScript, you’re either retraining or hiring. Both are expensive and slow.

  3. Distributed systems complexity: Running Backstage at scale requires separate workers for async jobs, proper queue management, monitoring for a distributed system. That’s not “2 FTEs for maintenance”—that’s real infrastructure work.

  4. Plugin ecosystem maintenance: Every time Backstage core updates, you need to verify plugin compatibility. Some plugins lag behind, forcing you to either wait or fork and maintain yourself.

Break-Even Is The Wrong Question

Here’s what I wish someone had told me: The break-even calculation is only part of the picture. The real question is: Is your platform a competitive differentiator or a commodity?

For us (financial services), we argued it was strategic. In reality, 85% of what we built could have been handled by Humanitec or Roadie with configuration. The 15% that was truly custom? We could have built that as plugins on top of a managed platform.

If your platform enables your business but isn’t your business, buy it.

We spent $1.8M building something we could have bought for $200K. The real loss? The platform features we didn’t build because our team was maintaining framework code instead of solving organizational problems.

What Would I Do Differently?

Hybrid approach: Buy foundation (Humanitec/Roadie/Port), build the differentiating layer. Your engineers still get to solve interesting problems—golden paths, deployment automation, custom integrations. But they’re not maintaining authentication systems and UI component libraries.

Your CFO is right about the math. Your engineers are right about wanting control. The synthesis is: Buy the commodity infrastructure, build the competitive advantage on top.

One more thing: The “vendor lock-in” concern is overweighted. The risk of DIY lock-in (knowledge trapped in 3 engineers who built it) is often higher than managed platform lock-in (standardized system multiple people can understand).

Your instincts are good. Trust the math.

I led our platform team through this evaluation last quarter. Let me share the detailed TCO model we built—it might help you pressure-test your numbers.

Our Context

Fortune 500 financial services, 180 engineers currently, planning for 250 within 12 months. Similar growth trajectory to yours.

The Real Timeline (Not The Optimistic One)

Your engineering team says 12-18 months. Here’s what that actually means:

Phase 1: Basic Portal (4-6 months)

  • Authentication, catalog, basic UI
  • Service documentation templates
  • User management

Phase 2: Catalog Enrichment (2-3 months)

  • Integrations with CI/CD, monitoring, PagerDuty, etc.
  • At least 200 engineering hours just for server-side pagination
  • 2 months for entity enrichment from multiple systems

Phase 3: RBAC & Security (6+ months)

  • Building a full RBAC product is typically 6 months for a 2-person team
  • Roles, admin UI, policy engine, token management
  • This is where security/compliance requirements actually hit

Phase 4: Golden Paths (3-4 months)

  • Service templates, CI/CD automation
  • This is where actual business value starts

Add 20-30% buffer for integration challenges, team ramp-up, scope creep.

So “12-18 months” realistically becomes “18-24 months to actually valuable.” And that’s if everything goes smoothly.

The TCO Model We Built

Build Option (3-year horizon):

  • Year 1: $1.2M (7 FTEs fully loaded + infrastructure + contractor help for React work)
  • Year 2: $1.1M (4 FTEs dev + 2 FTEs maintenance + $50K infra)
  • Year 3: $1.1M (same ongoing costs)
  • Total: $3.4M

Buy Option - Roadie for 100 developers initially, scaling to 200:

  • Year 1: $60K (100 devs × $20/month × 12 + onboarding + 2 engineers for integrations = $300K)
  • Year 2: $96K (200 devs, pricing volume discount)
  • Year 3: $96K
  • Total: $252K + $900K engineering capacity = $1.15M

Net savings: $2.25M over 3 years.

But the real kicker is the opportunity cost.

The Maintenance Reality Check

Your platform team says “2 FTEs for maintenance.” Here’s what maintenance actually looks like:

Monthly Backstage upgrades: 30-40 hours (not optional—security patches, features, dependency updates)

Plugin compatibility verification: 20-30 hours (every core upgrade breaks something)

Troubleshooting/support: 50-60 hours (developers reporting issues, unclear docs, edge cases)

Infrastructure maintenance: 20 hours (database migrations, monitoring updates, scaling adjustments)

New feature requests from developers: 60-80 hours (this is where “2 FTEs” becomes “full team”)

Total: 180-230 hours/month = ~1.5 FTEs just for keeping the lights on.

That doesn’t include building new platform capabilities. That’s just maintenance.

The Skills Gap Problem

Michelle mentioned this, but it’s worth emphasizing: Do you have React/TypeScript expertise on your platform team?

Most platform engineers come from infrastructure backgrounds: Kubernetes, Terraform, Python, Go. Backstage requires frontend chops. That’s a mismatch.

Options:

  1. Hire for it (expensive, slow, niche skillset)
  2. Retrain existing team (3-6 months of reduced productivity)
  3. Contract it out (adds $150-200K/year, coordination overhead)

None of these options are in your initial estimate.

What We Decided

We decided to buy (Roadie) and focus our engineering capacity on:

  • Golden path templates specific to our tech stack
  • Custom deployment automation
  • Cost tracking and optimization dashboards
  • Security compliance automation
  • Integration with our internal tools

These are the things that actually differentiate us. We’re not differentiated by having a hand-built UI or authentication system.

The Productivity Math That Closed The Deal

We calculated the productivity loss across our engineering org from not having good platform capabilities:

  • Before IDP: ~5-6 hours/week per developer on undifferentiated tasks (finding docs, waiting for access, manual deployments)
  • Target after IDP: ~1 hour/week

4-5 hours/week × 180 developers × $75 average hourly cost × 48 weeks = $2.6M - $3.2M annual productivity gain.

Even if we only achieve half that, the ROI is massive compared to platform costs.

The 18-month delay in the build scenario means 18 months of lost productivity = ~$4M in developer time wasted.

Bottom Line

Your engineering team’s concerns about lock-in and customization are valid feelings, but are they based on actual requirements?

Challenge them: “What specific customization do we need that Roadie/Humanitec/Port can’t support via their plugin systems?”

In our case, when we got specific, 90% of “requirements” were actually preferences or assumptions. The remaining 10% could be built as plugins on a managed platform.

Your platform engineers’ time is your scarcest resource. Spend it on building organizational capability, not maintaining open-source frameworks.

Coming at this from a different angle—the product experience trade-off that nobody’s talking about.

I was design systems lead at a startup that tried to DIY our internal platform. We spent 14 months building on Backstage. It was… a learning experience.

The Hidden UX Cost

When you choose “build,” you’re not just building backend infrastructure. You’re committing to:

Product design for every interaction
Frontend engineering for every component
Continuous UX refinement as you learn what developers actually need

Our team was fantastic at backend systems, Kubernetes, infrastructure automation. We were not good at building React applications.

What Actually Happened

Our portal worked functionally, but it felt homemade:

  • Inconsistent interaction patterns
  • Components that looked slightly off
  • Loading states that were janky
  • Mobile experience that was basically broken
  • Accessibility that was… let’s just say non-existent

Developers avoided using it. Not because it didn’t work, but because the experience was frustrating compared to tools they used daily (like GitHub, Datadog, etc.).

Portal adoption is a product challenge, not just a technical one.

The Comparison That Hurt

I demoed Roadie to our team after 10 months of building. Their portal was:

  • Polished, professional UI
  • Consistent design system
  • Responsive across devices
  • Actually accessible
  • Regular UX improvements shipping automatically

We had spent $900K and had something that looked like a student project. They spent $60K and got something that looked like a professional SaaS product.

That’s when I realized: We were competing with products built by teams who do only this, all day, every day.

The Ongoing Product Debt

Even after launch, maintaining product quality requires:

  • Designers who understand developer tools
  • Frontend engineers who care about details
  • User research to understand pain points
  • Continuous iteration based on feedback

If you’re serious about build, are you budgeting for a product manager, a designer, and frontend engineers long-term? Because that’s what it takes to compete with the UX quality of managed platforms.

The Opportunity Cost Question

Your CFO’s framing is brilliant: What if you spent that $1M on features that drive revenue instead of internal infrastructure?

That’s $1M in product development capacity. For a Series B fintech, that could be:

  • 3-4 senior engineers building customer-facing features
  • A year of product experimentation
  • An entire new product line

We chose to build our platform. Our competitors who bought theirs shipped 2 major features while we were debugging React hooks.

Guess who had better retention and growth that year? (Hint: not us.)

What I’d Ask Your Engineering Team

“If we build this, who owns the product experience? Who’s accountable for developer satisfaction with the portal? Who decides priorities when developers request UX improvements?”

Because in my experience, nobody wants to own that. Engineers want to build capabilities, not maintain UI component libraries.

The Hybrid Approach That Makes Sense

If your engineers really want to build (and I get the appeal), suggest this:

Buy the portal UI, authentication, catalog, RBAC—all the commodity infrastructure
Build the golden paths, workflows, and integrations that encode your organizational knowledge

That way they’re solving interesting problems (how do we onboard services faster? how do we encode security requirements into templates?) instead of React problems (how do we make this dropdown keyboard-accessible?).

Your engineers will be more engaged, your developers will have a better experience, and your product team can focus on actual products.

The math is clear. The UX quality difference is clear. The opportunity cost is clear.

What’s keeping you from the obvious decision?

The break-even question misses the talent development and organizational scalability angles. Let me share what we learned the hard way.

The Knowledge Concentration Risk

When you build, you create specialized knowledge concentrated in a small team. At our previous company, we had 3 engineers who deeply understood our self-hosted Backstage implementation.

Then two of them left within 6 months of each other.

Suddenly we had:

  • One person with complete knowledge (massive key-person risk)
  • New platform team members ramping for 4-6 months each
  • Organizational knowledge trapped in code without documentation
  • Fear of making changes because “only Sarah really understands this”

Bus factor became organizational risk.

The TCO calculation didn’t include “what happens when the team who built it leaves?” The real cost: 8 months of platform stagnation while we rebuilt institutional knowledge.

The Skills Ecosystem Question

Building Backstage requires niche skills: “Backstage framework expertise, React, TypeScript, plugin architecture.”

Using a managed platform requires common skills: “Understanding IDP concepts, API integration, workflow automation.”

Which skillset:

  • Is easier to hire for?
  • Is more transferable if someone leaves?
  • Helps engineers grow their careers?
  • Scales as your org grows?

“Backstage framework expert” is a tiny talent pool. “Platform engineer who can integrate systems and build golden paths” is a much larger pool.

The Hiring and Retention Angle

When we recruited for our platform team:

DIY Backstage pitch:
“Build and maintain our internal developer portal built on Backstage framework. You’ll work on TypeScript, React, plugin development, and keeping up with framework updates.”

Managed platform pitch:
“Build platform capabilities that accelerate 200+ engineers. You’ll design golden paths, build automation, integrate our tool ecosystem, and directly impact developer productivity.”

Which job is more appealing to a senior platform engineer?

Most want to solve novel organizational problems, not maintain an open-source framework. The second pitch got way more interest and better candidates.

Retention risk: Engineers burn out from maintenance treadmill. Your best people leave when they’re just “keeping the lights on” instead of building.

The Scaling Consideration

As you grow from 150 to 300 to 500 engineers:

Build costs stay high or increase:

  • More maintenance burden (more users = more issues)
  • More feature requests (more teams = more diverse needs)
  • More platform engineers needed (someone estimated 1 platform engineer per 50-75 developers)

Buy costs scale more predictably:

  • Per-seat pricing is transparent
  • Vendor handles scaling infrastructure
  • Platform team size scales with actual differentiated work, not maintenance

From 150 to 500 engineers, your DIY costs might double. Your managed platform costs scale linearly at ~$20-25/dev/month.

The Organizational Design Question

If you build: 4-6 engineers split between framework maintenance and capability development.

If you buy: 2-3 engineers fully focused on golden paths, automation, and organizational patterns.

Which team is more energized? Which has clearer impact? Which is easier to staff and grow?

The Talent Strategy Perspective

Your $3M delta isn’t just about platform costs. It’s about:

15+ engineering headcount you could hire for customer-facing work
Career development for platform team (building vs maintaining)
Organizational resilience (distributed knowledge vs concentrated)
Hiring velocity (common skills vs niche framework expertise)

What I’d Do Differently

At my previous company, we chose DIY. Platform team was constantly firefighting, morale was low, key people left, and we stagnated.

Current company: We use Humanitec for foundation, build golden paths on top. Platform team is energized, retention is high, and we’re shipping platform capabilities monthly instead of quarterly.

From talent strategy perspective: Buy the platform, build your golden paths.

Your people are your scarcest resource. Spend them on problems that grow the business and their careers, not on maintaining open-source frameworks.

The math says buy. The talent reality says buy. The scalability case says buy.

What’s the actual blocker to the obvious decision?