We Spent 9 Months Building Our Backstage Portal. Developers Used It For 2 Weeks. Here's What We Learned

carlos_ml · March 22, 2026, 4:14am

I need to write this while it’s still fresh. We just had our quarterly platform retrospective, and the conversation was… uncomfortable. In a good way, but still. It’s time to be honest about what happened with our Backstage implementation.

TL;DR: We spent 9 months building our internal developer portal. Developers used it enthusiastically for about 2 weeks. Then usage dropped to nearly zero. Here’s what we learned the hard way.

The Beginning: So Much Optimism

18 months ago, our CTO announced we’d be building an internal developer portal using Backstage. The excitement was real. We were going to solve all the problems:

Service discovery (which team owns what?)
Documentation sprawl (spread across Confluence, GitHub wikis, Google Docs)
Onboarding friction (new engineers taking 3+ weeks to ship first PR)
Self-service infrastructure (tired of waiting 2 days for environment provisioning tickets)

We dedicated 4 platform engineers full-time. All incredibly skilled—they built our microservices infrastructure, manage our Kubernetes clusters, handle PCI compliance frameworks. We thought: “How hard could this be?”

Reality Check #1: The Skills Mismatch

Here’s what we didn’t anticipate: platform engineers and web developers have completely different skillsets.

Our team works in Go, Python, and YAML all day. We’re great at distributed systems, infrastructure as code, observability pipelines. But TypeScript? React component development? Frontend state management? That’s a different world.

We assumed web development would be easier than our day job. We were wrong. Building production-grade internal tooling requires expertise we didn’t have. After 3 months of struggling, we hired two frontend-focused engineers specifically for Backstage development.

That wasn’t in the plan or budget.

Reality Check #2: The Timeline Deception

The research said 6-12 months. We were confident we’d hit 6 months because our team is experienced.

We launched in 9 months. But “launched” is generous. What we actually shipped:

Basic service catalog with ownership data
Links to documentation (not integrated, just links)
A scaffolding template for new microservices (that only worked for one language)
Dashboard showing recent deployments

The sophisticated stuff we promised—self-service infrastructure provisioning, automated compliance checks, integrated observability, golden path templates—those features kept getting pushed to “phase 2.”

18 months later, we’re still building phase 2 features. “Later” keeps getting later when you’re maintaining what you’ve already built.

Reality Check #3: The Adoption Disaster

Initial adoption looked promising: ~15% of our 200 developers used it in the first month. We celebrated! But watch what happened:

Month 1: 15% weekly active users
Month 3: 12% weekly active users
Month 6: 8% weekly active users
Month 12: 5% weekly active users
Today: ~3% weekly active users

We built this beautiful portal with service catalogs, scaffolding templates, documentation search, deployment dashboards. And developers… went back to their old workflows.

Why? Because the old workflows were faster and more familiar. Our portal solved problems we thought developers had, not problems they actually have.

The Critical Mistake: Building What We Thought They Needed

Here’s the painful truth: we never asked developers what they actually needed. We made assumptions:

“Of course they want service discovery!” → Most devs work on 2-3 services, already know who owns them
“Documentation should be centralized!” → Developers prefer docs near code (READMEs) over portals
“Scaffolding templates will standardize services!” → Teams have specific needs, one-size-fits-all templates don’t work
“Deployment dashboards provide visibility!” → CI/CD already shows this, portal adds no value

We built 15 plugins thinking “more features = more value.” What we created was cognitive overload. Developers looked at it, saw complexity, and stuck with simple tools they already knew.

What Managed Solutions Would Have Changed

I’ve been thinking a lot about this lately. If we’d started with a managed Backstage solution (like Roadie), what would be different?

Time to value: We’d have been live in 2-3 weeks instead of 9 months. We could have discovered adoption problems in month 2, not month 12.

Validation before investment: We could have tested whether developers actually care about service catalogs before dedicating 4 engineers for a year.

Feature prioritization: Managed solutions force you to start with basics. We might have learned earlier that our comprehensive approach was wrong.

Opportunity cost: Those 4 engineers could have been improving CI/CD pipelines, building better observability, automating compliance—things that would have directly reduced developer friction.

The sunk cost fallacy is real. We’ve invested so much in self-hosted that migrating to managed feels like admitting defeat. But continuing to throw resources at low-adoption tooling is its own form of defeat.

Lessons for Others Considering Backstage

If you’re where we were 18 months ago, here’s what I wish someone had told me:

1. Developer Buy-In Must Come First

The stat that 20% cite “lack of developer buy-in” as the top failure reason? We’re a textbook case. Build with developers, not for developers. Shadow them. Interview them. Identify friction points through observation, not assumption.

2. Maintenance Is Severely Underestimated

We thought: “Once we build it, we’ll need maybe 1 engineer for maintenance.” Reality: 2 FTE permanently, just to keep it running, upgrade Backstage versions, fix broken plugins, handle security patches.

That’s $400K/year in ongoing costs we didn’t budget for.

3. Managed Lets You Validate Faster

If your goal is to learn whether an IDP solves real problems, managed solutions let you run that experiment in weeks instead of quarters. You can always migrate to self-hosted later if you prove ROI.

Starting with self-hosted means you’re betting big before you know if developers will even use it.

4. TypeScript/React Expertise Isn’t Optional

If your platform team doesn’t have deep frontend skills, self-hosted Backstage will be painful. You’re not just configuring YAML—you’re building React applications. That requires different expertise than infrastructure engineering.

5. Measure Adoption Obsessively

We should have had clear adoption targets: “If we don’t hit 30% WAU by month 3, we pause and reassess.” Instead, we kept building features hoping adoption would magically improve. It didn’t.

What We’re Doing Now

We’re seriously considering migrating to a managed solution. The ROI conversation with leadership is straightforward:

Current cost: 2 FTE maintenance + low adoption = $400K/year waste
Managed cost: ~$100K/year with better features and support
Savings: $300K/year + ability to redeploy engineers to higher-value work

The hard part is admitting we made the wrong choice 18 months ago. But that’s cheaper than making the same wrong choice for another 18 months.

Questions for This Community

For those who’ve been through IDP implementations:

How did you drive adoption beyond 30%? What actually worked?
Did you start managed or self-hosted? Would you make the same choice again?
How did you validate use cases before building? What process prevented assumptions?
What metrics proved ROI to leadership? Beyond deployment frequency—actual business impact?

I’m sharing this because I don’t think we’re alone. The 89% market share with 10% adoption rate suggests this is a common problem. Maybe if we’re honest about failures, we can help others avoid them.

Update: Reading the comments on Michelle’s thread about build vs buy really drove this home. We’re all making similar mistakes. Let’s learn from each other’s experience instead of repeating the same patterns.

data_rachel · March 22, 2026, 4:17am

Luis, thank you for writing this. Seriously. This level of honesty about platform engineering failures is rare, and it’s exactly what leaders need to hear before making similar investments.

This Validates Everything I Was Worried About

Your timeline—9 months to launch, still building “phase 2” features 18 months later—is exactly what concerned me about self-hosted. That’s not your team’s fault. That’s the nature of building complex web applications when it’s not your core expertise.

The adoption curve you shared (15% → 3% over a year) is devastating. But here’s what I appreciate: you’re not sugarcoating it. Most post-mortems I read either hide the failure or blame users for “not getting it.” You’re owning the real problem: you built what you thought developers needed instead of validating what they actually need.

The Opportunity Cost Hits Different

Your math: 4 engineers × 9 months = sunk cost, plus 2 FTE ongoing = $400K/year for 3% adoption. That’s $133K per percentage point of adoption. Absolutely brutal ROI.

But here’s what really stings: What could those 6 FTEs have built instead?

Improved CI/CD pipelines that reduce build times by 50%?
Better observability that catches production issues before customers report them?
Automated compliance workflows that reduce audit prep time from weeks to days?
Infrastructure cost optimization that saves $1M/year in cloud spend?

All of those would have delivered measurable business value. The portal might still deliver value eventually, but the opportunity cost of what you didn’t build is the hidden expense nobody talks about.

Platform Engineering As Product Engineering

I’m becoming convinced that platform teams need product managers, not just engineering managers. Your failure wasn’t technical—it was product management:

No user research to validate assumptions
No MVP to test adoption before full build
No success metrics defined upfront
No kill criteria (“if adoption < 30% by month 3, we pivot”)

Infrastructure engineers are incredible at building reliable, scalable systems. But building products developers love requires different skills: empathy, user research, ruthless prioritization, willingness to kill features that don’t work.

My team is now requiring that any internal platform investment over $100K goes through the same product validation process we use for customer-facing features. If we wouldn’t build it that way for customers, why would we build it that way for developers?

The Managed Migration Makes Sense

Your ROI case for migrating to managed is compelling:

Current: $400K/year for low adoption and maintenance burden
Managed: ~$100K/year with better features, no maintenance, professional support
Net: $300K savings + ability to redeploy 2 engineers to higher-value work

That’s not admitting defeat. That’s resource allocation based on data. The sunk cost fallacy would be continuing to throw money at something that isn’t working just because you’ve already invested in it.

One question though: Have you investigated why adoption failed before migrating platforms?

I worry that switching from self-hosted to managed Backstage doesn’t fix the fundamental problem if developers don’t care about IDPs at all. Managed gets you faster time-to-value and better UX, but if the core use cases aren’t valuable to developers, you might end up with the same 3% adoption on a managed platform.

Before migrating, I’d recommend:

Interview the 3% who still use it—what value are they getting?
Interview the 97% who don’t—what would make them use it?
Validate whether the problem is the implementation or the concept
Only then decide whether managed Backstage, proprietary alternatives, or no IDP is the right answer

What I’m Taking From This

My team was leaning toward self-hosted because “we can customize it exactly how we want.” Your experience suggests that’s a trap. Customization doesn’t matter if nobody uses it.

We’re now planning to:

Start with managed Backstage (de-risk the technical implementation)
Focus on 1-2 specific use cases that we’ve validated through developer interviews
Set clear adoption targets (30% WAU by month 2 or we reassess)
Measure business outcomes, not just portal features

If we can’t drive adoption with managed, we definitely won’t drive it by building our own. At least with managed, the learning is cheap.

Luis, one more question: If you could go back 18 months, what’s the first thing you would have done differently? Not “use managed instead of self-hosted”—I mean the very first step in the process. What should have happened before any platform decision was made?

vp_eng_keisha · March 22, 2026, 4:17am

Luis, I just forwarded this to my entire design systems team. Not because we’re thinking about Backstage—because this is exactly what happens with design systems nobody uses, and the patterns are identical.

You Built the Portal I’ve Seen a Dozen Times

Service catalog? Documentation search? Scaffolding templates? Deployment dashboards?

These are the equivalent of design systems with:

47 button variants
Comprehensive typography scales
Exhaustive color palettes
Complex state management patterns

Nobody asked for any of it. Platform team (or design team) assumed people needed it. So they built it. And it sits empty while people use simple, “wrong” solutions that actually work for them.

The Core Problem: Building For Creators, Not Users

This line killed me: “Our portal solved problems we thought developers had, not problems they actually have.”

EXACTLY.

You know what developers actually needed? Probably one thing:

Faster environment provisioning

That’s it. Not a portal. Not a service catalog. Not documentation search. Just “I want a staging environment and it takes 2 days to get one.”

If you’d built self-service environment creation as a CLI tool (no portal needed), would it have solved the actual friction? Probably yes. Would it have required TypeScript experts? No. Would it have taken 9 months? Definitely not.

But instead, you built a comprehensive solution to a comprehensive problem you never validated existed.

What Developer Research Would Have Revealed

If you’d shadowed developers for a week (not surveyed them, watched them work), you would have seen:

Most developers work on 2-3 services—they already know the owners, don’t need discovery
Documentation near code (READMEs) is preferred over centralized portals—context matters
One-size-fits-all scaffolding doesn’t work—teams have specific patterns and preferences
CI/CD already shows deployment status—adding it to a portal is redundant

You could have learned all of this before writing a single line of code. Instead, you learned it after spending $400K/year for 18 months.

The “15 Plugins = More Value” Trap

This is where platform teams (and design teams) go wrong: assuming more features = more value.

Reality: More features = more cognitive load.

When I see a design system with 47 components, my brain shuts down. “This is too complex, I’ll just write custom CSS.” Same thing happened with your portal. Developers saw 15 plugins, felt overwhelmed, went back to simple tools they already knew.

The best products I’ve used do 1-2 things exceptionally well:

Stripe: accepts payments
Twilio: sends messages
Vercel: deploys websites

They don’t try to solve every problem. They nail one specific job and make it 10x easier than alternatives.

If you’d built self-service environment provisioning and made it take 30 seconds instead of 2 days, developers would have loved you. That’s 10x better. Everything else is just 10% better, which isn’t enough to overcome switching costs.

Managed Won’t Fix the Real Problem

Luis, I appreciate your honesty about considering managed Backstage, but I’m worried you’re solving the wrong problem.

The problem isn’t “self-hosted is expensive to maintain.” The problem is “we built something developers don’t value.”

Managed Backstage gets you:

Faster time to launch
Better UX and polish
No maintenance burden
Professional support

But it doesn’t get you: Solving problems developers actually care about.

If the core issue is that IDPs don’t address real friction points, managed Backstage just makes you fail faster and cheaper. Which is actually good! Better to learn in month 1 than month 12. But don’t assume migrating platforms fixes adoption.

What I’d Recommend Before Migrating

Michelle asked about change management in her thread. I think you’re asking the wrong question. It’s not “how do we drive adoption of the portal we built?” It’s “what do developers actually need?”

Before migrating to managed:

Interview the 97% who stopped using it—specifically ask: “What were you hoping to get from the portal?” and “Why did you stop using it?”
Shadow developers for 3-5 days—see where they actually waste time
Identify the top friction point—not the top 5, just the #1 thing
Prototype a solution—doesn’t have to be Backstage, could be a script
Measure whether it actually saves time—real metrics, not surveys

If that prototype drives 50%+ adoption, then consider whether Backstage (managed or self-hosted) is the right platform. But start with the problem, not the solution.

The Question Nobody’s Asking

Here’s what I keep wondering: Does anyone actually need a full developer portal?

Or do we just need:

Faster environment provisioning (solved by Terraform + Slack bot)
Service ownership clarity (solved by CODEOWNERS files + documentation)
Documentation findability (solved by better search in existing tools)
Onboarding efficiency (solved by runbooks + mentorship + better README files)

Maybe the answer is “we don’t need a portal, we need to fix 4 specific workflows.” And maybe that’s cheaper, faster, and gets higher adoption than any comprehensive platform.

Luis, gut check: If you could only fix ONE developer friction point, what would it be? Not “build a portal”—what specific thing wastes developers’ time today?

alex_technical · March 22, 2026, 4:17am

Luis, as someone who’s spent the last decade in product management, this is one of the clearest examples of a product-market fit failure I’ve seen in internal tooling. Let me break down where this went wrong from a product lens.

This Is A Classic PMF Failure, Not A Technical Failure

Your team executed brilliantly on the technical side. You built what you set out to build, on a reasonable timeline, with skilled engineers. The problem is: you never validated that developers wanted what you were building.

Product-market fit failures in B2B SaaS look exactly like this:

Built based on assumptions, not customer research
Feature bloat trying to solve every problem
Adoption starts okay, then drops as users realize it doesn’t solve real pain
Team keeps adding features hoping to reverse declining usage

The 9-month build time isn’t the problem. The problem is going 9 months before learning you’re building the wrong thing.

Let’s Talk About Those Metrics

You mentioned tracking weekly active users (15% → 3%). Good. But that’s a lagging indicator. By the time WAU drops, you’ve already lost the battle.

What you needed to track during development:

Leading Indicators:

Time to complete core workflows (is portal faster than existing tools?)
Developer NPS/satisfaction (surveyed monthly, not just at launch)
Feature utilization rates (which plugins get used vs ignored?)
Return usage (do people come back or one-and-done?)

Business Outcome Metrics:

Time-to-first-deployment for new hires (did onboarding improve?)
Support ticket volume to platform team (did self-service reduce requests?)
Environment provisioning time (did you solve the 2-day wait problem?)
Cross-team collaboration (did service discovery help?)

If you’d measured these monthly, you would have known by month 3 that adoption was failing and pivoted. Instead, you kept building based on the original plan.

The $400K/Year Question

Two FTE maintaining this plus low adoption = $400K/year ongoing cost. Let’s put that in business context:

What did you get for $400K/year?

Service catalog that 3% of developers use
Documentation links (not integrated docs, just links)
One scaffolding template that only works for one language
Deployment dashboard that duplicates CI/CD functionality

What could $400K/year buy instead?

Managed Backstage: ~$100K + $300K for 2 engineers to build high-value platform features
Point solutions: Self-service infrastructure ($50K) + better docs ($30K) + onboarding improvements ($20K) = $100K, plus $300K for other work
Direct developer productivity: 2 senior engineers embedded in product teams improving velocity

The opportunity cost is staggering. Platform engineering has to compete for budget against revenue-generating work. If the ROI isn’t clear, it’s a cost center leadership will cut.

Managed Backstage Might Not Be The Answer

You’re considering migrating to managed to save costs and improve UX. But I’m worried you’re about to make a smaller version of the same mistake.

Before migrating, validate the core assumption: Do developers need an IDP at all?

What if the answer is:

“We need faster environment provisioning” → Terraform + Slack bot
“We need better documentation” → Improve READMEs + better search in GitHub/Notion
“We need service ownership clarity” → CODEOWNERS files + documentation
“We need onboarding improvements” → Runbooks + mentorship programs

None of those require a portal. They might not even require new tools—just better processes and existing tools used better.

Managed Backstage might give you 10% better UX and 50% lower costs, but if the core value prop isn’t there, you’re still spending $100K/year on something with low adoption.

What I’d Do If I Were You

If this were my product (and I had your honest data), here’s the process:

Step 1: Understand Why Usage Dropped (2 weeks)

Interview 10-15 developers who stopped using it
Ask: “What were you hoping to get?” and “Why did you stop?”
Shadow 5 developers for a day each—see actual workflows
Identify top 3 friction points in their day-to-day work

Step 2: Validate Whether IDP Is The Right Solution (2 weeks)

For each friction point, list possible solutions (portal vs point tools vs process changes)
Prototype the simplest solution (maybe not Backstage at all)
Test with 10 developers, measure time saved
Only proceed if you get >70% “would use regularly” commitment

Step 3: Choose Implementation Path (1 week)

If IDP is right: Managed Backstage vs self-hosted vs alternatives
If point solutions are right: Buy/build specific tools
If process is right: Change workflows, don’t build tools

Step 4: Set Kill Criteria Before Building (1 day)

“If we don’t hit X% adoption by month Y, we stop investing”
“If time saved per developer < $Z/month, we pivot”
“If developer NPS < 30, we reassess”

Total time to validate: 5 weeks. Way less than 9 months.

The Broader Question

You asked: “How did others drive adoption beyond 30%?”

I’d reframe: “How did others validate their IDP would get 30% adoption BEFORE building it?”

Because if you build it and then try to drive adoption, you’re doing product management backwards. Adoption should be validated before you write code, not after you launch.

The teams that succeed start with the problem, validate solutions, then build. The teams that fail start with the solution, build it, then discover the problem doesn’t exist.

Luis, my question for you: What’s the one metric that, if it improved, would prove your platform team is delivering value? Not “portal adoption”—actual business outcome. Time-to-production for new services? Developer satisfaction? Deployment frequency? Cloud cost efficiency?

Once you identify that metric, work backwards: What’s the minimum solution that moves it? Might not be Backstage at all.

alex_infrastructure · March 22, 2026, 4:17am

Luis, I’ve read this post three times and every time I find something new to think about. Thank you for being this vulnerable about what went wrong. This is exactly the kind of learning our industry needs more of.

Adoption Is A Leadership Challenge, Not A Technology Challenge

What strikes me most: this wasn’t a technical failure. Your platform team built exactly what they set out to build. The failure was organizational—you didn’t get developer buy-in before, during, or after the build.

I’m scaling engineering from 25 to 80+ people right now, and adoption challenges are becoming my biggest concern. Your experience validates what I’ve been worried about: even perfectly executed technical projects can fail if you don’t solve the people problems first.

The 15% → 3% Curve Is A Change Management Failure

Looking at your adoption trajectory:

Month 1: 15% (curiosity + launch enthusiasm)
Month 3: 12% (novelty wearing off)
Month 6: 8% (reverting to familiar tools)
Today: 3% (only the true believers remain)

This isn’t developers being stubborn or resistant to change. This is the portal not being obviously better than existing workflows. Change management 101: people only adopt new tools if the new way is significantly easier than the old way.

Your portal might be better in objective ways (centralized, standardized, comprehensive). But if it’s not easier in subjective ways (faster, more familiar, less cognitive load), adoption will fail.

Questions I’m Wrestling With For My Own Team

Your experience raises questions I need to answer before we invest in any platform:

1. How Do You Get Developer Buy-In Early?

Options I’m considering:

Embed platform engineers in product teams (1 week rotations) to understand workflows
Create advisory council of developers who review platform decisions
Require user stories from actual developers before any platform investment
Pilot with willing teams before company-wide rollout

What would have worked for you? If you could go back, how would you ensure developers were partners in the design process, not recipients of a finished product?

2. How Do You Balance “What Developers Want” vs “What The Org Needs”?

This tension is real:

Developers might want freedom and flexibility
Organization needs standardization and governance
Platform team stuck in the middle

Your service catalog might not help developers ship features faster (what they want), but it might help with compliance, security, ownership clarity (what org needs). How do you reconcile that gap?

3. When Do You Force Adoption?

I keep coming back to this. Should you:

Option A: Make the new tool so good that adoption is voluntary

Pros: High satisfaction, organic evangelism
Cons: Takes time, might never reach 100% adoption

Option B: Sunset old tools and force migration

Pros: Guarantees adoption, enables standardization
Cons: Resentment, lower satisfaction, need to be confident new tool is actually better

Your 3% adoption suggests voluntary didn’t work. Would forcing migration have helped, or just made developers resentful?

4. How Do You Measure Whether Platform Engineering Is Working?

You mentioned 2 FTE maintenance = $400K/year for low adoption. But even if adoption was 80%, how do you prove value?

Platform engineering is infrastructure—it’s supposed to be invisible when it works. How do you measure:

Velocity improvements (hard to attribute to platform vs other factors)
Developer satisfaction (surveys are lagging indicators)
Time saved (self-reported and often inaccurate)
Business outcomes (deployment frequency doesn’t equal revenue)

This is the conversation our CFO will have with me when I ask for platform headcount. I need better answers.

The Managed Migration Decision

Your ROI case makes sense on paper:

Save $300K/year in maintenance costs
Better UX and features from managed solution
Redeploy 2 engineers to higher-value work

But Maya’s point haunts me: what if the problem isn’t implementation, but concept? What if developers just don’t value IDPs at all, regardless of how well-executed they are?

Before migrating, I’d want to know:

Who are the 3% still using it and why? What value are they getting that the 97% aren’t?
What would the 97% actually use? Interview them, not survey them
Is there a forcing function that would drive adoption? Like sunsetting old tools
Would managed Backstage change anything? Or just deliver the same low-value product more efficiently?

If the answer to #4 is “it wouldn’t change adoption,” then migrating platforms doesn’t solve the real problem.

What I’m Taking From This

My team was planning to build a comprehensive developer portal as part of our scaling initiative. Your post is making me completely rethink that approach.

New plan:

Start with developer interviews (not platform team assumptions)
Identify the #1 friction point (not the top 10)
Prototype solution in 1-2 weeks (script, process change, whatever works)
Measure adoption and time saved
Only scale if prototype proves value

If I can’t get 50%+ adoption on a simple prototype, I definitely won’t get it on a comprehensive platform.

One Final Question

Luis, you mentioned considering managed Backstage. But have you also considered just… shutting down the portal entirely?

Not being snarky—genuinely asking. What happens if you sunset it? Do those 3% lose critical functionality? Or do they just go back to tools that the 97% were already using?

If sunsetting it doesn’t break anything critical, that tells you something important: the portal never solved a must-have problem. And if it never solved must-have problems, migrating platforms won’t fix that.

Sometimes the right answer is “we learned this doesn’t work, let’s redirect resources to things that do.” That’s not failure—that’s evidence-based decision making.

Curious what your leadership is saying about this. Are they pushing for migration to managed? Or questioning whether to continue investing in IDPs at all?