OpenTelemetry Hits 95% Adoption for New Cloud-Native Projects — Is Your Observability Vendor Lock-In Finally Over?

The numbers from the latest CNCF survey are hard to ignore: OpenTelemetry has reached a 95% adoption rate for new cloud-native projects in 2026. Not “awareness” — actual adoption. And 89% of production users now say OTel compliance is “very important” or “critical” when evaluating observability vendors. We’ve crossed a tipping point, and the implications for how we think about observability architecture are significant.

But as someone who just completed a major observability migration using OTel as the abstraction layer, I want to give you the full picture — the genuine wins, the hidden traps, and the places where the promise still falls short of reality.

The Promise: Vendor-Agnostic Telemetry

The core value proposition of OpenTelemetry is elegant: standardize how applications generate telemetry data (traces, metrics, logs), and decouple that from where the data goes. Instrument your code once using OTel SDKs, and you can send that data to Datadog, Grafana Cloud, Elastic, Honeycomb, Lightstep, or any other backend. If your vendor doubles their pricing — and let’s be honest, Datadog’s pricing has made this a very real concern — you can switch backends without touching your application code.

In theory, this eliminates the observability vendor lock-in that has been a pain point for infrastructure teams for the better part of a decade. In practice? It’s complicated.

The Reality: Soft Lock-In Is the New Hard Lock-In

Every major observability vendor now advertises “full OTel support.” But here’s what they don’t highlight: they’re all adding proprietary extensions on top of OTel that create soft lock-in. Datadog’s “Enhanced OTel” adds custom attributes and semantic conventions that only render properly in the Datadog UI. Grafana’s OTel integration works best with their custom resource detectors. Elastic’s APM agent wraps OTel with proprietary correlation logic.

None of this violates the OTel spec — it extends it. But the practical effect is that teams who use these “enhanced” features find themselves just as locked in as they were before, because their telemetry only makes full sense in one vendor’s UI.

The lesson: OTel gives you portability of the base telemetry layer, but you have to be disciplined about staying within the standard spec and avoiding vendor-specific extensions if you actually want to maintain the ability to switch.

Our Migration Story

My team migrated from Datadog to Grafana Cloud over the past quarter, using OTel as the abstraction layer. Here’s what that looked like:

The good: Our services were already instrumented with OTel SDKs (we made that investment 18 months ago specifically for this flexibility). Switching the backend meant changing OTel Collector configuration — updating exporters from OTLP/Datadog to OTLP/Grafana. For 80% of our telemetry pipeline, this was a configuration change. The migration took 3 weeks instead of the 6 months we estimated it would take without OTel.

The painful: The remaining 20% was brutal. We had Datadog-specific custom metrics with semantics that didn’t translate cleanly. Dashboard queries written in Datadog’s proprietary query language had to be rewritten for Grafana’s PromQL/LogQL. Alert definitions couldn’t be ported. All the operational knowledge embedded in “how to investigate X in Datadog” had to be rebuilt for Grafana.

The cost savings: Moving from Datadog to Grafana Cloud reduced our observability bill by approximately 60%. At our scale, that’s a substantial annual savings. I’ll be honest — cost was the primary driver for this migration, not philosophical commitment to open standards. OTel made the migration feasible; cost made it necessary.

Where OTel Still Falls Short

Despite the impressive adoption numbers, there are real gaps:

Logs: OTel’s logging support is still the weakest leg of the observability triad. The log data model was only stabilized recently, and SDK support varies significantly across languages. If you’re a Python or Java shop, you’re in reasonable shape. If you’re running Go or Rust services, the logging SDK maturity is still catching up. Most teams I talk to still use a separate logging pipeline (Fluentd, Vector) alongside OTel for traces and metrics.

Profiling: Continuous profiling support in OTel is in early stages. The profiling signal was accepted as an OTel signal type last year, but production-grade SDK support is limited. If profiling is core to your observability strategy, you’re still largely dependent on vendor-specific agents.

Spec velocity: The OTel spec moves slowly by design — stability is a feature, not a bug. But this means emerging observability patterns (eBPF-based instrumentation, AI workload telemetry, LLM token tracking) aren’t covered by the spec yet, and vendors are filling the gap with proprietary solutions.

AI-Powered Observability: The Next Frontier

One area where things are moving fast is AI-driven root cause analysis layered on top of OTel data. Grafana launched an AI assistant that correlates traces, metrics, and logs to suggest root causes. Elastic’s AI-powered anomaly detection works directly with OTel-formatted data. Datadog’s Watchdog has been doing this for a while but now accepts OTel-native inputs.

The interesting dynamic: OTel standardization is making AI-powered observability more viable because the data is structured consistently regardless of source. AI models trained on OTel-format traces can generalize across different services and even different organizations in ways that vendor-specific data formats couldn’t support.

The Real Question

So here’s what I want to discuss: has anyone successfully gone multi-vendor with OTel? I mean actually sending the same telemetry to multiple backends simultaneously — using Grafana for dashboarding, Honeycomb for trace exploration, and a data lake for long-term analytics?

The OTel Collector’s fan-out exporter capability makes this technically possible, but I’m curious about the operational reality. Does anyone actually run multi-vendor, or does everyone end up picking one backend and sticking with it, with OTel just serving as insurance against future vendor changes?

I’d also love to hear from anyone who’s migrated to Datadog using OTel. The migration stories I hear are almost always away from Datadog — curious if the traffic goes both directions.

Alex, this is a great writeup and mirrors almost exactly the journey we went through, except we learned some of the painful lessons the hard way before OTel was mature enough to help.

The $400K/Year Savings Story

Let me give the executive perspective on this. My company was spending approximately $680K/year on Datadog — and the trajectory was pointing toward $900K+ as we scaled. When I presented the observability line item to our board, the reaction was immediate: “Why does monitoring cost more than some of your engineering teams?”

We evaluated alternatives and ultimately migrated to a combination of Grafana Cloud for dashboarding and alerting, with Honeycomb for deep trace analysis. The annual savings came to roughly $400K. OTel was the enabler — without it, we estimated the migration would take 12-18 months of engineering time, which would have eaten into the savings significantly.

With OTel as the abstraction layer, we completed the migration in about 10 weeks. But I want to be honest about the parts that weren’t smooth.

What OTel Doesn’t Abstract Away

You mentioned Datadog-specific semantics, and I want to expand on this because it was our biggest pain point. Our teams had built years of operational knowledge around Datadog’s specific conventions:

  • Custom metrics using Datadog’s tagging conventions that didn’t map cleanly to OTel’s attribute model
  • Dashboard queries written in Datadog’s query syntax — hundreds of dashboards that had to be manually recreated
  • Alert conditions that relied on Datadog-specific aggregation behaviors
  • Runbooks that referenced Datadog UI navigation (“click on the APM tab, filter by service X”)

The telemetry collection migrated cleanly because we were already using OTel SDKs. But the operational layer above the telemetry — dashboards, alerts, runbooks, team workflows — had to be rebuilt from scratch. That’s the part that took the bulk of the 10 weeks.

My Advice: New Services First, Legacy Later

If I were advising a CTO who’s considering this move, here’s what I’d say:

Standardize on OTel from day one for every new service. Use the OTel SDKs directly, avoid vendor-specific instrumentation libraries, and stay within the standard semantic conventions. This costs almost nothing and gives you optionality.

Don’t rush migrating legacy instrumentation. If you have services with Datadog agents deeply integrated, the ROI of migrating them to OTel is questionable unless you’re planning a vendor switch. The migration effort is non-trivial and the risk of breaking existing monitoring during the transition is real.

The real ROI of OTel is in avoiding future lock-in, not in painlessly migrating existing setups. Think of it as insurance: the premium is low (using standard SDKs instead of vendor SDKs), and the payout is enormous if you ever need to switch.

On the Multi-Vendor Question

We do run a mild version of multi-vendor. Our OTel Collectors export traces to both Grafana Tempo and Honeycomb simultaneously. Grafana handles our day-to-day dashboarding and alerting; Honeycomb is used by senior engineers for deep-dive investigations when Grafana’s trace UI isn’t powerful enough.

It works, but it’s not free — you’re paying for two backends ingesting the same data, and there’s cognitive overhead in knowing which tool to use for which investigation type. For most companies, I’d recommend picking one primary backend and treating multi-vendor as a future option rather than a day-one architecture.

The 95% adoption number is meaningful, but adoption of the SDK is the easy part. True vendor independence requires discipline at every layer of the observability stack, and that discipline is harder to maintain than most teams realize.

I want to raise a dimension of the OTel conversation that I think is critically underappreciated in most adoption discussions: the security and privacy implications of standardized, portable telemetry.

Telemetry Data Is Sensitive Data

When your observability stack was vendor-specific — say, Datadog’s agent sending data directly to Datadog’s backend — you had a relatively simple trust model. Your telemetry data flowed through one vendor’s infrastructure, governed by one vendor’s security posture and one set of contractual data handling obligations.

With OTel, the architecture is fundamentally different. Your telemetry might pass through:

  1. OTel SDK in your application (your code)
  2. OTel Collector running in your infrastructure (your ops)
  3. An intermediary processing layer (could be third-party)
  4. One or more backend vendors (Grafana, Elastic, etc.)
  5. Potentially a data lake for long-term storage (your data team)

Each hop is a potential data exposure point. And here’s the thing most teams don’t think about until it’s too late: telemetry data routinely contains sensitive information.

What Lives in Your Traces

I did a security audit of our OTel trace data last quarter, and what I found was alarming:

  • User IDs and session tokens embedded in span attributes from HTTP middleware
  • IP addresses and geolocation data captured by default in many instrumentation libraries
  • Request parameters including search queries, form inputs, and API payloads — some containing PII
  • Database queries with literal values in span events, including email addresses and phone numbers
  • Internal service topology that reveals architecture details useful for reconnaissance

With a single-vendor solution, this data lived in one place with one access control model. With OTel’s fan-out capability — the same feature that enables multi-vendor — this sensitive data potentially flows to multiple destinations, each with different security properties.

OTel-Native Security Controls

The good news is that the OTel Collector provides a robust pipeline for implementing security controls at the telemetry layer. Here’s what we’ve implemented:

Attribute-level data masking: We use OTel Collector processors to scrub PII from span attributes before they leave our infrastructure. Any attribute matching patterns like email, phone, SSN, or credit card gets hashed or redacted. This runs at the Collector level, so it applies regardless of which backend the data flows to.

Sampling policies with security awareness: Not all traces need to be exported. We’ve implemented tail-based sampling that’s aware of data sensitivity — traces touching our user authentication service are sampled at a lower rate and have additional attribute scrubbing applied.

Egress controls: Our OTel Collector configuration explicitly whitelists which exporters are allowed and which endpoints they can reach. This prevents accidental data leakage if someone adds a new exporter without security review.

Encryption in transit: All OTel Collector-to-backend communication uses mTLS, not just TLS. We verify the identity of the receiving endpoint, not just the encryption of the channel.

The Compliance Angle

For anyone operating under GDPR, HIPAA, SOC2, or similar frameworks, OTel actually makes compliance harder, not easier, at least initially. With a single vendor, you could point to one Data Processing Agreement covering all your telemetry. With OTel enabling multi-vendor architectures, you need DPAs with every backend that receives telemetry, and you need to ensure your Collector pipeline enforces data minimization before export.

My recommendation: treat your OTel Collector as a security boundary, not just a routing layer. Every piece of telemetry that passes through it should be subject to the same data classification and handling policies you’d apply to any system processing user data.

Alex, your migration story is compelling, and the cost savings are real. But I’d encourage everyone reading this to do a security audit of their telemetry data before enabling fan-out to multiple backends. You might be surprised what’s hiding in your spans.

This thread is hitting on something I’m really excited about, so let me add the data engineering angle — because OTel data is becoming way more valuable than just ops monitoring.

OTel as a Product Analytics Data Source

My team has been doing something that I think will become a standard pattern within the next year: we’re feeding OTel data into our analytics pipeline for product metrics, not just operational monitoring. And the results have been transformative for how we understand user experience.

Here’s the insight that started this: service latency data, when correlated with user sessions, tells you something profoundly useful about product experience. If a user’s checkout flow involves 12 service calls and the P95 latency for that flow is 4.2 seconds, that’s a product metric, not just an SRE metric. It tells you something about conversion probability, user satisfaction, and revenue impact that no amount of frontend analytics can capture with the same precision.

We built a pipeline that takes OTel trace data from the Collector, enriches it with user and session context from our product database, and feeds it into our analytics warehouse. Product managers can now query “what was the P95 latency for users in the premium tier during the checkout flow last week?” without asking the infrastructure team.

The OTel Collector as a Data Engineering Tool

Alex, you mentioned the Collector’s fan-out exporter capability, and I want to emphasize how powerful this is from a data engineering perspective. We’re currently exporting the same OTel data to three destinations simultaneously:

  1. Grafana Cloud — for real-time operational dashboarding and alerting (SRE team)
  2. BigQuery — for long-term analytics and product metrics (data and product teams)
  3. A custom Kafka topic — for real-time ML feature extraction (ML engineering team)

The same trace data that tells an SRE “service X is slow” also tells a product manager “checkout conversion dropped 3% because service X was slow” and tells an ML engineer “user engagement features should account for latency degradation.” One instrumentation investment, three teams getting value.

This is only possible because OTel standardizes the data format. Before OTel, getting this same data to three destinations meant maintaining three separate instrumentation systems or building fragile ETL pipelines to transform vendor-specific formats.

The Volume Problem Is Real

Now for the honest part: the data volume is enormous, and storage costs are becoming the new observability expense.

We process approximately 2TB of telemetry data daily across our services. At our current BigQuery rates, storing and querying this data costs significantly more than we initially budgeted. The irony is not lost on me — we saved money by switching observability vendors, and some of those savings are being consumed by the analytics pipeline that feeds on the same data.

Our approach to managing this:

  • Aggressive sampling at the Collector level — only 10% of traces go to BigQuery; the other 90% go to Grafana with shorter retention
  • Trace summarization — we built a Collector processor that extracts key metrics from traces (latency, error rate, span count) and exports just the summary to BigQuery, keeping full traces only for the sampled subset
  • Tiered storage — hot data in BigQuery for 30 days, warm data in GCS for 90 days, cold storage after that

Even with these optimizations, our telemetry storage costs are growing 15-20% month-over-month as we instrument more services. This is the hidden cost of OTel success: once teams realize how valuable standardized telemetry data is, everyone wants more of it, and the data volumes compound fast.

The Multi-Vendor Answer

To directly answer your question, Alex: yes, we’re running multi-vendor, but not in the way most people imagine. We’re not sending the same data to competing observability platforms. We’re sending the same data to platforms that serve fundamentally different use cases — ops monitoring, product analytics, and ML features.

I think this is the real unlock of OTel: not “I can switch from Datadog to Grafana” (though that’s valuable), but “I can use the same instrumentation investment to serve my SRE team, my product team, and my data science team simultaneously.” That’s where the compounding ROI lives.

Sam’s point about security is well taken though — our pipeline to BigQuery required careful PII scrubbing, and we learned that lesson the hard way when a product manager queried trace data and found raw user emails in span attributes. OTel’s flexibility is powerful, but it requires treating telemetry data with the same care you’d give any user data pipeline.