Your AI Feature Has No DRI: Why It's Drifting Without a Quarterly Goal Owner
Walk into a quarterly business review and ask whose name is on the AI feature. Watch what happens. The PM points at the platform team. The platform team points at the research engineer who wrote the eval harness. The research engineer points at the FinOps analyst who keeps emailing about the cost graph. The FinOps analyst points back at the PM. Four people, one feature, zero owners. The eval score has been drifting downward for six weeks and nobody has triaged it because the dashboard lives in a Notion page that was last edited the day after launch.
This is the most predictable outcome of how organizations actually ship AI features in 2026. The feature was launched by a tiger team that got disbanded the moment the launch press release went out. The instrumentation was bolted on by an infra group that has no product mandate. The prompt is a prompts/v3.txt file in the repo whose blame is split across nine engineers, none of whom remember why line 47 says what it does. The user-facing tile has a PM whose OKRs moved on to the next launch two quarters ago. The feature is technically in production, technically owned, and structurally orphaned.
Every other product surface in the org has a directly responsible individual whose promotion case rests on it. The checkout flow has a name. The notification system has a name. The search ranking model — even the one that predates the LLM era — has a name. The AI feature has an org chart. That's the difference, and that's why one set of features gets better quarter over quarter while the other set drifts.
Symptoms of an Orphaned AI Feature
The drift is rarely loud. It looks like nothing for a while, then it looks like a series of small awkward facts that nobody connects until an incident review forces the connection.
A pattern I see repeatedly: the eval suite catches a regression on Tuesday. The Slack alert fires into a channel called #ai-quality that 23 people are in and none of them are paged. The alert sits for nine days. Someone finally notices in a stand-up, pings the prompt repo, and learns that an unrelated dependency bump changed the JSON-mode behavior of the model client. The fix is two lines. The latency on the fix was nine days because nobody's calendar said "if #ai-quality fires, you respond."
A second pattern: prompts accumulate TODO comments. // TODO: figure out why this still hallucinates on the medical disclaimer case. // TODO: add a fallback when the tool call returns []. The TODOs pile up because no one has the merge authority — or more accurately, no one has the responsibility — to make a call. Every change to a prompt is a low-grade product decision masquerading as a config edit, and product decisions without an owner default to "leave it as-is and add a comment."
A third pattern: an incident retro names "the AI team" as the action owner. There is no AI team. There is a platform group, a research group, a product group, and a feature flag the on-call SRE wishes someone would explain. The action item gets re-assigned three times over the next month and eventually closed as "deferred."
A fourth pattern: leadership asks for the AI feature's adoption number and three groups send three different dashboards. Product reports weekly active users of the parent feature (the one the AI tile lives inside). Platform reports raw API call volume. FinOps reports unique billing accounts that touched a model endpoint. None of these numbers measure the same thing. None of them are wrong. None of them answer the question.
If two or more of these are happening, the feature has no DRI. The org chart has been substituted for a person, and the org chart cannot make decisions, cannot be paged, cannot have a promotion case.
Why AI Features Are Especially Prone to This
Traditional product surfaces fit cleanly into one team's scope. The checkout PM owns conversion; the checkout EM owns reliability; the checkout designer owns the form. There's still cross-team work — payments infra, fraud, taxes — but the feature has a center of gravity.
AI features don't have a center of gravity by default. They straddle four functions and require all four to function at all:
- Product owns whether the feature solves a user problem.
- Engineering owns whether the feature works.
- Research / ML owns whether the model and eval harness reflect reality.
- Finance / FinOps owns whether the unit economics make sense.
Each of these functions has its own backlog, its own metrics, and its own quarterly review. If the AI feature isn't on any of their top-line OKRs, it lives in the seams. The seams are where features go to drift.
There's also a launch dynamic that makes this worse. AI features get shipped through tiger teams — small cross-functional groups assembled to push a high-stakes initiative across the line. Tiger teams are great for shipping. They're terrible for ownership transition, because the moment the launch is done, the team disbands and nobody planned the handoff. The Tempo guidance on tiger teams is explicit: define who owns the output once they disband, otherwise you've created a handoff nightmare. Most launches skip that step. The "handoff doc" is a Notion page nobody opens after week two.
The Reforge analysis of AI-native product teams puts it bluntly: organizations are still measuring people for jobs that no longer exist in their original form. The PM who shipped the AI tile is still being measured on shipping the next tile. There's no incentive structure that rewards them for the slow, unglamorous work of keeping the existing tile from rotting. So they ship the next tile, and the rot happens anyway.
The Four Numbers a Real DRI Owns
The fix is mechanical. Name a single AI-feature DRI and put four numbers on their OKR sheet. If you can't agree on a name, you don't have a DRI; you have an org chart with optimistic labels.
The four numbers are:
- Adoption. Weekly active users of the AI surface, measured against a denominator the DRI agrees to in advance. Not parent-feature WAU. Not API call volume. Users who actually got value from the AI capability, defined the same way every week.
- Eval score. A composite quality number from a maintained eval suite, with a clear "below this number we roll back" threshold. The DRI owns the suite — meaning they're responsible for making sure it reflects what users actually do, which means the suite needs new test cases when user behavior shifts.
- Cost per active user. Total inference, retrieval, and tool-call spend divided by the adoption denominator. This is the unit economic that tells you whether you're building a product or a science project. McKinsey's five-layer measurement framework calls this "value capture per use" — the number that distinguishes promising experiments from durable businesses.
- Incident count. Production incidents attributed to the AI feature, including silent quality regressions caught by the eval suite, not just user-visible outages. Trend matters more than absolute level.
These four numbers are not new. The shift is putting them on one person's OKR sheet. When the same person is accountable for all four, the trade-offs that used to live in cross-team meetings collapse into one head. Should we ship a more capable but more expensive model? The DRI weighs adoption against cost-per-user and decides. Should we tighten a guardrail that lowers eval scores but reduces incidents? Same person, same trade-off, same decision. The decision still gets made by a group; it just stops being everybody's job and therefore nobody's.
The Worklytics analysis of early Copilot adoption points at the same pattern from the other side: the organizations that converted pilot adoption into durable productivity were the ones with named owners whose quarterly reviews included AI-specific KRs. The ones that didn't, didn't.
The Adjacent Investment: A Platform Whose Customer Is the DRI
A common failure mode when organizations finally name AI-feature DRIs is to also stand up a centralized "AI platform team" — and then have the platform team try to own the same surfaces the DRIs own. This produces a worse problem than the one it solved: now there are two owners, both with strong opinions, neither with the final word.
The platform team's job is to be a service provider, not a co-owner. Its customers are the DRIs. Its product is the eval harness, the inference gateway, the prompt-management tooling, the cost dashboards, the incident playbooks. It does not own the feature; it owns the surface area that lets the feature owner do their job without rebuilding observability from scratch every quarter.
When the platform team starts trying to enforce its own roadmap on AI features — picking models, dictating prompts, owning the eval thresholds — the DRI structure collapses. The Reforge framing for AI-native product teams emphasizes this split: centralize infrastructure and governance, decentralize the product decisions. The platform team should be measured on internal NPS from the DRIs, on the time it takes a new AI feature to go from concept to instrumented production, and on platform reliability. Not on the success of any individual feature. That belongs to a name on a feature's OKR sheet.
The 2026 platform-team alignment analysis from the AI infra space frames this as a recurring pattern: platform teams lose alignment with business outcomes when their OKRs become "engineering goals" disconnected from any specific feature's success. The fix is to put their OKRs in service of the DRIs, not in parallel to them.
The Shadow Test
Here's the diagnostic I suggest to leadership teams who aren't sure whether they have this problem. Pick the AI feature you're most proud of. Ask three questions:
- Who is on the hook if the eval score drops below threshold tomorrow morning? Name a person, not a team.
- What is the feature's cost per active user this quarter? Not total spend, not API volume — the unit economic.
- Whose promotion case will be hurt if the feature gets quietly deprecated in two quarters because adoption flatlined?
If you can answer all three with the same name, you have a DRI. If you can answer two, you have a half-DRI and the feature will drift in the area you couldn't answer. If you can't answer any, the feature is already drifting; you just haven't surfaced the evidence yet.
The 362 AI incidents recorded by the AI Incident Database in 2025 — up from 233 the year before — are not random. They're heavily concentrated in features that nobody owns. The pattern is consistent across the case studies: the team that built the feature has moved on, the team that operates it has no mandate to change it, and the team with the mandate doesn't have the context. The incident is the mechanism that finally forces an owner to be assigned. By then it's expensive.
The Org Pattern That Actually Works
The shape that holds up over time looks like this:
- One AI-feature DRI per significant AI surface. Their OKRs include adoption, eval score, cost-per-active-user, and incident count. They have merge authority on the prompts, the eval suite, and the model selection. They sit in the product org if the feature is user-facing; in the platform org if the feature is internal-facing.
- A platform team whose customer is the DRIs. Its OKRs are about how fast and how cheaply DRIs can ship and operate AI features. It has no opinion on whether any individual feature should exist.
- A research / eval function that consults across DRIs and is measured on the quality of the methodology, not on any individual feature's score. This function helps the DRI build and maintain the eval suite but doesn't own the suite.
- A FinOps integration that gives the DRI a direct line of sight on cost-per-user, ideally the same dashboard updated in near-real-time, not a monthly report that arrives three weeks late.
The AI features that get better quarter over quarter are the ones with a name on them. The ones that drift are the ones with an org chart on them. If your AI feature is drifting, the first lever isn't a model upgrade or a prompt rewrite or a new eval framework. It's a name on the door. Everything else follows from that.
- https://github.com/resources/insights/enterprise-ai-program-dri
- https://uplevelteam.com/blog/ai-engineering-team-structure
- https://handbook.gitlab.com/handbook/people-group/directly-responsible-individuals/
- https://www.mckinsey.com/capabilities/quantumblack/our-insights/from-promise-to-impact-how-companies-can-measure-and-realize-the-full-value-of-ai
- https://www.aakashg.com/product-okr-examples/
- https://www.worklytics.co/resources/ai-adoption-okrs-2025-templates-copilot-pioneers
- https://www.reforge.com/blog/ai-native-product-teams
- https://www.ai-infra-link.com/why-platform-teams-lose-alignment-with-business-goals-in-2026/
- https://www.tempo.io/blog/tiger-team
- https://posthog.com/blog/why-small-teams-crush-tiger-teams
- https://www.statsig.com/perspectives/tiger-team-structure-roles-use
- https://www.confident-ai.com/knowledge-base/top-5-llm-monitoring-tools-for-ai
- https://venturebeat.com/infrastructure/monitoring-llm-behavior-drift-retries-and-refusal-patterns
- https://www.finops.org/wg/finops-for-ai-overview/
- https://www.helpnetsecurity.com/2026/04/14/ai-adoption-safety-transparency-report/
