Skip to main content

The AI Team Topology Problem: Why Your Org Chart Determines Whether AI Ships

· 8 min read
Tian Pan
Software Engineer

Most AI features die in the gap between "works in notebook" and "works in production." Not because the model is bad, but because the team that built the model and the team that owns the product have never sat in the same room. The AI team topology problem — where AI engineers sit in your org chart — is quietly the biggest predictor of whether your AI investments ship or stall.

The numbers bear this out. Only about half of ML projects make it from prototype to production — at less mature organizations, the failure rate reaches 90%. Meanwhile, CircleCI's 2026 State of Software Delivery report found that AI-assisted code generation boosted feature branch throughput by 59%, yet production branch output actually declined 7% for median teams. Code is being written faster than ever. It's just not shipping.

The Three Models and Their Failure Modes

Organizations typically land on one of three structures for their AI teams: centralized, embedded, or platform. Each has a characteristic way of failing.

The centralized model — a dedicated AI team or center of excellence serving the whole company — concentrates scarce talent and builds deep technical expertise. It also creates a service desk dynamic. Product teams submit requests, the AI team prioritizes, and a queue forms. The AI team optimizes for model quality because that's what they can measure. The product team needs something that handles edge cases in their specific domain. Projects die in the gap between those two goals.

The fully embedded model — ML engineers report directly into product teams — solves the alignment problem but creates new ones. Each team reinvents the same infrastructure. The recommendation team builds their own feature store. The search team builds their own. The fraud team builds their own. Three incompatible feature stores, three different deployment pipelines, zero shared learning.

The platform model — a shared ML platform team with product-embedded ML engineers who use it — is the theoretical ideal. In practice, it requires a level of organizational maturity that most companies haven't reached. The platform team builds for generality while the product teams need specificity, and the negotiation between the two becomes its own bottleneck.

None of these models are inherently wrong. The problem is that most organizations pick one based on their current headcount rather than their current failure mode.

The Handoff That Kills Projects

The most destructive pattern in AI team organization is the research-to-production handoff. A data scientist or ML researcher develops a model in a notebook, validates it against a holdout set, presents impressive metrics to stakeholders, and then "hands off" to an engineering team for productionization.

This handoff fails predictably for several reasons:

  • Training-serving skew. Features engineered in notebooks don't match production implementations. The scientist used a pandas join on a static dataset. Production needs a real-time feature lookup with p99 latency under 50ms.
  • Misaligned constraints. The scientist designed a model without knowing the latency SLA. The engineer discovers the model needs 500ms per prediction when the product requires 50ms. That's a month of wasted work.
  • Reproducibility gaps. Models trained ad hoc in personal environments lack versioning, dependency pinning, and deterministic data pipelines. Retraining requires archaeology.
  • Late-stage discovery. Production requirements surface only after development completes. By then, the model architecture may need fundamental changes — not just engineering polish.

The fix isn't better documentation or more detailed handoff specs. It's eliminating the handoff entirely. Teams that ship AI features reliably are the ones where ML engineers and product engineers work in the same sprint, attend the same standups, and share ownership of the same production service.

What Actually Ships: The Federated Model

The organizational pattern with the best track record is what some call the federated model: cross-functional teams with domain focus that report into a central ML organization. This structure preserves a shared knowledge base — common tooling, standardized practices, career paths for ML specialists — while embedding those specialists deeply enough in product teams to understand the domain.

The key distinction from pure embedding is dual accountability. An ML engineer in a federated model has a dotted line to the product team and a solid line to the ML organization. The product team sets priorities and provides domain context. The ML organization sets standards, maintains shared infrastructure, and prevents the duplication that plagues fully embedded teams.

Concretely, this means:

  • Joint kickoffs where ML and product engineers align on latency targets, data availability, and infrastructure constraints before any modeling begins.
  • Shared feature stores and version-controlled pipelines that prevent training-serving skew by construction.
  • Production constraints as guardrails during development, not requirements discovered at deployment time.
  • Collaborative validation gates where both ML quality metrics and production reliability metrics must pass before launch.

The New York Times found this approach more efficient than having parallel teams solve similar problems independently — it's cheaper to build one model and adapt it across domains than to have two teams build two models in parallel.

The Verification Bottleneck

There's a newer dimension to this problem that the rise of AI-assisted coding has made acute. As AI generates more code, the bottleneck shifts from writing to verification. CircleCI's data shows this clearly: main branch build success rates have dropped to 70.8%, the lowest in five-plus years, against a benchmark of 90%. Recovery time is up 13% year over year. Teams are pushing more code and breaking more things.

This creates a new team topology question: who verifies AI-generated output?

Some organizations are experimenting with a new role — the AI Reliability Engineer — whose job is validating AI output rather than generating code. The "Centaur Pod" structure pairs one senior architect with two AI reliability engineers and an autonomous agent fleet. The architect sets direction. The reliability engineers verify. The agents execute.

Whether or not that specific structure takes hold, the underlying insight matters: AI teams now need to treat verification as a first-class function, not a side effect of code review. Teams that treat review as an afterthought are the ones showing up in the delivery statistics as median performers with zero production growth.

The Maturity Progression

The right team topology depends on where you are, not where you want to be. Organizations progress through recognizable stages:

Stage 1: Consulting model. A small centralized team takes on high-priority projects across the business. This works when you have fewer than five ML engineers and need to prove value before investing in infrastructure. The risk is becoming a permanent service desk.

Stage 2: Specialized model. ML engineers embed into specific product domains as the team grows. This works when you have enough talent to staff multiple domains without spreading thin. The risk is infrastructure duplication and knowledge silos.

Stage 3: Platform model. A shared ML platform emerges to serve embedded engineers. This works when the organization has enough scale to justify platform investment. The risk is over-engineering the platform at the expense of product delivery.

Stage 4: Federated model. Embedded engineers with central governance, shared tooling, and dual accountability. This works when the organization is mature enough to handle matrix reporting without paralysis. The risk is bureaucratic overhead.

Skipping stages doesn't save time. An organization that jumps to the platform model without the consulting phase to prove value, or without the embedded phase to understand domain needs, builds a platform nobody uses.

Organize by Problem, Not Function

One last structural insight that separates teams that ship from teams that demo: organize by use case, not by business function.

When you organize AI teams by business unit — one team for marketing, one for operations, one for finance — you get three teams independently solving similar problems. Three separate recommendation engines. Three duplicate data pipelines. Three teams that never talk to each other.

When you organize by problem type — one team for recommendation systems, one for forecasting, one for natural language understanding — you get shared solutions adapted across domains. The recommendation team builds a core engine that marketing, operations, and finance all use with domain-specific tuning.

This requires a business sponsor in each domain who can translate between the AI team's technical capabilities and the domain's specific needs. Without that translation layer, the problem-organized team drifts toward building technically impressive solutions that don't fit any specific domain well — which is just the centralized model's service desk failure in a different disguise.

The Org Chart Is the Architecture

Conway's Law applies to AI teams with particular force. The structure of your organization will determine the structure of your AI systems, which will determine whether those systems ship or stall.

If your ML engineers and product engineers sit in different reporting chains with different incentive structures, you will get models that work beautifully in isolation and fail in production. If your AI team is centralized without embedded domain context, you will get a queue of unshipped projects. If your AI engineers are fully embedded without shared infrastructure, you will get duplicated effort and inconsistent quality.

The organizations getting value from AI in 2026 aren't the ones with the best models or the most compute. They're the ones that solved the team topology problem first — aligning incentives, eliminating handoffs, and building structures where shipping is the default outcome rather than a heroic exception.

References:Let's stay in touch and Follow me for more thoughts and updates