Skip to main content

The Agent Portfolio Audit: How to Consolidate 15 Independent Agents Into a Platform Without Killing Team Autonomy

· 9 min read
Tian Pan
Software Engineer

Six months after launching their first AI agent, most engineering organizations discover they have fifteen of them. Not because anyone planned a fleet — because each team solved a real problem and shipped. The customer support team built a triage agent. The data team built a report-generation agent. Platform engineering built a runbook agent. Infrastructure built three more. None of them share auth, logging, tooling, or evaluation methodology. Tokens are bleeding from a dozen provider accounts and nobody can tell you which agent is responsible.

This is the moment that separates engineering organizations that can scale AI from those that can't. The answer is not to slow down agent development — it's to run a portfolio audit before entropy makes consolidation impossible.

Why You End Up Here

Agent sprawl follows a predictable pattern. The barrier to building an agent — a prompt, a few tool definitions, a loop — is orders of magnitude lower than traditional software development. Any team with Python skills and an API key can ship a working agent in a day. This is mostly good: it accelerates domain-specific automation and lets the people closest to a problem build the solution.

The problem is structural. Teams build agents the same way they build internal tools: fast, locally, optimized for the immediate problem. Authorization is an afterthought. Logging is ad-hoc. Evaluation is manual. Costs get charged to a single shared API key with no attribution. When an agent breaks, the on-call engineer discovers there are no traces and no owner.

Gartner reported a 1,445% surge in multi-agent system inquiries between Q1 2024 and Q2 2025 — the fastest adoption of any enterprise technology in the last decade. Surveys show 82% of organizations have discovered at least one AI agent that their security or IT team didn't previously know existed. This is the agent version of shadow IT, and it creates the same governance debt.

The threshold where you can no longer manage this informally tends to arrive around ten agents. By fifteen, you have real problems: duplicate API integrations, conflicting tool definitions, zero portfolio-level cost visibility, and no way to answer basic questions like "which of our agents handles customer data?" or "what's our monthly LLM spend per business unit?"

The Four-Axis Audit

Before you can consolidate anything, you need a clear picture of what you have. A portfolio audit covers four dimensions:

Capability map. For each agent: what does it do, what tools does it call, what data does it access, and who owns it? The goal is to surface capability overlap — teams that have independently built agents with 70%+ functional overlap without knowing the other existed. Overlap isn't always bad (some redundancy is fine), but it needs to be a deliberate choice, not an accident.

Evaluation coverage. Which agents have any formal evaluation at all? Which are tested against a structured benchmark? Which are only checked manually when something goes wrong? Most portfolios divide cleanly into agents that have CI-integrated evals (usually the ones built by ML engineers) and agents with no repeatable evaluation (usually the ones built by application teams). No evaluation means no ability to catch regressions from model upgrades, prompt changes, or tool schema drift.

Infrastructure dependencies. Inventory what each agent actually needs to run: which LLM providers it calls, whether it uses shared or isolated credentials, whether it emits traces to a central observability system, and what memory backend it maintains. Pay particular attention to agents that maintain their own persistent state — these tend to have the most surprising failure modes and the most security exposure.

Cost attribution. Run LLM spend across the portfolio for the last 30 days and determine what fraction is attributable to known agents with known owners. In most unmanaged portfolios, 40–60% of LLM spend is effectively unattributed — charged to shared API keys with no metadata to tie it back to a team or feature. You cannot optimize what you cannot measure.

Run the audit as a cross-functional exercise. Get the data, not the opinions. When teams self-report what their agents do, they often underestimate tool overlap and overestimate evaluation coverage. Parse actual tool call logs and provider billing data rather than relying on READMEs.

The Consolidation Playbook

Once you have the audit results, consolidation splits into three tracks that run in parallel.

Track 1: The shared control plane.

This is the infrastructure every agent should run through, regardless of which team owns it. It has three components:

LLM gateway. A centralized proxy between all agents and all LLM providers. The gateway handles API key management (teams get virtual keys, not real provider credentials), enforces per-team and per-agent rate limits, and attaches attribution metadata to every request. Every token sent to any provider flows through the gateway; this is the only way to get portfolio-level cost attribution and the only way to enforce budget controls.

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates