Skip to main content

Shadow MCP: The Tool Servers Your Security Team Has Never Heard Of Are Already Running on Your Engineers' Laptops

· 13 min read
Tian Pan
Software Engineer

Your security team has a complete inventory of every SaaS subscription on the corporate card, every OAuth app with admin consent, every device on the corporate Wi-Fi. They have zero visibility into the seven processes bound to 127.0.0.1 on your senior engineer's laptop right now — a "deploy assistant" with a long-lived staging API token, a "ticket triager" subscribed to a customer-data Slack channel, a "release notes generator" with read access to the production analytics warehouse. None of it is on a vendor list. None of it shows up in the SSO logs. All of it is running on credentials the engineer already had, doing things nobody approved them to do.

This is shadow MCP, and it is the fastest-growing unmanaged authorization surface in the enterprise. The Model Context Protocol made it trivially cheap to wire any tool into any LLM, and engineers — being engineers — wired the obvious things first. Saviynt's CISO AI Risk Report puts the number at 75% of CISOs who have already discovered unsanctioned AI tools running in their production environments. The GitHub MCP server crossed two million weekly installs in early 2026. The Postgres MCP server, which gives an LLM a SQL prompt against any database the developer can reach, is north of 800,000 weekly installs. None of those numbers represent enterprise IT decisions.

The mental model most security programs still apply was built for a different artifact. Shadow IT is a SaaS subscription that hits an expense report, a vendor that gets discovered when a finance audit reconciles credit-card statements against the asset inventory. Procurement, CASB, and SSO are the three sieves, and between them they catch most of what matters. Shadow MCP defeats all three by construction. The artifact is not a vendor — it is a Python process on a laptop. The traffic is not egress to a SaaS endpoint — it is a JSON-RPC call to localhost. The credential is not a new SSO grant — it is the engineer's existing prod-readonly token, the same one they use from a Jupyter notebook, now repurposed as the auth for a long-running agent loop. There is nothing for finance to discover, nothing for the CASB to inspect, nothing for the SSO logs to record beyond the original token issuance that was approved years ago for a different purpose entirely.

The Artifact Is a Loopback Socket, Not a Vendor

Shadow MCP servers do not look like applications in the asset-management sense. They are short scripts, often vibe-coded in an afternoon, started by a developer's IDE plugin or a pipx invocation, and bound to a high port on the loopback interface. They expose tools the LLM can call, and those tools fan out to whatever the developer's machine can reach — internal APIs, production databases, Slack workspaces, GitHub repos, the corporate VPN. EDR doesn't flag them because they're not malware. MDM doesn't catalog them because they're not installed software in the registry sense. The CASB doesn't see them because the egress is to internal CIDR ranges, not to a vendor's public IP.

The visibility surface that actually exists is narrow and underused. Endpoint telemetry can list every process bound to a TCP socket, including loopback, and that list is the closest thing to ground truth for what MCP servers are running on a corporate fleet. Network egress monitoring on developer hosts — not on the corporate edge, where the traffic is normal-looking VPN traffic, but on the host itself — can detect the unusual cadence of an agent loop calling the same internal API every thirty seconds for hours on end. Neither of these is a heavy lift technically, and neither is happening at most companies because the threat model the security team is running was last updated for SaaS shadow IT, where the answer was "scan the credit-card statements."

Credential Reuse Is the Default, and It's the Problem

Every shadow MCP server runs on credentials someone already has. That's not an accident — it's the entire reason the developer chose MCP over building a service. The whole point is that they don't have to file a ticket for a new credential. They take the prod-readonly database token they were issued for ad-hoc analysis, paste it into a config file the LLM client reads on startup, and now the same token is being used by an agent loop that issues thirty queries a minute, every minute, for as long as the laptop is open. The audit trail that was perfectly readable when the token was used by a notebook session — one query every few minutes, with the engineer at the keyboard — is now noise. The forensic question "what queries did this user run on the day the customer-data leak was reported" goes from a few-row SQL filter to a million-row haystack with no signal.

This is not a hypothetical. The OWASP MCP Top 10 (MCP09:2025) lists shadow servers as a distinct category specifically because the credential-reuse failure mode is consistent across deployments. Two-thirds of open-source MCP servers ship with security practices the OWASP project rates as "poor" — flaws in OAuth flows in roughly 43% of them, command injection vulnerabilities in another 43%. Even the well-built ones inherit whatever permissions the credential has. The token doesn't know it's being used by an agent. The downstream API doesn't know it's being called in a loop. The only place the new usage pattern is legible is on the laptop where the MCP server is running, which is the one place nobody is looking.

Confused-Deputy Bridges Don't Look Like Exfiltration

The data-exfiltration risk in shadow MCP isn't the obvious one. Nobody is going to upload the customer table to Pastebin from a corporate laptop — that would trip every DLP rule in the building. The risk is the confused-deputy bridge: an LLM with one tool that reads from prod and another tool that writes to a personal Notion, a personal email, a personal GitHub gist. Neither tool is malicious. Neither call, in isolation, is anomalous. But the composition is a one-way pipe from production data to the engineer's personal cloud, and no DLP rule will catch it because the egress destination is whatever Notion's IP range happens to be that day, which is also where the engineer legitimately writes their meeting notes.

The 2025 vulnerability report cycle made this concrete. Four CVSS-9.3-and-up findings hit Anthropic, Microsoft, ServiceNow, and Salesforce, all following the same pattern: an attacker injects hidden instructions into content the agent processes (an email body, a web form, a Slack message), and the agent uses its own legitimate permissions to exfiltrate data to the attacker. The agent isn't compromised in the malware sense. The credentials aren't stolen. The agent is just doing what an agent does — composing tools to satisfy a request — and the request happens to be one the user never made. Shadow MCP makes this worse in exactly one way: the tools are composed by the engineer who installed them, with no review of what the composition can do, and the registry of what tools an agent has access to lives in a config file on a laptop that nobody else has read.

The Paved Road Has to Be Faster Than the Dirt Road

The instinct of every security program that discovers shadow MCP is to ban it. This does not work and has never worked for any class of shadow IT. The engineers installed the tool servers because the official path didn't exist or was too slow, and a policy document that says "don't do that" doesn't change either of those facts. What works is a paved road that's faster than the dirt road — an internal MCP gateway where sanctioned servers are signed, versioned, discoverable, and pre-wired to short-lived credentials, so the friction of doing it right drops below the friction of rolling your own.

The architecture that's emerging across enterprise deployments has converged on a small number of components. A central gateway sits in front of every MCP server the company sanctions, mediating every tool call, applying request logging, PII redaction, and policy at the egress boundary before the call reaches the underlying system. Authentication happens via OAuth2 Token Exchange (RFC 8693), which lets the gateway hand out narrowly-scoped, short-lived tokens specific to each MCP server, derived from the engineer's broader SSO identity — so the credential the agent loop uses is not the engineer's prod-readonly database token but a derived token scoped to "read these three tables, expires in fifteen minutes." A registry layer catalogs which servers are sanctioned, who owns each one, what tools they expose, and what risk class they fall into; deployment is gated on registration, and unregistered servers fail to start in the corporate environment. Tool-level allow-lists, not server-level, are non-negotiable — a single MCP server commonly exposes a read_database tool, a write_database tool, and an admin_database tool, and the governance layer must let administrators approve some without approving the rest.

The mistake the early movers made is thinking the gateway is the goal. It's not. The gateway is the easy part. The hard parts are the four things that have to land alongside it for the gateway to actually displace the dirt road: a self-service onboarding flow that gets a new engineer to a working tool call in under five minutes (the bar is set by pip install plus a config file, and beating that requires real product investment), a tool catalog that's discoverable from the IDE without leaving the workflow, a credential lifecycle that's invisible to the engineer (they should never see a token, only an SSO redirect), and observability that's better than what the engineer gets locally (request traces, cost attribution, replay). If the gateway adds latency, rejects valid calls under load, or makes the engineer fill out a form to add a new tool, the dirt road wins, and you've spent three quarters building something nobody uses.

What to Discover, in What Order

If you have not yet run a shadow MCP discovery sweep, the order that produces signal fastest is endpoint telemetry first, network second, registry third. Start with the EDR or endpoint inventory you already have and pull the list of every long-running process bound to a localhost socket on developer machines, then cross-reference against process names containing "mcp", "model-context", or any of the popular servers (postgres, github, slack, notion, linear). The output is usually larger than the security team expects by an order of magnitude — most companies discover that a third or more of their senior engineers are running at least one MCP server, and the long tail includes things the security team has never seen named before.

Second, instrument network egress on developer hosts to flag the cadence signature of an agent loop — periodic, identical-shape calls to internal APIs at sub-minute intervals — and correlate against authenticated user sessions. The signal you're looking for is "user who is not at the keyboard but whose credentials are issuing queries every thirty seconds for the last six hours." This catches the long-running agents that survive between work sessions and that present the largest credential-reuse audit-trail risk.

Third — and only third, because doing this first produces a list nobody will adopt — stand up the registry and the gateway. Make registration opt-in for the first quarter and use the discovery data from steps one and two to drive outreach: every engineer running an unsanctioned server gets a personal nudge with a link to the paved road, not a policy document. The conversion rate from this kind of nudge is dramatically higher than from a top-down ban, and it gives you a real signal on which tool integrations to prioritize for the sanctioned catalog.

MCP Is an Authorization Surface, Not a Developer Convenience

The architectural mistake security organizations are making in 2026 is the same one they made with OAuth apps a decade ago — treating a new authorization surface as a developer convenience to be tolerated rather than a lifecycle to be managed. OAuth apps were also "just" a developer convenience until the third-party app sprawl became the largest unmonitored access path into the SaaS estate, and the response — app inventories, scope reviews, periodic recertification, automated revocation — took most of the 2010s to mature. MCP is on the same trajectory, compressed into eighteen months instead of ten years, and the lessons from the OAuth era apply directly: inventory before policy, paved road before ban, scoped tokens before broad ones, lifecycle management as a continuous obligation rather than a one-time approval.

The cost of waiting is paid in trust, not engineering hours. The first major shadow-MCP breach disclosure will list "an internal AI assistant" or "a developer productivity tool" as the affected system, and the post-incident review will find that the credential was approved for a notebook session three years ago, the tool registry was a JSON file on a laptop, and the audit trail was unreadable because the agent loop drowned out the human queries. The organizations that get ahead of this build the gateway, the registry, and the discovery pipeline before that incident, not after. The ones that wait will pay the same cost the OAuth-app sprawl cost extracted, with one difference: the LLM in the loop makes every credential reuse a potential exfiltration path, not just an access path, and the blast radius compounds accordingly.

The takeaway for engineering and security leaders is simple to state and hard to execute. MCP is not a fad to be governed by a memo. It is the new application identity, the new tool-permission graph, the new egress boundary for sensitive data — and like every previous version of those things, it needs an inventory, a paved road, and a lifecycle. The teams that act on this in 2026 will look, in 2028, like the teams that took OAuth-app governance seriously in 2014. The teams that don't will spend 2028 explaining to a regulator why the engineer's laptop had read access to the customer table and write access to a personal Notion and nobody could say when, by whom, or for how long.

References:Let's stay in touch and Follow me for more thoughts and updates