39 posts tagged with "mcp"

Tool Schema Design Is Your Blast Radius: When Function Definitions Become Security Boundaries

May 2, 2026 · 10 min read

Software Engineer

The most dangerous file in your agent codebase is the one you've been writing as if it were API documentation. The tool registry — that JSON or Pydantic schema that tells the model what functions exist and what arguments they take — is no longer a docstring. It is your authorization layer. And if you designed it the way most teams do, you handed the LLM a master key and called it good engineering.

Consider the canonical first cut at a tool: query_database(sql: string). The intent is reasonable — let the model formulate the right SQL for the user's question. The reality is that the model is now an untrusted client with unlimited DDL and DML rights to whatever database the connection string points at. The system prompt that says "only run SELECTs on the orders table" is a suggestion, not a control. When a prompt-injected tool result — an email body, a webpage, a PDF — tells the model to run DROP TABLE users, your authorization model is the model's instruction-following discipline. That is not authorization. That is hope.

Pagination Is a Tool-Catalog Discipline: Why Agents Burn Context on List Returns

April 28, 2026 · 11 min read

Tian Pan

Software Engineer

Every well-designed HTTP API in your stack returns paginated results. Nobody loads a million rows into memory and hopes for the best. Yet the tools your agent calls return the entire list, and the agent dutifully reads it, because the function signature says list_orders() -> Order[] and the agent has no protocol for "give me the next page" the way a human user has scroll-and-load-more.

The agent burns tokens on rows it could have skipped. The long-tail customer with 50K records hits context-window failures the median customer never sees. The tool author cannot tell from the trace whether the agent needed all those rows or simply could not ask for fewer. And somewhere in your eval suite, the regression that would have flagged this never runs because every test fixture has fewer than 100 records.

Pagination is not a UI affordance. It is a load-shedding primitive — and the agent that consumes a tool without it is reimplementing every SELECT * FROM orders mistake the API designers in your company spent a decade learning to avoid.

Shadow MCP: The Tool Servers Your Security Team Has Never Heard Of Are Already Running on Your Engineers' Laptops

April 27, 2026 · 13 min read

Tian Pan

Software Engineer

Your security team has a complete inventory of every SaaS subscription on the corporate card, every OAuth app with admin consent, every device on the corporate Wi-Fi. They have zero visibility into the seven processes bound to 127.0.0.1 on your senior engineer's laptop right now — a "deploy assistant" with a long-lived staging API token, a "ticket triager" subscribed to a customer-data Slack channel, a "release notes generator" with read access to the production analytics warehouse. None of it is on a vendor list. None of it shows up in the SSO logs. All of it is running on credentials the engineer already had, doing things nobody approved them to do.

This is shadow MCP, and it is the fastest-growing unmanaged authorization surface in the enterprise. The Model Context Protocol made it trivially cheap to wire any tool into any LLM, and engineers — being engineers — wired the obvious things first. Saviynt's CISO AI Risk Report puts the number at 75% of CISOs who have already discovered unsanctioned AI tools running in their production environments. The GitHub MCP server crossed two million weekly installs in early 2026. The Postgres MCP server, which gives an LLM a SQL prompt against any database the developer can reach, is north of 800,000 weekly installs. None of those numbers represent enterprise IT decisions.

Your Tool Catalog Is a Power Law and You're Optimizing the Long Tail

April 27, 2026 · 11 min read

Tian Pan

Software Engineer

Pull a week of tool-call traces from any production agent and the shape is the same: three or four tools handle 90% of the calls, and a couple of dozen others split the remaining 10%. The catalog is a power law, but the framework treats it like a uniform list. Every tool description ships in every system prompt, every selection rubric weights tools equally, every eval samples the catalog as if a search-files call and a refund-issue call were drawn from the same distribution. They are not.

The cost of that flatness is invisible until it isn't. A team adds the eighteenth tool, the planner's accuracy on the original three drops two points, nobody can localize the regression to a specific change because everything moved at once, and the eval suite — itself uniform across the catalog — averages the slip into a number that still looks fine. Meanwhile the tokens spent describing tools the model will not call this turn now exceed the tokens spent on the user's actual prompt.

The Agent Paged Me at 3 AM: Blast-Radius Policy for Tools That Reach Humans

April 23, 2026 · 12 min read

Tian Pan

Software Engineer

The first time an agent pages your on-call four times in an hour because it's looping on a malformed alert signal, leadership learns something the security team already knew: "tool access" and "ability to create human work" were the same permission, and you granted it without either a safety review or a product-ownership review. Nobody owned the question of who's allowed to interrupt a human at 3 AM, because nobody framed it as a question. It was framed as a Slack integration.

The 2026 agent stack has made this failure mode cheap to reach. Anthropic's MCP servers, OpenAI's Agents SDK, and the whole class of vendor-shipped action tools have collapsed the distance between "the model decided to do a thing" and "a human got woken up." Most teams ship those integrations the same way they ship a database client: scope a token, drop in the SDK, write a system prompt, ship. The blast radius of a database client is a row count. The blast radius of a PagerDuty client is a person's sleep.

The MCP Server Graveyard: When Your Agent's Dependencies Stop Shipping

April 23, 2026 · 10 min read

Tian Pan

Software Engineer

The last commit to the MCP server your agent calls every five minutes was eight months ago. The upstream API it wraps rolled out a new authentication model in February. There are 47 open issues, 12 of them flagged security. The maintainer's GitHub account hasn't shown activity since October. Your agent still connects, still receives tool descriptions, still executes calls — and silently, every one of those calls flows through a piece of infrastructure that nobody is watching.

This is the shape of MCP abandonment. Not a malicious rug pull, not a compromised package, just neglect. Somebody published a useful server in 2025, got adopted, then moved on. The server kept working because nothing forced it to break. Until it does — and by then, the trust boundary your agent was crossing every five minutes has already failed.

Most teams adopted community MCP servers the way they adopted npm packages: by running install and reading the README. That mental model makes sense for libraries that sit in your dependency tree, get audited at build time, and surface their deprecations through your package manager. It does not survive contact with MCP, where the dependency is a live trust boundary that the LLM invokes in a loop, with credentials, on production data.

Tool Schema Deprecation: Why You Can't Just Rename a Parameter

April 23, 2026 · 11 min read

Tian Pan

Software Engineer

You renamed query to search_query on a tool schema. The changelog says "non-breaking: clearer naming." The PR passed review. Three days later, your support queue fills up with reports that the assistant is "searching for blank results." What happened is not what anyone on the thread would tell you. The agents did not fail. They submitted the old field name, your tool server ignored the unknown key, defaulted search_query to the empty string, and returned zero hits. The model, seeing a legitimate-looking empty response, confidently explained to the user why their query returned nothing relevant.

This is the part of agent engineering that does not fit the mental model borrowed from REST API versioning. A REST client that sends a renamed field gets a 400 and a clear error — the field either exists in the validator or it doesn't. An agent that sends a renamed field gets a silent acceptance, a nonsense result, and a hallucinated rationalization. The failure is not at the wire; it is in the joint between the runtime schema and the model's in-context mental model of what the tool looks like.

Tool schemas live in two places. The first is the runtime spec — the JSON schema you publish to the MCP server or the function-calling registry. The second is the model's in-context representation of that spec, reinforced every turn by few-shot examples in your system prompt, by the serialized tool history the agent sees on multi-turn tasks, and by whatever the model already absorbed about your API during pretraining. You can atomically update the first. You cannot atomically update the second. That asymmetry is the whole problem, and it is why "additive only, reserve forever" — the discipline that protobuf and GraphQL operators internalized a decade ago — needs to migrate to the tool-schema layer now.

Agent Protocol Fragmentation: Designing for A2A, MCP, and What Comes Next

April 19, 2026 · 9 min read

Tian Pan

Software Engineer

Most teams picking an agent protocol are actually making three separate decisions at once — and treating them as one is why so many integrations break the moment a second framework enters the picture.

The three decisions are: how your agent talks to tools and data (vertical integration), how your agent collaborates with other agents (horizontal coordination), and how your agent surfaces state to a human interface (interaction layer). Google's A2A, Anthropic's MCP, and OpenAPI-based REST solve for different layers of this stack. When engineers conflate them, they either over-engineer a single-agent setup with multi-agent machinery, or under-engineer a multi-agent workflow with single-agent tooling. Both failures are expensive to refactor once in production.

MCP Is the New Microservices: The AI Tool Ecosystem Is Repeating Distributed Systems Mistakes

April 14, 2026 · 8 min read

Tian Pan

Software Engineer

If you lived through the microservices explosion of 2015–2018, the current state of MCP should feel uncomfortably familiar. A genuinely useful protocol appears. It's easy to spin up. Every team spins one up. Nobody tracks what's running, who owns it, or how it's secured. Within eighteen months, you're staring at a dependency graph that engineers privately call "the Death Star."

The Model Context Protocol is following the same trajectory, at roughly three times the speed. Unofficial registries already index over 16,000 MCP servers. GitHub hosts north of 20,000 public repositories implementing them. And Gartner is predicting that 40% of agentic AI projects will fail by 2027 — not because the technology doesn't work, but because organizations are automating broken processes. MCP sprawl is a symptom of exactly that problem.

The MCP Composability Trap: When 'Just Add Another Server' Becomes Dependency Hell

April 13, 2026 · 9 min read

Tian Pan

Software Engineer

The MCP ecosystem has 10,000+ servers and 97 million SDK downloads. It also has 30 CVEs filed in sixty days, 502 server configurations with unpinned versions, and a supply chain attack that BCC'd every outgoing email to an attacker for fifteen versions before anyone noticed. The composability promise — "just plug in another MCP server" — is real. But so is the dependency sprawl it creates, and most teams discover the cost after they're already deep in integration debt.

If you've built production systems on npm, you've seen this movie before. The MCP ecosystem is speedrunning the same plot, except the packages have shell access to your machine and credentials to your production systems.

MCP Server Supply Chain Risk: When Your Agent's Tools Become Attack Vectors

April 10, 2026 · 9 min read

Tian Pan

Software Engineer

In September 2025, an unofficial Postmark MCP server with 1,500 weekly downloads was quietly modified. The update added a single BCC field to its send_email function, silently copying every email to an attacker's address. Users who had auto-update enabled started leaking email content without any visible change in behavior. No error. No alert. The tool worked exactly as expected — it just also worked for someone else.

This is the new shape of supply chain attacks. Not compromised binaries or trojaned libraries, but poisoned tool definitions that AI agents trust implicitly. With over 12,000 public MCP servers indexed across registries and the protocol becoming the default integration layer for AI agents, the MCP ecosystem is recreating every mistake the npm ecosystem made — except the blast radius now includes your agent's ability to read files, send messages, and execute code on your behalf.

What Nobody Tells You About Running MCP in Production

April 8, 2026 · 10 min read

Tian Pan

Software Engineer

The Model Context Protocol sells itself as a USB-C port for AI — plug any tool into any model and watch them talk. In practice, the first day feels like that. The second day you hit a scaling bug. By the third day you're reading CVEs about tool poisoning attacks you didn't know existed.

MCP is a genuinely useful standard. Introduced in late 2024 and quickly adopted across the industry, it has solved real integration friction between LLMs and external systems. But the gap between "got a demo working" and "running reliably under load with real users" is larger than most teams expect. Here's what that gap actually looks like.

About Tian Pan