Skip to main content

36 posts tagged with "mcp"

View all tags

The MCP Tool List Grew Mid-Session and Your Agent Called a Tool It Had Never Been Told About

· 10 min read
Tian Pan
Software Engineer

A security incident review opens with a question the team cannot answer: how did the agent learn the name of the tool it just called? The audit trail shows a tools/call for a tool whose name does not appear in any tools/list response the harness logged. The MCP server cheerfully accepted the call and executed it. The model, asked in a postmortem to explain where the tool name came from, offers no answer because there is none — it guessed, and the guess landed on a real action.

This is the failure mode at the seam between two assumptions that look compatible on paper. The client treats the tool list as a contract that names the surface area of authority it has been granted. The server treats the tool list as a snapshot of what is currently available, free to grow when the world grows. Between those two views, the LLM is a bridge that does not know the difference.

The MCP Server Your Team Forgot Was Running with Prod Credentials

· 10 min read
Tian Pan
Software Engineer

A new engineer joined the team on Monday. By Wednesday, she had a working local agent setup: an MCP server bridged to the company's deployment API, pointed at staging, talking to her editor. The onboarding doc walked her through the OAuth flow. The token she pasted into the server's environment file was the one her teammate had emailed her — the same token the CI pipeline uses to ship to staging. By Friday, she had joined the team for a working session at a coworking space.

The MCP server was still running. Bound to 127.0.0.1. No authentication. The token was loaded into the process. She didn't think about it because she was not using it. But any tab that visited any website that day could speak to her local server through her own browser. So could any other laptop on the coworking wifi, because she had not noticed that the server was actually bound to 0.0.0.0. The OAuth token your CI pipeline uses to push to staging was now reachable by anyone who could trick a browser into making a request to a local IP — which, in 2026, is a one-pop-up problem.

This post is about that class of failure: the gap between "I'm developing on my laptop" and "my laptop is a server reachable by adversaries." MCP servers, by design, sit right in that gap. Most teams have not noticed.

The Revoked Tool Your Agent Kept Calling Because the Registry Cache Was an Hour Stale

· 11 min read
Tian Pan
Software Engineer

A user opens the integrations page, finds the Stripe connector they installed last month, clicks Remove, and closes the tab. They believe they have just rescinded an authority. What they have actually done is decrement a row in a database that the agent currently talking to them will not read again for another forty-three minutes. In the interval, the agent will try to call that Stripe tool, the registry's authorization layer will correctly say no, the agent's harness will see the denial as a transient downstream blip and retry three times, and the user's own Stripe audit log will record three unauthorized access attempts arriving from a vendor they thought they had just severed.

The user's escalation will read, almost verbatim: your platform kept trying to access my Stripe after I removed it. That is exactly what happened, and the root cause sits one layer deeper than the bug report ever reaches. The tool registry was the source of truth for what the agent was allowed to do. The agent did not read the source of truth. It read a cache.

The Async Tool Call Your Agent Fired and Forgot

· 10 min read
Tian Pan
Software Engineer

The clearest sign that an agent's tool-call abstraction is broken is when the trace shows the step marked done and the downstream system shows nothing happened. The model called a tool, received a job ID back, treated the job ID as the answer, and moved on. Three minutes later the actual work either succeeded with nobody listening or failed with the error landing in a log nobody reads. The user sees a confident summary; the operations queue sees a stranded task.

This is the failure mode the function-calling abstraction quietly enables. JSON schemas describe parameters and return types, but they do not distinguish between "this tool returns a result" and "this tool returns a receipt for an operation whose result you will need to ask about later." The model treats both the same way, because to the planner they look the same — a successful tool call with a non-error payload.

The OAuth Scope Your Agent Acquired Across Chained Tool Calls

· 10 min read
Tian Pan
Software Engineer

A user clicks "Authorize" on your agent's consent screen once. By the time the session ends, that agent has chained through eleven tool calls, negotiated three step-up authorizations, and now holds the union of scopes across every tool it touched. The user remembers granting one thing. Your audit log shows read-write access to half their account. The OAuth standard says everything is working as designed, and that is exactly the problem.

The classical OAuth consent model was built for a world where one app talks to one API. Agents shattered that assumption two years ago and the standard has not caught up in practice, even where the spec has. The result is a category of silent privilege escalation that no one decides to ship — it accretes, one tool registration at a time, while your security review keeps inspecting the front door.

The prompt injection that survived your sanitizer because the agent read it through a tool

· 11 min read
Tian Pan
Software Engineer

A team I talked to last month had a clean prompt-injection story. Their gateway ran every user message through a classifier. Anything that scored above a threshold got bounced with a polite error. They benchmarked it against a public adversarial set, hit 99.4% block rate, and shipped. Two weeks later, a customer-success ticket revealed that the agent had quietly drafted, approved, and sent an email instructing an internal billing tool to refund a stranger's invoice to a new account. The malicious instruction had never touched the user input. It came in through a Confluence page the agent fetched when the user asked, perfectly innocently, "what does our refund policy say?"

That is the failure mode no input sanitizer catches, and it is now the dominant prompt-injection vector in production agents. The classifier you trained on user prompts never saw the payload, because the payload arrived through a different door. By the time the bytes hit the model, the agent had already labeled them as "context I retrieved to help the user," not "untrusted text from a stranger on the internet." The model treats both with the same compliance instinct, because the model has no concept of trust at all.

The Tool Version Bump Your Agent Quietly Adapted To

· 10 min read
Tian Pan
Software Engineer

A downstream search service ships v2.3.2 on a Tuesday afternoon. The release notes mention a renamed status field, a new nullable confidence value, and a reordered array in the result envelope. Nothing in the CHANGELOG is marked breaking. The provider's own client libraries absorb the change in a point release. Your team's HTTP integrations would have logged a deserialization error inside an hour. Your agent — the one routing customer questions through that search tool — does not. It keeps answering. The questions still resolve. The dashboards stay green.

Six weeks later, someone notices that "out of stock" replies have crept up from two percent of queries to eleven. The root cause is the v2.3.2 bump. The renamed status string changed from in_stock to available, and the agent — being a flexible reasoner over text rather than a schema-strict client — interpreted the absence of the old token as "not available," then phrased that finding into helpful, confident, wrong customer messages. The contract regression was absorbed on the consumer side, where no test suite was watching.

This is the failure mode that conventional API hygiene was never designed to catch. Strict clients break loudly. Agents break quietly. And the longer you treat your agent like a normal HTTP consumer, the longer this class of bug hides inside metrics that look fine.

The Pointer Your Agent Mistook for a Value: Reference vs Value in Tool Outputs

· 11 min read
Tian Pan
Software Engineer

A search tool returns ten document IDs. An asset tool returns an S3 presigned URL. A database tool returns a row handle. A file tool returns a path. Each of those returns is, formally, a pointer — a small string that names a value the agent does not yet possess. The model's downstream behavior depends entirely on whether it knows that and dereferences before reasoning, or whether it treats the pointer as if it were already the thing.

The failure mode is invisible from the trace. The tool call succeeded. The return is well-formed. The model emitted plausible-looking output. Nothing in the log says "the agent reasoned about a filename and called it a document." The pointer-vs-value confusion sits underneath the visible behavior, in a layer your tool schema never named.

The Sandbox Your Agent Didn't Notice Was Real

· 10 min read
Tian Pan
Software Engineer

A team I know has a textbook staging setup. Read-only replicas of the production database. A mock Stripe account that pretends to charge cards. Synthetic users with fake email addresses on a domain nobody owns. The agent is asked to walk through an "account delinquent" escalation flow in staging, end to end, as part of a release rehearsal. The trace looks clean. The agent does what it is supposed to do.

Three minutes later, a real customer — a paying one, who churned six months ago and was still in a dormant export the developer had used to seed a test fixture — replies to a politely-worded payment-overdue email. The "send_email" tool, registered next to a dozen other tools that all terminate in mocks, was wired to the production Mailgun key. The developer who set it up two sprints earlier had been iterating fast on email templates and the sandbox tier capped them at five emails an hour, which broke the inner loop, so they swapped in the real key "just for the afternoon" and forgot. Nobody re-checked. The agent had no way to know.

The Tool You Added For One Agent Is Now In Every Agent's Hand

· 10 min read
Tian Pan
Software Engineer

Six months ago, somebody on the customer-support team wired a send_email tool for their agent. It worked. The platform team noticed it in the shared tool registry, gave a thumbs-up emoji on the PR, and moved on. This week, a security engineer ran an audit and discovered that send_email is in the action surface of the meeting-notes summarizer, the data-quality bot, an analytics assistant nobody officially owns, and a half-built prototype that hasn't been touched since January. None of these agents need to send email. None of them have ever been reviewed for whether they should be allowed to. The PRD for the meeting-notes summarizer is two sentences long and the words "outbound communication" do not appear in it.

This is the default state of every shared tool registry I have ever audited. The act of registering a tool — pushing a JSON schema and a handler into a central catalog — is treated as a developer convenience, like adding a utility function to a shared library. But once the registry is sourced into every agent's prompt, registering a tool is not a library change. It is a deployment to every agent in the company simultaneously, with no review of whether each of them should have received it.

MCP Server Sprawl: The Unbounded Tool Surface Nobody Owns

· 9 min read
Tian Pan
Software Engineer

The Model Context Protocol did exactly what it set out to do: it made giving an agent a new capability almost free. Wiring in a calendar server, a database server, an internal company server, or one of the 30,000-tool catalogs that vendors now publish is a config change, not a project. That frictionlessness is the feature. It is also the problem.

Because adding a tool is cheap, every team adds tools. The data team wires in a warehouse server. The support team adds a ticketing server. Someone connects a filesystem server for a one-off task and never removes it. None of these decisions is wrong. But there is no decision that owns their sum — the aggregate tool surface your agent now carries on every single request. The tool list has become a dependency graph with a real carrying cost, and in most organizations it is the one dependency graph nobody is responsible for.

The result is sprawl: a tool catalog that grows monotonically, gets reviewed by no one, costs more every quarter, and quietly makes the agent worse. This is the unowned surface, and it deserves the same scrutiny you already give your API surface and your npm tree.

Your Tool Descriptions Are an Instruction Channel the Model Obeys

· 8 min read
Tian Pan
Software Engineer

When a security team reviews a new tool integration, they read the code. They check what the function does, what it touches, what scopes it needs, whether it logs secrets. They almost never read the one sentence that decides whether the model calls it at all — the tool's description. That sentence is not documentation. It is an instruction the model treats as authoritative, and in most agent stacks nobody reviews it.

A tool description is written for the model to read. The model uses it to decide when the tool is relevant, what arguments to pass, and how to interpret what comes back. That makes the description a control channel into the model's behavior. And the moment a tool arrives from a third-party registry, a Model Context Protocol (MCP) server you don't operate, or a plugin a teammate installed last week, that control channel is authored by someone you never agreed to trust.

This is the gap. Input sanitization inspects what users type. Code review inspects what functions execute. The tool description sits between them — it is configuration that behaves like input — and it falls through both nets.