Skip to main content

The IDE Plugin Is the Product Now: When Your Coding Agent Outgrows the Editor's Plugin API

· 11 min read
Tian Pan
Software Engineer

The default mental model for an AI coding tool is a panel inside VS Code. A chat box, a few inline suggestions, maybe an "apply diff" button. That framing is two years out of date. The leading products in the category are not VS Code extensions; they are full editors that happen to look like VS Code on launch. Cursor is a fork. Windsurf is a fork. Zed is a from-scratch native editor. The pattern is not coincidence — it is what happens when an agent's surface area finally exceeds what the host editor's plugin API was designed to support.

If you are building a coding agent and still treating "ship a plugin" as the obvious distribution choice, you are about to hit the same wall the leaders walked into around 2024 and chose to climb. The wall has a name: the plugin API was built to add features to an editor controlled by humans, not to host an autonomous agent that wants to control the editor.

What the plugin API was built for

VS Code's extension API is one of the best in the industry. It exposes language servers, debug adapters, custom views, command palettes, hover providers, code actions, tree views, status bar items, webviews, tasks, terminals. You can build a high-quality language extension or a thoughtful productivity tool with it. The API matured around the assumption that the human is the agent: the user types, the user runs commands, the user clicks "apply." Extensions react.

Agents invert this. The agent is the actor; the editor is the surface where the agent's actions become visible to a human who is supervising rather than driving. This inversion stresses parts of the plugin API that human-in-the-loop extensions never touched.

A few examples of where the seams show:

  • Custom diff and approval UI. The agent wants to propose a multi-file change with per-hunk approval, an inline rationale, a side-by-side preview, and an "approve all in this file" affordance. The plugin API gives you a built-in diff editor and a webview. The webview is a sandboxed iframe that cannot directly modify the editor — you marshal everything through the extension host. Every UX detail you want costs an event-bus round trip.
  • Multi-pane orchestration. An agent loop wants to keep a conversation pane visible while a plan pane updates while a file pane shows the current edit while a terminal pane streams test output. The host editor decides the layout grammar, and that grammar was designed for "editor + sidebar + bottom panel," not for an agent's situation room.
  • Sub-second whole-project context. The agent needs to know about the function three files away, not just the line under the cursor. That requires an index that lives next to the editor's own buffer state, not a separate process talking through a JSON-RPC pipe with a 50ms tax on every query.
  • In-editor agent telemetry. What did the agent read? What tools did it call? What was rejected and why? The agent wants to render its own trace UI inline with the code it is touching. Plugin APIs rarely expose the editor's own surfaces deeply enough to do this without a webview that re-implements half the editor.

None of this is a flaw in VS Code. It is a category mismatch. The API was scoped to keep extensions safe, portable, and well-behaved across versions. Agents need surfaces that are unsafe, non-portable, and version-coupled — exactly the things a healthy plugin contract refuses to expose.

The build-vs-fork-vs-extend decision

Once you accept that the plugin model has limits, you face three doors. They are not equivalent, and each one prices a different risk.

Extend. Stay a plugin. Lean on what the API does give you. Live within the chat-pane-and-inline-suggestions box. The advantage is enormous: every existing user can install you in 30 seconds, your extension lives next to all the others a developer already depends on, and you inherit security updates and platform improvements for free. The cost is a ceiling on the agent's surface area. You will not ship a custom diff approval flow that feels right. You will not own the layout. You will route every agent observation through the host's UI primitives. For some products that ceiling is fine forever. For an agent that wants to be the primary loop, the ceiling becomes the product.

Fork. Take the open-source editor and modify the core. Cursor, Windsurf, and Trae all chose this door. You inherit familiarity (the keybindings, the command palette, the muscle memory) and 90% of the engineering investment in modern editing (Tree-sitter, language servers, terminal, debugger). You pay three taxes. First, you maintain a fork — every upstream release is a merge cost, and security patches are now your problem to land in time. Second, you fragment the marketplace. Cursor has had to transition to Open VSX and even publish their own maintained versions of popular extensions because Microsoft's terms restrict the official marketplace to Microsoft products; users discover the gap when their familiar C/C++ extension breaks on a version bump. Third, you take on a brand promise: when Live Share or some other Microsoft-only feature stops working, the user blames you, not the plugin contract.

Build from scratch. Zed's path. Native Rust, GPU-accelerated rendering, no Electron, an indexed CRDT-based buffer model designed from day one to support real-time collaboration and streaming AI. The ceiling is the highest of the three: you own the rendering pipeline, the input loop, the buffer data structure, the file watcher. You can make agent operations feel like first-class editor operations because they are. The cost is everything you do not get for free — the language servers you have to integrate, the debug protocol you have to implement, the muscle memory users have to rebuild, the long road to extension parity. This door is closed to almost everyone because the engineering bill is measured in years of a team that knows how to write an editor.

The decision is rarely made cleanly. Many teams start at "extend," hit the ceiling, prototype a fork, ship the fork to power users, and only then discover what a maintenance burden looks like. The teams that picked the right door early did so because they had a clear thesis about the agent's UI surface — not because they prefer big projects.

What the agent actually needs from the editor

To pick the right door, list the operations the agent must perform and ask, for each one, whether the plugin API can carry it without becoming a webview project of its own.

A reasonable list for an autonomous coding agent in 2026 looks like this:

  • Read the open project at agent speed. The agent runs a master loop where each turn might issue a dozen file reads, regex searches, and symbol lookups. Round-trip latency is the user-facing latency. If your context-fetch tool takes 80ms per call because it crosses a process boundary and serializes JSON, you have just budgeted seconds per turn for plumbing.
  • Render proposed changes in a custom approval surface. Per-hunk acceptance, per-file rationale, optional inline test runs, the ability to reject and re-prompt. The built-in diff editor is a good primitive but rarely the final UX.
  • Stream the agent's own state. Plan, current step, tool calls, costs. Users want to see the agent thinking. That display lives inside the editor surface, not in a separate window.
  • Expose hook points for organizational policy. When does the agent need approval? Which files are off-limits? Which tools require human confirmation? Hooks need to fire reliably, in-process, without the user being able to disable them by uninstalling the extension.
  • Coordinate with terminals, debuggers, and language servers. The agent's "verify" step is "run the test, read the output, parse the failure, make a hypothesis." Plugin APIs let you spawn terminals; they rarely let you read terminal output programmatically with the fidelity an agent needs.
  • Persist agent memory tied to the project. Conventions, recent decisions, where the bodies are buried. This belongs next to the project, in a format the agent can index and cite, not in a vendor cloud you keep round-tripping to.

If most of those are nice-to-have, ship a plugin. If most of them are the product, the plugin is going to feel like a website built inside a browser extension — technically possible, structurally wrong.

The migration cost users actually pay

The discourse on "fork vs. extension" tends to dwell on engineering cost. The cost that decides product success is the user's switching cost when your agent stops being a plugin and starts being a different editor.

Developers do not have a clean editor; they have a stack of years of choices: keybindings, themes, fonts, color schemes, snippets, debug configs, .vscode/settings.json files checked into every repo, language extensions, formatter integrations, pair programming setups, remote development sessions, devcontainers. A fork inherits maybe 80% of this; the user spends a weekend rebuilding the missing 20% and discovers two things: which of their dailies do not exist on Open VSX, and which Microsoft-only features they had silently come to depend on (Pylance, Remote-SSH, Live Share, the C/C++ extension, the .NET tooling).

The teams that handled this best did three things. They picked their fork moment when their agent's value gap was wide enough to overcome the friction — not the day the plugin began to feel cramped. They invested heavily in extension compatibility from day one, including publishing maintained mirrors of the most-installed extensions where licensing allows. And they made the import experience feel like an upgrade rather than a migration: settings sync on first run, keymap import, sensible defaults that match the editor the user just left.

The fork that fails is not the one with worse engineering. It is the one that asks the user to throw away muscle memory before the agent has earned it.

The new shape of the AI coding tool stack

Step back, and a layered stack is forming. At the bottom is the model and its provider. Above that is the agent harness — the master loop, the tool catalog, the context-management strategy, the policy hooks. Above that is the editor surface — the place where the agent's reads, writes, and approvals become visible to a human who is supervising. Around that sits the policy and observability layer that an organization wires in.

The interesting shift is that the editor surface is no longer a thin presentation layer. It is the most differentiated part of the product after the model itself. Two teams using the same frontier model will ship products that feel completely different depending on whether the editor surface gives the agent a chat pane or a situation room.

That is why the discourse frame of "an AI feature inside IDE X" was right for 2024 and is wrong for 2026. The agent is not a feature. It is the loop the user is supervising. The editor is the supervisor's console. Treating the console as a place where you rent some screen real estate from a host application puts a hard cap on the product.

Picking your door honestly

If you are starting an AI coding tool today, the honest version of the decision goes like this. Build an extension if your agent's value lives inside a chat pane and a few inline edits, and your moat is the model, the prompts, or the integrations — not the editor experience. You will move fast and reach every developer through marketplaces they already trust.

Fork if your agent's UX needs surfaces the plugin API cannot carry, the team has the budget to maintain a downstream branch indefinitely, and the value gap will outpace the friction of asking users to install a new editor. Be honest about the ongoing cost; the work of keeping a fork green is not a one-time porting project but a permanent payroll line.

Build from scratch only if you have a fundamental thesis about what the editor itself should be, and the team to execute it. The bar here is brutally high. The reward is that the agent's operations stop feeling grafted on and start feeling native, because they are.

The wrong question is "plugin or fork?" The right question is "what does my agent need from the editor that the editor was not built to give?" Answer that, and the door picks itself.

References:Let's stay in touch and Follow me for more thoughts and updates