Mid-Flight Steering: Redirecting a Long-Running Agent Without Killing the Run
Watch a developer use an agentic IDE for twenty minutes and you will see the same micro-drama play out three times. The agent starts a long task. Two tool calls in, the user realizes they want a functional component instead of a class, or a v2 endpoint instead of v1, or tests written in Vitest instead of Jest. They have exactly one lever: the red stop button. They press it. The agent dies mid-edit. They copy-paste the last prompt, append the correction, and pay for the first eight minutes of work twice.
The abort button is the wrong affordance. It treats "I want to adjust the plan" and "I want to throw away the run" as the same gesture. In practice they are as different as a steering wheel and an ejector seat, and conflating them is why so many agent products feel brittle the moment a task takes longer than a single screen of output.
The fix is not a better progress bar or a more confident model. It is a set of architectural seams that let the user introduce new information without discarding correct work — and a UX vocabulary that distinguishes course correction from override from cancellation so the user can pick the right one without thinking about it.
The Abort-Only UI Is a Distributed Systems Bug in Disguise
Most agent UIs inherit their shape from request-response web apps. You submit a prompt, a stream comes back, you watch it land. The only control signal flowing the other way is the HTTP connection itself — kill the socket and you kill the run. That model is fine when the unit of work fits in a single response. It falls apart when the unit of work is a fifteen-minute plan with a dozen tool calls and the user has new information at minute seven.
The root cause is a missing channel. Streaming tokens and events go down to the client; almost nothing comes back up until the run is done. When users eventually press Escape, they hit a stateless HTTP boundary that has to tear down the stream and the inference pipeline together. There is no primitive for "keep going but listen to me." So the product ships with Stop as the only option, and every correction becomes a restart.
Teams shipping real mid-flight steering in 2026 — from GitHub Copilot's agent mode work to the Codex-style editors — have all converged on the same diagnosis: you need a persistent bidirectional channel, an explicit in-flight run identity, and a place on the server where steering inputs land so the agent can pick them up between actions. Without those three things, every interrupt is a race with cancellation semantics, and the user's correction usually arrives one tool call too late.
Three Architectural Seams Worth Building
Once you accept that steering is a transport-and-state problem rather than a UX skin, three seams do most of the work. They are not alternatives; they compose.
Checkpoint-and-inject. The agent's state — plan, scratchpad, tool-call history, retrieval results — lives behind a checkpointer that writes after every step. A steering input becomes a structured update merged into the checkpoint before the next step reads it. LangGraph's interrupt() primitive is the reference implementation: the graph persists, pauses, and resumes from the same thread ID with the user's payload merged in. The value of this seam is that correct work survives. You do not unwind seven tool calls because the eighth one was headed somewhere wrong.
Plan revision hooks. Somewhere in the loop, the agent commits to a plan — an explicit list of steps, a tree of subgoals, a TODO. Exposing that plan as a first-class editable object (not a log line, not a message) gives the user a natural place to intervene. Edit the step, reorder it, delete it, add a constraint to it, and let the agent re-read the plan on the next iteration. This is the difference between shouting "stop, stop, do it differently" into a chat box and pointing at step 4 and typing "use the v2 API here." The second is actionable because it lands on a structured surface the agent already consults.
Soft-interrupt tokens. Between tool calls — or at any natural checkpoint inside a long-running step — the agent reads from a steering inbox. If a message is waiting, it is spliced into context as a system note and the agent decides whether to replan, acknowledge, or continue. The inbox is non-blocking: the user can drop notes into it at any time, and the agent picks them up at the next safe point. Claude Code's hook events (PreToolUse, PostToolUse, UserPromptSubmit during an active run) are one concrete instantiation; asynchronous steering notes proposed for Copilot's agent mode are another. The shared insight is that the agent should be polling for user input even when it is not pausing.
The UX Taxonomy: Correction, Override, Cancellation
Having the seams is half the job. The other half is giving the user three different verbs instead of one.
- https://ably.com/blog/ai-transport-redirect-steering
- https://github.com/microsoft/vscode/issues/288920
- https://docs.langchain.com/oss/python/langgraph/interrupts
- https://www.langchain.com/blog/making-it-easier-to-build-human-in-the-loop-agents-with-interrupt
- https://platform.claude.com/docs/en/agent-sdk/hooks
- https://strandsagents.com/latest/documentation/docs/user-guide/concepts/experimental/steering/
- https://newsletter.victordibia.com/p/4-ux-design-principles-for-multi
