Skip to main content

Conversation Branching as a First-Class Primitive: Why Linear Threads Force Users to Kill and Restart

· 10 min read
Tian Pan
Software Engineer

The clearest signal that your chat product needs branching is also the easiest one to ignore: users keep copy-pasting old conversations into new sessions. They are not migrating providers. They are not bored. They are trying to ask "what if I had pushed back on that earlier assumption?" without losing the forty turns of context they spent building. The linear thread offers them exactly two options — overwrite the next message and lose the original, or start a new chat and lose the prefix. So they invent a third one with a clipboard.

Every time a user does this, your product is leaking a feature request through a workaround. The workaround is bad: it strips message metadata, breaks tool-call linkage, drops file attachments, and creates orphaned threads that no longer map to a coherent task. But it persists because the alternative — abandoning context that took thirty minutes to assemble — is worse. The conversation is structurally a tree. The UI insists it is a list. Users patch the gap manually.

Branching as a first-class primitive means treating divergence the way a version control system treats it: as a normal operation that preserves history, supports parallel exploration, and allows merging back. OpenAI shipped this in ChatGPT in late 2025 as "Branch in new chat." Claude Code stores conversations as a DAG of messages where edits create forks rather than overwrites. LangGraph checkpointers expose fork_at(checkpoint_id) as a primitive. The pattern is converging because the linear-thread abstraction was always lossy — it just took a few years of usage data for the loss to become undeniable.

The Three Failure Modes Linear Threads Force

Linear chat UIs collapse three distinct user intents into the same UI gesture. When a user wants to change direction, they edit their last message. But "change direction" hides at least three different needs, each of which deserves a different state transition.

The first is course correction: the user thinks the model misunderstood and wants to restate. The original response is no longer wanted; overwrite is fine. The second is alternative exploration: the user got a reasonable answer but wants to see what a different framing produces — both are valuable. The third is rollback to a fork point: the user realized ten turns ago they should have given different constraints, and now wants to retry the whole subsequent conversation under those constraints, while keeping the original branch as a reference.

In a linear thread, all three look identical: the user clicks edit on a message and rewrites it. The system has no way to distinguish "throw away the rest" from "keep both." Most products default to discard, because keeping creates a navigation problem the linear UI cannot represent. Users who want the keep-both semantic cannot get it without leaving the product.

The cost shows up as duplicated work. Researchers running comparative analysis open four browser tabs of the same chat. Marketers testing tone variations spawn fresh sessions and re-paste the brief into each. Engineers debugging a multi-step plan ask the model to "go back to step three" and watch it confabulate the prior context because the conversation it is referencing has already been mutated. The branching pattern emerges from below — clumsily, with high friction, and with the model unable to help because each branch lives in a different session.

Copy-on-Branch Is the Right State Model

The mistake most teams make on the first pass is treating a branch as a deep copy of the conversation. This works until users start branching frequently, at which point storage costs and update semantics become a problem. A user with a 60-turn conversation who creates five branches off message 50 should not be paying for 300 messages of storage and should not have any chance of inconsistency between the shared prefix.

The correct model is copy-on-branch with structural sharing: messages are immutable, branches are pointers into a DAG, and shared prefixes exist exactly once on disk. This is the same insight that makes git scalable. A branch is not a copy of a tree; it is a new ref pointing at a commit, and commits are content-addressed nodes in an append-only graph. Translating to chat: each message is a node with a parent pointer, branches are leaf refs, and the "conversation" the user sees is a path from root to leaf reconstructed at read time.

This makes several operations cheap that would otherwise be expensive. Forking is O(1) — you allocate a new leaf ref pointing at the fork-point message. Switching branches is a pointer change, not a copy. Diffing two branches becomes a tree-diff between paths, useful for showing users "this is where the conversations diverged." Garbage collection becomes reachability analysis: messages with no leaf ref pointing through them are deletable, but never silently — they are someone's history.

The non-obvious benefit is that this model makes the model's view consistent across branches. The shared prefix is the exact same byte sequence in every branch, so the KV cache stays warm. If you serve traffic with prefix caching, branching gets nearly free at inference time as long as users stay near recent fork points. A naive deep-copy implementation forfeits this — the prefix bytes are technically the same but the cache key is different, so each branch pays a cold-start tax on its first turn.

The UI Problem Is Harder Than the Storage Problem

Storage as a DAG is a solved problem. Showing a DAG to users is not. The honest assessment of conversation branching today is that almost every implementation gets the storage model approximately right and the UI approximately wrong. ChatGPT's "Branch in new chat" sidesteps the visualization problem by punting branches into separate top-level entries in the sidebar, which preserves the linear-thread fiction at the cost of losing the parent-child relationship visually. Claude Code's branch navigation is functional but tucked behind a small toggle most users never discover. Tools like tldraw's branching-chat template and Canvas Chat go the other direction and lay the tree out spatially on a canvas, which is great for exploration but disorienting for users whose mental model of "chat" is still a vertical scroll.

There is no settled answer, but a few patterns are emerging. Inline branch indicators — a small "1/3" widget on a message showing it has alternative siblings, with arrows to flip between them — work well for two or three branches at a single fork point. They scale poorly past five and not at all when branches themselves have branches. Sidebar tree views scale better but compete with the main chat for attention and tend to be ignored. Spatial canvases scale best for power users but require giving up the chat affordance entirely, which is a much bigger UX bet than most teams want to make.

The pragmatic compromise: inline indicators by default, a tree-view drawer for users who have created more than three branches in a session, and an explicit "promote this branch to a new top-level chat" escape hatch for the case where a branch becomes the canonical thread. Whatever you ship, ship it with the understanding that you will rebuild it once after you watch users actually use it. Branching reveals workflow patterns the linear UI was hiding, and those patterns will not match what your designer drew on the whiteboard.

Merge Is the Hard Part — Skip It on Version One

Branching without merge is useful. Merge without branching is incoherent. So once branching ships, the natural next ask is: "can I take the conclusion from branch A and the data from branch B and combine them into a third thread?" The answer should be yes, eventually, but the implementation is much harder than branching, and most teams should consciously punt it.

The core problem is that conversation messages are not commutative. Two parallel branches contain assistant responses that reference each other's prefix — branch A's message 52 was conditioned on the model not having seen branch B's message 51, and vice versa. A naive merge that interleaves messages produces a Frankenstein thread the model cannot meaningfully continue from. The model sees inconsistent self-references and either confabulates a unified history or emits a confused response asking for clarification.

The workable approaches all involve some form of synthesis rather than concatenation. Summary-based merge: ask the model to produce a summary of branch B's conclusions, inject the summary as a system or user message into branch A, and continue from there. Selective extraction: let the user pick specific messages from branch B to copy into branch A as quoted context, with explicit framing ("from a parallel exploration, the model concluded..."). Three-way merge with synthesis: present both branches' conclusions to the model and ask it to produce a synthesis as a new turn. None of these are true merges in the git sense; they are all controlled context injections dressed up in merge UI.

Forky and ContextBranch experiment with semantic three-way merges, but the technique is research-grade. For most teams the right call is to ship branching without merge, watch how users approximate merge with copy-paste from one branch to another, and let those workflows guide what kind of merge primitive is actually wanted. Often it is just "copy this message into that branch" — a much smaller feature than a real merge.

The Telltale Signal You Needed This Yesterday

The diagnostic question is not "do users want branching?" — they will say yes if you ask, but they say yes to most features. The diagnostic is: how often do users start a new conversation with text that obviously came from another conversation? Pasted blocks that begin with "Earlier you said..." or "In our previous chat we established..." are users hand-rolling a branch operation across session boundaries because the product won't give them one within a session.

Instrumenting this is straightforward. Look for new conversations whose first user turn exceeds some length threshold — say, 500 tokens — and contains second-person references to the model. Look for sessions that share unusual proper nouns or named entities with another recent session by the same user. Look for sessions whose first message contains phrases like "continuing from," "as we discussed," or "in the previous conversation." Each of these is a leak — a workflow your product almost supports but doesn't.

The other signal is edit-then-regret. A user edits a previous message, the conversation overwrites the original branch, and within a few minutes the user pastes back what looks like the prior assistant response into the chat as context. They are reconstructing the lost branch from memory. If this happens more than rarely, your edit-message UX is destroying user work and asking them to reconstruct it manually.

Treat both signals as priority bugs in the conversation model, not feature requests. Users are doing the right thing — exploring divergent paths through context they value — and the product is forcing them to do it the wrong way. The fix is not a better edit dialog or a smarter "are you sure?" warning. It is admitting that conversation is not a list, and rebuilding the data model so the UI can finally show what users have always been trying to do.

References:Let's stay in touch and Follow me for more thoughts and updates