Session Stitching: Why Your Conversation-ID Is a Lie
A user starts negotiating a contract with your agent on her desktop at 9 a.m. She gets a Slack ping, switches to her phone over lunch to ask one clarifying question, and reopens the desktop tab at 4 p.m. to revise the draft. To her, that was one task — three hours of working through one contract. To your system, that was three sessions on two devices, each with its own conversation-id, each with its own memory window, each presenting a fresh greeting and asking her to re-paste the draft she'd already discussed twice.
The bug is not in the model. The bug is that your platform encoded "session" — a transport-layer artifact about a single connection — as the unit of context, while your user encoded "task" — the contract — as the unit of context. Every framework on the market quietly conflates the two, and the gap between them is where half of agent UX disappears.
This is not a niche complaint. Once you start logging task-level traces, you find that a meaningful share of "new conversations" are actually continuations of unfinished tasks — the user gave up on stitching them by hand and started over. The product KPI you call "engagement" partly measures users paying the cost of your missing abstraction.
The Frameworks Hand You a Session-ID and Call It Memory
Open any agent SDK from the last two years and the persistence story rhymes. LangGraph asks you to pass a thread_id; the checkpointer saves graph state per thread, separate from every other thread, and resumes when you re-invoke with the same id. The OpenAI Agents SDK gives you a Session keyed by a session id (SQLiteSession("conversation_123")); same session, full history; new session, blank slate. The Claude Agent SDK persists a session to disk so you can return to it later. Google's ADK ships a Resume feature that picks up an interrupted workflow run.
These primitives are correct at the layer they target — they reliably persist state for a single logical run. But none of them defines what a "logical run" is from the user's point of view, and the default that ships in tutorials is: one session per WebSocket connect, one thread per browser tab, one resume per process. The platform decides where the boundary falls, and the boundary almost always lands in the wrong place.
When you trace a real user, the seams are not where the framework draws them. The user does not switch tasks because their TLS connection idled out. They switch tasks when they finish negotiating the contract and start onboarding a vendor. The transport-layer session-id and the user's mental task boundary are two different things that happen to coincide some of the time, and your product is staking continuity on a coincidence.
Task-ID Has To Be First-Class — And Orthogonal
The fix is not to make sessions live longer. Long sessions accumulate unrelated history, and if your model upgrade later trims context, you cannot decide what to keep, because you never recorded what belongs to what.
The fix is to introduce a task_id that is user-meaningful and orthogonal to session_id. Sessions still exist — they describe a connection, a device, a process lifetime — but they are no longer the unit of memory. A task is the unit of memory, and a session is just one slice of activity that happened to belong to a task.
Concretely:
- A task has a name the user understands ("Acme contract revision," not "conv-7e3f"). The user names it explicitly, or your agent proposes a name from the first turn and lets the user accept or rename.
- A task has a working set: the documents, decisions, open questions, and partial outputs that constitute its in-progress state. This working set is durable across sessions and is the thing your resumption summary is built from.
- A session has a
task_idforeign key. When a user reconnects on a new device, the platform looks up open tasks for that user, surfaces them, and asks which one this session continues — or lets the user start a new task explicitly. - The agent's continuation UX is "you were working on Acme contract revision — pick up where you left off?" with a one-paragraph summary of state. It is not a chronological scroll of "yesterday's chat" that forces the user to re-derive what they were doing.
ChatGPT Projects is the closest mainstream pattern that gets this shape right: a project is a durable container, conversations live inside it, and the user can switch devices without losing where they are. It is not perfect, but it demonstrates that a task abstraction above sessions is feasible at consumer scale.
The point is not that everyone needs a Projects clone. The point is that your data model needs an explicit join between sessions and tasks, owned by your team, exposed to your eval suite, and visible in your telemetry. If you do not have that join, you do not have task continuity — you have hope.
Cross-Device Continuity Is an Authorization Problem in Disguise
The moment you commit to durable, cross-device tasks, the auth model you built for ephemeral sessions stops generalizing.
A short-lived session-bound credential is fine when the user, the device, and the conversation all live and die together. The contract is: this token authorizes this connection; lose the connection, lose the token. But a task that survives across devices needs a different contract: this token authorizes resuming this task from a different device, possibly without the original cookie, possibly after a reboot, possibly while the user is also actively connected from another device.
Recent work on cross-device flows has pushed this into a concrete protocol space. The IETF's Cross-Device Flows BCP draft formalizes session-transfer flows where a user authorizes the transfer on an Authorization Device and then consumes the session from a Consumption Device, with state preserved across the boundary. Device-Bound Session Credentials (DBSC) bind a session cookie to a device-held private key so it cannot be silently lifted. Both of these matter because they tell you that your cross-device resumption is going to negotiate with the same constraints that the auth-protocol world has been wrestling with — only now you are negotiating them on top of a durable task store full of in-progress agent state, including draft contracts and partially executed tool calls.
Two practical consequences:
- https://docs.langchain.com/oss/python/langgraph/persistence
- https://openai.github.io/openai-agents-python/sessions/
- https://platform.claude.com/docs/en/agent-sdk/sessions
- https://help.openai.com/en/articles/10169521-using-projects-in-chatgpt
- https://openai.com/index/memory-and-new-controls-for-chatgpt/
- https://datatracker.ietf.org/doc/html/draft-ietf-oauth-cross-device-security-15
- https://w3c.github.io/webappsec-dbsc/
- https://arxiv.org/html/2602.16313
- https://arxiv.org/html/2507.05257v1
- https://google.github.io/adk-docs/runtime/resume/
- https://temporal.io/blog/building-a-persistent-conversational-ai-chatbot-with-temporal
- https://www.langchain.com/conceptual-guides/runtime-behind-production-deep-agents
