Conversational REST: When Your Chat UI Needs Pagination, Filters, and Sort

May 9, 2026 · 11 min read

Software Engineer

A user asks your shopping agent for "running shoes under $150 with good arch support." The model dutifully returns twelve options as a wall of bulleted text inside a single chat bubble that overflows the viewport. The user scrolls, loses their place, and types "show me only Asics" — at which point your agent re-runs the entire search instead of filtering the result set it already has. Three turns later, the user is inventing a query language one prompt at a time, and your product feels like a command line wearing a chat-bubble costume.

This is the failure mode I keep watching teams ship. They built a chat product on top of what users actually wanted to be a faceted-search product. The model is fine. The retrieval is fine. The UI is the problem, and it's the wrong shape for the task.

The shortest way I can put it: chat is an input modality, not an output one. The agent's job is to translate user intent into a structured query. The moment the result set is more than three items, the right answer is to render UI, not to keep talking.

Chat is great for ambiguous one-shots and bad at everything else

Chat shines when the user doesn't know how to formulate the question yet. "I want a book like Cryptonomicon but shorter and less violent" is the kind of query that a faceted search box was never going to get right, because the user is still negotiating with their own taste. The model gets to disambiguate, and a paragraph of natural-language reply is the right output shape.

But the tasks users actually do in production aren't ambiguous one-shots. They're the same things people have always done in software: compare five candidate options side by side, filter a long list down to three, sort by recency or price or relevance, and page through a result set that doesn't fit on screen. None of these are conversational tasks. They're spatial tasks. They want a viewport, not a transcript.

When an agent returns twelve search results as a single message, the user has to do the spatial work in their head: hold the items in working memory, mentally tag the ones they want to keep, and remember which ones the agent already mentioned so they don't ask about them again. The chat log is a transcript of a conversation that was supposed to be a list. Every additional turn makes it worse — pagination by re-prompt is the single most common chat anti-pattern I see, and it accumulates context faster than any real workflow can survive.

The Baymard Institute's e-commerce filter research has been making this point for a decade in a different vocabulary: when users have a result set above roughly seven items, they reach for filters and sort before they reach for "more results." The pattern doesn't change because the search box is now an LLM. The user is the same human; the task is the same comparison. What changed is that we stopped giving them the controls.

The "show me the next 10" parody

The second failure mode is subtler. The team notices that one chat bubble can't hold the result set, so they ship pagination — but they ship it as a conversational verb. The user types "show me the next 10," and the agent obligingly sends the next page as another giant bubble. Then "now sort by price." Then "actually, hide the ones over $200."

Every one of those turns is the user inventing a query language one prompt at a time. The product team congratulates themselves on a "natural" interface, but what they shipped is a worse version of a SQL prompt with no autocomplete, no schema hints, and no guarantee that the model will interpret "actually, hide" the same way twice. There is a reason no shipping search product has a "type your filter as a sentence" UI, and it isn't because nobody thought of it.

The deeper problem is that conversational pagination throws away the structured intent the agent already has. The model already knows the candidate set. It already knows the filter axes — price, brand, color, size, rating. It already knows the sort key the user prefers. Re-asking the model on every refinement is the same architectural mistake as re-running a SQL query for every column toggle: the work belongs in the client, against the result set, not in another round trip to the LLM.

What "structured intent" actually looks like

The architectural shift that fixes this is small to describe and big to actually do: the agent emits a structured search specification, the application renders that spec as real UI, and refinement happens against the rendered state, not against another model call.

In practice, the agent's output for a search task isn't "Here are 12 running shoes: 1. ... 2. ...". It's something closer to:

A query object — the parsed intent: { category: "running_shoes", price_max: 150, attributes: ["arch_support"] }
A result set — the candidate items, fetched once
A facet schema — the filter axes the agent thinks are useful for this query (brand, size, weight, drop, terrain), populated with counts
A default sort — the agent's best guess at what "good" means here (rating × price, or pure relevance)

The chat thread renders a normal product list with normal filter chips and normal sort controls, alongside whatever conversational explanation the agent wants to add ("I prioritized models with above-average arch support reviews"). When the user toggles a filter, the application updates the rendered state — same JavaScript a regular e-commerce app would run. The agent only gets re-invoked when the user actually says something new in the conversational channel: a new constraint, a clarifying question, a request for a recommendation among the visible options.

This is what the generative-UI work — Google's A2UI, AG-UI, MCP-UI, Vercel's AI SDK generative UI — has been converging on for the last year. The protocols differ, but the shape of the answer is identical: agents emit a structured UI spec, the host renders real components, and the model stops being responsible for layout. The argument I'd make is that you don't need any of those frameworks to do this — you need to stop treating the model's reply as the surface and start treating it as the spec.

Faceted search is the lost ancestor here

The pattern teams keep accidentally reinventing is faceted search, and the irony is thick: the e-commerce industry solved exactly this problem in the early 2000s. Facets are filters dynamically generated from the result set. Counts. Ranges. Toggles. Active-filter chips at the top. One-tap removal. The whole design vocabulary exists, has shipping examples on every retail site you've ever used, and is being silently re-derived — badly — inside chat threads.

The hybrid that's emerging in 2026 is genuinely interesting. Instead of "natural language or facets," the right model is "natural language populates facets." The agent reads the user's prose query, runs hybrid retrieval — vector for semantic intent, keyword for exact matches, structured for filterable attributes — and surfaces the facets that this query actually has variance along. Search the corpus for "lightweight trail shoes" and the facets are weight, drop, lug depth, brand. Search for "office sneakers under $100" and the facets are color, brand, leather/synthetic, price bucket. The agent isn't picking from a static facet list; it's generating the relevant axes per query.

Done well, this collapses the false choice between "the user types in natural language" and "the user clicks filters." Both are happening, in the same query, in the same UI. The agent translates loose intent into a tight query and gives the user the controls to refine it. Elastic's recent work on AI-assisted facet generation and Google's faceted-vector hybrid retrieval pieces are the canonical references, but the pattern is older than the LLM era — it's just being pulled forward into agent products by teams who realize their chat is overflowing.

A decision rule for when chat should stop talking

The heuristic I give teams when they're auditing their own agent UI is brutally short:

If the answer is a sentence, render it as a sentence. A summary, an explanation, a recommendation among a small set, a yes/no with a reason. Chat is the right surface.
If the answer is one item, render it as a card. The card has the metadata, the actions, and a chat slot underneath for follow-up. Don't bury structured data in prose.
If the answer is two or three items, render them as a small comparison view. Side by side, same fields visible, agent's recommended pick called out. Still tractable in working memory.
If the answer is four or more items, render a list with filters and sort. The agent emits the query and facet schema; the application renders the result set; refinement happens against the rendered state.
If the answer is a process — a multi-step decision, a workflow, a transaction — render the workflow. Agent shouldn't be conducting a checkout via chat turns. Agent should produce the cart and let the application's checkout UI take over.

The number "three" is not magic. It's roughly where most users stop being able to hold the option set in working memory and start wanting to compare. Pick your own threshold from your retention data, but pick one — the failure mode I see is that the threshold is implicitly infinite, and the chat thread becomes the dumping ground for any result set the model can serialize.

Why teams resist this

Two reasons, both bad.

The first is that chat is cheap to ship. A single text bubble works on every device, doesn't need design specs, doesn't need a component library, and doesn't need anyone to argue about which fields to show. The team that stops at "the model returned a list, we rendered the list as a paragraph" is making a velocity choice, not a UX one. The cost shows up later as low retention on the workflows users actually came for.

The second is that chat feels like the differentiated thing. The product team is afraid that if the result set is rendered as a normal e-commerce list with normal filters, the user won't realize there's an AI involved. This is the wrong fear. The user came for the task, not for the technology. The chat surface should be the input — where the agent helps the user formulate the query, disambiguate intent, and explain trade-offs — and the structured surface should be the output, where the actual work happens. The "AI-ness" is in the quality of the query and the relevance of the facets, not in the chat bubble.

What to instrument

If you want to find out whether your product is in this trap, the cheapest diagnostic is to look at session traces and count, per session:

Median number of items in the largest single agent reply
Frequency of follow-up turns that look like manual filtering ("show me only X," "without Y," "sort by Z")
Drop-off rate after the third such filter turn
Time from first query to the user clicking on a final item

If your largest replies are over five items and your filter-by-conversation rate is non-trivial, you're shipping conversational REST and your retention is paying for it. The fix isn't a better model. It's a UI that knows when to stop being a chat.

The realization that lands

The agent's job is to translate user intent into a structured query — not to be the entire interface. Once you accept that, the architecture follows: the model emits a query and a result set, the application renders that result set as real UI, and the chat thread becomes the place where the user negotiates intent, not the place where the user does spatial work.

The teams getting this right in 2026 aren't shipping prettier chat bubbles. They're shipping products where the chat is one channel and the rendered workspace is another, where the agent fills both, and where the user never has to say "show me the next 10." They just scroll.

References:

Let's stay in touch and Follow me for more thoughts and updates

Twitter LinkedIn Telegram Discord 小红书

Conversational REST: When Your Chat UI Needs Pagination, Filters, and Sort

Chat is great for ambiguous one-shots and bad at everything else

The "show me the next 10" parody

What "structured intent" actually looks like

Faceted search is the lost ancestor here

A decision rule for when chat should stop talking

Why teams resist this

What to instrument

The realization that lands

Recommended Reading

About Tian Pan

Chat is great for ambiguous one-shots and bad at everything else​

The "show me the next 10" parody​

What "structured intent" actually looks like​

Faceted search is the lost ancestor here​

A decision rule for when chat should stop talking​

Why teams resist this​

What to instrument​

The realization that lands​

Recommended Reading

About Tian Pan

Chat is great for ambiguous one-shots and bad at everything else

The "show me the next 10" parody

What "structured intent" actually looks like

Faceted search is the lost ancestor here

A decision rule for when chat should stop talking

Why teams resist this

What to instrument

The realization that lands