4 posts tagged with "agent-design"

When Your Agents Disagree: Conflict Resolution Patterns for Parallel AI Systems

May 2, 2026 · 9 min read

Software Engineer

Here is the uncomfortable fact that multi-agent system designs rarely surface in architecture reviews: when you run two agents over the same task, they will not agree on the answer somewhere between 20% and 40% of the time, depending on task type. Most systems respond to this by silently picking one answer. The logs show a final decision; the intermediate disagreement disappears. Everything looks healthy until something downstream breaks, and you spend three to five times longer debugging it than you would a single-agent failure — because you can't tell which agent was wrong, or even that they disagreed at all.

Disagreement between agents is not a fringe case to handle later. As parallel agent topologies become a standard architecture pattern, conflict resolution graduates from a footnote into a first-class reliability discipline.

"Done!" Is Not a Return Code: Why Agent Completion Needs a Structured Signal

April 23, 2026 · 10 min read

Tian Pan

Software Engineer

An agent ends its turn with "All done — let me know if you want any changes!" and your orchestrator has to decide whether to mark the ticket resolved, kick off the next handoff, or retry. That sentence is not a return code. It is a polite closing line trained to sound reassuring at the end of a chat, and every line of automation downstream of it inherits the ambiguity. The teams that treat this as a parsing problem write regexes that catch \b(done|complete|finished)\b and call it a day. The teams whose agents run in production eventually learn that completion is an event, not a mood.

The failure mode is bimodal and boring. Either the agent announces done when it isn't — premature termination — and the orchestrator happily advances the workflow on a half-finished artifact. Or the agent is actually done, but phrases it in a way that doesn't match the detector ("I went ahead and landed the change, though the test for the edge case is still flaky"), and the orchestrator spins up a retry that re-does the work, duplicates the side effect, and sometimes contradicts the successful first pass. Both modes degrade silently. Neither shows up in a dashboard until someone reads a trace and notices that the agent said "I think that covers it" and the billing system treated that as a commit.

The fix is not smarter parsing. It is giving the agent a structured way to terminate — a done-tool with an enumerated status, a reason code, and a handle your pipeline can route on — and changing the orchestrator to wait for that event instead of listening to the chat stream.

The Intent Gap: When Your LLM Answers the Wrong Question Perfectly

April 10, 2026 · 9 min read

Tian Pan

Software Engineer

Intent misalignment is the single largest failure category in production LLM systems — responsible for 32% of all dissatisfactory responses, according to a large-scale analysis of real user interactions. It's not hallucination, not refusal, not format errors. It's models answering a question correctly while missing entirely what the user actually needed.

This is the intent gap: the distance between what a user says and what they mean. It's invisible to most eval suites, invisible to error logs, and invisible to the users themselves until they've wasted enough cycles to realize the output was technically right but practically useless.

Routines and Handoffs: The Two Primitives Behind Every Reliable Multi-Agent System

September 18, 2025 · 8 min read

Tian Pan

Software Engineer

Most multi-agent systems fail not because the models are wrong, but because the plumbing is leaky. Agents drop context mid-task, hand off to the wrong specialist, or loop indefinitely when they don't know how to exit. The underlying cause is almost always the same: the system was designed around what each agent can do, without clearly defining how work moves between them.

Two primitives fix most of this: routines and handoffs. They're deceptively simple, but getting them right is the difference between a demo that works and a system you can ship.

About Tian Pan