Skip to main content

102 posts tagged with "llm"

View all tags

Seven Patterns for Building LLM Systems That Actually Work in Production

· 10 min read
Tian Pan
Software Engineer

The demo always works. Prompt the model with a curated example, get a clean output, ship the screenshot to the stakeholder deck. Six weeks later, the system is in front of real users, and none of the demo examples appear in production traffic.

This is the gap every LLM product team eventually crosses: the jump from "it works on my inputs" to "it works on inputs I didn't anticipate." The patterns that close that gap aren't about model selection or prompt cleverness — they're about system design. Seven patterns account for most of what separates functional prototypes from reliable production systems.

Common Pitfalls When Building Generative AI Applications

· 10 min read
Tian Pan
Software Engineer

Most generative AI projects fail — not because the models are bad, but because teams make the same predictable mistakes at every layer of the stack. A 2025 industry analysis found that 42% of companies abandoned most of their AI initiatives, and 95% of generative AI pilots yielded no measurable business impact. These aren't model failures. They're engineering and product failures that teams could have avoided.

This post catalogs the pitfalls that kill AI projects most reliably — from problem selection through evaluation — with specific examples from production systems.

The Agent Evaluation Readiness Checklist

· 9 min read
Tian Pan
Software Engineer

Most teams building AI agents make the same mistake: they start with the evaluation infrastructure before they understand what failure looks like. They instrument dashboards, choose metrics, wire up graders — and then discover their evals are measuring the wrong things entirely. Six weeks in, they have a green scorecard and a broken agent.

The fix is not more tooling. It is a specific sequence of steps that grounds your evaluation in reality before you automate anything. Here is that sequence.

The Anatomy of an Agent Harness

· 9 min read
Tian Pan
Software Engineer

Most engineers building AI agents spend 80% of their time thinking about which model to use and 20% thinking about everything else. That ratio should be flipped. The model is almost interchangeable at this point — the harness is what determines whether your agent actually works in production.

The equation is simple: Agent = Model + Harness. If you're not the model, you're the harness. And the harness is where nearly all the real engineering lives.

Routines and Handoffs: The Two Primitives Behind Every Reliable Multi-Agent System

· 8 min read
Tian Pan
Software Engineer

Most multi-agent systems fail not because the models are wrong, but because the plumbing is leaky. Agents drop context mid-task, hand off to the wrong specialist, or loop indefinitely when they don't know how to exit. The underlying cause is almost always the same: the system was designed around what each agent can do, without clearly defining how work moves between them.

Two primitives fix most of this: routines and handoffs. They're deceptively simple, but getting them right is the difference between a demo that works and a system you can ship.

Building Effective AI Agents: Patterns That Actually Work in Production

· 9 min read
Tian Pan
Software Engineer

Most AI agent projects fail not because the models aren't capable enough — but because the engineers building them reach for complexity before they've earned it. After studying dozens of production deployments, a clear pattern emerges: the teams shipping reliable agents start with the simplest possible system and add complexity only when metrics demand it.

This is a guide to the mental models, patterns, and practical techniques that separate robust agentic systems from ones that hallucinate, loop, and fall apart under real workloads.