Skip to main content

5 posts tagged with "rag"

View all tags

A Year of Building with LLMs: What the Field Has Actually Learned

· 9 min read
Tian Pan
Software Engineer

Most teams building with LLMs today are repeating mistakes that others made a year ago. The most expensive one is mistaking the model for the product.

After a year of LLM-powered systems shipping into production — codegen tools, document processors, customer-facing assistants, internal knowledge systems — practitioners have accumulated a body of hard-won knowledge that's very different from what the hype cycle suggests. The lessons aren't about which foundation model to choose or whether RAG beats finetuning. They're about the unglamorous work of building reliable systems: how to evaluate output, how to structure workflows, when to invest in infrastructure versus when to keep iterating on prompts, and how to think about differentiation.

This is a synthesis of what that field experience actually shows.

Beyond RAG: Hybrid Search, Agentic Retrieval, and the Database Design Decisions That Actually Matter

· 8 min read
Tian Pan
Software Engineer

Most teams ship RAG and call it a retrieval strategy. They chunk documents, embed them, store the vectors, and run nearest-neighbor search at query time. It works well enough in demos. In production, users start reporting that the system can't find an article they know exists, misses error codes verbatim in the docs, or returns semantically similar but factually wrong passages.

The problem isn't RAG. The problem is treating retrieval as a one-dimensional problem when it's always been multi-dimensional.

Hard-Won Lessons from Shipping LLM Systems to Production

· 7 min read
Tian Pan
Software Engineer

Most engineers building with LLMs share a common arc: a working demo in two days, production chaos six weeks later. The technology behaves differently under real load, with real users, against real data. The lessons that emerge aren't philosophical—they're operational.

After watching teams across companies ship (and sometimes abandon) LLM-powered products, a handful of patterns appear again and again. These aren't edge cases. They're the default experience.

Seven Patterns for Building LLM Systems That Actually Work in Production

· 10 min read
Tian Pan
Software Engineer

The demo always works. Prompt the model with a curated example, get a clean output, ship the screenshot to the stakeholder deck. Six weeks later, the system is in front of real users, and none of the demo examples appear in production traffic.

This is the gap every LLM product team eventually crosses: the jump from "it works on my inputs" to "it works on inputs I didn't anticipate." The patterns that close that gap aren't about model selection or prompt cleverness — they're about system design. Seven patterns account for most of what separates functional prototypes from reliable production systems.

Common Pitfalls When Building Generative AI Applications

· 10 min read
Tian Pan
Software Engineer

Most generative AI projects fail — not because the models are bad, but because teams make the same predictable mistakes at every layer of the stack. A 2025 industry analysis found that 42% of companies abandoned most of their AI initiatives, and 95% of generative AI pilots yielded no measurable business impact. These aren't model failures. They're engineering and product failures that teams could have avoided.

This post catalogs the pitfalls that kill AI projects most reliably — from problem selection through evaluation — with specific examples from production systems.