Skip to main content

2 posts tagged with "vector-search"

View all tags

Memory Architectures for Production AI Agents

· 10 min read
Tian Pan
Software Engineer

Most teams add memory to their agents as an afterthought — usually after a user complains that the agent forgot something it was explicitly told three sessions ago. At that point, the fix feels obvious: store conversations somewhere and retrieve them later. But this intuition leads to systems that work in demos and fall apart in production. The gap between a memory system that stores things and one that reliably surfaces the right things at the right time is where most agent projects quietly fail.

Memory architecture is not a peripheral concern. For any agent handling multi-session interactions — customer support, coding assistants, research tools, voice interfaces — memory is the difference between a stateful assistant and a very expensive autocomplete. Getting it wrong doesn't produce crashes; it produces agents that feel subtly broken, that contradict themselves, or that confidently repeat outdated information the user corrected two weeks ago.

Beyond RAG: Hybrid Search, Agentic Retrieval, and the Database Design Decisions That Actually Matter

· 8 min read
Tian Pan
Software Engineer

Most teams ship RAG and call it a retrieval strategy. They chunk documents, embed them, store the vectors, and run nearest-neighbor search at query time. It works well enough in demos. In production, users start reporting that the system can't find an article they know exists, misses error codes verbatim in the docs, or returns semantically similar but factually wrong passages.

The problem isn't RAG. The problem is treating retrieval as a one-dimensional problem when it's always been multi-dimensional.