Skip to main content

4 posts tagged with "agent-memory"

View all tags

Amortizing Context: Persistent Agent Memory vs. Long-Context Windows

· 9 min read
Tian Pan
Software Engineer

When 1 million-token context windows became commercially available, a lot of teams quietly decided they'd solved agent memory. Why build a retrieval system, manage a vector database, or design an eviction policy when you can just dump everything in and let the model sort it out? The answer comes back in your infrastructure bill. At 10,000 daily interactions with a 100k-token knowledge base, the brute-force in-context approach costs roughly $5,000/day. A retrieval-augmented memory system handling the same load costs around $333/day — a 15x gap that compounds as your user base grows.

The real problem isn't just cost. It's that longer contexts produce measurably worse answers. Research consistently shows that models lose track of information positioned in the middle of very long inputs, accuracy drops predictably when relevant evidence is buried among irrelevant chunks, and latency climbs in ways that make interactive agents feel broken. The "stuff everything in" approach doesn't just waste money — it trades accuracy for the illusion of simplicity.

Agent Memory Garbage Collection: Engineering Strategic Forgetting at Scale

· 10 min read
Tian Pan
Software Engineer

Every production agent team eventually builds the same thing: a memory store that grows without bound, retrieval that degrades silently, and a frantic sprint to add forgetting after users report that the agent is referencing their old job, a deprecated API, or a project that was cancelled three months ago. The industry has poured enormous effort into giving agents memory. The harder engineering problem — garbage collecting that memory — is where the real production reliability lives.

The parallel to software garbage collection is more than metaphorical. Agent memory systems face the same fundamental tension: you need to reclaim resources (context budget, retrieval relevance) without destroying data that's still reachable (semantically relevant to future queries). The algorithms that solve this look surprisingly similar to the ones your runtime already uses.

The Forgetting Problem: When Unbounded Agent Memory Degrades Performance

· 9 min read
Tian Pan
Software Engineer

An agent that remembers everything eventually remembers nothing useful. This sounds like a paradox, but it's the lived experience of every team that has shipped a long-running AI agent without a forgetting strategy. The memory store grows, retrieval quality degrades, and one day your agent starts confidently referencing a user's former employer, a deprecated API endpoint, or a project requirement that was abandoned six months ago.

The industry has spent enormous energy on giving agents memory. Far less attention has gone to the harder problem: teaching agents what to forget.

Graph Memory for LLM Agents: The Relational Blind Spots That Flat Vectors Miss

· 10 min read
Tian Pan
Software Engineer

A customer service agent knows that the user prefers morning delivery. It also knows the user's primary address is in Seattle. What it cannot figure out is that the Seattle address is a work address used only on weekdays, and the morning delivery window does not apply there on Mondays because of a building restriction the user mentioned three months ago. Each fact is retrievable in isolation. The relationship between them is not.

This is the failure mode that bites production agents working from flat vector stores. Each piece of information exists as an embedding floating in high-dimensional space. Similarity search retrieves facts that match a query. It does not recover the structural connections between facts — the edges that give them meaning in combination.

Most agent memory architectures are built around vector databases because they are fast, simple to set up, and work well for the majority of retrieval tasks. The failure cases are subtle enough that they often survive into production before anyone notices the pattern.