Skip to main content

2 posts tagged with "databases"

View all tags

Why Your Agent Needs a Read Replica: Read/Write Splitting for Agent Memory

· 10 min read
Tian Pan
Software Engineer

Most agent memory is one undifferentiated store. The loop reads from it to assemble context at the start of every step, and writes to it after every action — new observations, running summaries, scratchpad edits. Same store, same access path, no separation. It works fine in a demo and starts to rot the moment the agent runs long enough for the store to get large.

The reason it rots is familiar to anyone who has scaled a database. A single store that serves both reads and writes is a single-primary database with no replica, and it inherits every problem that topology has under load: writes contend with reads, a half-written record gets read mid-update, and there is no isolation between the volatile working set and the durable record. We solved this for databases decades ago by splitting reads from writes. Agent memory deserves the same treatment.

The fix is not a bigger vector index or a smarter embedding model. It is an architectural one — recognizing that "memory" is two different workloads wearing the same name, and giving each the storage discipline it actually needs.

Text-to-SQL in Production: Why Correct SQL Is the Easy Part

· 10 min read
Tian Pan
Software Engineer

GPT-4o scores 86.6% on the Spider benchmark. Deploy it against your actual data warehouse and you might get 10%. That gap is not a rounding error—it is the entire problem. The queries that make up the missing 76% execute without errors, return rows with the correct schema, and are completely wrong.

Text-to-SQL is not a syntax problem. Every serious production deployment discovers the same uncomfortable truth: the hard failures are silent ones. A query that scans a 10TB Snowflake table, returns revenue figures that are 30% too high due to a duplicated join, or quietly bypasses row-level security looks identical to a correct query from the outside. It finishes, it returns data, and nobody flags it.

This post covers the failure modes that actually bite teams in production, and the layered architecture that prevents them.