Skip to main content

3 posts tagged with "migrations"

View all tags

Per-Vector Version Tags: The Missing Column Behind Every Embedding Migration

· 10 min read
Tian Pan
Software Engineer

A new embedding model lands. The benchmark numbers are 4% better. A staff engineer files the ticket: "Upgrade embeddings to v3." Two weeks later the index has been re-embedded, the alias has been swapped, and the team has shipped the change behind a feature flag. Six weeks later, support tickets pile up. Search results "feel off." A retro is scheduled. Nobody can explain what regressed because nothing crashed and every dashboard is green.

The problem is not the model swap. The problem is that the vector store has no idea which vectors came from which model. There is no column for it. There is no migration table tracking which records have been backfilled. There is no alembic_version row, no schema_migrations table, no pg_dump of the previous state. The team treated an embedding upgrade like a config flip, and the vector store had no schema-level concept that would have stopped them.

Embedding migrations need the same artifact that database migrations have relied on for two decades: a per-record version tag, written into every vector, queried on every read, and used as the gating criterion for cutover and rollback. It is the single column most teams forget to add, and adding it later costs more than adding it up front.

Embedding Migrations Are the New Schema Migrations

· 12 min read
Tian Pan
Software Engineer

The first time most teams swap an embedding model in production, they treat it as a batch job. Re-run the embedder, build a new index, swap the alias, deploy. Latency stays normal. Error rates stay zero. Every query returns results. And retrieval quality silently regresses for weeks before anyone notices, because the symptom is "users complain the answers feel off," not a red dashboard.

This is not a deployment problem. It is a schema migration that the team has decided to run blind. The old embedding space and the new one are different reference frames; the cosine geometry that used to mean "these two paragraphs are about the same topic" no longer means that with the same numerical confidence. Documents and queries that used to cluster together drift apart non-uniformly. Re-rankers trained on the old distribution start firing on examples that no longer match what they learned. The eval suite that scores green on pointwise relevance misses all of it, because no individual document moved very far while the entire graph rotated.

Treat the swap like a database migration and almost everything that goes wrong becomes preventable. Treat it like a batch job and the regressions arrive on a schedule that nobody owns.

Embedding Model Rotation Is a Database Migration, Not a Deploy

· 11 min read
Tian Pan
Software Engineer

Somewhere in a staging channel, an engineer writes "bumping the embedder to v3, new model scored +4 on MTEB, merging after the smoke test." Two days later support tickets start trickling in about search results that feel "weirdly off." A week later retrieval precision is down fourteen points, cosine scores have collapsed from 0.85 into the 0.65 range, and nobody can explain why — because the deploy looked identical to the last five model bumps. It wasn't a deploy. It was a database migration wearing a deploy's costume.

Embedding model rotation is the most misfiled change type in AI infrastructure. It lands in your system through the same channels as a prompt tweak or a generation-model pin update — a config file, a PR, a CI check — so it gets the governance of a config change. But under the hood, a new embedder does not produce a better version of your old vectors. It produces vectors that live in a different coordinate system entirely, where cosine similarity across the two manifolds is a category error. The correct mental model is not "rev the dependency." It is "swap the primary key encoding on a fifty-million-row table while serving reads."