Skip to main content

One post tagged with "distillation"

View all tags

Distillation Is a Product Decision, Not a Research Artifact

· 10 min read
Tian Pan
Software Engineer

A frontier-model chat feature is roughly a thirty-cents-per-conversation product. The distilled variant of the same feature is roughly a third-of-a-cent-per-conversation product. These are not two implementations of one product. They are two products, with different free-tier economics, different acquisition costs, different markets, and different competitive moats. The team that ships the distilled version as "the same feature, cheaper" wastes the move.

Most engineering organizations still treat distillation as a research-team optimization that gets applied after a feature is "done" — a tail-end pass to wring inference cost out of something already spec'd against the frontier model. That framing is wrong by an order of magnitude. The choice of teacher, the choice of student, the eval suite the student is graded against, and the product surface the student is deployed to are product decisions. They determine which capabilities you are consenting to lose, which traffic shape you are designing for, and which price floor you are unlocking. Hand them to a research team to optimize against MMLU and you will ship a model that wins benchmarks the product does not care about.