Skip to main content

4 posts tagged with "model-deprecation"

View all tags

Fine-Tune Orphan: Recovering Domain Expertise When the Base Model Is Deprecated

· 9 min read
Tian Pan
Software Engineer

On January 4, 2024, OpenAI retired the /fine-tunes endpoint. Every fine-tuned Ada, Babbage, Curie, and Davinci model stopped responding. Teams that had spent months building production systems on these models — careful prompt design, annotated datasets, labeling pipelines — woke up to HTTP 404s. The fine-tunes didn't migrate. The learned behaviors didn't transfer. The domain expertise was gone.

This wasn't a fringe edge case. Google followed in August 2024 by completely decommissioning the PaLM API, with zero backwards-compatible grace period. Unlike OpenAI, which at least let existing GPT-3.5 fine-tunes keep running while blocking new training runs, Google's shutdown meant production inference stopped the same day. If your fine-tuned PaLM model was in the critical path, you had a service outage.

The Two Clocks Problem: When Your Model Provider's Cadence Breaks Your Roadmap

· 10 min read
Tian Pan
Software Engineer

There are two clocks ticking on your AI product, and they are not synchronized. The model providers run on a roughly quarterly heartbeat — Claude Opus 4.6 in February 2026, GPT-5.4 in March, Claude Opus 4.7 in April, GPT-5.5 a week later. Your product roadmap was committed in January and does not look up again until July. Somewhere in between, a capability you spent eight engineer-weeks building gets shipped as a one-line API parameter, and nobody on the team has a process for noticing.

This is not a forecasting problem. The releases were widely telegraphed — anyone who reads the changelog could have seen each of them coming. It is a planning-artifact problem. Roadmaps were invented for a world where the platform underneath your product changed once a decade. The platform now changes once a quarter, and the artifact has not been updated to match.

The 12-Month AI Feature Cliff: Why Your Production Models Decay on a Calendar Nobody Marked

· 11 min read
Tian Pan
Software Engineer

A feature ships at 92% pass rate. The launch deck celebrates it. Twelve months later the same feature is at 78% — no incident report, no failed deploy, no single change to point at, just a slow erosion that nobody owned watching for. The team blames "hallucinations" or "user behavior shift," picks a junior engineer to investigate, and sets a quarterly OKR to "improve quality." The OKR misses. The feature ships an apologetic dialog telling users the AI sometimes makes mistakes. Six months after that, it's deprecated and replaced with a new version that ships at 91% pass rate, and the cycle starts again.

This isn't bad luck. It's the second clock that AI features run on, the one that nobody marks on the release calendar at launch. Conventional software has feature decay too — dependency drift, codebase rot, the slow accumulation of half-applied refactors — but those decay on a clock the engineering org already understands and budgets for. AI features have all of that, plus a parallel set of decay sources that conventional amortization assumptions don't model: model deprecations, vendor weight rotations, distribution shift in user inputs, prompt patches that compound, judge calibration drift, and the quiet aging of an eval set that no longer represents what production traffic looks like.

The architectural realization that has to land — before the next AI feature ships, not after — is that AI features have a non-zero baseline maintenance cost. The feature isn't done when it launches. It's enrolled in a maintenance schedule it can't escape, and the team that didn't budget for that schedule is going to discover it the hard way.

The Orphan Adapter Problem: When Your Fine-Tune Outlives Its Base Model

· 12 min read
Tian Pan
Software Engineer

A senior engineer left six months ago. She owned the classifier adapter that routes customer support tickets — a 32-rank LoRA trained on 847 hand-labeled examples, pinned to a base model that hits end-of-life in 43 days. Nobody remembers why those 847 examples were chosen over the 2,000 they started with. The training data sits in an S3 bucket whose lifecycle policy purges objects older than one year. Her laptop was wiped. The fine-tuning notebook has a cell that calls a preprocessing function she imported from her personal dotfiles repo, now private.

This is the orphan adapter — a fine-tune that outlived its maintainers, outlived its data, and is about to outlive the base model it was trained on. It sits in your production stack, routing real user traffic, and nobody left on the team can rebuild it. The deprecation email didn't create this crisis. It just exposed it.