Skip to main content

4 posts tagged with "lora"

View all tags

The Orphan Adapter Problem: When Your Fine-Tune Outlives Its Base Model

· 12 min read
Tian Pan
Software Engineer

A senior engineer left six months ago. She owned the classifier adapter that routes customer support tickets — a 32-rank LoRA trained on 847 hand-labeled examples, pinned to a base model that hits end-of-life in 43 days. Nobody remembers why those 847 examples were chosen over the 2,000 they started with. The training data sits in an S3 bucket whose lifecycle policy purges objects older than one year. Her laptop was wiped. The fine-tuning notebook has a cell that calls a preprocessing function she imported from her personal dotfiles repo, now private.

This is the orphan adapter — a fine-tune that outlived its maintainers, outlived its data, and is about to outlive the base model it was trained on. It sits in your production stack, routing real user traffic, and nobody left on the team can rebuild it. The deprecation email didn't create this crisis. It just exposed it.

LoRA Adapter Composition in Production: Running Multiple Fine-Tuned Skills Without Model Wars

· 9 min read
Tian Pan
Software Engineer

The promise sounds clean: fine-tune lightweight LoRA adapters for each specialized skill — one for professional tone, one for JSON formatting, one for medical terminology, one for safety guardrails — then combine them at serving time. Teams ship this design, it works fine in development, and then falls apart in production when two adapters start fighting over the same weight regions and the output quality collapses to something indistinguishable from the untrained base model. Not slightly worse. Completely untuned.

This post is about what happens when you compose adapters in practice, why naive merging fails so reliably, and what strategies actually work at production scale.

The Adapter Compatibility Cliff: When Your Fine-Tune Meets the New Base Model

· 11 min read
Tian Pan
Software Engineer

Fine-tuning a language model gives you a competitive edge until the provider updates the base model underneath your adapter. At that point, one of two things happens: your service crashes with a shape mismatch error, or — far more dangerously — it silently starts returning degraded outputs while your monitoring shows nothing unusual. Most teams discover the second scenario only when users start complaining that "the AI got dumber."

This is the adapter compatibility cliff. You trained a LoRA adapter on model version N. The provider shipped version N+1. Your adapter is now running on a foundation it was never designed for, and there is no migration path.

Fine-Tuning Economics: The Real Cost Calculation Before You Commit

· 10 min read
Tian Pan
Software Engineer

Most engineers underestimate fine-tuning costs by a factor of three to five. The training run is the smallest part of the bill. Data curation, failed experiments, deployment infrastructure, and ongoing model maintenance are where budgets actually go. Teams that skip this math end up months into a fine-tuning project before realizing that a well-engineered prompt with few-shot examples would have solved the problem in a week.

This post walks through the complete economics — what fine-tuning actually costs across its full lifecycle, when LoRA and PEFT make the math work, and a decision framework for choosing between fine-tuning and prompt engineering based on real production numbers.