Skip to main content

4 posts tagged with "platform"

View all tags

The AI Feature Sunset Playbook Nobody Writes

· 13 min read
Tian Pan
Software Engineer

Every AI org has a graveyard. Not of services — those get a runbook, a deprecation banner, a 30-day migration window, and a slot on the platform team's quarterly roadmap. The graveyard is of features: the smart-summary beta that never graduated, the auto-categorizer that two enterprise customers actually built workflows around, the agentic flow that demoed beautifully and shipped behind a flag that nobody flipped off. The endpoint is easy to deprecate. The four other things attached to it — the prompt, the judge, the regression set, and the incident memory — are what actually take a quarter, and nobody on the team has written the playbook because nobody has been promoted for retiring something.

This is the gap. Most of the public discourse on "model deprecation" is about vendor-side retirements: GPT-4o leaves on a date, Assistants API beta sunsets on August 26, DALL-E 3 retires on May 12, and your platform team has a notification period to migrate. That problem has playbooks because vendors publish dates, because the migration is forced, and because the work fits in a sprint. The internal version — when you decide a feature you built didn't graduate, and you have to actually take it out — has none of those forcing functions. The deprecation date is whatever you say it is. The migration path is whatever you build. And the artifacts you have to retire are not a single endpoint but a tangled stack of model-adjacent assets that your monitoring barely knows exist.

AI Feature Dependency Graphs: When a Prompt Edit Is a Silent Breaking Change

· 12 min read
Tian Pan
Software Engineer

A team owns a summarizer. Another team owns the search ranker that ingests those summaries. A third team owns a router that picks between agent personalities based on the ranker's confidence score. None of these teams have a shared on-call rotation, none of them sit in the same standup, and the only contract between them is "the previous feature's output is the next feature's input." On a Tuesday, the summarizer team tightens a prompt to fix a hallucination complaint from a sales demo. The search ranker's quality collapses six hours later. The router starts handing off to the wrong agent personality by Wednesday morning. The post-mortem will record the cause as "prompt change," but the actual cause is that the team's AI features have quietly composed into a directed graph that nobody drew.

This is the most common shape of an AI outage that doesn't trip any of the alerts you built for AI outages. The model isn't down. The eval suite for the changed feature is green. The token cost line is flat. What broke is the interface between two features, which is a thing your dependency tooling treats as plain text because that's all it is at the API boundary — and treats as inert because plain text doesn't carry a version, a schema, or a deprecation policy.

The Eval Harness, Not the Prompt, Is Your Real Provider Lock-In

· 10 min read
Tian Pan
Software Engineer

Every "we'll just swap providers if needed" plan in the deck has a budget line for prompt rewrites. None of them has a line for the eval suite. That is the bug. The prompts are the visible coupling — the part you wrote, the part you can grep for, the part a junior engineer can rewrite in an afternoon. The eval harness is the invisible coupling, and it is the one that will eat a quarter of your roadmap when you actually try to migrate.

The pattern shows up the moment leverage matters. Your contract is up. A competitor releases a model that benchmarks better on your domain. Pricing on output tokens shifts under you. You go to run the candidate model through your eval suite to make the call, and within a day you discover that you cannot trust any score the harness produces, because the harness itself was written against the incumbent. You are not comparing models. You are comparing one model against a measurement instrument that was calibrated to the other one.

Building a Generative AI Platform: Architecture, Trade-offs, and the Components That Actually Matter

· 12 min read
Tian Pan
Software Engineer

Most teams treating their GenAI stack as a model integration project eventually discover they've actually built—or need to build—a platform. The model is the easy part. The hard part is everything around it: routing queries to the right model, retrieving context reliably, filtering unsafe outputs, caching redundant calls, tracing what went wrong in a chain of five LLM calls, and keeping costs from tripling month-over-month as usage scales.

This article is about that platform layer. Not the model weights, not the prompts—the surrounding infrastructure that separates a working proof of concept from something you'd trust to serve a million users.