Blog

Page 125

12 articles

The Model EOL Clock: Treating Provider LLMs as External Dependencies
Every pinned model version has a deprecation date you don't control. Here's how to treat provider LLMs as external dependencies with behavioral regression suites, EOL runbooks, and migration test harnesses baked in before the notice arrives.
insiderllm
Apr 1511 min
Model Routing Is a System Design Problem, Not a Config Option
Treating LLM selection as a runtime dispatch decision — not a deployment constant — unlocks real cost savings. Here's how to think about routing signals, fallback failure modes, shadow routing, and the cost accounting that most teams skip.
insiderllm
Apr 1511 min
Multi-Model Consistency: When Your Pipeline's Sequential LLM Calls Contradict Each Other
Three LLM calls in a single workflow can produce conflicting facts, entity references, and state claims. Here's how to design pipelines that stay coherent.
ai-engineeringllm
Apr 159 min
Multi-Session Eval Design: Catching the AI Feature That Gets Worse Over Time
Single-turn evals miss the class of AI failure that emerges only after state accumulates. How to design a multi-session eval harness, decay curves, and regression methodology that catch quality rot before users churn.
ai-engineeringevaluation
Apr 1511 min
Multi-User Shared Agent State: The Concurrency Primitives You Actually Need
Most agent designs assume one user per session. Shared workspaces need distributed systems primitives to prevent silent data corruption when concurrent users give contradictory instructions.
ai-engineeringagents
Apr 1511 min
Multimodal Pipelines in Production: What Breaks When You Go Beyond Text
Going multimodal in production means confronting a new class of failures: silent image rejections, PDF table misalignment, audio latency budgets, and cross-modal hallucination that text evals never surface.
multimodalllm
Apr 1511 min
The Noisy Neighbor Problem in Shared LLM Infrastructure: Tenancy Models for AI Features
When one feature's batch job eats the shared API quota, paying users see 429s. Detection signals and isolation patterns for shared LLM infrastructure.
insiderllm
Apr 1512 min
PII in the Prompt Layer: The Privacy Engineering Gap Most Teams Ignore
How personally identifiable information flows uncontrolled into LLM inference calls, and the masking, tokenization, and logging architectures that close the compliance gap.
insiderprivacy
Apr 1512 min
Pricing Your AI Product: Escaping the Compute Cost Trap
Traditional SaaS pricing assumes near-zero marginal cost per user. LLM features break that assumption — tokens can consume 20–40% of gross margin. Here's how to build a pricing architecture that survives.
insiderai
Apr 1510 min
Proactive Agents: Event-Driven and Scheduled Automation for Background AI
Most agent design literature assumes a human triggers execution. Production AI increasingly runs in the background — on schedules, change events, and system state transitions. Here's what that changes architecturally.
insiderai-engineering
Apr 1511 min
Prompt Canary Deployments: Ship Prompt Changes Like a Senior SRE
Prompt edits are as dangerous as code deploys — but almost nobody treats them that way. Here's the traffic-splitting, quality-monitoring, and rollback discipline that separates teams that catch regressions before users do from teams that find out on Twitter.
llmopsprompt-engineering
Apr 1510 min
Prompt Diff Review as a Discipline: What Reviewers Actually Need to Ask
Traditional code review instincts don't map to prompt edits. Here's the checklist, the tooling, and the reviewer-author dialog that turn a prompt PR into a behavioral contract.
prompt-engineeringllm
Apr 1511 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 125

The Model EOL Clock: Treating Provider LLMs as External Dependencies

Model Routing Is a System Design Problem, Not a Config Option

Multi-Model Consistency: When Your Pipeline's Sequential LLM Calls Contradict Each Other

Multi-Session Eval Design: Catching the AI Feature That Gets Worse Over Time

Multi-User Shared Agent State: The Concurrency Primitives You Actually Need

Multimodal Pipelines in Production: What Breaks When You Go Beyond Text

The Noisy Neighbor Problem in Shared LLM Infrastructure: Tenancy Models for AI Features

PII in the Prompt Layer: The Privacy Engineering Gap Most Teams Ignore

Pricing Your AI Product: Escaping the Compute Cost Trap

Proactive Agents: Event-Driven and Scheduled Automation for Background AI

Prompt Canary Deployments: Ship Prompt Changes Like a Senior SRE

Prompt Diff Review as a Discipline: What Reviewers Actually Need to Ask

About Tian Pan