Blog

Page 117

12 articles

The Instruction Complexity Cliff: Why LLMs Follow 5 Rules Reliably but Not 15
Frontier models reliably satisfy around 3 stacked constraints and forget rules buried in the middle of long prompts. Here's what the empirical data shows about instruction compliance degradation — and the design patterns that keep system prompts reliable at scale.
insiderllm
Apr 1610 min
The Jagged Frontier: Why AI Fails at Easy Things and What It Means for Your Product
AI's capability curve is jagged, not smooth — superhuman at some tasks, shockingly bad at adjacent ones. Here's how that creates invisible product traps and what to do about it.
insiderai-engineering
Apr 1610 min
The Knowledge Contamination Problem: When Your RAG System Ignores Its Own Retrieval
LLMs confidently answer from training memory even when retrieval provides better facts. Here's how to detect when a model ignores context versus when retrieval simply fails — and what to do about it.
ragllm
Apr 168 min
Knowledge Cutoff Is a Silent Production Bug
A model's training cutoff is not a documentation footnote — it is a class of time-delayed production failure that conventional monitoring cannot see. Here is how to detect it, contain it, and design around it.
llmproduction
Apr 1611 min
Live Web Grounding in Production: Why Calling a Search API Is Only the Beginning
Why 'just call a search API' produces a far worse pipeline than engineers expect — the latency math, failure modes, and architectural patterns that separate demo-quality from production-ready web grounding.
ragllm
Apr 1610 min
LLM-as-Annotator Quality Control: When the Labeler and Student Share Training Data
Using an LLM to label data for fine-tuning another LLM sounds efficient — until both models have absorbed the same internet text. Here's how shared pretraining creates systematic labeling failures, and the detection and mitigation strategies that actually work.
llmfine-tuning
Apr 1610 min
When LLMs Beat Rule-Based Systems for Data Normalization (And When They Don't)
LLMs handle the long tail of messy production data better than rules — but at a cost that surprises most teams. Here's the hybrid architecture, cost math, and validation patterns that actually hold up in production.
insiderdata-engineering
Apr 1611 min
Why LLMs Make Confident Mistakes When Analyzing Your Product Data
LLMs confidently hallucinate metrics, miss denominators, and confuse correlation with causation when analyzing behavioral data. Here's where they fail and how to use them safely.
insiderai-engineering
Apr 1611 min
The LLM Provider Incident Runbook: Staying Up When Your AI Stack Goes Down
When your LLM provider goes down, you have minutes to decide. An operational playbook for multi-provider failover, graceful degradation, and user communication that keeps your product standing.
llmproduction
Apr 1611 min
LLM Rate Limits Are a Distributed Systems Problem
LLM API rate limits behave like distributed locks — batch jobs silently starve user-facing flows through starvation, head-of-line blocking, and priority inversion, all while your error dashboards stay green.
llmdistributed-systems
Apr 1611 min
The Hidden Switching Costs of LLM Vendor Lock-In
Beyond API compatibility, the real switching costs of changing LLM providers live in prompt rewrites, eval rebuilds, and embedding re-indexing — a map of what survives a model swap and what doesn't.
insiderllm
Apr 1611 min
The Magic Moment Problem: Why AI Feature Onboarding Fails and How to Fix It
The first five minutes determine whether users keep using your AI feature. Here's the engineering behind onboarding flows that actually convert skeptics.
aiproduct
Apr 1610 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 117

The Instruction Complexity Cliff: Why LLMs Follow 5 Rules Reliably but Not 15

The Jagged Frontier: Why AI Fails at Easy Things and What It Means for Your Product

The Knowledge Contamination Problem: When Your RAG System Ignores Its Own Retrieval

Knowledge Cutoff Is a Silent Production Bug

Live Web Grounding in Production: Why Calling a Search API Is Only the Beginning

LLM-as-Annotator Quality Control: When the Labeler and Student Share Training Data

When LLMs Beat Rule-Based Systems for Data Normalization (And When They Don't)

Why LLMs Make Confident Mistakes When Analyzing Your Product Data

The LLM Provider Incident Runbook: Staying Up When Your AI Stack Goes Down

LLM Rate Limits Are a Distributed Systems Problem

The Hidden Switching Costs of LLM Vendor Lock-In

The Magic Moment Problem: Why AI Feature Onboarding Fails and How to Fix It

About Tian Pan