Blog

Page 119

12 articles

The Pretraining Shadow: The Hidden Constraint Your Fine-Tuning Plan Ignores
Fine-tuning changes how a model talks, not what it fundamentally knows or believes. Here's what the research says about the ceiling practitioners keep hitting — and how to build around it.
llmfine-tuning
Apr 169 min
Pricing AI Features: The Unit Economics Framework Engineering Teams Always Skip
Variable inference costs break fixed-price SaaS assumptions. A practical framework for per-workflow cost modeling, heavy-user subsidy math, and consumption cap design that preserves margin as usage scales.
insiderai
Apr 1611 min
Prompt Cache Break-Even: The Exact Math on When Provider-Side Prefix Caching Actually Pays Off
Prompt caching advertises a 90% discount on cache hits, but the write premium means low hit rates cost you more than no caching at all. Here's the exact math and the session architecture decisions that determine whether you capture the discount.
llmcost-optimization
Apr 169 min
Prompt Canaries: The Deployment Primitive Your AI Team Is Missing
Code canary deployments catch crashes and latency regressions — but they're blind to the behavioral failures that actually hurt LLM systems. Here's the metric stack, deployment manifest pattern, and auto-rollback design that closes the gap.
insiderllm
Apr 1610 min
Prompt Injection Detection at 100,000 Requests Per Day: Why Simple Defenses Break and What Actually Works
Static filters and LLM-as-judge approaches both fail at high throughput. Here's the layered classifier architecture that actually catches prompt injections under a 200ms latency budget.
securityllm
Apr 1611 min
The Prompt-Model Coupling Trap: Why Your Prompts Only Speak One Model's Dialect
Carefully tuned prompts silently accumulate dependencies on specific model behaviors — JSON formatting quirks, instruction hierarchy, refusal thresholds — that break on migration day. How to build a portability test harness and write lower-coupling prompts.
prompt-engineeringllm
Apr 1610 min
Property-Based Testing for LLM Outputs: Finding the Bugs Your Eval Set Never Imagined
Curated eval sets encode only the failure modes you imagined. Property-based testing generates thousands of adversarial input variants to find the bugs at domain boundaries your test suite structurally cannot reach.
testingllm
Apr 1611 min
Poisoned at the Source: RAG Corpus Decay and Data Governance for Vector Stores
Production RAG systems silently degrade as corpora accumulate stale chunks, conflicting facts, and adversarially-crafted content. Here's how to treat your retrieval layer as infrastructure — with TTL design, ingest-time conflict detection, and access control patterns that keep it trustworthy.
ragvector-databases
Apr 1611 min
The RAG Eval Antipattern That Hides Retriever Bugs
Most teams evaluate RAG systems end-to-end, letting the generator mask retrieval failures. Here's how to build a retriever-only eval harness that surfaces bugs before they compound.
insiderrag
Apr 1610 min
Schema-First AI Development: Define Output Contracts Before You Write Prompts
Naive JSON prompting fails 15–20% of the time in production. Schema-first development — defining output contracts before writing prompts — cuts that to near zero, and the approach is now the right default for every automated LLM pipeline.
insiderllm
Apr 169 min
The Schema Problem: Taming LLM Output in Production
Structured outputs from LLMs feel solved until version drift, optional fields, and downstream parsers collide. A practical framework for versioning and validating LLM output contracts so a model upgrade never silently corrupts your data pipeline.
llmproduction
Apr 169 min
The Discovery Problem: Why Semantic Search Fails Browsing Users
Embedding-based retrieval optimizes for users who know what they want. It quietly fails everyone else — here's how to detect browsing intent and fix your ranking strategy.
searchretrieval
Apr 169 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 119

The Pretraining Shadow: The Hidden Constraint Your Fine-Tuning Plan Ignores

Pricing AI Features: The Unit Economics Framework Engineering Teams Always Skip

Prompt Cache Break-Even: The Exact Math on When Provider-Side Prefix Caching Actually Pays Off

Prompt Canaries: The Deployment Primitive Your AI Team Is Missing

Prompt Injection Detection at 100,000 Requests Per Day: Why Simple Defenses Break and What Actually Works

The Prompt-Model Coupling Trap: Why Your Prompts Only Speak One Model's Dialect

Property-Based Testing for LLM Outputs: Finding the Bugs Your Eval Set Never Imagined

Poisoned at the Source: RAG Corpus Decay and Data Governance for Vector Stores

The RAG Eval Antipattern That Hides Retriever Bugs

Schema-First AI Development: Define Output Contracts Before You Write Prompts

The Schema Problem: Taming LLM Output in Production

The Discovery Problem: Why Semantic Search Fails Browsing Users

About Tian Pan