Blog

Page 70

12 articles

A/B Testing AI Features When the Treatment Is Non-Deterministic
Standard A/B testing breaks down when your treatment is an LLM — outputs vary per call, model updates ship mid-experiment, and 'success' resists clean operationalization. Here are the statistical adjustments and experiment patterns that produce trustworthy results anyway.
experimentationllm
Apr 1810 min
Agent Protocol Fragmentation: Designing for A2A, MCP, and What Comes Next
Most teams picking an agent protocol are making three separate decisions at once. A practical breakdown of how MCP, A2A, and OpenAPI solve different layers of the agent stack — and how to design your interface layer to avoid costly refactors.
insiderai-agents
Apr 189 min
The Cascade Problem: Why Agent Side Effects Explode at Scale
Agents that pass every unit test in isolation cause cascading side effects when deployed at scale. Here's the engineering taxonomy and the patterns that actually prevent it.
insiderai-engineering
Apr 1812 min
The Agent Specification Gap: Why Your Agents Ignore What You Write
Specification failures account for 42% of multi-agent system breakdowns in production. Here's why the gap between what you write and what agents interpret is bigger than you think — and the structured spec format that closes it.
insiderai-engineering
Apr 1812 min
AI as a CI/CD Gate: What Agents Can and Cannot Reliably Block
AI agents are increasingly blocking merges in CI/CD pipelines, but the cases where they add real signal are narrow. A guide to the trust model, integration architecture, and how to avoid building a rubber stamp that slows releases without catching regressions.
AI EngineeringCI/CD
Apr 189 min
AI Coding Agents on Legacy Codebases: What Works and What Backfires
AI coding agents produce plausible-looking but semantically wrong changes on legacy codebases. A breakdown of which task types transfer safely, where agents silently break implicit contracts, and the characterization-test-first pattern that makes agent-assisted refactoring reliable.
ai-engineeringlegacy-systems
Apr 1810 min
AI Coding Agents on Legacy Codebases: Why They Fail Where You Need Them Most
AI coding agents ace greenfield benchmarks but routinely break legacy systems in subtle, hard-to-catch ways. Here's what goes wrong and how to make them safer on mature codebases.
insiderai
Apr 189 min
AI Content Provenance in Production: C2PA, Audit Trails, and the Compliance Deadline Engineers Are Missing
C2PA gives you cryptographic proof of who signed AI-generated content and when. But it doesn't survive your CDN, doesn't satisfy the EU AI Act alone, and won't tell you whether the content is truthful. Here's what production provenance actually looks like.
aiproduction
Apr 1812 min
Why Users Ignore the AI Feature You Spent Three Months Building
AI features fail not because the model is bad but because users never discover them, trust them, or develop the habit of reaching for them. Here's how to fix that.
insiderai engineering
Apr 1810 min
When Your AI Feature Ages Out: Knowledge Cutoffs and Temporal Grounding in Production
Products built on models with a fixed training cutoff break as the world diverges from training data. Here's how to detect cutoff-induced failures, manage RAG freshness, and design for temporal drift before it becomes a silent production regression.
insiderai-engineering
Apr 1810 min
The AI Feature Maintenance Cliff: Why Your AI-Powered Features Age Faster Than You Think
AI features don't just degrade — they degrade silently. Prompt drift, model updates, and distribution shift conspire to erode AI quality in production, and the dashboards stay green the whole time.
llmproduction
Apr 189 min
The AI Feature Retirement Playbook: How to Sunset What Users Barely Adopted
Most engineering teams know how to ship AI features. Almost none have a plan for retiring them. Here's the playbook for knowing when to quit and how to do it without burning users or accumulating compliance debt.
aiproduct
Apr 1811 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 70

A/B Testing AI Features When the Treatment Is Non-Deterministic

Agent Protocol Fragmentation: Designing for A2A, MCP, and What Comes Next

The Cascade Problem: Why Agent Side Effects Explode at Scale

The Agent Specification Gap: Why Your Agents Ignore What You Write

AI as a CI/CD Gate: What Agents Can and Cannot Reliably Block

AI Coding Agents on Legacy Codebases: What Works and What Backfires

AI Coding Agents on Legacy Codebases: Why They Fail Where You Need Them Most

AI Content Provenance in Production: C2PA, Audit Trails, and the Compliance Deadline Engineers Are Missing

Why Users Ignore the AI Feature You Spent Three Months Building

When Your AI Feature Ages Out: Knowledge Cutoffs and Temporal Grounding in Production

The AI Feature Maintenance Cliff: Why Your AI-Powered Features Age Faster Than You Think

The AI Feature Retirement Playbook: How to Sunset What Users Barely Adopted

About Tian Pan