Blog

Page 93

12 articles

What Your Vendor's Model Card Doesn't Tell You
Model cards report average benchmark scores. They omit tail behavior, system-prompt interaction effects, cultural blind spots, and the silent regressions that break production systems. Here's what teams are building instead.
insiderllm
Apr 1910 min
Vibe Code at Scale: Managing Technical Debt When AI Writes Most of Your Codebase
AI-generated code looks plausible but harbors systematic defects that compound into crisis-level technical debt by month 12-18. Here are the engineering practices that actually prevent it.
ai-engineeringtechnical-debt
Apr 199 min
The Vibe Coding Productivity Plateau: Why AI Speed Gains Reverse After Month Three
93% of developers use AI coding assistants, but productivity gains have stalled at 10%. Here's the compounding failure mode that turns early velocity wins into long-term drag — and the practices that prevent it.
ai-engineeringproductivity
Apr 198 min
When Workflow Engines Beat LLM Agents: A Decision Framework for Deterministic Orchestration
Gartner predicts 40% of agentic AI projects will be canceled by 2027. Before defaulting to an autonomous LLM agent, here is the framework for choosing deterministic orchestrators instead.
agent-architectureworkflow-orchestration
Apr 199 min
A/B Testing AI Features When the Treatment Is Non-Deterministic
Standard A/B testing breaks down when your treatment is an LLM — outputs vary per call, model updates ship mid-experiment, and 'success' resists clean operationalization. Here are the statistical adjustments and experiment patterns that produce trustworthy results anyway.
experimentationllm
Apr 1810 min
Agent Protocol Fragmentation: Designing for A2A, MCP, and What Comes Next
Most teams picking an agent protocol are making three separate decisions at once. A practical breakdown of how MCP, A2A, and OpenAPI solve different layers of the agent stack — and how to design your interface layer to avoid costly refactors.
insiderai-agents
Apr 189 min
The Cascade Problem: Why Agent Side Effects Explode at Scale
Agents that pass every unit test in isolation cause cascading side effects when deployed at scale. Here's the engineering taxonomy and the patterns that actually prevent it.
insiderai-engineering
Apr 1812 min
The Agent Specification Gap: Why Your Agents Ignore What You Write
Specification failures account for 42% of multi-agent system breakdowns in production. Here's why the gap between what you write and what agents interpret is bigger than you think — and the structured spec format that closes it.
insiderai-engineering
Apr 1812 min
AI as a CI/CD Gate: What Agents Can and Cannot Reliably Block
AI agents are increasingly blocking merges in CI/CD pipelines, but the cases where they add real signal are narrow. A guide to the trust model, integration architecture, and how to avoid building a rubber stamp that slows releases without catching regressions.
AI EngineeringCI/CD
Apr 189 min
AI Coding Agents on Legacy Codebases: What Works and What Backfires
AI coding agents produce plausible-looking but semantically wrong changes on legacy codebases. A breakdown of which task types transfer safely, where agents silently break implicit contracts, and the characterization-test-first pattern that makes agent-assisted refactoring reliable.
ai-engineeringlegacy-systems
Apr 1810 min
AI Coding Agents on Legacy Codebases: Why They Fail Where You Need Them Most
AI coding agents ace greenfield benchmarks but routinely break legacy systems in subtle, hard-to-catch ways. Here's what goes wrong and how to make them safer on mature codebases.
insiderai
Apr 189 min
AI Content Provenance in Production: C2PA, Audit Trails, and the Compliance Deadline Engineers Are Missing
C2PA gives you cryptographic proof of who signed AI-generated content and when. But it doesn't survive your CDN, doesn't satisfy the EU AI Act alone, and won't tell you whether the content is truthful. Here's what production provenance actually looks like.
aiproduction
Apr 1812 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 93

What Your Vendor's Model Card Doesn't Tell You

Vibe Code at Scale: Managing Technical Debt When AI Writes Most of Your Codebase

The Vibe Coding Productivity Plateau: Why AI Speed Gains Reverse After Month Three

When Workflow Engines Beat LLM Agents: A Decision Framework for Deterministic Orchestration

A/B Testing AI Features When the Treatment Is Non-Deterministic

Agent Protocol Fragmentation: Designing for A2A, MCP, and What Comes Next

The Cascade Problem: Why Agent Side Effects Explode at Scale

The Agent Specification Gap: Why Your Agents Ignore What You Write

AI as a CI/CD Gate: What Agents Can and Cannot Reliably Block

AI Coding Agents on Legacy Codebases: What Works and What Backfires

AI Coding Agents on Legacy Codebases: Why They Fail Where You Need Them Most

AI Content Provenance in Production: C2PA, Audit Trails, and the Compliance Deadline Engineers Are Missing

About Tian Pan