Blog

Page 53

12 articles

The 95% Reliability Illusion: Why Your 10-Step Agent Fails 40% of the Time
A ten-step agent at 95% per-step reliability succeeds end-to-end only 60% of the time. Verification placement, redundancy patterns, and shorter chains are the architectural levers that bend the curve.
ai-agentsreliability
Apr 2512 min
Why AI-Generated Comments Rot Faster Than the Code They Describe
AI agents emit a fluent docstring on every function, paraphrasing the code rather than encoding intent. The moment the code shifts, the comment lies — and the next reader trusts the lie over the code. A code-review discipline for the AI-assisted era.
insiderai-engineering
Apr 2511 min
AI Reviewing AI: The Asymmetric Architecture of Code-Review Agents
When the author and reviewer agents share a base model, code review becomes a confidence amplifier rather than a quality gate. The asymmetric architectures, multi-pass critics, and eval discipline that turn AI review into actual signal.
insidercode-review
Apr 2512 min
Your APIs Assumed One Human at a Time. Parallel Agents Broke the Contract.
Internal APIs were designed for human-paced sessions. When users spawn parallel agents, the rate limits, idempotency assumptions, audit-log schemas, and CSRF flows all break at once.
insideragents
Apr 2512 min
Your Model Update Is a Breaking Change: The Behavioral Changelog You Owe Your Integrators
When a vendor silently rolls a minor model update, every downstream prompt becomes a contract that nobody honored — what a behavioral changelog should contain, why nobody publishes one, and what consumers should instrument while waiting.
llmmodel-versioning
Apr 2512 min
The kWh Column Missing From Your Inference Span: Carbon Attribution Per Request
Sustainability disclosure is moving from corporate aggregate to product-level granularity. Engineering teams that measure cost per token but not energy per request are about to discover the dashboards they built solve the wrong problem.
insidersustainability
Apr 2510 min
Content Provenance for AI Outputs: C2PA, SynthID, and the Audit Trail You Will Soon Owe
By August 2026, a generative model's output is a signed artifact, not a string. Here's the architecture C2PA and SynthID demand — and why retrofitting it later costs more than building it now.
provenanceai-governance
Apr 2510 min
The Contestability Gap: Engineering AI Decisions Your Users Can Actually Appeal
Production AI agents turn refund denials, content removals, and verification rejections into final answers by accident. Build the durable record, the appeal endpoint, and a real second-look pipeline before regulators or angry users force you to.
insiderai-agents
Apr 2511 min
Debate Diversity Collapse: When Three Agents Vote 3-0 Because They Read the Same Internet
Multi-agent LLM panels often vote 3-0 not because the answer is right but because frontier models share priors. A practical guide to measuring debate diversity collapse and designing ensembles that actually disagree.
insidermulti-agent
Apr 2511 min
DLP Belongs in Your AI Gateway, Not Bolted Into Every App
Per-app redaction libraries always drift, fork, and get bypassed. Centralize DLP at the LLM gateway as a mandatory egress checkpoint with per-route policies and reversible vault tokens.
insiderai-gateway
Apr 2511 min
The Dual-Writer Race: When Your Agent and Your User Edit the Same Calendar Event
Agents that edit calendars, CRMs, and tickets inherit a concurrency bug class their tools were never designed for. The fix is plumbing version tokens through the tool layer.
insiderai-agents
Apr 2512 min
The Inference Budget Committee: Governance When Token Spend Crosses Seven Figures
How engineering orgs govern LLM token spend at the seven-figure threshold: capacity pools, outcome-based chargeback, and the committee that allocates them.
insiderinference
Apr 2512 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 53

The 95% Reliability Illusion: Why Your 10-Step Agent Fails 40% of the Time

Why AI-Generated Comments Rot Faster Than the Code They Describe

AI Reviewing AI: The Asymmetric Architecture of Code-Review Agents

Your APIs Assumed One Human at a Time. Parallel Agents Broke the Contract.

Your Model Update Is a Breaking Change: The Behavioral Changelog You Owe Your Integrators

The kWh Column Missing From Your Inference Span: Carbon Attribution Per Request

Content Provenance for AI Outputs: C2PA, SynthID, and the Audit Trail You Will Soon Owe

The Contestability Gap: Engineering AI Decisions Your Users Can Actually Appeal

Debate Diversity Collapse: When Three Agents Vote 3-0 Because They Read the Same Internet

DLP Belongs in Your AI Gateway, Not Bolted Into Every App

The Dual-Writer Race: When Your Agent and Your User Edit the Same Calendar Event

The Inference Budget Committee: Governance When Token Spend Crosses Seven Figures

About Tian Pan