Blog

Page 45

12 articles

Agent Memory Drift: Why Reconciliation Is the Loop You're Missing
Long-running agents drift from the world the moment they stop watching. Treat memory like a database replica: watermarks, change feeds, and lazy revalidation.
insiderai-agents
Apr 2611 min
Agent SLOs Without Ground Truth: An Error Budget for Outputs You Can't Grade in Real Time
Classic SRE practice gives you uptime and latency targets that map cleanly to user happiness. Agentic features break that mapping. Here's how to write an error budget when 'success' arrives hours after the request — and why the team that copies the latency-SLO playbook will meet every quarterly target while users churn.
insiderai-engineering
Apr 2611 min
Where the 30 Seconds Went: Latency Attribution Inside an Agent Step Your APM Can't See
Classical APM treats an agent step as one fat span and leaves on-call engineers guessing. Decompose it into seven phases, separate prefill from decode, and chase the critical path instead of total span time.
insiderobservability
Apr 2611 min
Agent Traffic Is Not Human Traffic: Designing APIs for Two Species of Caller
Production APIs are now serving two species of caller — humans and agents — with different traffic physics, failure modes, and threat profiles. Treating them as one is the source of every flaky-endpoint investigation in 2026.
api-designai-agents
Apr 2611 min
The Agent Undo Button Is a Saga, Not a Stack
Multi-tool agent undo is a saga-pattern problem in disguise. Pre-computed inverses, residue UX, and cascade caps decide whether reversal succeeds or silently fails 40% of the time.
insiderai-agents
Apr 2610 min
The Carbon Math of Agent Workflows: A Token Budget Is Now an ESG Disclosure
Agent workflows can burn 50–200x the energy of a single chat completion, and procurement teams have started asking. A pragmatic guide to per-task carbon attribution, the routing decisions a carbon budget forces, and why the team that instruments first wins the room.
insiderai-agents
Apr 2610 min
AI Cyber Insurance: The Coverage Gap Your Agent Will Find First
Most cyber and E&O policies were written for breaches and bugs, not agents acting under your credentials. The coverage gap shows up at claim time, when nobody planned for it.
insiderai-engineering
Apr 2611 min
The AI Engineer Interview Is Broken: Stop Testing Implementation, Start Probing Eval-Design
Leetcode screens and system-design rounds were calibrated on engineers writing deterministic code. AI engineering needs a different signal — the round that detects it is eval-design, not implementation.
hiringai-engineering
Apr 2610 min
Why Deprecating an AI Feature Is Harder Than You Think: Users Built Trust Scaffolding You Can't See
Sunsetting an AI feature is not like sunsetting an API. The contract is the model's observed behavior, and users build invisible scaffolding on top of it that breaks on cutover.
insiderai-engineering
Apr 2610 min
The AI Feature OKR Mismatch: Why Quarterly Cadence Breaks AI Roadmaps
Quarterly OKRs were calibrated for deterministic software. AI features have launch curves and sustain curves, and the template that treats them as deliverables produces demos that decay between planning cycles.
ai-engineeringengineering-leadership
Apr 2610 min
The AI Feature RACI: Why Four Green Dashboards Add Up to a Broken Product
Every production AI feature has four artifact owners and zero owners for the integrated user experience. That gap is where seam bugs live — and the org-design fix that closes it.
ai-engineeringteam-structure
Apr 2611 min
The AI Feature You Should Not Have Shipped: A Task-Shape Checklist
Most demos work. A meaningful fraction of shipped AI features are still task-shape mismatched — stochastic engines wired into deterministic-required outputs. A pre-build checklist and the roadmap pathway you need to redirect ideas that are not model-shaped.
ai-producteval
Apr 2610 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 45

Agent Memory Drift: Why Reconciliation Is the Loop You're Missing

Agent SLOs Without Ground Truth: An Error Budget for Outputs You Can't Grade in Real Time

Where the 30 Seconds Went: Latency Attribution Inside an Agent Step Your APM Can't See

Agent Traffic Is Not Human Traffic: Designing APIs for Two Species of Caller

The Agent Undo Button Is a Saga, Not a Stack

The Carbon Math of Agent Workflows: A Token Budget Is Now an ESG Disclosure

AI Cyber Insurance: The Coverage Gap Your Agent Will Find First

The AI Engineer Interview Is Broken: Stop Testing Implementation, Start Probing Eval-Design

Why Deprecating an AI Feature Is Harder Than You Think: Users Built Trust Scaffolding You Can't See

The AI Feature OKR Mismatch: Why Quarterly Cadence Breaks AI Roadmaps

The AI Feature RACI: Why Four Green Dashboards Add Up to a Broken Product

The AI Feature You Should Not Have Shipped: A Task-Shape Checklist

About Tian Pan