Blog

Page 39

12 articles

The Agent Degraded-Mode Spec Is the Document You Didn't Write
Most production agents have a degraded-mode spec — it just lives in scattered catch blocks, untested, and the customer writes the public version of it on the next bad day.
agentsreliability
Apr 2711 min
Agent Disaster Recovery: When Working Memory Dies With the Region
Agent runtimes hide state in places your DR runbook never named. The fix: name the state surface, generate idempotency keys at task scope, checkpoint before every tool call, and default to fail-safe abort over fail-forward replay.
insiderai-engineering
Apr 2712 min
Agent Incident Forensics: Capture Before You Need It
When an agent issues a wrong refund, your CRO will ask what produced it — and the answer requires a captured-at-write-time tuple of prompt, model id, decode config, tool results, and conversation history. Here is the discipline that makes 'we can reconstruct it' a true statement.
ai-engineeringobservability
Apr 2711 min
Output As Payload: Your AI Threat Model Got Half The Boundary
AI threat models usually stop at the model and treat output as safe content. Indirect prompt injection turns rendered markdown, structured output, generated code, and tool-call arguments into attack payloads — and the boundary worth defending is downstream of the model.
ai-securitythreat-modeling
Apr 279 min
The Agent Permission Prompt Has a Habituation Curve, and Your Safety Story Lives on Its Slope
A permission prompt is a security control with a measurable half-life. Track per-user approval rate, tier friction by blast radius, and stop letting a 100% click-through rate carry your safety story.
insiderai-agents
Apr 2710 min
Your Agent Release Notes List Files. Your Integrators Need Behavior Diffs.
Every agent release ships a bundle of system-prompt, model, tool, rubric, and retriever changes — and a file-diff changelog tells integrators nothing about the behavior shifts they will actually parse, budget against, or get paged on.
ai-engineeringagent-versioning
Apr 2713 min
Agent Trace Sampling: When 'Log Everything' Costs $80K and Still Misses the Regression
Request-level sampling policies break for agent traces. A per-tier policy — always-trace failures, head-sample successes, tail-sample by cost percentile — turns the trace store from a budget hole into an incident-response tool.
insiderllm
Apr 2710 min
Your Prompts Ship Like Cowboys: Why Code Review Discipline Doesn't Extend to AI Artifacts
A four-line bug fix gets three rounds of code review. A forty-line system-prompt edit ships with a single LGTM. A field guide to closing the discipline gap on AI artifacts before it ships your next regression.
llmcode-review
Apr 2711 min
The Demo Was a Single Seed: Why Your AI Rollout Is a Variance Problem, Not a Polish Problem
The wow demo was one realization out of thousands the model would generate against the same input. The rollout craters not because polish is missing — because nobody measured variance. Here's the n-of-k sampling, worst-case input library, and distribution-shift checklist that close the gap.
insiderai-engineering
Apr 2711 min
The Hidden Edges Between Your AI Features: When One Prompt Edit Regresses Three Other Teams
AI features compose through artifacts nobody catalogs — prompt fragments, eval seeds, judge rubrics. When a shared edit lands, three other teams regress and nobody can attribute it. Here's how to draw the graph.
ai-engineeringplatform-engineering
Apr 279 min
Your AI Explainer Doc Is a Runtime Dependency, Not Marketing Copy
When the prompt changes and the help-center article doesn't, your AI feature's trust contract breaks silently — and the prompt repo can predict the gap.
ai-engineeringdocumentation
Apr 2712 min
Your AI Feature Ramp Is Rolling Out on the Wrong Axis
User-percentage feature flags spread the hard 5% of queries evenly across cohorts, hiding tail regressions until 100%. Ramp by difficulty, token length, query slice, or tool-call depth instead — that is the axis where AI blast radius actually lives.
ai-engineeringprogressive-delivery
Apr 2711 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 39

The Agent Degraded-Mode Spec Is the Document You Didn't Write

Agent Disaster Recovery: When Working Memory Dies With the Region

Agent Incident Forensics: Capture Before You Need It

Output As Payload: Your AI Threat Model Got Half The Boundary

The Agent Permission Prompt Has a Habituation Curve, and Your Safety Story Lives on Its Slope

Your Agent Release Notes List Files. Your Integrators Need Behavior Diffs.

Agent Trace Sampling: When 'Log Everything' Costs $80K and Still Misses the Regression

Your Prompts Ship Like Cowboys: Why Code Review Discipline Doesn't Extend to AI Artifacts

The Demo Was a Single Seed: Why Your AI Rollout Is a Variance Problem, Not a Polish Problem

The Hidden Edges Between Your AI Features: When One Prompt Edit Regresses Three Other Teams

Your AI Explainer Doc Is a Runtime Dependency, Not Marketing Copy

Your AI Feature Ramp Is Rolling Out on the Wrong Axis

About Tian Pan