Blog

Page 73

12 articles

The Data Quality Tax in LLM Systems: Why Bad Input Hits Differently
Traditional ML degrades gracefully on noisy data. LLMs hallucinate confidently, corrupt vector stores, and propagate errors downstream with apparent authority. Here's how to measure and mitigate the data quality tax.
insiderllm
Apr 189 min
Dead Reckoning for Long-Running Agents: Knowing Where Your Agent Is Without Stopping It
When an agent runs for hours, knowing where it is—and whether it's still on track—becomes a first-class engineering problem. These are the patterns that solve it.
insiderai-agents
Apr 1811 min
Decision Provenance in Agentic Systems: Audit Trails That Actually Work
When autonomous agents take consequential actions, having logs is not the same as having accountability. A practical guide to designing decision provenance for production agentic systems — event schemas, ownership handoffs, hallucination attribution, and the compliance requirements that make this non-optional.
insideragentic-ai
Apr 1813 min
The AI Feature Sunset Playbook: Decommissioning Agents Without Breaking Your Users
Shutting down an AI feature is fundamentally different from deprecating a deterministic API. Here's the engineering playbook for mapping behavioral dependencies, staging sunsets, and avoiding the support ticket avalanche.
insiderai-engineering
Apr 1810 min
Designing for Partial Completion: When Your Agent Gets 70% Done and Stops
Most agent failure designs assume clean abort or clean success. Real agents hit uncertainty, authorization limits, and resource constraints mid-task. Here's how to design for what actually happens.
insiderai-engineering
Apr 1810 min
Dev/Prod Parity for AI Apps: The Seven Ways Your Staging Environment Is Lying to You
Staging environments systematically misrepresent how LLM applications behave in production. Here are seven specific failure modes — from prompt cache warmth to silent traffic distribution drift — and the pre-prod checks that surface them.
llmopsproduction
Apr 1811 min
Distributed Tracing Across Agent Service Boundaries: The Context Propagation Gap
When agents call agents across microservice boundaries, W3C TraceContext breaks down and your traces fragment into disconnected spans. Here's the technical shape of the failure and how to fix it.
insiderobservability
Apr 1811 min
Embedding Drift: The Silent Degradation Killing Your Long-Lived RAG System
How mixed embedding models, chunking strategy changes, and preprocessing inconsistencies silently degrade RAG retrieval quality — and what to do about it.
insiderrag
Apr 1810 min
The Embedding Refresh Problem: Running a Vector Store Like a Database Engineer
Over 60% of RAG failures trace back to stale vectors, not bad prompts. How to apply database engineering discipline—CDC, drift detection, zero-downtime model migrations—to keep your vector index in sync with source truth.
ragvector-search
Apr 1810 min
The EU AI Act Is Now Your Engineering Backlog
The EU AI Act's August 2026 deadline for high-risk AI systems translates directly into concrete engineering tasks: audit trail architecture, data governance pipelines, and human oversight interfaces. Here's what engineers need to build — and in what order.
ai-engineeringcompliance
Apr 1812 min
The EU AI Act Features That Silently Trigger High-Risk Compliance — and What You Must Ship Before August 2026
Specific engineering decisions — adding a mood signal to your HR dashboard, routing loan decisions through a model — can silently cross the EU AI Act's high-risk threshold. Here's what triggers classification, and what you must build before August 2026 enforcement.
insiderai-engineering
Apr 189 min
Eval Set Decay: Why Your Benchmark Becomes Misleading Six Months After You Build It
Static eval sets are frozen snapshots of user behavior. As real traffic evolves, your benchmark drifts from production reality—here's how to measure decay and keep evals honest.
insiderevaluation
Apr 1810 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 73

The Data Quality Tax in LLM Systems: Why Bad Input Hits Differently

Dead Reckoning for Long-Running Agents: Knowing Where Your Agent Is Without Stopping It

Decision Provenance in Agentic Systems: Audit Trails That Actually Work

The AI Feature Sunset Playbook: Decommissioning Agents Without Breaking Your Users

Designing for Partial Completion: When Your Agent Gets 70% Done and Stops

Dev/Prod Parity for AI Apps: The Seven Ways Your Staging Environment Is Lying to You

Distributed Tracing Across Agent Service Boundaries: The Context Propagation Gap

Embedding Drift: The Silent Degradation Killing Your Long-Lived RAG System

The Embedding Refresh Problem: Running a Vector Store Like a Database Engineer

The EU AI Act Is Now Your Engineering Backlog

The EU AI Act Features That Silently Trigger High-Risk Compliance — and What You Must Ship Before August 2026

Eval Set Decay: Why Your Benchmark Becomes Misleading Six Months After You Build It

About Tian Pan