Blog

Page 86

12 articles

Embedding Drift: The Silent Degradation Killing Your Long-Lived RAG System
How mixed embedding models, chunking strategy changes, and preprocessing inconsistencies silently degrade RAG retrieval quality — and what to do about it.
insiderrag
Apr 1810 min
The Embedding Refresh Problem: Running a Vector Store Like a Database Engineer
Over 60% of RAG failures trace back to stale vectors, not bad prompts. How to apply database engineering discipline—CDC, drift detection, zero-downtime model migrations—to keep your vector index in sync with source truth.
ragvector-search
Apr 1810 min
The EU AI Act Is Now Your Engineering Backlog
The EU AI Act's August 2026 deadline for high-risk AI systems translates directly into concrete engineering tasks: audit trail architecture, data governance pipelines, and human oversight interfaces. Here's what engineers need to build — and in what order.
ai-engineeringcompliance
Apr 1812 min
The EU AI Act Features That Silently Trigger High-Risk Compliance — and What You Must Ship Before August 2026
Specific engineering decisions — adding a mood signal to your HR dashboard, routing loan decisions through a model — can silently cross the EU AI Act's high-risk threshold. Here's what triggers classification, and what you must build before August 2026 enforcement.
insiderai-engineering
Apr 189 min
Eval Set Decay: Why Your Benchmark Becomes Misleading Six Months After You Build It
Static eval sets are frozen snapshots of user behavior. As real traffic evolves, your benchmark drifts from production reality—here's how to measure decay and keep evals honest.
insiderevaluation
Apr 1810 min
Evaluating AI Service Vendors Beyond Your LLM Provider
Most teams scrutinize their LLM provider but trust everything else on vibes. A rigorous framework for evaluating guardrail vendors, embedding providers, observability tools, and fine-tuning platforms—with due diligence criteria that catch business-model risk before it bites you.
ai-engineeringvendor-evaluation
Apr 1810 min
Foundation Model Vendor Strategy: What Enterprise SLAs Actually Guarantee
Enterprise teams pick LLM vendors based on benchmarks and demos. Then they hit production and discover what the SLA actually says — which is usually much less than they assumed.
insiderai-engineering
Apr 1812 min
The Evaluation Paradox: How Goodhart's Law Breaks AI Benchmarks
When AI teams optimize for benchmark scores instead of real capabilities, scores climb while quality degrades. Here's how the evaluation paradox works and what structural changes actually make evals resistant to gaming.
insiderai
Apr 1810 min
GraphRAG vs. Vector RAG: The Architecture Decision Teams Make Too Late
Vector RAG hits a mathematical ceiling on relational queries — the migration path from pure vector to hybrid graph-vector retrieval, and the query patterns that reveal you've outgrown dense-only search.
RAGGraphRAG
Apr 1812 min
Hallucination Is Not a Root Cause: A Debugging Methodology for AI in Production
Moving beyond 'the model hallucinated' to systematic root cause analysis: retrieval failure, conflicting context, prompt ambiguity, and knowledge boundary violations each require different fixes.
insiderllm
Apr 1810 min
Why Hallucination Rate Is the Wrong Primary Metric for Production LLM Systems
Hallucination rate is easy to measure but weakly correlated with user outcomes. A framework for choosing behavioral metrics that actually reflect whether your AI feature is working.
evaluationobservability
Apr 188 min
The Idempotency Problem in Agentic Tool Calling
Why agent retry logic causes duplicate charges, double-sent emails, and inconsistent state — and how saga patterns, idempotency keys, and structured error signals fix the problem at the architecture level.
insiderai-engineering
Apr 1811 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 86

Embedding Drift: The Silent Degradation Killing Your Long-Lived RAG System

The Embedding Refresh Problem: Running a Vector Store Like a Database Engineer

The EU AI Act Is Now Your Engineering Backlog

The EU AI Act Features That Silently Trigger High-Risk Compliance — and What You Must Ship Before August 2026

Eval Set Decay: Why Your Benchmark Becomes Misleading Six Months After You Build It

Evaluating AI Service Vendors Beyond Your LLM Provider

Foundation Model Vendor Strategy: What Enterprise SLAs Actually Guarantee

The Evaluation Paradox: How Goodhart's Law Breaks AI Benchmarks

GraphRAG vs. Vector RAG: The Architecture Decision Teams Make Too Late

Hallucination Is Not a Root Cause: A Debugging Methodology for AI in Production

Why Hallucination Rate Is the Wrong Primary Metric for Production LLM Systems

The Idempotency Problem in Agentic Tool Calling

About Tian Pan