Blog

Page 116

12 articles

Why Your Document Extractor Breaks on the Contracts That Matter Most
Fixed-layout extractors fail on the adversarial diversity of real enterprise documents. Here's the preprocessing pipeline that actually works in production, and the eval methodology that measures quality on the long tail.
insiderdocument-ai
Apr 1613 min
Enterprise RAG Governance: The Org Chart Behind Your Retrieval Pipeline
40–60% of enterprise RAG deployments fail to reach production. The culprit is almost never the retrieval algorithm—it's governance: no document ownership, no access controls at query time, no PII handling, no freshness enforcement.
insiderrag
Apr 1611 min
Eval Coverage as a Production Metric: Is Your Test Suite Actually Testing What Users Do?
A green eval suite can coexist with silently degraded production quality. Here's how to measure whether your evals actually represent real user intent—and what to do when they don't.
llmevaluation
Apr 169 min
Event-Driven Agent Scheduling: Why Cron + REST Calls Fail for Recurring AI Workloads
Cron was built for sysadmin scripts, not autonomous agents. Here's what breaks when you use it for recurring LLM jobs—and the message queue architecture that actually works.
insiderai engineering
Apr 1611 min
Why Your AI Model Is Always 6 Months Behind: Closing the Feedback Loop
AI models degrade silently because the gap between user failures and model updates spans months. Here's how to instrument implicit signals, run online evaluation, and use fast-path fine-tuning to compress that cycle from quarters to days.
mlopsai-engineering
Apr 1610 min
The Feedback Loop Trap: Why AI Features Degrade When Users Adapt to Them
Self-induced distribution shift is the silent killer of production AI features. When users adapt their behavior to your AI's outputs, retraining on that adapted data makes the problem worse. Here's how to detect, measure, and break the loop.
ai-engineeringproduction-ai
Apr 1610 min
Feedback Surfaces That Actually Train Your Model
Thumbs-up/down captures signal from the wrong users at the wrong moment. Here's how to design feedback surfaces that generate high-fidelity training data as a natural byproduct of product use.
ai-engineeringrlhf
Apr 1610 min
Fleet Health for AI Agents: What Single-Agent Observability Gets Wrong at Scale
Scaling from one agent to a thousand exposes fleet-level failure modes that single-agent observability tools miss entirely: version heterogeneity, correlated provider cascades, and token spirals that burn monthly budgets in minutes.
ai-engineeringobservability
Apr 169 min
GraphRAG vs. Vector RAG: When Knowledge Graphs Beat Embeddings
Vector embeddings degrade to zero accuracy on multi-entity queries in compliance and enterprise domains. Here's when knowledge graphs are the right call — and the operational costs you're signing up for.
insiderrag
Apr 169 min
Where to Put the Human: Placement Theory for AI Approval Gates
The most common HITL mistake isn't skipping human review — it's placing it at the wrong point. A framework for classifying agent actions by risk and inserting approval gates exactly where they prevent irreversible damage.
insiderai-engineering
Apr 1612 min
When Embeddings Aren't Enough: A Decision Framework for Hybrid Retrieval Architecture
A practical framework for when to combine BM25 with dense embeddings, how to handle metadata filters without killing recall, and when cross-encoder reranking is worth the latency cost.
ragretrieval
Apr 1611 min
The Insider Threat You Created When You Deployed Enterprise AI
Giving employees AI coding assistants and document search agents also gives compromised insider accounts significantly amplified capability. Here's the threat model and the architectural controls that limit blast radius.
insiderai
Apr 1610 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 116

Why Your Document Extractor Breaks on the Contracts That Matter Most

Enterprise RAG Governance: The Org Chart Behind Your Retrieval Pipeline

Eval Coverage as a Production Metric: Is Your Test Suite Actually Testing What Users Do?

Event-Driven Agent Scheduling: Why Cron + REST Calls Fail for Recurring AI Workloads

Why Your AI Model Is Always 6 Months Behind: Closing the Feedback Loop

The Feedback Loop Trap: Why AI Features Degrade When Users Adapt to Them

Feedback Surfaces That Actually Train Your Model

Fleet Health for AI Agents: What Single-Agent Observability Gets Wrong at Scale

GraphRAG vs. Vector RAG: When Knowledge Graphs Beat Embeddings

Where to Put the Human: Placement Theory for AI Approval Gates

When Embeddings Aren't Enough: A Decision Framework for Hybrid Retrieval Architecture

The Insider Threat You Created When You Deployed Enterprise AI

About Tian Pan