Blog

Page 79

12 articles

Why Vision Models Ace Benchmarks but Fail on Your Enterprise PDFs
Vision models post impressive benchmark numbers on document understanding, but enterprise teams routinely see silent failures on real PDFs. Here's what breaks and how to build pipelines that survive contact with production documents.
insiderdocument-ai
Apr 189 min
Who Owns AI Quality? The Cross-Functional Vacuum That Breaks Production Systems
AI quality failures rarely stem from bad models. They stem from nobody claiming ownership. Here's how to fix the accountability vacuum before it costs you.
ai-engineeringmlops
Apr 1810 min
Agent Identity and Delegated Authorization: OAuth Patterns for Agentic Actions
When an AI agent books a calendar event or sends an email on your behalf, it operates under delegated authority. Here's how to design OAuth scope contracts, rotation lifecycle, revocation triggers, and audit trails for production agentic systems.
ai-engineeringsecurity
Apr 1710 min
Agentic Data Pipelines: Offline Enrichment and Classification at Scale
How AI agents change the design of ETL and batch-enrichment workflows — variable compute per record, confidence thresholds as operational contracts, schema design for downstream consumers, and monitoring patterns that distinguish model uncertainty from data ambiguity.
ai-engineeringdata-pipelines
Apr 179 min
AI-Native API Design: Why REST Breaks When Your Backend Thinks Probabilistically
REST was built for fast, deterministic backends. LLM services are slow, probabilistic, and long-running — and the interface patterns that actually hold up in production look nothing like conventional HTTP API design.
insiderapi-design
Apr 1711 min
The AI On-Call Playbook: Incident Response When the Bug Is a Bad Prediction
Traditional runbooks break when the symptom is 'outputs feel wrong.' A practical triage decision tree, escalation criteria, and postmortem format built specifically for AI systems in production.
insidermlops
Apr 1712 min
The AI Ops Dashboard Nobody Builds Until It's Too Late
Latency and error rate cover less than 20% of the failure space for LLM-powered features. Here are the five production failure modes your APM dashboard silently ignores — and the signal hierarchy that actually catches them.
insiderai-engineering
Apr 1711 min
Chatbot, Copilot, or Agent: The Taxonomy That Changes Your Architecture
Picking the wrong AI interaction paradigm — chatbot, copilot, or agent — creates architectural debt you can't fix by tuning prompts. A breakdown of the trust models, context-window strategies, and error-recovery requirements that should drive the decision before you write a line of code.
insiderai-agents
Apr 1710 min
The Cold Start Problem in AI Personalization: Being Useful Before You Have Data
New users have no history, your model has no context, and you're competing against the perception that AI doesn't know them. Here's the engineering playbook for bridging that gap.
insiderai
Apr 1711 min
Why '92% Accurate' Is Almost Always a Lie
A single accuracy number hides the errors that actually matter. Here's a four-dimension taxonomy — correct, recoverable, harmful, abstained — and a one-page format that gives non-technical stakeholders enough to make the right product, legal, and investment decisions.
insiderai
Apr 178 min
The Data Flywheel Is Not Free: Engineering Feedback Loops That Actually Improve Your AI Product
Most teams collect thumbs-up/down and call it a feedback loop. The real infrastructure is implicit signal extraction, weak supervision pipelines, and closed-loop architecture that routes production data back into training without drowning in annotation overhead.
insiderai-engineering
Apr 1711 min
Data Versioning for AI: The Dataset-Model Coupling Problem Teams Discover Too Late
Why 'the model regressed' usually means 'the upstream data changed' — and the lineage graph patterns that let you trace production degradations to their data cause before wasting a week re-tuning prompts.
mlopsdata-engineering
Apr 179 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 79

Why Vision Models Ace Benchmarks but Fail on Your Enterprise PDFs

Who Owns AI Quality? The Cross-Functional Vacuum That Breaks Production Systems

Agent Identity and Delegated Authorization: OAuth Patterns for Agentic Actions

Agentic Data Pipelines: Offline Enrichment and Classification at Scale

AI-Native API Design: Why REST Breaks When Your Backend Thinks Probabilistically

The AI On-Call Playbook: Incident Response When the Bug Is a Bad Prediction

The AI Ops Dashboard Nobody Builds Until It's Too Late

Chatbot, Copilot, or Agent: The Taxonomy That Changes Your Architecture

The Cold Start Problem in AI Personalization: Being Useful Before You Have Data

Why '92% Accurate' Is Almost Always a Lie

The Data Flywheel Is Not Free: Engineering Feedback Loops That Actually Improve Your AI Product

Data Versioning for AI: The Dataset-Model Coupling Problem Teams Discover Too Late

About Tian Pan