Blog

Page 122

12 articles

AI On-Call Psychology: Rebuilding Operator Intuition for Non-Deterministic Alerts
On-call for AI systems breaks standard SRE intuition. A practical taxonomy, rotation design, and training curriculum for operating stochastic production systems without burning out the team or missing real regressions.
insiderai-engineering
Apr 1511 min
AI Product Metrics That Don't Lie: Behavioral Signals Over Thumbs-Up Scores
Aggregate satisfaction scores and thumbs-up rates hide the cases where AI is confidently wrong. Here's the behavioral signal stack that actually tells you whether your model improvement moved the needle.
ai-engineeringproduct
Apr 159 min
The AI Reliability Floor: Why 80% Accurate Is Worse Than No AI at All
There is a reliability floor below which an AI feature actively destroys user trust faster than it can build value. Here is how to find it before shipping.
insiderai-engineering
Apr 159 min
The AI Procurement Gap: Why Your Vendor Evaluation Process Can't Handle Probabilistic Systems
Traditional RFPs score features and uptime SLAs that mean nothing for stochastic outputs. The eval-driven assessment, contract clauses, and vendor transparency signals that procurement teams are missing for AI.
insiderai-procurement
Apr 1511 min
Stop Writing Prompts by Hand: Automated Optimization with DSPy and MIPRO
DSPy and its MIPRO optimizer replace manual prompt engineering with declarative signatures and Bayesian search — producing prompts that outperform hand-written ones by 20–40% on complex tasks. Here's how the system works and when it's worth the overhead.
ai-engineeringllm
Apr 159 min
Backpressure for LLM Pipelines: Queue Theory Applied to Token-Based Services
How to apply Little's Law, admission control, bulkheads, and token-bucket backpressure to LLM call graphs — and why naive retry logic turns transient provider blips into outages.
insiderllm
Apr 1511 min
The Bias Audit You Keep Skipping: Engineering Demographic Fairness into Your LLM Pipeline
Safety filters and fairness checks are different problems requiring different engineering responses. Output quality disparities across gender, race, and language group won't surface in your guardrails — here's the methodology that catches them before they ship.
insiderllm
Apr 1510 min
The Cognitive Offloading Trap: When Your Team Can't Work Without the AI
Engineering teams that route all knowledge work through AI agents stop practicing the underlying skills. Here's how to recognize unhealthy AI dependency and design deliberate practices that preserve human capability.
ai-engineeringteam-dynamics
Apr 159 min
Compound Failure Modes in AI Pipelines: When Partial Success Isn't Enough
If each stage of your AI pipeline succeeds 95% of the time, a three-step chain succeeds only 86% of the time. The probability math practitioners undercount, correlation effects that make it dramatically worse, and the architectural patterns that prevent multiplicative collapse in production.
ai-engineeringreliability
Apr 159 min
Context Compression Changes What Your Model Actually Sees
Token pruning and prompt compression can cut LLM inference costs by 3–10x, but they silently change what your model sees. A practical breakdown of the failure modes — lost coreference chains, dropped constraints, tool output hallucination — and how to validate and budget compression safely.
llmcontext-management
Apr 1511 min
Continuous Fine-Tuning Without Data Contamination: The Production Pipeline
A production engineering guide to ongoing LLM fine-tuning from user feedback — covering data routing architecture, contamination detection, catastrophic forgetting prevention, and automated safety preservation.
insidermlops
Apr 1511 min
Contract Tests for Prompts: Stop One Team's Edit From Breaking Another Team's Agent
Prompts are shared APIs without contracts — a consumer-driven testing discipline catches cross-team breaking changes before they hit production agents.
prompt-engineeringcontract-testing
Apr 159 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 122

AI On-Call Psychology: Rebuilding Operator Intuition for Non-Deterministic Alerts

AI Product Metrics That Don't Lie: Behavioral Signals Over Thumbs-Up Scores

The AI Reliability Floor: Why 80% Accurate Is Worse Than No AI at All

The AI Procurement Gap: Why Your Vendor Evaluation Process Can't Handle Probabilistic Systems

Stop Writing Prompts by Hand: Automated Optimization with DSPy and MIPRO

Backpressure for LLM Pipelines: Queue Theory Applied to Token-Based Services

The Bias Audit You Keep Skipping: Engineering Demographic Fairness into Your LLM Pipeline

The Cognitive Offloading Trap: When Your Team Can't Work Without the AI

Compound Failure Modes in AI Pipelines: When Partial Success Isn't Enough

Context Compression Changes What Your Model Actually Sees

Continuous Fine-Tuning Without Data Contamination: The Production Pipeline

Contract Tests for Prompts: Stop One Team's Edit From Breaking Another Team's Agent

About Tian Pan