Blog

Page 88

12 articles

The Mental Model Shift That Separates Good AI Engineers from the Rest
The transition from deterministic to stochastic systems trips up strong engineers. Here are the mental models, debugging intuitions, and practices that actually separate experienced AI engineers from everyone else.
ai-engineeringengineering-leadership
Apr 1810 min
Model Deprecation Is a Production Incident Waiting to Happen
LLM providers deprecate models on 6–12 month windows, but most teams treat migration as a backlog item—until it becomes a 3 AM outage. Here's the operational playbook to make model upgrades boring.
llmopsai-engineering
Apr 189 min
Multi-Tenant AI Systems: Isolation, Customization, and Cost Attribution at Scale
How to serve multiple customers from shared AI infrastructure without leaking data, creating noisy neighbors, or losing track of who's spending what.
ai-engineeringarchitecture
Apr 1810 min
Multi-Modal Agents in Production: What Text-Only Evals Never Catch
Adding vision and document inputs to agent pipelines introduces failure modes that text-only evals never surface. Here's what practitioners are running into and how to build evals that catch it.
ai-engineeringagents
Apr 1810 min
Multimodal AI in Production: The Gap Between Benchmarks and Reality
Vision and audio models look impressive in demos. In production, they face latency penalties, grounding failures, and extraction inconsistencies that most benchmark scores hide entirely.
ai-engineeringmultimodal
Apr 1810 min
The 90% Reliability Wall: Why AI Features Plateau and What to Do About It
Why AI features stall around 90% reliability, how to diagnose reducible vs. irreducible error, and the product-architecture decisions that let you ship honest value.
ai-engineeringreliability
Apr 189 min
On-Call for Stochastic Systems: Why Your AI Runbook Needs a Rewrite
Traditional incident response assumes reproducible failures. LLM-powered systems don't. Here's how to rewrite your alerting schema, triage decision tree, and post-mortem template for non-deterministic AI.
insiderai-engineering
Apr 1810 min
The On-Device LLM Problem Nobody Talks About: Model Update Propagation
Shipping LLMs to edge devices creates a distributed system with no central rollback — version fragmentation, silent capability drift, and artifact ensemble mismatches that don't show up in benchmarks.
insideredge-ai
Apr 1812 min
On-Device LLM Inference in Production: When Edge Models Are Right and What They Actually Cost
The privacy, latency, and offline case for running LLM inference on iOS, Android, and browser—plus the quality-size tradeoffs, cost math, and the update problem that bites teams six months after ship.
performanceinfrastructure
Apr 1810 min
The Orchestration Framework Trap: When LangChain Makes You Slower to Ship
AI orchestration frameworks like LangChain accelerate prototyping but create debugging opacity, versioning brittleness, and leaky abstractions at scale. Here's the decision framework for knowing when to use them and when to drop down a layer.
ai-engineeringlangchain
Apr 188 min
The Over-Tooled Agent Problem: Why More Tools Make Your LLM Dumber
Tool selection accuracy drops to 13% when LLMs face large tool sets. Here's why over-tooling breaks your agents and how to architect around it with routing layers, hierarchical toolsets, and lazy-loading registries.
ai-engineeringagents
Apr 189 min
The PII Leak in Your RAG Pipeline: Why Your Chatbot Knows Things It Shouldn't
Semantic similarity doesn't respect data-access boundaries. Here's how RAG pipelines expose sensitive records to unauthorized users—and the layered defenses that stop them.
ragsecurity
Apr 1810 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 88

The Mental Model Shift That Separates Good AI Engineers from the Rest

Model Deprecation Is a Production Incident Waiting to Happen

Multi-Tenant AI Systems: Isolation, Customization, and Cost Attribution at Scale

Multi-Modal Agents in Production: What Text-Only Evals Never Catch

Multimodal AI in Production: The Gap Between Benchmarks and Reality

The 90% Reliability Wall: Why AI Features Plateau and What to Do About It

On-Call for Stochastic Systems: Why Your AI Runbook Needs a Rewrite

The On-Device LLM Problem Nobody Talks About: Model Update Propagation

On-Device LLM Inference in Production: When Edge Models Are Right and What They Actually Cost

The Orchestration Framework Trap: When LangChain Makes You Slower to Ship

The Over-Tooled Agent Problem: Why More Tools Make Your LLM Dumber

The PII Leak in Your RAG Pipeline: Why Your Chatbot Knows Things It Shouldn't

About Tian Pan