Blog

Page 75

12 articles

The Latent Capability Ceiling: When a Bigger Model Won't Fix Your Problem
Frontier models plateau on domain-specific tasks well before teams expect it. Here's how to diagnose whether you've hit a true capability ceiling or a prompt, eval, or data problem — and which technique actually breaks through.
llmfine-tuning
Apr 1810 min
The Idempotency Crisis: LLM Agents as Event Stream Consumers
At-least-once delivery assumes reprocessing an event produces the same result. LLMs don't. A practical guide to idempotency keys, deduplication windows, and compensating read-models for AI-powered Kafka consumers.
insiderai-engineering
Apr 1811 min
LLM-Powered Data Pipelines: The ETL Tier Nobody Benchmarks
Most LLM benchmarks measure chatbot quality. But the bulk of enterprise LLM spend is going into batch pipelines — and almost nobody is measuring whether those pipelines actually work.
insiderdata-engineering
Apr 1810 min
LLM Vendor Lock-In Is a Spectrum, Not a Binary
Not all LLM dependencies are created equal. Some are acceptable engineering tradeoffs; others are technical debt from day one. Here's how to tell them apart across six distinct lock-in layers.
insiderllm
Apr 1810 min
Long-Session Context Degradation: How Multi-Turn Conversations Go Stale
Sessions beyond 50 turns accumulate contradictions, user intent drift, and sycophancy loops. Here's the engineering playbook for detecting degradation and keeping long conversations useful.
llmcontext-engineering
Apr 188 min
The Long-Tail Coverage Problem: Why Your AI System Fails Where It Matters Most
Aggregate metrics like accuracy and F1 can look great while your AI system silently fails on the minority inputs that matter most. How to detect, measure, and fix long-tail coverage gaps before users find them.
evaluationtesting
Apr 1810 min
LoRA Adapter Composition in Production: Running Multiple Fine-Tuned Skills Without Model Wars
Teams build separate LoRA adapters for tone, format, domain knowledge, and safety — then hit conflicts when composing them. Here's how to detect interference, choose the right merge strategy, and serve mixed adapters per-request without reloading weights.
insiderfine-tuning
Apr 189 min
The Mental Model Shift That Separates Good AI Engineers from the Rest
The transition from deterministic to stochastic systems trips up strong engineers. Here are the mental models, debugging intuitions, and practices that actually separate experienced AI engineers from everyone else.
ai-engineeringengineering-leadership
Apr 1810 min
Model Deprecation Is a Production Incident Waiting to Happen
LLM providers deprecate models on 6–12 month windows, but most teams treat migration as a backlog item—until it becomes a 3 AM outage. Here's the operational playbook to make model upgrades boring.
llmopsai-engineering
Apr 189 min
Multi-Tenant AI Systems: Isolation, Customization, and Cost Attribution at Scale
How to serve multiple customers from shared AI infrastructure without leaking data, creating noisy neighbors, or losing track of who's spending what.
ai-engineeringarchitecture
Apr 1810 min
Multi-Modal Agents in Production: What Text-Only Evals Never Catch
Adding vision and document inputs to agent pipelines introduces failure modes that text-only evals never surface. Here's what practitioners are running into and how to build evals that catch it.
ai-engineeringagents
Apr 1810 min
Multimodal AI in Production: The Gap Between Benchmarks and Reality
Vision and audio models look impressive in demos. In production, they face latency penalties, grounding failures, and extraction inconsistencies that most benchmark scores hide entirely.
ai-engineeringmultimodal
Apr 1810 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 75

The Latent Capability Ceiling: When a Bigger Model Won't Fix Your Problem

The Idempotency Crisis: LLM Agents as Event Stream Consumers

LLM-Powered Data Pipelines: The ETL Tier Nobody Benchmarks

LLM Vendor Lock-In Is a Spectrum, Not a Binary

Long-Session Context Degradation: How Multi-Turn Conversations Go Stale

The Long-Tail Coverage Problem: Why Your AI System Fails Where It Matters Most

LoRA Adapter Composition in Production: Running Multiple Fine-Tuned Skills Without Model Wars

The Mental Model Shift That Separates Good AI Engineers from the Rest

Model Deprecation Is a Production Incident Waiting to Happen

Multi-Tenant AI Systems: Isolation, Customization, and Cost Attribution at Scale

Multi-Modal Agents in Production: What Text-Only Evals Never Catch

Multimodal AI in Production: The Gap Between Benchmarks and Reality

About Tian Pan