Blog

Page 100

12 articles

Temperature Governance in Multi-Agent Systems: Why Variance Is a First-Class Budget
Running all your agent components at the same temperature is as wrong as giving them all the same timeout. A guide to per-role sampling policy design that matches output variance to what each pipeline stage actually needs.
insiderai-engineering
Apr 1911 min
Temporal Context Injection: Making LLMs Actually Know What Day It Is
LLMs have no clock. Every date-sensitive feature you ship is broken by default — unless you engineer temporal context in explicitly. Here's how to do it without destroying your prompt cache.
llmproduction-ai
Apr 1911 min
Text-to-SQL in Production: Why Natural Language Queries Fail at the Schema Boundary
Why vendor demos of text-to-SQL work perfectly and production deployments fall apart — and the engineering techniques that actually close the gap.
ai-engineeringllm
Apr 199 min
The Token Economy of Multi-Turn Tool Use: Why Your Agent Costs 5x More Than You Think
Agent cost estimates built on single-call math are wrong by design. Here's how multi-turn tool use compounds token costs non-linearly — and the specific design levers that keep long-horizon agents economically viable.
insiderai-engineering
Apr 1910 min
Tokenizer Blindspots That Break Production LLM Systems
Why the '1000 tokens ≈ 750 words' assumption breaks in the cases that matter most: multilingual text, structured outputs, and code-heavy workloads — and the production bugs that follow.
insiderllm
Apr 1910 min
Tool Output Compression: The Injection Decision That Shapes Context Quality
Tool results in AI agent pipelines vary 100× in token density. The strategy you choose for injecting them into context — raw, compressed, or extracted — sets a hard ceiling on your agent's accuracy, cost, and latency at scale.
insiderllm-agents
Apr 1910 min
Upstream Data Quality Is Your AI Agent's Real Bottleneck
Most AI agent failures in production aren't model problems — they're data problems. Here's how to diagnose and fix the upstream data quality issues that no amount of prompt engineering can solve.
ai-agentsdata-quality
Apr 199 min
What Your Vendor's Model Card Doesn't Tell You
Model cards report average benchmark scores. They omit tail behavior, system-prompt interaction effects, cultural blind spots, and the silent regressions that break production systems. Here's what teams are building instead.
insiderllm
Apr 1910 min
Vibe Code at Scale: Managing Technical Debt When AI Writes Most of Your Codebase
AI-generated code looks plausible but harbors systematic defects that compound into crisis-level technical debt by month 12-18. Here are the engineering practices that actually prevent it.
ai-engineeringtechnical-debt
Apr 199 min
The Vibe Coding Productivity Plateau: Why AI Speed Gains Reverse After Month Three
93% of developers use AI coding assistants, but productivity gains have stalled at 10%. Here's the compounding failure mode that turns early velocity wins into long-term drag — and the practices that prevent it.
ai-engineeringproductivity
Apr 198 min
When Workflow Engines Beat LLM Agents: A Decision Framework for Deterministic Orchestration
Gartner predicts 40% of agentic AI projects will be canceled by 2027. Before defaulting to an autonomous LLM agent, here is the framework for choosing deterministic orchestrators instead.
agent-architectureworkflow-orchestration
Apr 199 min
A/B Testing AI Features When the Treatment Is Non-Deterministic
Standard A/B testing breaks down when your treatment is an LLM — outputs vary per call, model updates ship mid-experiment, and 'success' resists clean operationalization. Here are the statistical adjustments and experiment patterns that produce trustworthy results anyway.
experimentationllm
Apr 1810 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 100

Temperature Governance in Multi-Agent Systems: Why Variance Is a First-Class Budget

Temporal Context Injection: Making LLMs Actually Know What Day It Is

Text-to-SQL in Production: Why Natural Language Queries Fail at the Schema Boundary

The Token Economy of Multi-Turn Tool Use: Why Your Agent Costs 5x More Than You Think

Tokenizer Blindspots That Break Production LLM Systems

Tool Output Compression: The Injection Decision That Shapes Context Quality

Upstream Data Quality Is Your AI Agent's Real Bottleneck

What Your Vendor's Model Card Doesn't Tell You

Vibe Code at Scale: Managing Technical Debt When AI Writes Most of Your Codebase

The Vibe Coding Productivity Plateau: Why AI Speed Gains Reverse After Month Three

When Workflow Engines Beat LLM Agents: A Decision Framework for Deterministic Orchestration

A/B Testing AI Features When the Treatment Is Non-Deterministic

About Tian Pan