Blog

Page 69

12 articles

System Prompt Sprawl: When Your AI Instructions Become a Source of Bugs
As system prompts grow from hundreds to thousands of tokens, internal contradictions accumulate and model behavior becomes unpredictable. Here's how to detect, contain, and restructure before it costs you.
insiderprompt-engineering
Apr 199 min
Temperature Governance in Multi-Agent Systems: Why Variance Is a First-Class Budget
Running all your agent components at the same temperature is as wrong as giving them all the same timeout. A guide to per-role sampling policy design that matches output variance to what each pipeline stage actually needs.
insiderai-engineering
Apr 1911 min
Temporal Context Injection: Making LLMs Actually Know What Day It Is
LLMs have no clock. Every date-sensitive feature you ship is broken by default — unless you engineer temporal context in explicitly. Here's how to do it without destroying your prompt cache.
llmproduction-ai
Apr 1911 min
Text-to-SQL in Production: Why Natural Language Queries Fail at the Schema Boundary
Why vendor demos of text-to-SQL work perfectly and production deployments fall apart — and the engineering techniques that actually close the gap.
ai-engineeringllm
Apr 199 min
The Token Economy of Multi-Turn Tool Use: Why Your Agent Costs 5x More Than You Think
Agent cost estimates built on single-call math are wrong by design. Here's how multi-turn tool use compounds token costs non-linearly — and the specific design levers that keep long-horizon agents economically viable.
insiderai-engineering
Apr 1910 min
Tokenizer Blindspots That Break Production LLM Systems
Why the '1000 tokens ≈ 750 words' assumption breaks in the cases that matter most: multilingual text, structured outputs, and code-heavy workloads — and the production bugs that follow.
insiderllm
Apr 1910 min
Tool Output Compression: The Injection Decision That Shapes Context Quality
Tool results in AI agent pipelines vary 100× in token density. The strategy you choose for injecting them into context — raw, compressed, or extracted — sets a hard ceiling on your agent's accuracy, cost, and latency at scale.
insiderllm-agents
Apr 1910 min
Upstream Data Quality Is Your AI Agent's Real Bottleneck
Most AI agent failures in production aren't model problems — they're data problems. Here's how to diagnose and fix the upstream data quality issues that no amount of prompt engineering can solve.
ai-agentsdata-quality
Apr 199 min
What Your Vendor's Model Card Doesn't Tell You
Model cards report average benchmark scores. They omit tail behavior, system-prompt interaction effects, cultural blind spots, and the silent regressions that break production systems. Here's what teams are building instead.
insiderllm
Apr 1910 min
Vibe Code at Scale: Managing Technical Debt When AI Writes Most of Your Codebase
AI-generated code looks plausible but harbors systematic defects that compound into crisis-level technical debt by month 12-18. Here are the engineering practices that actually prevent it.
ai-engineeringtechnical-debt
Apr 199 min
The Vibe Coding Productivity Plateau: Why AI Speed Gains Reverse After Month Three
93% of developers use AI coding assistants, but productivity gains have stalled at 10%. Here's the compounding failure mode that turns early velocity wins into long-term drag — and the practices that prevent it.
ai-engineeringproductivity
Apr 198 min
When Workflow Engines Beat LLM Agents: A Decision Framework for Deterministic Orchestration
Gartner predicts 40% of agentic AI projects will be canceled by 2027. Before defaulting to an autonomous LLM agent, here is the framework for choosing deterministic orchestrators instead.
agent-architectureworkflow-orchestration
Apr 199 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 69

System Prompt Sprawl: When Your AI Instructions Become a Source of Bugs

Temperature Governance in Multi-Agent Systems: Why Variance Is a First-Class Budget

Temporal Context Injection: Making LLMs Actually Know What Day It Is

Text-to-SQL in Production: Why Natural Language Queries Fail at the Schema Boundary

The Token Economy of Multi-Turn Tool Use: Why Your Agent Costs 5x More Than You Think

Tokenizer Blindspots That Break Production LLM Systems

Tool Output Compression: The Injection Decision That Shapes Context Quality

Upstream Data Quality Is Your AI Agent's Real Bottleneck

What Your Vendor's Model Card Doesn't Tell You

Vibe Code at Scale: Managing Technical Debt When AI Writes Most of Your Codebase

The Vibe Coding Productivity Plateau: Why AI Speed Gains Reverse After Month Three

When Workflow Engines Beat LLM Agents: A Decision Framework for Deterministic Orchestration

About Tian Pan