Blog

Page 80

12 articles

Prompt Versioning Done Right: Treating LLM Instructions as Production Software
Most teams treat prompts like config files — until a three-word edit tanks a revenue-generating workflow. Here's the engineering discipline that prevents it.
llmprompt-engineering
Apr 198 min
Zero-Shot, Few-Shot, or Chain-of-Thought: A Production Decision Framework
Most teams pick prompting strategies by convention. Here are the evidence-based criteria—task complexity, model scale, token budget, output structure—that predict which approach wins on your specific task.
llmprompting
Apr 1910 min
RAG Knowledge Base Freshness: The Staleness Problem Teams Solve Last
Chunking and embedding quality dominate RAG architecture discussions, but index freshness silently determines your system's reliability over time. Here's how to detect, measure, and fix it.
insiderrag
Apr 1911 min
RAG Position Bias: Why Chunk Order Changes Your Answers
Retrieval correctness isn't enough — where your chunks appear in the prompt determines which ones the model actually uses. How position bias works in production RAG systems and what to do about it.
insiderrag
Apr 198 min
Testing the Retrieval-Generation Seam: The Integration Test Gap in RAG Systems
Unit tests for your retriever and generator can both pass while your RAG system silently fails. Here's how to test the seam between them and localize blame when it breaks.
insiderrag
Apr 1911 min
RBAC Is Not Enough for AI Agents: A Practical Authorization Model
Static role-based access control breaks when agents shift permissions mid-task. Here is how to build an authorization model that actually holds: narrow tool scopes, short-lived credentials, ABAC runtime policies, and audit trails anchored to agent identity.
insiderai-agents
Apr 1911 min
Reasoning Model Economics: When Chain-of-Thought Earns Its Cost
Extended thinking models cost 10–50x more per query. Here's the task taxonomy that tells you when that premium pays off — and the routing architecture that applies it automatically.
insiderllm
Apr 199 min
The Reranker Gap: Why Most RAG Pipelines Skip the Most Important Layer
Most RAG pipelines stop at vector similarity search and wonder why accuracy plateaus. The reranker is the missing layer — here's what it costs to skip it and how to decide when the tradeoff is worth it.
ragretrieval
Apr 198 min
Sequential Tool Call Waterfalls: The Hidden Latency Tax in Agent Loops
Agent frameworks default to sequential tool execution even when calls are logically independent, creating latency cascades identical to the N+1 query problem. Here's how to identify and fix them.
insiderai-agents
Apr 1910 min
Shadow to Autopilot: A Readiness Framework for AI Feature Autonomy
Moving AI from shadow mode through advisory, co-pilot, and autopilot stages requires explicit quality gates and monitoring, not just organizational courage. Here's the engineering framework.
insiderai-engineering
Apr 1911 min
The Share-Nothing Agent: Designing AI Agents for Horizontal Scalability
Most AI agents can't scale horizontally because they accumulate implicit state that ties them to a single machine. Here's the architectural discipline that fixes it.
insideragent-architecture
Apr 1912 min
The Six-Month Cliff: Why Production AI Systems Degrade Without a Single Code Change
Your AI feature shipped green and performed well at launch. Six months later it's quietly 20–40% worse — and your dashboards never flagged it. Here's why this happens and how to stop it.
llmproduction
Apr 199 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 80

Prompt Versioning Done Right: Treating LLM Instructions as Production Software

Zero-Shot, Few-Shot, or Chain-of-Thought: A Production Decision Framework

RAG Knowledge Base Freshness: The Staleness Problem Teams Solve Last

RAG Position Bias: Why Chunk Order Changes Your Answers

Testing the Retrieval-Generation Seam: The Integration Test Gap in RAG Systems

RBAC Is Not Enough for AI Agents: A Practical Authorization Model

Reasoning Model Economics: When Chain-of-Thought Earns Its Cost

The Reranker Gap: Why Most RAG Pipelines Skip the Most Important Layer

Sequential Tool Call Waterfalls: The Hidden Latency Tax in Agent Loops

Shadow to Autopilot: A Readiness Framework for AI Feature Autonomy

The Share-Nothing Agent: Designing AI Agents for Horizontal Scalability

The Six-Month Cliff: Why Production AI Systems Degrade Without a Single Code Change

About Tian Pan