Blog

Page 60

12 articles

The Prompt Ownership Problem: When Conway's Law Comes for Your Prompts
Prompts live in four teams at once — authors, evaluators, deployers, and support. When no single role owns the whole loop, Conway's law guarantees silent quality leaks. The RACI gaps, shared-library traps, and steward role that actually keep behavior coherent.
prompt-engineeringai-governance
Apr 2211 min
Your Prompt Is Competing With What the Model Already Knows
Foundation models arrive pre-loaded with strong opinions about your domain. Probe the prior, refute the default, and stop shipping prompts that compete with what the model already believes.
insiderai-engineering
Apr 2211 min
Your RAG Chunker Is a Database Schema Nobody Code-Reviewed
Treat your RAG chunker like preprocessing and every boundary tweak becomes a silent schema migration. Version it, stage it, and own the retrieval eval alongside it.
insiderrag
Apr 2211 min
Why Your RAG Citations Are Lying: Post-Hoc Rationalization in Source Attribution
Between 50 and 90 percent of LLM citations do not fully support the claims they are attached to. Here is why post-hoc attribution makes RAG systems quietly untrustworthy, how to measure citation faithfulness with NLI, and the architectural fixes that actually help.
insiderrag
Apr 2210 min
Rate Limit Hierarchy Collapse: When Your Agent Loop DoSes Itself
One user's agent fan-out can starve every other user of the same quota. Why flat token buckets collapse under agent workloads, and the four-layer hierarchy that keeps the platform honest.
ai-agentsrate-limiting
Apr 2212 min
The Reasoning-Model Tax at Tool Boundaries
Reasoning models win benchmarks but bleed latency and quality at tool-choice steps. A per-step hybrid routing pattern, attribution, and anti-patterns.
insiderai-agents
Apr 2210 min
The Reflection Placebo: Why Plan-Reflect-Replan Loops Return Version One
Single-model reflection loops mostly return the first plan with cosmetic edits while compounding the token bill. Here is how to measure the placebo and what actually produces divergent plans.
ai-agentsllm
Apr 229 min
The Refusal Training Gap: Why Your Model Says No to the Wrong Questions
Refusal in language models is two distinct capabilities that training pipelines conflate, leaving models that block benign requests while confidently fabricating answers to questions they cannot reliably answer.
llmai-safety
Apr 2210 min
Retry Amplification: How a 2% Tool Error Rate Becomes a 20% Agent Failure
Agent loops turn a 2% tool error rate into a 20% user-visible failure by multiplying retries across steps and SDK layers. Here is the math, the self-DoS pattern, and the retry budget discipline that stops it.
insideragents
Apr 2213 min
The Right-Edge Accuracy Drop: Why the Last 20% of Your Context Window Is a Trap
Filling an LLM's advertised context window wrecks accuracy at the right edge — the failure mode past 'lost in the middle,' with benchmarks, safety margins by task, and prompt fixes.
insiderllm
Apr 2211 min
The Rubber-Stamp Collapse: Why AI-Authored PRs Are Hollowing Out Code Review
When most diffs in a repo start life as model output, reviewers anchor on 'looks plausible' and miss the semantic bugs that don't render as syntactic smell. The countermeasures, the disclosure question leadership has to answer, and the incident curve that catches up six months later.
insiderai-engineering
Apr 2210 min
Sampling Bias in Agent Traces: Why Your Debug Dataset Silently Excludes the Failures You Care About
Head-based and uniform-random sampling silently excise the rare catastrophic agent trajectories from your debug corpus. Tail sampling, anomaly-keyed retention, and per-failure-mode reservoirs build a debug dataset that actually contains the failures you need.
observabilityagents
Apr 229 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 60

The Prompt Ownership Problem: When Conway's Law Comes for Your Prompts

Your Prompt Is Competing With What the Model Already Knows

Your RAG Chunker Is a Database Schema Nobody Code-Reviewed

Why Your RAG Citations Are Lying: Post-Hoc Rationalization in Source Attribution

Rate Limit Hierarchy Collapse: When Your Agent Loop DoSes Itself

The Reasoning-Model Tax at Tool Boundaries

The Reflection Placebo: Why Plan-Reflect-Replan Loops Return Version One

The Refusal Training Gap: Why Your Model Says No to the Wrong Questions

Retry Amplification: How a 2% Tool Error Rate Becomes a 20% Agent Failure

The Right-Edge Accuracy Drop: Why the Last 20% of Your Context Window Is a Trap

The Rubber-Stamp Collapse: Why AI-Authored PRs Are Hollowing Out Code Review

Sampling Bias in Agent Traces: Why Your Debug Dataset Silently Excludes the Failures You Care About

About Tian Pan