Blog

Page 50

12 articles

When Your CLI Speaks English: Least Authority for Promptable Infrastructure
The principle of least authority breaks when your CLI accepts English. Every wrapper that translates intent into commands becomes a confused deputy. Patterns that hold up: intent-bound tokens that pin the resolved plan, binding dry-runs, and audit trails that link prompt to action graph.
insidersecurity
Apr 2613 min
The Indexing Policy Committee Nobody Convened: RAG Corpus Governance Beyond the One-Time Migration
Most RAG failures aren't model failures—they're governance failures. The four axes of corpus governance—legal, freshness, authorship-trust, ownership—and the indexing-policy discipline that determines whether your retrieval index is a product surface or a shared inbox.
ragllm
Apr 269 min
The RAG Read-After-Write Race: When Your Vector Index Cites a Document That No Longer Exists
Production RAG pipelines silently assume snapshot isolation between retrieval and generation. They never enforce it — and the bug shows up as deleted-chunk citations, edited-chunk inversions, and stale-permission leaks.
ragvector-database
Apr 2610 min
Reachability Analysis for Agent Action Spaces: Eval Coverage for the Branches You Never Tested
Your tool catalog plus a planner forms a reachable graph of plans your evals have probably never exercised. Borrow reachability analysis from compilers to find the branches your incident channel will discover first.
insiderai-engineering
Apr 2612 min
Reasoning-Effort Budgeting: When Thinking Tokens Become a Finance Line Item
Reasoning tokens look like output tokens on the bill but balloon 3–10x and have no natural ceiling. Treat thinking effort as a tunable resource — measured in yield, governed by budgets, routed by difficulty, and surfaced as its own dashboard line item before finance asks about it first.
llmfinops
Apr 2611 min
Replan, Don't Retry: Why Most Agent Errors Aren't Transient
Most agent frameworks default to exponential-backoff retry on tool errors — a pattern borrowed from stateless HTTP that's actively wrong inside a stateful planning loop. The right default is replan.
agentsllm
Apr 2610 min
Reviewing Agent PRs Is a Different Job, Not a Faster One
Agent-authored PRs concentrate bugs in different places than human PRs, and the reviewer instincts trained on years of human code quietly fail on them. A walkthrough of the new bug profile, why fluent diffs are dangerous, and the three artifacts every reviewer now has to read together.
code-reviewai-agents
Apr 2610 min
The RLAIF Doom Loop: When Your Cheapest Feedback Signal Quietly Poisons Your Fine-Tune
AI-generated preference labels are 100x cheaper than human ones — and they teach your model to prefer the judge's aesthetic, not your users'.
rlaifrlhf
Apr 2610 min
The Router Is the Product: Why Your Cheap Classifier Decides More Behavior Than Your Flagship Model
Cost-aware LLM routing makes the cheap model the actual product surface for most users. If your eval discipline still points at the flagship, you are flying blind on 70% of traffic — here is the router-as-product framing that fixes it.
llm-routingai-engineering
Apr 2610 min
Sampling Parameter Inheritance: When Temperature 0.7 Leaks From the Planner Into the Verifier
Agent harnesses that propagate temperature down the call tree turn the planner's creativity knob into the verifier's bug. Per-role sampling profiles, default-deny inheritance, and the disagreement-rate eval that catches the leak.
insideragents
Apr 2610 min
Session Stitching: Why Your Conversation-ID Is a Lie
Frameworks ship session-ids; users live in tasks. The gap between them is where half of agent UX disappears, and the fix is a task-id, not longer sessions.
insideragent-ux
Apr 2611 min
Your Shadow Eval Set Is a Compliance Time-Bomb
Production-trace eval pipelines accumulate PII no one promised users would be processed this way. The fix is sanitization at the write boundary, schema-typed spans, and tag-based retention — not regex scrubbers at read time.
ai-engineeringprivacy
Apr 2610 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 50

When Your CLI Speaks English: Least Authority for Promptable Infrastructure

The Indexing Policy Committee Nobody Convened: RAG Corpus Governance Beyond the One-Time Migration

The RAG Read-After-Write Race: When Your Vector Index Cites a Document That No Longer Exists

Reachability Analysis for Agent Action Spaces: Eval Coverage for the Branches You Never Tested

Reasoning-Effort Budgeting: When Thinking Tokens Become a Finance Line Item

Replan, Don't Retry: Why Most Agent Errors Aren't Transient

Reviewing Agent PRs Is a Different Job, Not a Faster One

The RLAIF Doom Loop: When Your Cheapest Feedback Signal Quietly Poisons Your Fine-Tune

The Router Is the Product: Why Your Cheap Classifier Decides More Behavior Than Your Flagship Model

Sampling Parameter Inheritance: When Temperature 0.7 Leaks From the Planner Into the Verifier

Session Stitching: Why Your Conversation-ID Is a Lie

Your Shadow Eval Set Is a Compliance Time-Bomb

About Tian Pan