Blog

Page 107

12 articles

On-Call for Stochastic Systems: Why Your AI Runbook Needs a Rewrite
Traditional incident response assumes reproducible failures. LLM-powered systems don't. Here's how to rewrite your alerting schema, triage decision tree, and post-mortem template for non-deterministic AI.
insiderai-engineering
Apr 1810 min
The On-Device LLM Problem Nobody Talks About: Model Update Propagation
Shipping LLMs to edge devices creates a distributed system with no central rollback — version fragmentation, silent capability drift, and artifact ensemble mismatches that don't show up in benchmarks.
insideredge-ai
Apr 1812 min
On-Device LLM Inference in Production: When Edge Models Are Right and What They Actually Cost
The privacy, latency, and offline case for running LLM inference on iOS, Android, and browser—plus the quality-size tradeoffs, cost math, and the update problem that bites teams six months after ship.
performanceinfrastructure
Apr 1810 min
The Orchestration Framework Trap: When LangChain Makes You Slower to Ship
AI orchestration frameworks like LangChain accelerate prototyping but create debugging opacity, versioning brittleness, and leaky abstractions at scale. Here's the decision framework for knowing when to use them and when to drop down a layer.
ai-engineeringlangchain
Apr 188 min
The Over-Tooled Agent Problem: Why More Tools Make Your LLM Dumber
Tool selection accuracy drops to 13% when LLMs face large tool sets. Here's why over-tooling breaks your agents and how to architect around it with routing layers, hierarchical toolsets, and lazy-loading registries.
ai-engineeringagents
Apr 189 min
The PII Leak in Your RAG Pipeline: Why Your Chatbot Knows Things It Shouldn't
Semantic similarity doesn't respect data-access boundaries. Here's how RAG pipelines expose sensitive records to unauthorized users—and the layered defenses that stop them.
ragsecurity
Apr 1810 min
The Privacy Architecture of Embeddings: What Your Vector Store Knows About Your Users
Embedding a user's documents creates a novel privacy surface area that traditional databases don't have. Here's how re-identification risks work, where access control breaks down in RAG pipelines, and the architectural patterns that actually fix it.
insidersecurity
Apr 1810 min
Prompt Archaeology: Recovering Intent from Legacy Prompts Nobody Documented
When you inherit a production prompt with no documentation, how do you figure out what it was supposed to do? A systematic methodology for recovering intent from undocumented prompts — and the documentation format that prevents the next engineer from facing the same problem.
llmprompt-engineering
Apr 1810 min
The Prompt Debt Spiral: How One-Line Patches Kill Production Prompts
Production prompts accumulate technical debt through incremental patches that compound into contradictory, bloated instructions. Here's how to recognize the spiral and break it before a prompt becomes unmaintainable.
insiderprompt-engineering
Apr 189 min
The Prompt Governance Problem: Managing Business Logic That Lives Outside Your Codebase
When you have 50+ active prompts across product, ML, and infra teams, you have a distributed systems problem — not a writing problem. Here's the infrastructure that keeps it from becoming a liability.
ai-engineeringllm
Apr 189 min
Prompt Injection Is a Supply Chain Problem, Not an Input Validation Problem
Per-request sanitization gives teams a false sense of security. As RAG systems index millions of documents and agents consume third-party tool outputs, the real defense requires architecture-level controls: content provenance, trust-tier enforcement, and sandboxed execution.
securityai-engineering
Apr 189 min
Prompt Localization Debt: The Silent Quality Tiers Hiding in Your Multilingual AI Product
Why prompts that perform at 91% in English quietly degrade to 72% in Japanese or Arabic — and how to build the evaluation infrastructure that catches these regressions before they reach non-English users.
insiderai-engineering
Apr 189 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 107

On-Call for Stochastic Systems: Why Your AI Runbook Needs a Rewrite

The On-Device LLM Problem Nobody Talks About: Model Update Propagation

On-Device LLM Inference in Production: When Edge Models Are Right and What They Actually Cost

The Orchestration Framework Trap: When LangChain Makes You Slower to Ship

The Over-Tooled Agent Problem: Why More Tools Make Your LLM Dumber

The PII Leak in Your RAG Pipeline: Why Your Chatbot Knows Things It Shouldn't

The Privacy Architecture of Embeddings: What Your Vector Store Knows About Your Users

Prompt Archaeology: Recovering Intent from Legacy Prompts Nobody Documented

The Prompt Debt Spiral: How One-Line Patches Kill Production Prompts

The Prompt Governance Problem: Managing Business Logic That Lives Outside Your Codebase

Prompt Injection Is a Supply Chain Problem, Not an Input Validation Problem

Prompt Localization Debt: The Silent Quality Tiers Hiding in Your Multilingual AI Product

About Tian Pan