Blog

Page 85

12 articles

The Context Window Cliff: Application-Level Strategies for Long Conversations
What actually happens when your LLM context fills up mid-session, why most frameworks handle it badly, and the summarization, selective retention, and externalization patterns that keep long-lived conversations coherent.
llmcontext-management
Apr 1810 min
Continuous Deployment for AI Models: Your Rollback Signal Is Wrong
HTTP error rates can't detect behavioral regression in LLM upgrades. Here's how to run blue/green and canary deployments with behavioral divergence as the real rollback signal.
ai-engineeringmlops
Apr 1810 min
The Conversation Designer's Hidden Role in AI Product Quality
UX writing in system prompts, error messages, and capability disclosures directly shapes model behavior and user trust — in ways most engineering teams never measure.
insiderai-engineering
Apr 1810 min
Corpus Architecture for RAG: The Indexing Decisions That Determine Quality Before Retrieval Starts
Most RAG failures are diagnosed at query time but caused at index time. A technical guide to the chunk size, overlap, hierarchy, and metadata decisions that silently determine retrieval quality.
insiderrag
Apr 1812 min
Cross-Encoder Reranking in Practice: What Cosine Similarity Misses
Vector ANN search finds semantically adjacent chunks, not necessarily the most useful ones. Layer cross-encoder reranking, MMR, and BM25 hybrid scoring to close the retrieval quality gap—with latency math that tells you when it pays off.
ragretrieval
Apr 1810 min
The Data Quality Tax in LLM Systems: Why Bad Input Hits Differently
Traditional ML degrades gracefully on noisy data. LLMs hallucinate confidently, corrupt vector stores, and propagate errors downstream with apparent authority. Here's how to measure and mitigate the data quality tax.
insiderllm
Apr 189 min
Dead Reckoning for Long-Running Agents: Knowing Where Your Agent Is Without Stopping It
When an agent runs for hours, knowing where it is—and whether it's still on track—becomes a first-class engineering problem. These are the patterns that solve it.
insiderai-agents
Apr 1811 min
Decision Provenance in Agentic Systems: Audit Trails That Actually Work
When autonomous agents take consequential actions, having logs is not the same as having accountability. A practical guide to designing decision provenance for production agentic systems — event schemas, ownership handoffs, hallucination attribution, and the compliance requirements that make this non-optional.
insideragentic-ai
Apr 1813 min
The AI Feature Sunset Playbook: Decommissioning Agents Without Breaking Your Users
Shutting down an AI feature is fundamentally different from deprecating a deterministic API. Here's the engineering playbook for mapping behavioral dependencies, staging sunsets, and avoiding the support ticket avalanche.
insiderai-engineering
Apr 1810 min
Designing for Partial Completion: When Your Agent Gets 70% Done and Stops
Most agent failure designs assume clean abort or clean success. Real agents hit uncertainty, authorization limits, and resource constraints mid-task. Here's how to design for what actually happens.
insiderai-engineering
Apr 1810 min
Dev/Prod Parity for AI Apps: The Seven Ways Your Staging Environment Is Lying to You
Staging environments systematically misrepresent how LLM applications behave in production. Here are seven specific failure modes — from prompt cache warmth to silent traffic distribution drift — and the pre-prod checks that surface them.
llmopsproduction
Apr 1811 min
Distributed Tracing Across Agent Service Boundaries: The Context Propagation Gap
When agents call agents across microservice boundaries, W3C TraceContext breaks down and your traces fragment into disconnected spans. Here's the technical shape of the failure and how to fix it.
insiderobservability
Apr 1811 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 85

The Context Window Cliff: Application-Level Strategies for Long Conversations

Continuous Deployment for AI Models: Your Rollback Signal Is Wrong

The Conversation Designer's Hidden Role in AI Product Quality

Corpus Architecture for RAG: The Indexing Decisions That Determine Quality Before Retrieval Starts

Cross-Encoder Reranking in Practice: What Cosine Similarity Misses

The Data Quality Tax in LLM Systems: Why Bad Input Hits Differently

Dead Reckoning for Long-Running Agents: Knowing Where Your Agent Is Without Stopping It

Decision Provenance in Agentic Systems: Audit Trails That Actually Work

The AI Feature Sunset Playbook: Decommissioning Agents Without Breaking Your Users

Designing for Partial Completion: When Your Agent Gets 70% Done and Stops

Dev/Prod Parity for AI Apps: The Seven Ways Your Staging Environment Is Lying to You

Distributed Tracing Across Agent Service Boundaries: The Context Propagation Gap

About Tian Pan