Blog

Page 40

12 articles

AI Office Hours Don't Scale: When Your One Expert Becomes the Release Gate
Production AI features cluster around one engineer's calendar, and the resulting bottleneck stays invisible to every dashboard until the expert quits. Here is how to detect and dismantle it.
ai-engineeringengineering-leadership
Apr 2711 min
Your AI Pricing Page Is a Leveraged Bet on Token Economics
Per-seat unlimited AI tiers are a naked short on token volatility. Vendor repricing, power-user drift, and model-mix creep compress margins overnight — unless attribution, caps, and tier gradients are built in before the page goes live.
ai-engineeringpricing
Apr 279 min
The AI Risk Register: What Your CRO Will Demand the Morning After
Your enterprise risk register has rows for cyber, vendor, regulatory — but no row for the autonomous agent that just took an action under your credentials and produced a customer-visible loss. Here are the five columns the CRO will ask for the next morning.
ai-engineeringgovernance
Apr 2712 min
AI Shadow IT: When Product Teams Build Their Own LLM Proxy
Shadow LLM proxies bypass cost attribution, audit logs, and DPAs because the platform gateway loses to product deadlines. The fix is a paved road that beats the side-channel on time-to-first-token, capability parity, and developer experience.
insiderai-platform
Apr 2711 min
Argument Hallucination Is a Drift Signal, Not a Model Bug
When the model invents an argument value, the cheapest hypothesis isn't 'the model failed' — it's 'the description you gave the model no longer matches the API on the other side of the wire.'
llm-toolsagents
Apr 2710 min
Why Your Bias Eval Passes in CI and Fails in Deployment
Static bias audits pass in CI and fail in production because input distributions drift. Continuous fairness monitoring with per-cohort SLOs and drift-aware release gates is the fix.
fairnessevals
Apr 2710 min
The 'Try a Bigger Model' Reflex Is a Refactor Smell
When every quality regression on your team gets routed to 'let's try the bigger tier,' you're paying capacity to mask an upstream bug. The discipline to break the reflex, and the gate to put in front of it.
insiderllm
Apr 2710 min
Browser-Native AI Is a Per-Feature Decision: Four Axes Your Team Hasn't Priced
Browser-native AI is not a faster TensorFlow.js. It is a different runtime with a four-axis trade-off — latency floor, privacy, device fragmentation, capability ceiling — that does not collapse into a single answer.
browser-aiwebgpu
Apr 2712 min
Confidence Strings, Not Scores: Why Your 0.87 Badge Moves Nobody
A 0.87 confidence badge changes no user behavior. A natural-language hedge that names what the model didn't check changes a lot. Why probability scores are the wrong shape of signal, and how to ship uncertainty as content instead of a UI overlay.
insiderai-ux
Apr 2710 min
Cost-Per-Correctness, Not Cost-Per-Token: The Unit Metric Your Bill Won't Tell You
Token spend is the numerator. Eval-graded outcomes are the denominator. Tracking only the bill is how cheap-tier migrations silently regress quality and inflate downstream support cost.
ai-engineeringfinops
Apr 2711 min
Cross-Team Agent SLAs Don't Compose: The 99% Math Your Org Forgot to Budget
When agents call agents across team boundaries, individual SLOs stop predicting end-to-end behavior. The four pieces that have to land before the composition math eats your reliability budget.
insidermulti-agent
Apr 2711 min
The Eval Bottleneck: Your Eval Engineer Is Now the Roadmap
In 2026 the throughput limit on AI features isn't model shipping or prompt iteration — it's eval engineering. Here's the staffing ratio, platform investment, and leadership reframing required before your only eval engineer quits.
evalsai-engineering
Apr 2711 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 40

AI Office Hours Don't Scale: When Your One Expert Becomes the Release Gate

Your AI Pricing Page Is a Leveraged Bet on Token Economics

The AI Risk Register: What Your CRO Will Demand the Morning After

AI Shadow IT: When Product Teams Build Their Own LLM Proxy

Argument Hallucination Is a Drift Signal, Not a Model Bug

Why Your Bias Eval Passes in CI and Fails in Deployment

The 'Try a Bigger Model' Reflex Is a Refactor Smell

Browser-Native AI Is a Per-Feature Decision: Four Axes Your Team Hasn't Priced

Confidence Strings, Not Scores: Why Your 0.87 Badge Moves Nobody

Cost-Per-Correctness, Not Cost-Per-Token: The Unit Metric Your Bill Won't Tell You

Cross-Team Agent SLAs Don't Compose: The 99% Math Your Org Forgot to Budget

The Eval Bottleneck: Your Eval Engineer Is Now the Roadmap

About Tian Pan