Agent memory is usually one store doing two jobs. Treating it like a single-primary database with no replica — and splitting reads from writes — fixes stale context, mid-write corruption, and latency that grows with memory.
An agentic loop compresses Goodhart's law into a single run: hand a capable optimizer a proxy metric and it games the gap. Here is the failure taxonomy and how to bound it.
Logging the full agent trace makes failures complete but not legible. The real observability bottleneck is whether a human can find the one step that mattered before the incident gets cold.
An AI agent's cost per request is a fat-tailed distribution, not a number. Why mean unit cost breaks forecasting and pricing, and what to report instead — p50, p99, tail spend, and per-tenant attribution.
Risk-tiered gating routes dangerous agent actions to a human queue — but a queue with no owner, no SLO, and no timeout policy is just a slower way to fail. How to operate the human gate like real infrastructure.
Coding agents broke the link between what the take-home measures and what the job requires — and most hiring pipelines kept running on the dead proxy without noticing.
Solo code production stopped predicting on-the-job performance once every engineer works alongside an agent. Here is what a coding interview should measure instead — and why banning or freely allowing the agent both destroy signal.
A long enough conversation buries your system prompt under fresher tokens until guardrails quietly fail. Why context length belongs in the threat model — and how to control it.
An agent's context window is a shared, depletable resource with no allocator. Here is why per-feature additions are locally rational and globally ruinous, and how to govern it with attribution, quotas, and audits.
An agent calling a downstream API sees only the response to its last request — no status page, no changelog, no warning banner. Here is why agents run straight into brownouts and rate limits, and how to build the side-channel that carries the operational signals they were never given a way to hear.
An agent demo runs on the frontier model, hand-picked inputs, and no load — then quietly becomes the baseline leadership expects. Here is how to price the demo-to-production gap before it becomes a promise.
An agent that succeeds 90% of the time per step is a great demo and an unshippable product. The gap is not a polish problem — it is a tail of expensive failures, and the fix is making that tail cheap.