Dead Reckoning for Long-Running Agents: Knowing Where Your Agent Is Without Stopping It
Before GPS, sailors used dead reckoning: take your last confirmed position, note your speed and heading, and project forward. It works until the accumulated error compounds into something irreversible—a reef you didn't see coming.
Long-running AI agents have exactly this problem. When an agent spends two hours orchestrating API calls, writing documents, and executing multi-step plans, the people running it often have no better visibility than a sailor without instruments. The agent either finishes or it doesn't. The failure mode isn't the crash—it's the silent loop that burns $30 in tokens while appearing to work, or the agent that "successfully" completes the wrong task because its world model drifted an hour into execution.
Production data makes this concrete: agents with undetected loops have been documented repeating the same tool call 58 times before manual intervention. A two-hour runaway at frontier model rates costs $15–40 before anyone notices. And the worst failures aren't the ones that error out—they're the 12–18% of "successful" runs that return plausible-looking wrong answers.
