Skip to main content

Your Clock-in-Prompt Is a Correctness Boundary, Not a Log Field

· 10 min read
Tian Pan
Software Engineer

A scheduling agent booked a customer's onboarding call for Tuesday instead of Wednesday. The investigation took two days. The prompt was fine. The model was fine. The calendar tool was fine. The bug was that the system prompt carried a current_time field stamped an hour earlier, when the request routed through a cached prefix built just before midnight UTC. By the time the agent parsed "tomorrow at 10 AM" and called the booking tool, "tomorrow" referred to a day that was already "today" for the user in Tokyo.

The agent had no way to notice. It had nothing to notice with. LLMs do not have clocks. They have whatever string you handed them in the prompt, and they treat that string as authoritative the same way they treat the user's question as authoritative — which is to say, completely, without skepticism, without a second source to cross-check against.

Most teams know this in the abstract and still treat the timestamp they inject like a log field: something nice to have, rendered into the system prompt for context, nobody's explicit responsibility, nobody's correctness boundary. That framing is wrong. The timestamp is a correctness boundary. Every agent behavior that depends on "now" — scheduling, expiration, retry windows, "recently," "tomorrow," "in five minutes," freshness checks on retrieved documents — runs on whatever your time plumbing produced, and inherits every bug that plumbing has.

The four ways your clock lies to the agent

There are four distinct bugs that produce a wrong "now" inside the model, and they fail in different ways. A team that fixes one and ships often thinks they fixed the whole category.

Timezone default. Your server runs in UTC because that is the sensible server default. Someone wrote datetime.now() in the prompt-assembly code, without a timezone argument, because "the server is UTC, so UTC is fine." For a user in California at 10 PM local, UTC is already tomorrow; the agent asked for "today's schedule" now fetches the wrong calendar day. For a user in Tokyo at 7 AM, UTC is still yesterday; "did I complete my workout today" returns yesterday's workout. The bug is deterministic per time-of-day and per user location, which means it reproduces for some users and not others, which means it takes a long time to surface.

Stale cached prompt. You cache the system prompt for cost reasons. Prompt caching works by matching a prefix; any byte that changes invalidates the cache. A naive implementation rebuilds the system prompt with a fresh timestamp every request, and the cache never hits — so someone moves the timestamp into a stable location "to keep the cache warm," and now the timestamp is the cache key's prefix, which means it is frozen at whatever it was when the prefix was built. The agent now gets an authoritative-looking current_time that is minutes, hours, or in pathological cases a full day old. There is a real architectural tension here that teams resolve in opposite directions and neither direction is correct: either you break cache on every call, or you serve stale time. The right answer is that the stable prefix contains no time at all, and "now" lives in the per-turn user message where it belongs — but that requires someone to notice the tension exists.

DST transitions. Twice a year, in jurisdictions that still observe it, the local clock moves by an hour. The agent reasoning about "schedule this for tomorrow 3 PM" picks up "tomorrow" from the injected date, adds "15:00," and hands that to a calendar API that interprets 15:00 in whatever timezone the API was told to use. If the agent's clock string was built before the DST transition and the booking fires after, the hour is wrong. The same bug applies to expirations: "this link is valid for 24 hours" stamped at the wrong side of a DST shift expires an hour early or an hour late, and the user whose session died forty minutes before they expected calls support.

Context staleness in long-running agents. Modern agents are not request-response — they run for minutes, sometimes hours, through many tool calls. The system prompt was assembled at step zero. At step forty, the tool call fires. The current_time value the agent reasons with at step forty is whatever it was at step zero, unless you went out of your way to refresh it. "Retry in 5 minutes" decided at step zero and executed at step forty is now forty-five minutes in the past or forty-five minutes in the future, depending on how the agent interpreted its own instruction. MCP sessions and OAuth tokens expire inside these long runs too, for the same structural reason: "when does this become invalid" was computed against a snapshot of time that no longer applies.

Each of these is its own incident class. "We fixed the timezone bug" is a true statement that does not imply "we fixed the time bug."

Time is a fact, not a decoration

The reframe that matters: treat "now" the same way you treat any other fact the agent is expected to reason about correctly. Facts have provenance, they have a freshness window, they have explicit types, and they appear in the prompt in a location where the agent is expected to actually read them rather than glance past them.

Concretely, this means the time context should be:

  • Explicitly asserted, not buried in preamble. A line that says The current time is 2026-04-23T14:22:00-07:00 (Pacific Daylight Time). The user's local date is Thursday. is a fact the agent can cite. A vague "today is April 23, 2026" stamped in a block of boilerplate is decoration.
  • Timezone-qualified, with both the offset and the named zone. The offset alone (-07:00) does not disambiguate during DST transitions. The IANA zone name does, and it carries the future-dated transition rules with it.
  • Decomposed into the forms the agent will actually use. Agents ask "what day of the week is it" far more often than you'd think, and LLMs are notoriously bad at computing day-of-week from a bare date. Include day-of-week in the injected fact so the model does not have to do modular arithmetic it is poor at.
  • Refreshed at step boundaries, not only at the start of the conversation. Every non-trivial planner step, every tool call that depends on recency, every loop iteration — re-inject a fresh, explicit "now" as part of the step's user message. The cost is a few dozen tokens. The correctness gain is that the agent cannot drift more than one step out of sync.
Loading…
References:Let's stay in touch and Follow me for more thoughts and updates