Skip to main content

Agents as Cron Jobs: When Scheduled Triggers Beat Conversational Loops

· 10 min read
Tian Pan
Software Engineer

Most "agents" in production today are background jobs wearing a chat interface. They do not need a user typing into them. They need a trigger, a state file, and a way to resume after the inevitable timeout. The conversational loop — request, tool call, request, tool call, indefinitely — is a demo affordance that quietly became the default execution model, and it is the wrong model for the majority of agentic work that ships.

The decision is not philosophical. It shows up on the bill, in the on-call pager, and in the percentage of runs that finish at all. A conversational loop holds a model session open across many turns, accumulates context, and dies if any link in the chain fails. A scheduled trigger fires at a deterministic boundary, runs to completion or to a checkpoint, and writes its state somewhere durable before exiting. One is a phone call. The other is a job queue. Treating the two as interchangeable is how a $200/month feature becomes a $40,000/month feature without anyone changing the prompt.

The Default Loop Is Inheriting Failure Modes Nobody Tested For

When you wrap a model in a chat-style loop, you inherit the failure surface of a stateful long-lived session. That surface is large and lightly tested in most production code.

The first failure is context rot. Studies in 2025 documented what practitioners had been muttering about for a year: as a conversation grows, accuracy on instructions buried earlier in the window decreases, even within models advertising 200K-token contexts. The early system prompt and the user's actual goal get drowned by tool outputs, retries, and stale plans. The model does not announce that it has forgotten — it just produces shallower reasoning, recommends contradictory patterns within the same response, and silently abandons constraints from turn one.

The second failure is the contradiction pile. When a tool call fails, the failure stays in the context. When the agent tries a different approach, the new attempt sits next to the failed one. By turn forty, the context contains three implementations of the same function, three error messages, and three instructions to abandon each. The model is now reasoning about its own confusion instead of the task.

The third failure is resource exhaustion. A conversational loop with no hard turn cap is one prompt-injection or one bad tool response away from spending an entire monthly token budget in a single afternoon. Security researchers have already named the pattern — agentic resource exhaustion, the infinite-loop attack of the AI era — and the only durable defense is to not have unbounded loops in the first place.

The fourth failure is liveness. A long-running conversational session is, from an infrastructure perspective, a TCP-style state machine pretending to be stateless. Any network blip, any model rate-limit, any container restart kills the entire conversation. The work done in turns 1–30 is gone unless someone explicitly persisted it.

Most "Agents" Do Not Need to Be Conversations

A useful diagnostic question: who is on the other end of this agent's loop? If the answer is "another piece of software, on a schedule, with no human waiting," the agent should not be a chat. It should be a job.

Look at the actual workloads people are calling agents in 2026:

  • A nightly job that scans yesterday's support tickets, clusters them, and posts a digest to Slack.
  • A scheduled crawler that monitors competitor pricing and writes deltas to a database.
  • A weekly compliance bot that diffs new SaaS contracts against a policy template.
  • An hourly monitor that triages new GitHub issues and assigns labels.
  • A document-enrichment pipeline that runs whenever a file lands in S3.

None of these has a user typing into it. None of them benefits from a long-lived session. Every one of them benefits from being a deterministic, idempotent, retriable job that runs at known boundaries, writes its findings to durable storage, and exits cleanly. The conversational framing is not just unnecessary — it is the source of the reliability and cost problems these systems hit in month two.

The instinct to chat-wrap them comes from the SDK, not the workload. Frameworks ship with a conversational loop as the path of least resistance, and teams build their first prototype in that loop because it is what the demo showed. The loop then survives into production, where it costs more, fails more, and is harder to operate than the equivalent batch job would have been.

The Cron-Shaped Alternative

Treating an agent as a cron job means three commitments:

Deterministic triggers. The agent runs because something fires it — a schedule, a webhook, a queue event — not because a user is holding a session open. The trigger is the only entry point. There is no "and then it just keeps running."

Checkpointed state, not in-context state. The agent writes its plan, intermediate findings, and progress to disk or to a workflow runtime (Temporal, Restate, the Microsoft Agent Framework checkpoint API, LangGraph persistence) at every meaningful boundary. The context window is treated as a scratch buffer that can vaporize at any time, not as the source of truth. If the process dies at step seven of ten, step eight reads the state file and resumes; it does not replay the first seven turns into a fresh model session.

A bounded execution envelope. Every run has a wall-clock timeout, a token budget, and a maximum step count. If the work is not done when the envelope closes, the run checkpoints what it has and exits. The next scheduled fire-up resumes. This is how operating systems have run reliable workloads for fifty years; there is no AI exception.

The economic case is just as concrete. Anthropic and OpenAI both offer batch APIs at a 50% discount on input and output tokens for asynchronous workloads with multi-hour completion windows. Combined with prompt caching, the effective cost of a batch-style agent run can drop more than 90% versus the same logic in a real-time conversational loop. For workloads that genuinely do not need sub-second response — and most "agents" do not — staying on the synchronous tier is paying a premium for a latency budget you are not using.

What You Inherit When You Confuse the Two

The most expensive bugs are not the obvious crashes. They are the ones that come from running a batch workload through a conversational substrate, or vice versa.

Run a batch workload as a long conversation, and you inherit: unbounded token growth as context accumulates across what should have been independent runs; cross-run contamination, where a stale instruction from yesterday's run leaks into today's; impossible debugging, because every run has a different in-context history; and pricing that scales with the square of session length rather than linearly with work.

Run a conversational workload as a series of detached cron jobs, and you inherit a different set: total loss of conversational coherence, because each fire-up cannot see the prior turn's reasoning; constant re-derivation of context the user already established; and a user-facing latency that is now bounded by your scheduler's tick rate rather than by the model.

The failure mode in each direction looks identical from the outside: "the agent is unreliable." Internally, the diagnosis is opposite. Treating both with the same playbook — adding more tools, fattening the prompt, swapping models — does nothing because the substrate is wrong.

A useful litmus test before you ship: if a human were doing this work, would they sit at a keyboard for the entire duration, or would they kick it off and check back later? If the answer is "kick it off," the implementation should match. The fact that the underlying tool is a chat-completions API is an implementation detail, not a UX mandate.

Designing the Trigger, the State, and the Envelope

Three concrete design choices separate a cron-shaped agent from a conversational one cosplaying as a job.

The trigger should be the only mutator. A scheduled agent runs because cron fired, because a queue had a message, or because a webhook arrived. It does not have a side channel for "the user can also just call it." Every entry point you add is a state-machine branch you have to test, and the entire reliability story relies on runs being independent. If a human really does need to invoke it ad-hoc, route their request through the same queue everyone else uses; do not give them a back door into the same code path.

The state file is the agent. Treat the model as a stateless function from state -> next_state. The orchestrator reads the state from durable storage, calls the model with only the slice of state relevant to the next step, writes the result back, and exits. This is the pattern Temporal and Restate were designed for, and it is also achievable with a Postgres row, a JSON file, or whatever your team already operates. The point is that the state is named, versioned, and inspectable — you can resume, replay, or audit any run without convincing a long-lived process to cooperate.

The envelope is non-negotiable. A wall-clock timeout, a token cap, and a maximum step count must be enforced by the orchestrator, not by the model's good intentions. Nothing in the prompt prevents an agent from looping forever; only the runtime can. When the envelope closes, the run writes a checkpoint and exits. The next trigger picks up from the checkpoint. This is the difference between a system you can leave running over a holiday and one that pages someone at 3 a.m. because a tool call hung.

What Stays Conversational

This is not an argument that chat agents are useless. Some workloads genuinely need the conversational substrate: copilots where a human is iterating in real time, support bots where a customer is waiting on the other end, coding assistants where the user types and corrects. In those cases, the long-lived session is doing real work — it is encoding the human's evolving intent in a way that no checkpoint file can.

The error is the default. Today, most teams reach for the conversational loop because it is the shape the SDK ships in, not because the workload requires it. Reversing the default — treating jobs as jobs unless a human is in the loop, and applying the conversational substrate only when the user's continuous presence is part of the value — is the cheapest reliability and cost improvement available to most agent teams in 2026. It does not require a better model, a new framework, or a larger context window. It requires noticing that the work was never a conversation in the first place.

References:Let's stay in touch and Follow me for more thoughts and updates