Cron & Heartbeat

One-line summary

The cron service manages scheduled jobs (reminders, periodic tasks) and heartbeats — the latter being the hidden token drain that sends the full session context every 10 minutes just to get "HEARTBEAT_OK" back.

Responsibilities

  • Schedule and execute recurring jobs (cron expressions, interval-based, one-shot)
  • Manage heartbeats — periodic health-check LLM calls on the main session
  • Execute jobs in isolated agent sessions or the main session
  • Handle error backoff, stuck run detection, and missed job catch-up
  • Persist job state and track execution telemetry (including token usage)

Architecture diagram

Key source files

FileLinesRole
src/cron/service.ts57Public API: CronService class with start/stop/list/add/update/remove/run/wake
src/cron/service/timer.ts877Timer engine: onTimer(), runDueJobs(), executeJobCore(), heartbeat execution, error backoff
src/cron/service/jobs.ts634Scheduling: computeJobNextRunAtMs(), isJobDue(), schedule parsing, next-run computation
src/cron/service/ops.ts459CRUD operations: add/update/remove/list with validation and state management
src/cron/service/state.ts141State & dependencies: CronServiceDeps interface, CronServiceState, heartbeat functions
src/cron/types.ts143Types: CronJob, CronSchedule (at/every/cron), CronPayload (systemEvent/agentTurn), CronRunTelemetry
src/cron/service/store.tsJob persistence (JSON file)
src/cron/isolated-agent.tsIsolated agent execution for scheduled jobs
src/cron/delivery.tsDelivery plan resolution
src/cron/session-reaper.tsSession cleanup after job completion

Data flow

Timer tick cycle

Every MAX_TIMER_DELAY_MS (60s) or at next job due time:

onTimer()

runDueJobs() — check all jobs

For each due job:

executeJobCore()

  ├── sessionTarget = "main":
  │     ├── enqueueSystemEvent() — inject into main session
  │     └── if wakeMode = "now":
  │           ├── runHeartbeatOnce() — full LLM call
  │           ├── Retry while "requests-in-flight" (250ms delay, 2min max)
  │           └── Fallback: requestHeartbeatNow() (async)

  └── sessionTarget = "isolated":
        ├── runIsolatedAgentJob() — separate session
        ├── Track token usage in CronRunTelemetry
        └── Optionally post summary to main session

applyJobResult() — update job state

emit("finished") — with telemetry (model, provider, usage)

armTimer() — schedule next tick

Heartbeat flow (the token-expensive path)

Heartbeat cron job fires (default: every 600 seconds)

executeJobCore() with sessionTarget = "main", wakeMode = "now"

enqueueSystemEvent() — injects heartbeat prompt into main session queue

runHeartbeatOnce()

Full LLM call with:
  - Complete system prompt (~3,000-5,000 tokens)
  - Full session history (40,000-150,000+ tokens)
  - Tool definitions (~2,000-4,000 tokens)
  - Skills catalog (~2,000 tokens)

Agent responds: "HEARTBEAT_OK" (if nothing needs attention)
  OR: Alert text (if something needs user attention)

Critical constants

ConstantValueLocationImpact
MAX_TIMER_DELAY_MS60,000 ms (60s)timer.tsMaximum timer interval to prevent drift
MIN_REFIRE_GAP_MS2,000 ms (2s)timer.tsMinimum gap between consecutive fires
DEFAULT_JOB_TIMEOUT_MS600,000 ms (10 min)timer.tsDefault execution timeout per job
STUCK_RUN_MS7,200,000 ms (2h)timer.tsThreshold for stuck running markers
Default heartbeat interval600s (10 min)configWhen cron.heartbeat.enabled: true

Error backoff schedule (timer.ts:108-114)

1st error  →  30 seconds
2nd error  →  1 minute
3rd error  →  5 minutes
4th error  →  15 minutes
5th+ error →  60 minutes

After 3 consecutive schedule errors → auto-disable job (MAX_SCHEDULE_ERRORS = 3).

Token optimization impact

Heartbeats are the most cost-inefficient token consumer in OpenClaw:

ScenarioInput tokens/heartbeatCost (at $3/M input)Daily cost (6/hour)
Light session (5 turns)~15,000$0.045$6.48
Medium session (15 turns)~50,000$0.15$21.60
Heavy session (30+ turns)~150,000$0.45$64.80
Full context (200k)~200,000$0.60$86.40

Why heartbeats are expensive

  1. Full context sent: The heartbeat uses the same code path as a regular message — runHeartbeatOnce() triggers a full LLM call with complete system prompt, session history, tool definitions, and skills catalog
  2. Response is minimal: The agent typically responds with just HEARTBEAT_OK (~2 output tokens) for ~50,000+ input tokens
  3. Grows with session: Heartbeat cost scales linearly with session history length
  4. 6 calls/hour default: Every 10 minutes, burning tokens for a 2-token response

Optimization opportunities

  • Minimal heartbeat context: Send only identity line + "Is anything pending?" instead of full context
  • Adaptive heartbeat frequency: Increase interval when session is idle, decrease when active
  • Skip when session recently active: If user messaged within last N minutes, skip heartbeat
  • Heartbeat-specific prompt mode: Use PromptMode = "none" (15 tokens) instead of "full" (5,000 tokens)

Schedule types

CronSchedule =
  | { at: string }    // One-shot: ISO datetime or relative ("in 30m")
  | { every: string }  // Interval: "10m", "2h", "1d"
  | { cron: string }   // Cron expression: "0 */6 * * *"

Job payload types

CronPayload =
  | { systemEvent: string }   // Inject text into main session
  | { agentTurn: string }     // Execute agent turn in isolated session

How it connects to other modules

  • Depends on:

    • auto-reply/runHeartbeatOnce(), isolated agent execution
    • agents/pi-embedded-runner/ — LLM execution for heartbeat and agent turns
    • sessions/ — session targeting, key resolution
    • config/ — heartbeat settings, cron configuration
    • channels/ — delivery to messaging platforms
  • Depended by:

    • gateway/ — starts/stops cron service with gateway lifecycle
    • System prompt — includes heartbeat prompt instruction section

My blind spots

  • isolated-agent.ts — exact execution flow for isolated agent jobs vs main session
  • delivery.ts — how delivery plans work for multi-channel delivery
  • session-reaper.ts — when and how old cron sessions are cleaned up
  • Whether heartbeat uses prompt caching (Anthropic) — this would dramatically reduce heartbeat cost
  • The exact heartbeat prompt text and whether it's customizable per agent
  • stagger.ts — stagger window logic for avoiding thundering herd
  • None yet

Change frequency

  • timer.ts: Medium — backoff logic, heartbeat retry behavior, timing constants evolve
  • jobs.ts: Medium — scheduling logic changes with new schedule types
  • ops.ts: Medium — CRUD operations expand with new features (pagination, filtering)
  • service.ts: Low — thin wrapper, rarely changes
  • types.ts: Low — type additions are additive and rare