openclaw/src
Gustavo Madeira Santana e19a23520c
fix: unify session maintenance and cron run pruning (#13083)
* fix: prune stale session entries, cap entry count, and rotate sessions.json

The sessions.json file grows unbounded over time. Every heartbeat tick (default: 30m)
triggers multiple full rewrites, and session keys from groups, threads, and DMs
accumulate indefinitely with large embedded objects (skillsSnapshot,
systemPromptReport). At >50MB the synchronous JSON parse blocks the event loop,
causing Telegram webhook timeouts and effectively taking the bot down.

Three mitigations, all running inside saveSessionStoreUnlocked() on every write:

1. Prune stale entries: remove entries with updatedAt older than 30 days
   (configurable via session.maintenance.pruneDays in openclaw.json)

2. Cap entry count: keep only the 500 most recently updated entries
   (configurable via session.maintenance.maxEntries). Entries without updatedAt
   are evicted first.

3. File rotation: if the existing sessions.json exceeds 10MB before a write,
   rename it to sessions.json.bak.{timestamp} and keep only the 3 most recent
   backups (configurable via session.maintenance.rotateBytes).

All three thresholds are configurable under session.maintenance in openclaw.json
with Zod validation. No env vars.

Existing tests updated to use Date.now() instead of epoch-relative timestamps
(1, 2, 3) that would be incorrectly pruned as stale.

27 new tests covering pruning, capping, rotation, and integration scenarios.

* feat: auto-prune expired cron run sessions (#12289)

Add TTL-based reaper for isolated cron run sessions that accumulate
indefinitely in sessions.json.

New config option:
  cron.sessionRetention: string | false  (default: '24h')

The reaper runs piggy-backed on the cron timer tick, self-throttled
to sweep at most every 5 minutes. It removes session entries matching
the pattern cron:<jobId>:run:<uuid> whose updatedAt + retention < now.

Design follows the Kubernetes ttlSecondsAfterFinished pattern:
- Sessions are persisted normally (observability/debugging)
- A periodic reaper prunes expired entries
- Configurable retention with sensible default
- Set to false to disable pruning entirely

Files changed:
- src/config/types.cron.ts: Add sessionRetention to CronConfig
- src/config/zod-schema.ts: Add Zod validation for sessionRetention
- src/cron/session-reaper.ts: New reaper module (sweepCronRunSessions)
- src/cron/session-reaper.test.ts: 12 tests covering all paths
- src/cron/service/state.ts: Add cronConfig/sessionStorePath to deps
- src/cron/service/timer.ts: Wire reaper into onTimer tick
- src/gateway/server-cron.ts: Pass config and session store path to deps

Closes #12289

* fix: sweep cron session stores per agent

* docs: add changelog for session maintenance (#13083) (thanks @skyfallsin, @Glucksberg)

* fix: add warn-only session maintenance mode

* fix: warn-only maintenance defaults to active session

* fix: deliver maintenance warnings to active session

* docs: add session maintenance examples

* fix: accept duration and size maintenance thresholds

* refactor: share cron run session key check

* fix: format issues and replace defaultRuntime.warn with console.warn

---------

Co-authored-by: Pradeep Elankumaran <pradeepe@gmail.com>
Co-authored-by: Glucksberg <markuscontasul@gmail.com>
Co-authored-by: max <40643627+quotentiroler@users.noreply.github.com>
Co-authored-by: quotentiroler <max.nussbaumer@maxhealth.tech>
2026-02-09 20:42:35 -08:00
..
acp chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
agents fix(memory): default batch embeddings to off 2026-02-09 22:31:58 -06:00
auto-reply fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
browser Deduplicate more 2026-02-09 18:56:58 -08:00
canvas-host fix: use STATE_DIR instead of hardcoded ~/.openclaw for identity and canvas (#4824) 2026-02-07 22:16:59 -05:00
channels improve pre-commit hook 2026-02-09 18:59:42 -08:00
cli fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
commands Update contributing, deduplicate more functions 2026-02-09 19:21:33 -08:00
compat refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
config fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
cron fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
daemon fix(runtime): bump minimum Node.js version to 22.12.0 (#5370) 2026-02-05 13:42:52 -08:00
discord refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
docs Docs: landing page revamp (#8885) 2026-02-04 10:37:14 -05:00
gateway fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
hooks fix: use STATE_DIR instead of hardcoded ~/.openclaw for identity and canvas (#4824) 2026-02-07 22:16:59 -05:00
imessage refactor: unify peer kind to ChatType, rename dm to direct (#11881) 2026-02-09 09:20:52 +09:00
infra fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
line refactor: unify peer kind to ChatType, rename dm to direct (#11881) 2026-02-09 09:20:52 +09:00
link-understanding chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
logging fix: guard resolveUserPath against undefined input (#10176) 2026-02-06 13:16:58 -05:00
macos chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
markdown chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
media refactor: consolidate PNG encoder and safeParseJson utilities (#12457) 2026-02-09 00:21:54 -08:00
media-understanding refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
memory refactor: consolidate duplicate utility functions (#12439) 2026-02-08 23:59:43 -08:00
node-host fix: harden Windows exec allowlist 2026-02-03 09:34:25 -08:00
pairing refactor: consolidate PNG encoder and safeParseJson utilities (#12457) 2026-02-09 00:21:54 -08:00
plugin-sdk Update contributing, deduplicate more functions 2026-02-09 19:21:33 -08:00
plugins refactor: centralize isPlainObject, isRecord, isErrno, isLoopbackHost utilities (#12926) 2026-02-09 17:02:55 -08:00
process fix: skip extension append if command already has one 2026-01-31 20:39:33 -06:00
providers chore: Fix failing test. 2026-02-09 09:58:58 +09:00
routing refactor: unify peer kind to ChatType, rename dm to direct (#11881) 2026-02-09 09:20:52 +09:00
scripts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
security refactor: centralize isPlainObject, isRecord, isErrno, isLoopbackHost utilities (#12926) 2026-02-09 17:02:55 -08:00
sessions fix: unify session maintenance and cron run pruning (#13083) 2026-02-09 20:42:35 -08:00
shared/text chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
signal refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
slack refactor: centralize isPlainObject, isRecord, isErrno, isLoopbackHost utilities (#12926) 2026-02-09 17:02:55 -08:00
telegram refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
terminal fix: error handling in restore failure reporting 2026-02-03 06:22:51 +00:00
test-helpers fix: use STATE_DIR instead of hardcoded ~/.openclaw for identity and canvas (#4824) 2026-02-07 22:16:59 -05:00
test-utils chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
tts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
tui Centralize date/time formatting utilities (#11831) 2026-02-08 04:53:31 -08:00
types fix: update pi packages to 0.51.0, remove bogus type augmentation 2026-02-02 01:52:33 +01:00
utils refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
web fix: preserve original filename for WhatsApp inbound documents (#12691) 2026-02-09 16:56:19 -05:00
whatsapp chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
wizard Deduplicate more 2026-02-09 18:56:58 -08:00
channel-web.barrel.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
channel-web.ts
docker-setup.test.ts refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
entry.ts Centralize date/time formatting utilities (#11831) 2026-02-08 04:53:31 -08:00
extensionAPI.ts chore: Migrate to tsdown, speed up JS bundling by ~10x (thanks @hyf0). 2026-02-03 20:18:16 +09:00
globals.test.ts
globals.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
index.test.ts
index.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
logger.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
logger.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
logging.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
polls.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
polls.ts
runtime.ts CLI: restore terminal state on exit 2026-02-03 06:10:19 +00:00
utils.test.ts fix(paths): structurally resolve home dir to prevent Windows path bugs (#12125) 2026-02-08 20:06:29 -05:00
utils.ts Deduplicate more 2026-02-09 18:56:58 -08:00
version.test.ts fix: CLI harden update restart imports and fix nested bundle version resolution 2026-02-06 00:09:48 -05:00
version.ts fix: CLI harden update restart imports and fix nested bundle version resolution 2026-02-06 00:09:48 -05:00