fix: Increase RUN_TIMEOUT_MS default to 4 hours

Discovery cycles run 1-2h+, 2h was too aggressive. 4h gives headroom
while still catching truly hung processes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
B 2026-02-10 05:53:22 +00:00
parent dae37fded0
commit 65d28ba5f5
2 changed files with 4 additions and 4 deletions

View file

@ -52,7 +52,7 @@ It reads env vars:
- `TARGET_SCRIPT` (required) — Absolute path to the script to run on trigger
- `REPO_ROOT` (optional) — Working directory for the script (defaults to script's parent dir)
- `MAX_CONCURRENT` (optional) — Max parallel runs (default: `1`)
- `RUN_TIMEOUT_MS` (optional) — Kill runs older than this in milliseconds (default: `7200000` = 2 hours)
- `RUN_TIMEOUT_MS` (optional) — Kill runs older than this in milliseconds (default: `14400000` = 4 hours)
**Stale run detection:**
Before accepting a trigger, the server checks if tracked processes are still alive (`kill -0`). Dead processes are reaped automatically. Runs exceeding `RUN_TIMEOUT_MS` are force-killed to free the slot.
@ -263,8 +263,8 @@ cat /.sprite/logs/services/<service-name>.log | grep 'finished'
| Service | Observed cycle time | RUN_TIMEOUT_MS | Rationale |
|---------|-------------------|----------------|-----------|
| Discovery (improve.sh) | 1-2 hours | `7200000` (2h) | Team cycles with 5+ agents, worktrees, PRs |
| Refactor (refactor.sh) | TBD | `7200000` (2h) | Start high, tune after data |
| Discovery (improve.sh) | 15 min (gaps), 1-2h+ (discovery) | `14400000` (4h) | Discovery cycles are open-ended; gap fills are fast |
| Refactor (refactor.sh) | TBD | `14400000` (4h) | Start high, tune after data |
To override, add to the wrapper script:

View file

@ -22,7 +22,7 @@ const TRIGGER_SECRET = process.env.TRIGGER_SECRET ?? "";
const TARGET_SCRIPT = process.env.TARGET_SCRIPT ?? "";
const MAX_CONCURRENT = parseInt(process.env.MAX_CONCURRENT ?? "1", 10);
const RUN_TIMEOUT_MS = parseInt(
process.env.RUN_TIMEOUT_MS ?? String(2 * 60 * 60 * 1000),
process.env.RUN_TIMEOUT_MS ?? String(4 * 60 * 60 * 1000),
10
);