spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-04-30 21:09:29 +00:00

Author	SHA1	Message	Date
A	e1617fdc01	fix(e2e): add /usr/local/bin to openclaw PATH in verify.sh for GCP (#2736 ) On GCP VMs (running as root), npm installs openclaw to /usr/local/bin instead of ~/.npm-global/bin because the system npm prefix is writable and already in PATH. The E2E verify_openclaw() and related gateway helper functions only explicitly listed ~/.npm-global/bin, ~/.bun/bin, and ~/.local/bin — missing /usr/local/bin when .spawnrc sourcing silently fails in the piped-bash SSH exec context. Add /usr/local/bin explicitly to all openclaw-related PATH exports in verify.sh so the binary check succeeds regardless of .spawnrc state. Fixes #2732 Agent: test-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 21:21:02 -07:00
A	3630c07c70	fix(e2e): add per-agent timeout to prevent silent hangs in E2E runs (#2720 ) The E2E framework's run_single_agent function had no overall timeout. When provision/verify/input_test steps hung (e.g. cloud_exec blocking on sprite-zeroclaw or digitalocean-opencode), the process would stall indefinitely without writing a .result file, causing silent test failures. Add a per-agent wall-clock timeout (default 1800s, 2400s for junie) that wraps the core provision/verify/input_test logic in a killable subshell. If the timeout expires, the subshell is killed and a "fail" result is written, ensuring E2E batches always complete. Fixes #2714 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 13:16:09 -07:00
A	8fe6450485	fix(e2e): increase provision timeout for junie on hetzner (#2683 ) * fix(e2e): increase provision timeout for junie on hetzner junie's install takes >720s on Hetzner, exceeding the default PROVISION_TIMEOUT and causing 100% E2E failure for hetzner-junie. Add a per-agent provision timeout mechanism in common.sh via get_provision_timeout(). This checks (in order): 1. PROVISION_TIMEOUT_<agent> env var override 2. Built-in per-agent default (_PROVISION_TIMEOUT_junie=1200) 3. Global PROVISION_TIMEOUT (720s) provision.sh now calls get_provision_timeout() to resolve the effective timeout per agent instead of using the flat global. Fixes #2680 Agent: code-health Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): whitelist-sanitize agent name before eval in get_provision_timeout tr '-' '_' only replaced hyphens, allowing metacharacters like $, backticks, and ; to pass through into eval, enabling shell injection via a crafted agent name. Replace with sed whitelist [A-Za-z0-9_] to strip all unsafe chars. Agent: team-lead Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-16 00:54:03 -07:00
A	0efc4e89f0	fix(security): eliminate single-quote injection risk in verify.sh (#2667 ) Pass base64-encoded prompts via _ENCODED_PROMPT shell variable assignment at the start of remote command strings instead of interpolating directly into single-quoted decode contexts. This prevents quote-escaping vulnerabilities if INPUT_TEST_PROMPT or the encoding mechanism ever changes to produce characters that break single-quote delimiters. Fixes #2666 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 15:10:22 -07:00
A	548f41ed47	fix(e2e): source .bashrc in openclaw verify to resolve binary path on Sprite (#2660 ) On Sprite VMs, npm's global prefix (from nvm) is writable and in PATH after sourcing .bashrc, so openclaw installs to the nvm bin dir instead of ~/.npm-global/bin. The E2E verify_openclaw() binary check only prepended ~/.npm-global/bin, ~/.bun/bin, and ~/.local/bin — missing the nvm bin path entirely. Source .bashrc (in addition to .spawnrc) before the command -v check so the verify PATH matches the install-time PATH. Applied the same fix to the ensure/restart gateway helpers and the openclaw input test. Fixes #2656 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 12:46:37 -07:00
A	333a3928ad	refactor: remove dead verify_setup_* functions from e2e verify.sh (#2647 ) Remove three dead functions that were defined but never called: - verify_setup_github — checked GitHub CLI auth status - verify_setup_browser — checked Chrome browser install - verify_setup_telegram — checked openclaw Telegram config These were orphaned helpers (never called from verify_agent or anywhere else). All agent-specific checks go through verify_agent() which dispatches to the per-agent verify_*() functions, none of which called these helpers. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-15 04:49:28 -04:00
A	b100aeaa89	fix: check additional junie binary paths in GCP verify (#2613 ) The @jetbrains/junie-cli postinstall script may download the actual binary to non-standard locations that verify_junie() wasn't checking. Add ~/.junie/bin, /usr/local/bin, and dynamic npm global bin resolution to the PATH search in the binary check. Fixes #2611 Agent: ux-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-14 08:13:51 -04:00
A	8d3f848907	fix(e2e): increase openclaw gateway resilience timeout to 60s (#2587 ) GCP e2-micro VMs are slow and throttled. When the openclaw gateway is killed during the resilience test, the lock file is held by the dead process for ~5s. This causes the first systemd restart attempt to fail with "lock timeout after 5000ms", requiring a second restart cycle. Timeline on slow VMs: RestartSec(5) + lock-timeout(5) + RestartSec(5) + boot(5) ≈ 20s. The previous 30s window was too tight — the gateway DID recover but just barely missed the polling window on throttled CPUs. Increasing to 60s gives a comfortable 3x margin for all VM types. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-13 13:48:02 -04:00
A	44a6e763cd	fix(zeroclaw): direct binary download from pinned release to fix install timeout (#2554 ) ZeroClaw's latest GitHub release (v0.1.9a) ships no binary assets. The --prefer-prebuilt bootstrap path hits a 404, falls back to Rust source compilation, and exceeds the 600s install timeout — causing zeroclaw to fail on all clouds (digitalocean, gcp, hetzner, sprite). Fix: replace the bootstrap invocation with a direct curl download from v0.1.7-beta.30 (the last release that ships linux-gnu prebuilt binaries) into ~/.local/bin. This completes in seconds vs ~20 minutes for a source build, and removes the swap-space setup step that was only needed for memory-intensive compilation. Also remove the now-unused ensureSwapSpace function and update the E2E verify check to also look in ~/.local/bin for the zeroclaw binary. -- qa/e2e-tester Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-12 18:48:10 -07:00
Ahmed Abushagur	f683dd857b	feat: add --config and --steps CLI flags for programmatic setup (#2545 ) * feat: add Telegram and WhatsApp options to OpenClaw setup picker Adds separate "Telegram" and "WhatsApp" checkboxes to the OpenClaw setup screen: - Telegram: prompts for bot token from @BotFather, injects into OpenClaw config via `openclaw config set` - WhatsApp: reminds user to scan QR code via the web dashboard after launch (no CLI setup possible) Updates USER.md with channel-specific guidance when either is selected. Bump CLI version to 0.16.16. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: run WhatsApp QR scan interactively before TUI launch Instead of punting WhatsApp setup to "after launch", runs `openclaw channels login --channel whatsapp` as an interactive SSH session between gateway start and TUI launch. The user scans the QR code with their phone during provisioning setup. Flow: gateway starts → tunnel set up → WhatsApp QR scan → TUI launch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update WhatsApp hint to reflect pre-TUI QR scanning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add --config and --steps CLI flags for programmatic setup Add --config <path> flag to load spawn options from a JSON config file (model, steps, name, setup data like telegram_bot_token). Add --steps <list> flag for comma-separated setup step control. Both enable the web UI and headless automation to control which setup steps run. Priority order: CLI flags > --config file > env vars > defaults. - New spawn-config.ts module with valibot validation - OptionalStep extended with dataEnvVar and interactive metadata - validateStepNames() for step name validation with warnings - Telegram setup reads TELEGRAM_BOT_TOKEN env var before prompting - WhatsApp auto-skipped in headless mode with warning - promptSetupOptions() skipped when SPAWN_ENABLED_STEPS already set - E2E verify helpers for github, browser, telegram setup artifacts - QA reference file documenting all agent setup options - Version bump to 0.17.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add --model flag and priority order tests - Add --model <id> CLI flag that sets MODEL_ID env var - --model is extracted before --config so it takes priority - Add config-priority.test.ts with 8 tests verifying: - --model overrides config model - --steps overrides config steps - --steps "" disables all steps - --name overrides config name - Config tokens apply as defaults - Explicit env vars override config tokens - Remove preferences.json from priority order docs (not needed) - Add --model to help text and unknown-flag guidance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add --model, --config, --steps to README Document config file format, setup steps table, and new CLI flags in the commands table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review feedback - Move null byte check before path resolution (defense-in-depth) - Move agent-setup-options.md from .claude/rules/ to .docs/ (git-ignored) per documentation policy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve rebase conflicts and deduplicate --model flag extraction Rebase on main introduced a duplicate --model flag extraction block (one from the PR at line 804, one from main at line 941). Consolidated into the single early extraction point with -m shorthand support. Also removed duplicate --model entry from KNOWN_FLAGS set. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-13 00:32:58 +00:00
A	6081c0a17f	feat(qa): telegram soak test on digitalocean + fix bun -e (#2547 ) - soak.sh: SOAK_CLOUD env var makes cloud configurable (default: sprite) - qa.sh: load TELEGRAM_BOT_TOKEN, TELEGRAM_TEST_CHAT_ID, SOAK_CLOUD from /etc/spawn-qa-auth.env in soak mode - qa.yml: add weekly Monday 3am UTC scheduled soak trigger - fix: bun eval → bun -e across soak.sh, key-request.sh, github-auth.sh (bun eval is not a valid subcommand in bun 1.3.9) - fix: export _TOKEN via env prefix so process.env._TOKEN works in bun -e - docs: update shell-scripts.md rule to say bun -e (not bun eval) Verified: 3/4 Telegram tests pass in smoke test on DigitalOcean (120s wait) getMe ✓ sendMessage ✓ getWebhookInfo ✓; cron test needs full 55-min window. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 19:45:18 -04:00
A	91b66f4b40	fix(e2e): fix input test prompt delivery and agent flags (#2536 ) Three root-cause bugs in input test functions: 1. Stdin pass-through broken: cloud_exec uses "printf '...' \| base64 -d \| bash" on the remote, meaning bash reads the script from its own stdin — not the outer process's stdin. "PROMPT=$(base64 -d)" inside the script was reading from the already-consumed pipe, always producing an empty prompt. Fix: embed the base64-encoded prompt directly in the remote command string. Base64 output is [A-Za-z0-9+/=] only — safe to embed in single-quoted strings. 2. Zeroclaw flag wrong: "zeroclaw agent -p" was passing the prompt as --provider (not --prompt). The correct flag for non-interactive single-message mode is "-m"/"--message". 3. Codex model stale: "openai/gpt-5-codex" does not exist on OpenRouter. Updated to "openai/gpt-5.1-codex" which is available. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 13:50:06 -04:00
A	6fda75ccc8	security: validate base64 output in cloud_exec and soak.sh (defense-in-depth) (#2532 ) Add base64 character validation ([A-Za-z0-9+/=]) before use in SSH command strings for gcp.sh, aws.sh, and hetzner.sh cloud_exec functions -- matching the existing fix in digitalocean.sh (#2528). Also add a validated _encode_b64 helper to soak.sh and use it for all Telegram bot token encoding, preventing corrupted base64 from breaking out of single-quoted SSH command strings. Closes #2527 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 09:32:48 -04:00
A	76399eafd9	security: validate base64 in digitalocean.sh SSH exec (defense-in-depth) (#2528 ) Add explicit base64 character validation in _digitalocean_exec after encoding the command, matching the existing pattern in provision.sh. This ensures the encoded value contains only [A-Za-z0-9+/=] before embedding it in the SSH command string. Note: #2527 (provision.sh base64 validation) was already fixed in a prior commit — the validation at lines 284-289 already rejects non-base64 characters and empty output. Fixes #2526 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 08:16:48 -04:00
Ahmed Abushagur	5b5e7d4706	test: add cron-triggered Telegram reminder to soak test (#2519 ) * test: add cron-triggered Telegram reminder to soak test Tests OpenClaw's ability to stay alive and execute scheduled tasks. Installs a one-shot cron on the VM before the 1h soak wait that sends a Telegram message at ~55 min, then verifies the message was sent after the wait completes. Also moves Telegram config injection before the soak wait so the cron can use the bot token immediately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: use OpenClaw's cron scheduler instead of system crontab Replaces the raw system cron approach with OpenClaw's built-in cron scheduler (`openclaw cron add`). This properly tests that OpenClaw's gateway stays alive after 1 hour and can execute scheduled tasks. The test now: 1. Injects Telegram config + schedules an OpenClaw cron job (--at +55min) 2. Waits 1 hour (soak) 3. Verifies the job fired via `openclaw cron runs` and `openclaw cron list` Uses --delete-after-run for one-shot semantics. Verification checks both the run history and the auto-deletion as proof of execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: verify cron message on Telegram side via forwardMessage Instead of trusting OpenClaw's self-reported cron status, we now verify the message actually exists in the Telegram chat: 1. Extract message_id from OpenClaw's cron execution logs (tries `openclaw cron runs`, then ~/.openclaw/cron/ directory) 2. Call Telegram's forwardMessage API with that message_id 3. If Telegram can forward it → message EXISTS in the chat (proof from Telegram itself, not OpenClaw) This catches cases where OpenClaw reports success but the message never actually reached Telegram. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review findings in soak test - Add validate_positive_int() and validate SOAK_WAIT_SECONDS + SOAK_CRON_DELAY_SECONDS at startup (prevents command injection via crafted env vars) - Validate TELEGRAM_TEST_CHAT_ID is numeric in soak_validate_telegram_env - Use per-app marker file /tmp/.spawn-cron-scheduled-${app} to avoid race conditions when multiple soak tests run on the same VM Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 04:49:42 -04:00
A	c5a40b04a6	fix(e2e): add retry-with-backoff for DigitalOcean 422 droplet limit errors (#2520 ) When provisioning hits a 422 "droplet limit exceeded" response, wait 30s and retry up to 3 times. Makes E2E suite resilient to transient limit hits during parallel batch provisioning. Fixes #2516 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 07:49:47 +00:00
A	85a2289bb0	fix(e2e): dynamically calculate DigitalOcean parallel capacity from account limit (#2518 ) Previously, _digitalocean_max_parallel() always returned 3, assuming all quota slots were available. When pre-existing droplets occupy slots, the batch-3 parallel runs fail with "droplet limit exceeded" API errors. Now queries /v2/account for the actual droplet_limit and subtracts the current droplet count to compute available capacity. Falls back to 3 if the API is unreachable. -- qa/e2e-tester Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-12 02:50:48 -04:00
A	6ef7dfc99d	fix(e2e): add claude and codex to .spawnrc fallback in provision.sh (#2511 ) When Sprite (or another cloud) times out during provisioning, provision.sh falls back to constructing .spawnrc manually over SSH. The claude and codex agents were missing from the agent-specific case block, so: - claude: ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN were never written, causing verify_claude's openrouter.ai check to fail - codex: OPENAI_API_KEY and OPENAI_BASE_URL were never written Discovered during E2E run: sprite/claude failed with .spawnrc timeout + missing openrouter.ai in fallback .spawnrc. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 21:40:03 -04:00
A	1d2bf324c4	refactor: replace bun -e with bun eval and require() with ESM imports in shell scripts (#2505 ) Per shell-scripts.md rules: always use `bun eval` (not `bun -e`) and ESM-only (never `require()`). Fixed in: - sh/shared/key-request.sh: 3 instances of `bun -e` → `bun eval` - sh/e2e/lib/soak.sh: `bun -e` → `bun eval`; `require("fs")`/`require("path")` → named ESM imports from node:fs and node:path Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-11 14:03:46 -07:00
Ahmed Abushagur	330c10fcd2	feat: add Telegram soak test for OpenClaw (--soak mode) (#2492 ) Add a soak test that provisions OpenClaw on Sprite, waits 1 hour for stabilization, injects a Telegram bot token, and runs integration tests against the Telegram Bot API (getMe, sendMessage, getWebhookInfo). - New: sh/e2e/lib/soak.sh — soak test library with all Telegram-specific logic - Modified: sh/e2e/e2e.sh — add --soak flag to arg parser - Modified: qa.sh — add soak run mode (bypasses Claude, runs e2e.sh directly) - Modified: trigger-server.ts — add "soak" to VALID_REASONS - Modified: qa.yml — add soak to workflow_dispatch options Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: A <258483684+la14-1@users.noreply.github.com>	2026-03-11 05:51:53 -04:00
A	68abbee4df	fix(e2e): fix OPENROUTER_API_KEY fallback and sprite env whitelist (#2491 ) On QA VMs running Claude Code via OpenRouter, the API key is stored as ANTHROPIC_AUTH_TOKEN. Add a fallback in common.sh so e2e.sh picks up the key from ANTHROPIC_AUTH_TOKEN when ANTHROPIC_BASE_URL points to openrouter.ai and OPENROUTER_API_KEY is unset. Also add SPRITE_NAME and SPRITE_ORG to the headless env var whitelist in provision.sh — these are emitted by _sprite_headless_env() but were missing from the positive whitelist, causing every Sprite provisioning attempt to log errors and silently skip the env vars. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 03:23:46 -04:00
A	e9f8d5ec2d	fix: secure curl header args and provision.sh export whitelist (fixes #2464 , fixes #2465 ) (#2471 ) - Replace `-H "Authorization: Bearer ..."` curl args with temp curl config files (`-K`) in digitalocean.sh and hetzner.sh e2e drivers, keeping API tokens out of `ps` output - Replace dangerous-var blocklist in provision.sh with a positive whitelist of allowed cloud_headless_env variable names Agent: complexity-hunter Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-10 17:54:32 -07:00
A	1bddd713ea	fix: base64-encode commands in SSH exec to prevent injection (#2448 ) All four SSH-based cloud drivers (aws, digitalocean, gcp, hetzner) passed the command string directly as an SSH argument, which gets interpreted by the remote shell. While current callers pass trusted E2E test code, this creates a security footgun for future changes. Fix: base64-encode the command locally and decode it on the remote side before piping to bash. The encoded string contains only safe characters [A-Za-z0-9+/=], eliminating any injection vector. Stdin is preserved for callers that pipe data into cloud_exec. Closes #2432, closes #2433, closes #2434, closes #2435 Agent: complexity-hunter Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-10 13:22:33 -04:00
A	47b26deafa	fix: harden Sprite exec against injection via org flags and grep patterns (#2446 ) - Replace word-split _sprite_org_flags() call sites with _sprite_cmd() helper that uses a proper bash array for the -o flag, eliminating injection risk from org names with spaces or shell metacharacters - Validate _SPRITE_ORG against [A-Za-z0-9_-]+ in _sprite_validate_env - Use grep -qF (fixed-string) instead of grep -q for app name matching to prevent regex metacharacters in names from causing false matches - Use mktemp for _stderr_tmp in _sprite_exec instead of predictable PID-based path (/tmp/sprite-exec-err.$$) to prevent symlink attacks Closes #2436 Agent: complexity-hunter Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-10 10:08:17 -07:00
A	9bf3c216e8	fix: harden provision.sh against command injection in env_b64 and app_name (#2444 ) - Validate app_name at function entry (alphanumeric, dots, hyphens, underscores only) before it's used in file paths or passed to cloud_exec - Add trap-based cleanup for the temp file used during .spawnrc fallback creation - Add security comments documenting the three-layer defense model: printf %q quoting, base64 encoding, and stdin piping (no interpolation into command strings) The core vulnerability (env_b64 interpolated into the cloud_exec command string) was already fixed in a prior commit that switched to stdin piping. This change adds defense-in-depth and documentation. Fixes #2437, #2441 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-10 10:07:23 -07:00
A	a22fe9010c	fix: safe printf format strings and document e2e source usage (#2445 ) install.sh: Replace color variable interpolation in printf format strings with %b arguments to prevent format string injection (fixes #2443). common.sh: Use %b for color escapes in logging functions. Document that BASH_SOURCE and source usage in load_cloud_driver is intentional since e2e scripts are filesystem-only, not curl\|bash (fixes #2438). Agent: ux-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-10 12:28:45 -04:00
A	3724bb8ba4	fix: address SSH command injection risks in e2e cloud drivers (#2447 ) Add defense-in-depth validation across all e2e cloud driver scripts: - Validate IP addresses match IPv4 format before use in SSH commands (aws, digitalocean, gcp, hetzner) - Validate SSH username contains only safe characters (gcp) - Validate resource IDs are numeric before interpolating into API URLs (digitalocean droplet IDs, hetzner server IDs) - URL-encode app name in Hetzner API query parameter to prevent query parameter injection - Validate numeric env vars (INPUT_TEST_TIMEOUT, PROVISION_TIMEOUT, INSTALL_WAIT) that get interpolated into remote command strings Fixes #2432, #2433, #2434, #2435, #2442 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-10 12:27:47 -04:00
A	a9cd3b700c	security: escape pkill regex metacharacters in app_name (#2412 ) * security: escape pkill regex metacharacters in app_name Fixes #2409 - escape regex metacharacters (., [, \, , ^, $) in app_name before using in pkill -f pattern to prevent unintended process termination. Even though app_name is validated against a safe character whitelist, . and - are regex metacharacters that could match broader patterns than intended. Note: #2410 (unquoted regex in bash conditional) was already fixed by a prior commit that refactored the code to use sed instead of [[ =~ BASH_REMATCH ]]. Agent: security-auditor Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> fix: remove dead exec_long functions reintroduced from pre-#2407 code Remove cloud_exec_long dispatcher and all _*_exec_long() functions from common.sh and cloud driver files (aws, digitalocean, gcp, hetzner, sprite). These were explicitly removed as dead code in PR #2407 (commit `c4ae1684`) and must not be reintroduced. Issue #2410 (unquoted regex in bash conditional) is already resolved: the [[ =~ ]] pattern was previously replaced with case/sed parsing. Fixes #2409 Fixes #2410 Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-09 21:23:33 -07:00
A	c4ae16849d	refactor: remove dead cloud_exec_long and _*_exec_long functions (#2407 ) The cloud_exec_long dispatcher in common.sh and all five cloud-specific _exec_long implementations (aws, digitalocean, gcp, hetzner, sprite) were defined but never called by any code in the e2e test suite. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-09 19:39:53 -07:00
A	705687de17	fix: persist npm-global PATH to .profile/.bash_profile/.bashrc for SSH reconnect (#2399 ) After SSH reconnect, agent commands (openclaw, codex, kilocode, junie) were not found because PATH was only written to ~/.bashrc, which is not sourced by login shells. Login shells (used by SSH) source ~/.profile or ~/.bash_profile instead. Changes: - Write .spawnrc sourcing to ~/.profile and ~/.bash_profile in addition to ~/.bashrc and ~/.zshrc (orchestrate.ts) - Write npm-global PATH export to ~/.profile and ~/.bash_profile for all npm-installed agents: OpenClaw, Codex, Kilo Code, Junie (agent-setup.ts) - Write Claude Code PATH to ~/.profile and ~/.bash_profile (agent-setup.ts) - Write OpenCode PATH to ~/.profile and ~/.bash_profile (agent-setup.ts) - Extract NPM_GLOBAL_PATH_PERSIST constant to DRY up repeated shell snippets - Fix e2e provision.sh to also write .spawnrc sourcing to login shell configs - Bump CLI version to 0.15.32 Fixes #2394 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-09 16:26:49 -07:00
A	bd1399c861	fix: use mktemp in _sprite_fix_config to prevent race conditions (#2359 ) Replaces ${cfg}.fix$$ temp pattern with mktemp for guaranteed uniqueness. Both temp file usages in the function are updated. Fixes #2354 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 18:46:48 -07:00
A	8bc5581e62	fix: validate base64 encoding before embedding in remote command (#2360 ) Adds defense-in-depth check to reject malformed base64 output before it is embedded in the cloud_exec remote command. Fixes #2353 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 21:44:55 -04:00
A	6e81186295	fix: pipe base64 credentials directly to avoid shell variable exposure (#2344 ) Remove intermediate $env_b64 shell variable that stored base64-encoded credentials. Pipe directly from base64 to cloud_exec, preventing any credential data from appearing in process listings or shell traces. Fixes #2333 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 09:26:17 -07:00
A	24e393817f	fix: harden env var parsing and pkill patterns in provision.sh (#2342 ) - Block dangerous system env vars (PATH, LD_PRELOAD, etc.) before export - Add explicit alphanumeric validation on env var names - Validate app_name is non-empty and safe before pkill -f - Tighten pkill regex from "sprite.exec." to "sprite exec.*" Fixes #2330 #2332 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 10:43:28 -04:00
A	de732fa695	fix: prevent command injection in _sprite_exec via stdin piping (#2329 ) Pipe the command via stdin to bash instead of embedding it in a bash -c string. This eliminates shell injection risk from unquoted cmd parameter, consistent with _sprite_exec_long in the same file and other cloud drivers. Fixes #2327 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 06:44:19 -04:00
A	23fea2df21	fix(e2e): add junie agent to E2E test harness (#2314 ) The junie agent was added in #2300 but the E2E test scripts were not updated. This adds junie to ALL_AGENTS, verify dispatch, input test dispatch, and the provision.sh fallback env configuration. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-08 00:03:32 -05:00
A	51dec6e877	fix: E2E failures - SSH key gen race, hetzner 409, hermes binary path (#2305 ) Three distinct E2E bugs fixed: 1. SSH key generation race condition: When multiple agents provision in parallel, concurrent processes all call generateSshKey() and race to create ~/.ssh/id_ed25519. ssh-keygen won't overwrite an existing file (prompts on stdin which is "ignore"), causing zeroclaw/codex to fail with "SSH key generation failed". Fix: check if key already exists before generating, and re-check after a failed generation attempt. 2. Hetzner SSH key 409 uniqueness_error: The Hetzner API returns HTTP 409 with "SSH key not unique" when the same key content is registered under a different name. The hetznerApi() function throws on non-2xx before the error-parsing code runs, and the regex /already/ didn't match "not unique". Fix: catch 409 in ensureSshKey() and match against uniqueness_error/not unique/already patterns. 3. Hermes binary not found: The hermes install script (uv tool) creates the actual binary + venv at ~/.hermes/hermes-agent/venv/ with a symlink at ~/.local/bin/hermes. The tarball capture script only captured the symlink + ~/.local/share/, leaving a dangling symlink. Fix: include ~/.hermes/ in capture paths, add venv/bin to verify.sh PATH check, and update hermes launchCmd to include the venv PATH. Fixes #2304 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 22:05:44 -05:00
A	ce06492cb7	fix: use exact-line match for INPUT_TEST_MARKER in E2E verify functions (#2293 ) Fixes #2292 Unanchored grep -q would match the marker anywhere in output, including error messages like "Expected SPAWN_E2E_OK but got...". Using grep -qx requires the marker to appear as a complete line, preventing false passes. Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 14:40:06 -05:00
A	1740274323	fix: replace base64 interpolation with stdin piping in all cloud exec_long functions (#2290 ) Replace unsafe pattern where base64-encoded commands were interpolated into remote command strings with secure stdin piping — command data now travels as stdin rather than as part of the command string, eliminating injection risk from shell metacharacter interpretation. Affected functions across all 5 cloud drivers: - _hetzner_exec_long - _aws_exec_long - _gcp_exec_long - _digitalocean_exec_long - _sprite_exec_long Fixes #2286 Fixes #2287 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 14:09:15 -05:00
A	735e80e376	fix: replace base64 interpolation with stdin piping in verify.sh (Fixes #2283 ) (#2284 ) * fix: replace base64 interpolation with stdin piping in verify.sh (Fixes #2283) Replace unsafe pattern where encoded prompt was interpolated into remote command strings with secure stdin piping — prompt data now travels as stdin rather than as part of the command string, eliminating injection risk. Affected functions: input_test_claude, input_test_codex, input_test_openclaw, input_test_zeroclaw. Agent: security-auditor Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: use cloud_exec (not cloud_exec_long) for stdin piping cloud_exec_long ignores stdin - remote base64 -d would hang. cloud_exec passes cmd to bash -c, which preserves stdin piping. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: restore timeout protection for input tests using cloud_exec Wraps each agent command in `timeout ${INPUT_TEST_TIMEOUT}` on the remote side so tests cannot hang indefinitely after switching from cloud_exec_long to cloud_exec. Updates stale comment referencing cloud_exec_long. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 12:41:50 -05:00
A	035e4bf830	Remove Daytona cloud provider from codebase (#2261 ) Simplify the cloud matrix by removing Daytona. All Daytona-specific code, scripts, tests, and configuration have been removed. Daytona has been moved to "Previously Considered" in the Cloud Provider Wishlist (#1183) and can be revived on community demand. Closes #2260 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-06 18:53:08 -05:00
Ahmed Abushagur	4ac19a375a	fix: capture claude symlink target + verify PATH (#2245 ) * fix: tarball workflow failures (root ownership, swapfile, hermes TTY) - Use sudo mv + chown for tarball in release step (root-owned from capture) - Skip swapfile creation if /swapfile already exists (GitHub Actions runners) - Tolerate hermes setup wizard failure when /dev/tty unavailable in CI Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: capture claude symlink target in tarball + fix verify PATH The claude installer creates a symlink at ~/.local/bin/claude pointing to ~/.local/share/claude/versions/X.Y.Z. The capture script was missing ~/.local/share/claude/, causing a broken symlink in the tarball. Also add ~/.npm-global/bin to the verify PATH check for claude (npm fallback install path). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-06 10:55:09 -05:00
Ahmed Abushagur	c71f01725b	test: unit tests for openclaw gateway resilience config (#2224 ) * test(e2e): add openclaw gateway kill/restart resilience test Verifies that the openclaw gateway auto-restarts after being killed with SIGKILL, validating the systemd Restart=always supervision. The test runs as part of verify_openclaw: 1. Confirms gateway is listening on :18789 2. Kills it with SIGKILL (simulates a hard crash) 3. Waits up to 30s for systemd to auto-restart it 4. Verifies port 18789 comes back online If the gateway isn't running (e.g. non-systemd env), the test is skipped gracefully. On failure, dumps systemd status and gateway logs for diagnostics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert "test(e2e): add openclaw gateway kill/restart resilience test" This reverts commit `39b79d5c12`. * test: add unit tests for openclaw gateway resilience config Verifies that startGateway() produces correct systemd and cron configuration for auto-restart after a gateway crash: - Restart=always and RestartSec=5 in the systemd unit - Cron heartbeat checks port 18789 and restarts if dead - Wrapper script sources .spawnrc and execs openclaw gateway - Multiple port-check fallbacks (ss, /dev/tcp, nc) - Non-systemd fallback to setsid/nohup - 300s startup timeout Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(e2e): add openclaw gateway kill/restart resilience test Kills the gateway with SIGKILL during verify_openclaw and verifies systemd Restart=always brings it back within 30s. Skips gracefully on non-systemd environments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-05 19:38:10 -05:00
Ahmed Abushagur	08cf5e6d8a	fix(e2e): DigitalOcean name mismatch and bash 3.2 compat (#2218 ) 1. promptSpawnName() now checks DO_DROPLET_NAME before generating a random name, matching getServerName() behavior. This fixes the e2e harness creating droplets as spawn-XXXX when it expects e2e-digitalocean-AGENT-TIMESTAMP. 2. Replace BASH_REMATCH with sed-based parsing in provision.sh for macOS bash 3.2 compatibility. BASH_REMATCH was returning empty values, causing `export: '=': not a valid identifier`. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-05 13:44:32 -05:00
A	251ddf2967	fix(e2e): pass env_b64 via printf stdin to eliminate interpolation risk (#2159 ) Fixes #2152 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 19:34:31 -08:00
A	6423eb51f5	fix(security): validate env values in cloud_headless_env parser (#2146 ) * fix(security): validate env values in cloud_headless_env parser Reject values containing shell metacharacters ($, backtick, ;, &, \|, <, >) to prevent potential command injection if a cloud driver returns malicious output. Fixes #2139 Agent: security-auditor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): replace env value blacklist with whitelist regex The blacklist approach missed dangerous characters like (), quotes, backslash, newlines, {}, and !. Switch to a whitelist that only allows [A-Za-z0-9@%+=:,./_-] — a strict safe set sufficient for env values. Agent: security-auditor Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 20:49:45 +00:00
A	ebe8148177	fix(e2e): correct provision.sh fallback .spawnrc env vars for zeroclaw, hermes, kilocode (#2149 ) The fallback .spawnrc construction (used when provision times out before .spawnrc is written) had two bugs: 1. zeroclaw case wrongly included OPENAI_API_KEY and OPENAI_BASE_URL — these are hermes env vars, not zeroclaw's. zeroclaw only needs ZEROCLAW_PROVIDER=openrouter (plus the base OPENROUTER_API_KEY). 2. hermes and kilocode were missing from the case statement entirely. - hermes needs OPENAI_BASE_URL and OPENAI_API_KEY (verify_hermes checks for OPENAI_BASE_URL in .spawnrc) - kilocode needs KILO_PROVIDER_TYPE=openrouter and KILO_OPEN_ROUTER_API_KEY (verify_kilocode checks KILO_PROVIDER_TYPE) Without these fixes, hermes and kilocode would fail verification whenever provisioning timed out before the normal .spawnrc was written. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 15:46:57 -05:00
A	83d68c6a37	refactor: Remove dead code and stale references (#2137 ) Remove `cleanup_stale_apps()` in `sh/e2e/lib/cleanup.sh` which was dead code — defined but never called. The E2E orchestrator (`e2e.sh`) invokes `cloud_cleanup_stale` directly on the active cloud driver; the wrapper function and its file served no purpose. Also remove the corresponding `source` call in `e2e.sh`. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-03 11:51:59 -05:00
Ahmed Abushagur	300b330106	fix: address 4 reliability issues across codebase (#2129 ) * fix: address 4 reliability issues across codebase 1. sprite.ts: add --force to destroy command (stdin is "ignore" so interactive prompts would hang until 60s timeout) 2. verify.sh: replace /dev/tcp port checks with ss -tln primary (Debian/Ubuntu bash compiled without /dev/tcp support) 3. verify.sh: make _openclaw_restart_gateway a hard failure instead of log_warn (matching _openclaw_ensure_gateway behavior) 4. agent-setup.ts: add ss -tln port check + "already running" early exit + increase timeout from 120s to 300s (gateway takes ~3min to initialize on AWS medium instances) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome format - use consistent double quotes in portCheck Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-03 03:18:44 -05:00
Ahmed Abushagur	4a90abdaa2	fix(e2e): improve openclaw reliability on AWS and other clouds (#2123 ) * fix(e2e): improve openclaw reliability on AWS and other clouds Three changes to make openclaw e2e tests more robust: 1. Increase PROVISION_TIMEOUT from 480s to 720s — AWS cloud-init for "full" tier (Node.js + Bun + build-essential) can exceed 480s, causing the CLI to be killed before .spawnrc is written. 2. Add .spawnrc manual fallback in provision.sh — if the CLI is killed before writing .spawnrc, construct it via SSH using OPENROUTER_API_KEY with agent-specific env vars (openclaw, zeroclaw). 3. Add retry logic to openclaw gateway input test — the gateway can crash with 1006 websocket closure on resource-constrained instances. Now retries once after killing and restarting the gateway process. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): fix command injection in e2e provision scripts - Use printf %q and temp file for api_key handling in provision.sh to prevent shell metachar injection (single quotes, backticks, $) - Double-quote env_b64 interpolation in cloud_exec call to prevent word splitting - Replace echo with printf in bashrc append to avoid portability issues - Replace overbroad pkill -f 'openclaw gateway' in verify.sh with PID-targeted kill via lsof/fuser on port 18789 Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-02 23:19:34 -05:00

1 2

82 commits