Fixes#2797. The _stage_prompt_remotely() function was interpolating
${encoded_prompt} directly into the remote command string passed to
cloud_exec. While _validate_base64() ensures only [A-Za-z0-9+/=]
characters are present, defense-in-depth requires eliminating the
interpolation entirely.
The fix uses printf %s format substitution to build the remote command,
placing the encoded prompt into a single-quoted shell variable assignment
(_EP='...') on the remote side. Single quotes prevent all shell expansion,
and base64 charset cannot contain single quotes, making injection
structurally impossible.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add --fast flag for parallel server boot + setup
Adds `--fast` flag that runs server creation concurrently with API key
prompt, account check, pre-provision hooks, tarball download, and env
config generation. Once SSH is up, uploads tarball and applies config.
--fast implies --beta tarball and --beta images, enabling snapshots
and pre-built tarballs automatically.
Flow without --fast (sequential):
auth → API key → preProvision → size → create → boot → install → configure
Flow with --fast (parallel):
auth → size → [create+boot | API key | preProvision | tarball download | accountCheck]
→ upload tarball → inject env → configure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add --beta parallel as standalone opt-in for parallel setup
--beta parallel enables the parallel orchestration without implying
tarball/images. --fast still implies all three (tarball + images +
parallel).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add validateIdentifier() calls to buildFixScript() and fixSpawn() to
ensure agent keys from spawn history match [a-z0-9_-]+ before using
them to index manifest.agents. This prevents potential prototype
pollution or unexpected behavior from tampered history files.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Orphaned e2e instances from previously interrupted test runs (e.g. killed
by timeout) remain under the 30-minute max_age threshold and continue to
consume account capacity. This caused DigitalOcean "droplet limit exceeded"
422 errors when re-running the suite within 30 minutes of a failed run.
Add a pre-run stale cleanup call at the start of run_agents_for_cloud (after
credentials are validated, before agents start). This clears leftover e2e-*
instances immediately so they don't block provisioning in the new run.
-- qa/e2e-tester
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the pattern of embedding base64-encoded prompts directly into
remote command strings via shell variable interpolation with a two-step
approach: stage the encoded prompt to a remote temp file first, then
read from that file in the agent command. This eliminates RCE risk if
the prompt source ever becomes user-controlled.
Changes:
- Add _stage_prompt_remotely() helper that writes encoded prompt to
/tmp/.e2e-prompt on the remote host via an isolated cloud_exec call
- input_test_claude(): read prompt from temp file instead of _ENCODED_PROMPT var
- input_test_codex(): same
- input_test_openclaw(): same
- input_test_zeroclaw(): same
- Update _validate_base64() comment to reflect defense-in-depth role
Closes#2788
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Two CLI changes landed after the last version bump (0.23.1) without
incrementing the version:
- d9575acd: fix(cli): exit with code 1 on spawn fix error paths
- 148cc9e7: refactor: extract duplicate waitForSshSnapshotBoot to shared/ssh.ts
The CLI has auto-update enabled — without a version bump, users won't
pick up these fixes on next run.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The old-asset cleanup pipeline `gh release view | grep | while` fails
when grep finds no matches (exit 1) and pipefail is set. This kills
the entire step before gh release upload runs.
Fix: wrap grep in `{ grep ... || true; }` so no-match is not fatal.
This caused all arm64 builds and some x86_64 builds to fail nightly.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cx23 is only available in Helsinki — poor availability. Switch to
cpx22 (AMD, 2 vCPU, 4GB) which is available in nbg1/hel1/sin.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The waitForSshOnly function was identically duplicated in hetzner.ts and
digitalocean.ts. Extract the shared logic into waitForSshSnapshotBoot() in
shared/ssh.ts and replace the duplicate cloud implementations with thin
wrappers that resolve module-local state before delegating.
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The nested comprehension `[($agents[] | . as $a) | ...]` is invalid jq.
Use `[$agents[] as $a | $clouds[] as $c | ...]` instead.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmdFix error paths (spawn not found, non-interactive with multiple
servers, picker mismatch) previously returned without setting a
non-zero exit code. Scripts checking $? would incorrectly see success.
Now exits with code 1 on all error paths in cmdFix. fixSpawn() is
unchanged since it is also called from the list picker where returning
to loop is correct behavior.
Agent: ux-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
tryCatchIf(isFileError) only catches filesystem errors (ENOENT, EACCES),
but JSON.parse throws SyntaxError on corrupted preferences.json. This
was the same bug fixed in 16a2f180 across 4 files, but orchestrate.ts
was missed. A corrupted ~/.spawn/preferences.json would crash the CLI
instead of gracefully falling back to no preferred model.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add explicit validation that encoded_prompt only contains safe base64
characters ([A-Za-z0-9+/=]) in all input_test_* functions in verify.sh.
This makes the safety assumption explicit in code rather than relying
on documentation — if the base64 output ever contains unexpected chars,
the test aborts immediately instead of injecting them into a remote
command string.
Fixes#2775
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Validates LOG_DIR is within /tmp/spawn-e2e.* before deleting it,
preventing catastrophic data loss if LOG_DIR is somehow set to an
unexpected path via TMPDIR manipulation or future refactors.
Fixes#2777
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Replace `for _ in ${VAR}; do count=$((count+1)); done` patterns in e2e.sh
with `printf '%s\n' "${VAR}" | wc -w | tr -d ' '` to count space-separated
list items without relying on unquoted word splitting in loop headers.
The `cloud_count`, `pass_count`, and `fail_count` variables are now computed
using `wc -w` which is safer and more explicit. The empty-string guard on
the pass/fail counters ensures `wc -w` receives a non-empty input.
Fixes#2776
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
CLI changes:
- Add findSpawnSnapshot() to query Hetzner /images?type=snapshot API
for pre-built spawn-{agent}-* images (matches by description prefix)
- Add waitForSshOnly() for snapshot boots (skips cloud-init polling)
- Update createServer() to accept optional snapshotId — boots from
snapshot instead of ubuntu-24.04, skips cloud-init userdata
- Wire up orchestrator with skipAgentInstall flag
Packer changes:
- Add packer/hetzner.pkr.hcl using hcloud plugin, mirroring the DO
template (tier scripts, agent install, cleanup, manifest)
- Unify packer-snapshots.yml to build both DO and Hetzner in a single
workflow with cloud×agent matrix and per-cloud cleanup steps
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7 agent-specific it() blocks for validateLaunchCmd (all calling .not.toThrow()
on trivially different inputs) collapsed into one data-driven loop. Similarly,
6 individual validatePreLaunchCmd valid-pattern tests collapsed into one loop.
Reduces it() count in security-connection-validation.test.ts from 93 to 81 with
zero change in coverage - every command variant is still exercised.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
tryCatchIf(isFileError) only catches filesystem errors (ENOENT, EACCES),
but JSON.parse throws SyntaxError on corrupted input. Since tryCatchIf
rethrows non-matching errors, a corrupted config file crashes the CLI
instead of returning the intended null/false fallback.
Affected: readCache(), local manifest loader, loadApiToken(),
loadSavedOpenRouterKey(), hasCloudConfigCredentials()
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
s-2vcpu-4gb is not available in nyc3 (the default E2E region), causing
openclaw provisioning to fail with 422. s-2vcpu-4gb-intel offers the same
specs (2 vCPUs, 4 GB RAM) and is available in all regions including nyc3.
-- qa/e2e-tester
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Hetzner disabled fsn1 (Falkenstein), causing a fatal HTTP 412 error for
all users using the default location. This change:
- Fetches available locations dynamically from GET /locations API
- Falls back to a hardcoded list if the API call fails
- On location-unavailable errors (HTTP 412 resource_unavailable),
prompts the user to pick a different location instead of crashing
- Changes default location from fsn1 to nbg1 (Nuremberg)
- Excludes previously-failed locations from the re-pick list
Closes#2764
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Security Reviewer <security@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
macOS and Linux return identical results for getLocalShell, getWhichCommand,
getInstallScriptUrl, and getInstallCmd. Collapsed the duplicate per-platform
tests into a data-driven loop over ["darwin", "linux"], reducing repetition
while preserving the same coverage. Also added the missing Linux case for
getInstallCmd (was only tested for Windows and macOS).
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
removed the "integration with getScriptFailureGuidance" describe block
from credential-hints.test.ts. all three tests were redundant:
- "always includes setup instructions regardless of env state": tested
for vague "setup instructions" string, already verified by the
"when all required env vars are missing" describe block above.
- "always returns at least one line": pure existence check, already
proven by the "when no authHint is provided" tests which assert exact
length of 1.
- "returns more lines when authHint is provided": tests line-count
implementation detail rather than behavior; behavior is fully covered
by the per-scenario describe blocks.
1467 to 1464 tests. zero regressions. biome lint: 0 errors.
-- qa/dedup-scanner
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add doGetAll() pagination helper (matching Hetzner's hetznerGetAll pattern)
and use it for all three unpaginated DO API calls:
- ensureSshKey(): /account/keys (was silently truncated at 20 keys)
- createServer(): /account/keys (same issue for SSH key ID collection)
- listServers(): /droplets (was silently truncated at 20 droplets)
Replace fragile `regText.includes('"id"')` string check with proper
`parseJsonObj(regText)?.ssh_key` validation for SSH key registration.
Fixes#2748Fixes#2749
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
When p.isCancel() detected user cancellation in prompt() and
selectFromList(), the result was silently converted to "" instead of
exiting. This caused infinite retry loops in billing prompts, silent
fallthrough in oauth key entry, and unintended defaults in name prompts.
Now both functions call process.exit(0) on cancel for a clean exit.
Fixes#2745
Agent: ux-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
checkForUpdates() previously fetched the latest version from GitHub on
every single CLI invocation, blocking for up to 10s on slow/offline
connections. Now it writes a timestamp to ~/.config/spawn/.update-checked
after a successful check and skips the network call if the cache is
less than 1 hour old.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Remove `set -e` from userdata script and add an EXIT trap to guarantee
/root/.cloud-init-complete is written even if apt-get or other setup
steps fail. Add `|| true` to apt-get commands for extra resilience.
Previously, the userdata script used `set -e` causing it to abort on
any command failure before reaching the marker write at the end. This
made waitForCloudInit() always time out with "Cloud-init marker not
found, continuing anyway..." adding ~5 minutes to every Hetzner
provisioning.
Fixes#2739
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
When a GitHub Release contains only one architecture-specific tarball
(e.g., x86_64 only), the download command now checks `uname -m` on
the remote VM and fails with exit 1 if the arch doesn't match. This
prevents installing an x86_64 binary on ARM (or vice versa) and ensures
the orchestrator falls back to live installation.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
proc.killed is true as soon as kill() is called, not when the process
exits. This meant SIGKILL escalation was always skipped, leaving stuck
processes hanging indefinitely. Remove the faulty guard and always
attempt SIGKILL after the grace period — try/catch handles already-dead
processes.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, `spawn claude sprite --help` would warn about extra args
and proceed to provision a server. Now trailing help/version flags are
detected and handled correctly in both the default command path and
verb alias path (e.g., `spawn run claude sprite --help`).
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The regex `configPath.replace(/\/[^/]+$/, "")` only matches forward
slashes, so on Windows (which uses backslashes) it returns the full
path unchanged. `mkdirSync` then creates `digitalocean.json` as a
directory, causing EISDIR on the next write.
Replace with `dirname()` from `node:path` which handles both separators.
Affects digitalocean.ts, hetzner.ts, and aws.ts (oauth.ts already used
dirname correctly).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: PR Reviewer <pr-reviewer@spawn>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
validatePromptFilePath used path.resolve() which only normalizes the
string but doesn't follow symlinks. An attacker could create a symlink
(e.g., innocent.txt -> ~/.ssh/id_rsa) to bypass sensitive path checks
and exfiltrate credentials. Now uses realpathSync() to canonicalize
the path before pattern matching.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lightsail can report state=running before assigning a public IP. Continue
polling until both state is running and IP is non-empty, preventing SSH
connection failures from an empty IP address.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hetzner API defaults to 25 items per page. Users with >25 SSH keys would
hit SSH lockout on server creation because the newly registered key landed
on page 2+ and was omitted from the ssh_keys payload.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Bun.write does not support the `mode` option, so credential config files
(Hetzner, DigitalOcean, AWS, OpenRouter) were created with 0644 permissions
instead of the intended 0600, exposing API tokens to other local users.
Switch to node:fs writeFileSync which correctly applies file permissions.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Junie only accepts its own shorthand model names (gpt, opus, sonnet, etc.)
and not OpenRouter model IDs. Removing modelEnvVar lets junie handle its
own model routing via the OpenRouter API key instead.
Fixes#2734
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
On GCP VMs (running as root), npm installs openclaw to /usr/local/bin
instead of ~/.npm-global/bin because the system npm prefix is writable
and already in PATH. The E2E verify_openclaw() and related gateway
helper functions only explicitly listed ~/.npm-global/bin, ~/.bun/bin,
and ~/.local/bin — missing /usr/local/bin when .spawnrc sourcing
silently fails in the piped-bash SSH exec context.
Add /usr/local/bin explicitly to all openclaw-related PATH exports in
verify.sh so the binary check succeeds regardless of .spawnrc state.
Fixes#2732
Agent: test-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The bash wrapper scripts (.sh) contain bash syntax that PowerShell
cannot parse. On Windows, download the pre-built JS bundle from
GitHub releases and run it directly via `bun run {cloud}.js {agent}`,
which is exactly what the bash wrapper ultimately does.
Affects both interactive (execScript) and headless (cmdRunHeadless)
code paths. macOS/Linux behavior unchanged.
Closes#2726
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Two instances of the pattern `err && typeof err === "object" && "code" in err`
violated the type-safety rule requiring valibot or shared type-guard utilities
instead of manual multi-level type checks. Replaced with `toRecord(err)` and
`isString()` from @openrouter/spawn-shared for consistent, rule-compliant error
code extraction. Also bumps CLI patch version per cli-version.md.
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Add missing 'spawn uninstall' command to the Commands table. The command
exists in packages/cli/src/commands/help.ts (getHelpUsageSection) but was
absent from the README commands table.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Installs a systemd timer + oneshot service that updates the agent binary
and system packages every 6 hours without disrupting running instances.
Agent update safety:
- Binary agents (Go, Rust): Linux keeps old inode in memory; safe to replace
- npm agents: Node.js caches modules at startup; running processes unaffected
- New version takes effect on next restart via the existing restart loop
System update safety:
- Disables Ubuntu's unattended-upgrades to prevent dpkg lock contention
- Uses flock -w 300 on /var/lib/dpkg/lock-frontend before apt operations
- DEBIAN_FRONTEND=noninteractive with --force-confdef/--force-confold
User-facing:
- "Auto-update" option in setup multiselect (default on, user can uncheck)
- Skipped for local cloud and non-systemd systems
- Non-fatal: setup failure doesn't block agent launch
- Logs to /var/log/spawn-auto-update.log
Timer: 15min after boot, then every 6h with 30min random jitter.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace hardcoded "bash" shell references with platform-aware utilities so
spawn works natively from PowerShell on Windows without WSL or Git Bash.
- New shared/shell.ts: isWindows(), getLocalShell(), getInstallScriptUrl(),
getInstallCmd(), getWhichCommand() with platform override for testability
- local/local.ts: use getLocalShell() for runLocal() and interactiveSession()
- commands/run.ts: spawnScript/runScriptHeadless use getLocalShell()
- commands/update.ts: Windows downloads install.ps1, runs via PowerShell
- update-check.ts: Windows auto-update uses install.ps1; "where" replaces "which"
- shared/orchestrate.ts: PowerShell-compatible .spawnrc setup for local Windows
- Remote SSH commands unchanged — remote servers are always Linux
Closes#2726
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
* feat(cli): add `spawn uninstall` command
Adds a new `uninstall` subcommand that cleanly reverses the install:
- Removes ~/.local/bin/spawn binary and /usr/local/bin/spawn symlink
- Cleans spawn PATH entries from shell RC files (.bashrc, .zshrc, etc.)
- Removes ~/.cache/spawn/ cache directory
- Optionally removes ~/.spawn/ (history) and ~/.config/spawn/ (keys/config)
- Shows confirmation prompt before any destructive action
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: use start/end markers for shell RC blocks
- Add shared RC_MARKER_START/RC_MARKER_END constants in paths.ts
- Update install.sh to write `# >>> spawn >>>` / `# <<< spawn <<<` block markers
- Update uninstall.ts to remove content between markers (with legacy fallback)
- Addresses review feedback: shared markers make RC entries easier to audit/remove
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: share legacy RC marker from paths.ts
Move the legacy "# Added by spawn installer" string to RC_MARKER_LEGACY
in shared/paths.ts so both install.sh and uninstall.ts reference the
same source of truth for all marker strings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
When a DigitalOcean token expires mid-session (after ensureDoToken succeeds),
API calls like ensureSshKey, createServer, listServers, destroyServer would
crash with "Fatal: DigitalOcean API error 401" because doApi had no recovery
path for 401 responses.
Now doApi detects 401, attempts OAuth browser flow recovery via tryDoOAuth(),
and retries the request with the new token. A re-entrancy guard prevents
infinite loops (doApi → tryDoOAuth → doApi → ...). If OAuth recovery fails,
the original 401 error is thrown as before.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
testDoToken() used asyncTryCatchIf(isNetworkError, ...) which only caught
network errors. A 401 HTTP response threw a regular Error that escaped the
guard, propagating to main().catch() and printing "Fatal: DigitalOcean API
error 401...". Changed to asyncTryCatch() to catch all errors, returning
false for invalid tokens so ensureDoToken() naturally falls through to
OAuth recovery.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate 10 single-assertion cmdMatrix tests (5 wide-terminal + 5
narrow-terminal) into 2 comprehensive tests using beforeEach/afterEach for
terminal-width setup. Also fix a pre-existing environment-dependent failure
where HCLOUD_TOKEN being set on the host caused the auth-hint test to see
"ready" instead of "needs".
Changes:
- "grid view (wide terminal)": 5 tests → 1 test (8 fewer cmdMatrix() calls)
- "compact view (narrow terminal)": 5 tests → 1 test (same)
- Fix "should display auth hints" to clear host env vars before asserting
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The E2E framework's run_single_agent function had no overall timeout.
When provision/verify/input_test steps hung (e.g. cloud_exec blocking
on sprite-zeroclaw or digitalocean-opencode), the process would stall
indefinitely without writing a .result file, causing silent test failures.
Add a per-agent wall-clock timeout (default 1800s, 2400s for junie) that
wraps the core provision/verify/input_test logic in a killable subshell.
If the timeout expires, the subshell is killed and a "fail" result is
written, ensuring E2E batches always complete.
Fixes#2714
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>