Commit graph

65 commits

Author SHA1 Message Date
A
1097f055c3
fix(security): add --proto '=https' to all curl executable downloads (#2160)
42 curl calls downloading JS bundles, CLI binaries, and gh CLI tarballs
were missing --proto '=https', allowing protocol downgrade attacks on
hostile networks. PR #2138 fixed bun installer calls; this closes the
remaining gap for executable downloads.

Fixes applied:
- sh/{sprite,aws,gcp,hetzner,daytona,local}/{claude,codex,openclaw,opencode,kilocode,hermes,zeroclaw}.sh (42 files)
- sh/cli/install.sh (cli.js download)
- sh/shared/github-auth.sh (keyring, API, tarball downloads)

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 23:38:03 -05:00
A
251ddf2967
fix(e2e): pass env_b64 via printf stdin to eliminate interpolation risk (#2159)
Fixes #2152

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 19:34:31 -08:00
A
6423eb51f5
fix(security): validate env values in cloud_headless_env parser (#2146)
* fix(security): validate env values in cloud_headless_env parser

Reject values containing shell metacharacters ($, backtick, ;, &, |, <, >)
to prevent potential command injection if a cloud driver returns malicious output.

Fixes #2139

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): replace env value blacklist with whitelist regex

The blacklist approach missed dangerous characters like (), quotes,
backslash, newlines, {}, and !. Switch to a whitelist that only allows
[A-Za-z0-9@%+=:,./_-] — a strict safe set sufficient for env values.

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 20:49:45 +00:00
A
ebe8148177
fix(e2e): correct provision.sh fallback .spawnrc env vars for zeroclaw, hermes, kilocode (#2149)
The fallback .spawnrc construction (used when provision times out before
.spawnrc is written) had two bugs:

1. zeroclaw case wrongly included OPENAI_API_KEY and OPENAI_BASE_URL —
   these are hermes env vars, not zeroclaw's. zeroclaw only needs
   ZEROCLAW_PROVIDER=openrouter (plus the base OPENROUTER_API_KEY).

2. hermes and kilocode were missing from the case statement entirely.
   - hermes needs OPENAI_BASE_URL and OPENAI_API_KEY (verify_hermes
     checks for OPENAI_BASE_URL in .spawnrc)
   - kilocode needs KILO_PROVIDER_TYPE=openrouter and
     KILO_OPEN_ROUTER_API_KEY (verify_kilocode checks KILO_PROVIDER_TYPE)

Without these fixes, hermes and kilocode would fail verification whenever
provisioning timed out before the normal .spawnrc was written.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 15:46:57 -05:00
A
cfa1ae7a08
fix(security): add --proto '=https' to all curl bun installer calls (#2138)
* fix(security): add --proto '=https' to all curl bun installer calls

Fixes #2134

All _ensure_bun() functions across aws, hetzner, gcp, local, daytona,
and sprite scripts now enforce HTTPS-only downloads via --proto '=https'.
This prevents MITM attacks during bun installation on remote VMs.
DigitalOcean scripts were already correct and are not changed.

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): add --proto '=https' to bun installer in TS files

Address security reviewer feedback: the same MITM vulnerability
existed in 5 TypeScript programmatic provisioning files.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): quote --proto '=https' in su -c curl calls

The aws.ts and gcp.ts files had --proto =https without quotes inside
su -c '...' blocks. Uses double quotes ("=https") to properly nest
inside the single-quoted su -c argument while maintaining protocol
restriction.

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 11:52:54 -08:00
A
83d68c6a37
refactor: Remove dead code and stale references (#2137)
Remove `cleanup_stale_apps()` in `sh/e2e/lib/cleanup.sh` which was dead
code — defined but never called. The E2E orchestrator (`e2e.sh`) invokes
`cloud_cleanup_stale` directly on the active cloud driver; the wrapper
function and its file served no purpose.

Also remove the corresponding `source` call in `e2e.sh`.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-03 11:51:59 -05:00
Ahmed Abushagur
300b330106
fix: address 4 reliability issues across codebase (#2129)
* fix: address 4 reliability issues across codebase

1. sprite.ts: add --force to destroy command (stdin is "ignore" so
   interactive prompts would hang until 60s timeout)

2. verify.sh: replace /dev/tcp port checks with ss -tln primary
   (Debian/Ubuntu bash compiled without /dev/tcp support)

3. verify.sh: make _openclaw_restart_gateway a hard failure instead
   of log_warn (matching _openclaw_ensure_gateway behavior)

4. agent-setup.ts: add ss -tln port check + "already running" early
   exit + increase timeout from 120s to 300s (gateway takes ~3min
   to initialize on AWS medium instances)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: biome format - use consistent double quotes in portCheck

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-03 03:18:44 -05:00
Ahmed Abushagur
4a90abdaa2
fix(e2e): improve openclaw reliability on AWS and other clouds (#2123)
* fix(e2e): improve openclaw reliability on AWS and other clouds

Three changes to make openclaw e2e tests more robust:

1. Increase PROVISION_TIMEOUT from 480s to 720s — AWS cloud-init
   for "full" tier (Node.js + Bun + build-essential) can exceed 480s,
   causing the CLI to be killed before .spawnrc is written.

2. Add .spawnrc manual fallback in provision.sh — if the CLI is killed
   before writing .spawnrc, construct it via SSH using OPENROUTER_API_KEY
   with agent-specific env vars (openclaw, zeroclaw).

3. Add retry logic to openclaw gateway input test — the gateway can
   crash with 1006 websocket closure on resource-constrained instances.
   Now retries once after killing and restarting the gateway process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): fix command injection in e2e provision scripts

- Use printf %q and temp file for api_key handling in provision.sh to
  prevent shell metachar injection (single quotes, backticks, $)
- Double-quote env_b64 interpolation in cloud_exec call to prevent
  word splitting
- Replace echo with printf in bashrc append to avoid portability issues
- Replace overbroad pkill -f 'openclaw gateway' in verify.sh with
  PID-targeted kill via lsof/fuser on port 18789

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-02 23:19:34 -05:00
A
ffe4cf8c9e
refactor: Remove stale shellcheck disable comment from aws/kilocode.sh (#2125)
The SC2154 (referenced but not assigned) comment was leftover from a
prior version of the script. No such external variable is referenced in
the current implementation, making the suppression comment stale.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 20:40:14 -05:00
A
4aaf125a2c
fix(reliability): add curl download error handling to AWS and Hetzner shims (#2122)
14 agent shim scripts in sh/aws/ and sh/hetzner/ were missing error
handlers on the curl command that downloads the JS bundle from GitHub
releases. If the download failed (network issue, 404, etc.), the script
would silently proceed to exec an empty/corrupt file via bun, producing
a confusing error instead of a clear "Failed to download" message.

All other clouds (GCP, Daytona, DigitalOcean, Sprite) already had this
error handling pattern. This brings AWS and Hetzner into consistency.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 18:09:25 -05:00
A
277c4236a3
fix(security): replace eval with direct indirection in load_cloud_driver (#2121)
Removes eval-based function creation pattern in e2e/lib/common.sh.
Uses variable indirection (ACTIVE_CLOUD global + wrapper functions)
instead of eval to reduce attack surface.

Fixes #2118

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 16:50:27 -05:00
A
7cc21e4111
fix(security): quote timeout var and validate numeric in sprite.sh (#2120)
Fixes unquoted ${timeout} in _sprite_exec_long that could allow
command injection if timeout contained shell metacharacters.
Adds numeric validation before use.

Fixes #2117

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 16:47:39 -05:00
Ahmed Abushagur
b35ecdfaff
fix(e2e): drop --timeout flag from openclaw agent command (#2109)
The outer cloud_exec_long already enforces a timeout via
INPUT_TEST_TIMEOUT. The inner --timeout 60 was redundant and could
cause premature kills before the outer timeout expired.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: A <258483684+la14-1@users.noreply.github.com>
2026-03-02 10:57:44 -08:00
Ahmed Abushagur
35badf8d1b
fix(e2e): fail hard when OpenClaw gateway doesn't start (#2111)
The gateway startup was silently swallowed with log_warn, masking
real failures. Now tracks whether the port came up and fails the
test with the gateway log contents if it didn't.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-02 13:51:29 -05:00
Ahmed Abushagur
9242d44cbb
fix(e2e): add --force to sprite destroy in teardown (#2100)
Without --force, sprite destroy prompts for confirmation in
non-interactive E2E mode and silently fails ("Ok, come back later!"),
leaving stale instances running indefinitely.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 00:24:43 -08:00
A
3d5812602c
refactor: convert hermes scripts to thin-wrapper pattern (#2094)
- hetzner/hermes.sh: add thin-shim header comment, blank line after
  _ensure_bun definition, and section comments (Local checkout, Remote)
  to match the canonical pattern used by aws/gcp/sprite/daytona
- digitalocean/hermes.sh: add detailed _run_with_restart comment block
  and inline section comments (Normal exit, SIGTERM, Other failure) to
  match digitalocean/claude.sh

Both scripts now produce identical output to their cloud's reference
script (e.g. aws/hermes.sh, digitalocean/claude.sh) when the agent
name is substituted.

Fixes #2082

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 20:27:59 -08:00
A
b755c6966c
feat: add local/hermes to complete the 7x7 matrix (#2091)
Fixes #2079 — local/hermes was the only remaining missing entry in the
cloud×agent matrix. All 49 entries are now implemented.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 22:04:38 -05:00
A
d713f9650f
feat: add hermes agent to 4 clouds, bump install wait to 600s (#2084)
- Add hermes shim scripts for GCP, Hetzner, DigitalOcean, and Daytona
- Update manifest.json matrix entries from "missing" to "implemented"
- Bump default INSTALL_WAIT from 300s to 600s to fix zeroclaw timeout
  on small VMs where Rust compilation takes 8-12 minutes
- Update cloud READMEs with hermes usage docs
- Bump CLI version to 0.11.18

Co-authored-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 19:31:50 -05:00
A
69d1971abf
fix(security): remove space from token validation charset in key-request.sh (#2074)
API tokens never contain spaces; allowing them risks word splitting
in downstream unquoted uses of these env vars. Updated both the shell
regex in key-request.sh and the corresponding TypeScript regexes in
digitalocean.ts to stay in sync.

Fixes #2072

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 17:10:22 -05:00
A
548cfdf0b1
fix(security): apply base64 exec escaping to remaining 4 cloud drivers (#2067)
PR #2064 fixed _exec_long shell injection for DigitalOcean and Sprite
but missed the same bash -c '${cmd}' pattern in Hetzner, GCP, AWS, and
Daytona. Apply the same base64-encoding fix to all four.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 11:50:33 -08:00
A
862030b776
fix(security): escape cmd args in _exec_long to prevent shell injection (#2064)
Base64-encode the command before embedding it in bash -c to prevent
single-quote breakout in _sprite_exec_long and _digitalocean_exec_long.

Fixes #2063

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 12:42:27 -05:00
A
add83bbd6c
refactor: Remove dead code and stale references (#2062)
Remove orphaned sh/test/fixtures/ directory. These shell fixture files
(_shared_agent_assertions.sh, hetzner/_env.sh, hetzner/_api_assertions.sh,
digitalocean/_env.sh, digitalocean/_api_assertions.sh) were part of a mock
test harness (mock.sh) that was removed from the repository. The fixture
files reference `assert_api_called` and `MOCK_LOG` variables that are never
defined anywhere, confirming they are unreachable dead code.

Scan results:
- Dead code (sh/test/fixtures/): 5 orphaned fixture files removed
- Dead code (sh/shared, packages/cli/src/): none found
- Stale references to non-existent files: none found
- Python usage (python3 -c / python -c): none found
- Duplicate utilities across cloud modules: loadTokenFromConfig pattern
  exists in hetzner/daytona/digitalocean but reads from different cloud-
  specific config paths — cannot be consolidated (confirmed intentional)
- Stale comments: none found beyond those already fixed in prior PRs

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 11:45:24 -05:00
A
3fa0c82c91
refactor: Remove dead code and stale references (#2059)
* refactor: Remove dead code and stale references

Fix stale path comment in sh/shared/key-request.sh that referenced
the wrong location for loadTokenFromConfig (cli/src/ instead of
packages/cli/src/). Also updated wording from "Must match" to "Keep
in sync with" to more accurately describe the relationship.

Scan results (no other issues found):
- Dead code (sh/shared, packages/cli/src): none found
- Stale references to non-existent files: none found
- Python usage (python3 -c / python -c): none found
- Duplicate utilities across cloud modules: none (cloud-specific config
  loading functions share the same pattern but read from different paths
  and cannot be consolidated)
- Stale comments: one stale path in key-request.sh (fixed)

-- qa/code-quality

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: Remove dead code and stale references

Remove duplicate `log_step` function from `sh/shared/github-auth.sh`.
`log_step` was identical to `log_info` (same printf format, same output
stream) and had no semantic distinction. All 6 call sites are updated to
use `log_info` directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-01 08:26:10 -05:00
Ahmed Abushagur
45caf4b96b
fix(sprite): fix all 6 Sprite agent installs for E2E (#2057)
* fix(sprite): fix all 6 Sprite agent installs for E2E

- Use `npm install -g --prefix` instead of `npm config set prefix` to
  avoid creating .npmrc that conflicts with nvm on Sprite VMs
- Fix shell environment setup to only modify .bash_profile (not .bashrc)
  so non-interactive bash -c commands retain PATH config
- Add $HOME/.cargo/bin to PATH for zeroclaw (Sprite has no ~/.cargo/env)
- Add $HOME/.local/bin to PATH config for Sprite shell environment
- Add sprite E2E cloud driver with org detection, config corruption fix,
  direct command embedding (not $1 positional), and retry logic
- Fix provision.sh to kill full process tree after timeout (prevents
  orphaned sprite exec sessions from corrupting config)
- Fix verify.sh zeroclaw check to not rely on ~/.cargo/env existing

Tested: 6/6 Sprite agents pass E2E (claude, codex, openclaw, zeroclaw,
opencode, kilocode). Hermes is not in the Sprite manifest.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: biome format - collapse runSprite call to single line

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-01 07:15:09 -05:00
A
66bd992198
refactor: Remove dead code and stale references (#2055)
Fix stale path comment in sh/shared/key-request.sh that referenced
the wrong location for loadTokenFromConfig (cli/src/ instead of
packages/cli/src/). Also updated wording from "Must match" to "Keep
in sync with" to more accurately describe the relationship.

Scan results (no other issues found):
- Dead code (sh/shared, packages/cli/src): none found
- Stale references to non-existent files: none found
- Python usage (python3 -c / python -c): none found
- Duplicate utilities across cloud modules: none (cloud-specific config
  loading functions share the same pattern but read from different paths
  and cannot be consolidated)
- Stale comments: one stale path in key-request.sh (fixed)

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 04:45:54 -05:00
Ahmed Abushagur
2155a36a1f
fix(e2e): use explicit PATH for hermes, codex, and kilocode binary checks (#2049)
Non-interactive SSH sessions don't source .bashrc or .zshrc, so binaries
installed to ~/.local/bin (hermes via uv) or ~/.npm-global/bin (codex,
kilocode via npm) were not found during verification.

Fix all three verify functions and the codex input test to use explicit
PATH with the known install directories, matching the pattern already
used by openclaw and claude.

Verified: AWS 7/7, Hetzner 6/6 implemented, GCP 6/6 implemented.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 00:07:33 -05:00
Ahmed Abushagur
cd758589c3
fix(e2e): robust DigitalOcean teardown with retry and deletion confirmation (#2046)
The teardown was doing a single DELETE without --max-time, so connection
timeouts caused HTTP 000 and the droplet was never deleted. When running
6 agents in batches of 3, batch 1's stale droplet caused batch 2 to fail
with "will exceed your droplet limit."

Fix:
- Add --max-time 30 to prevent curl hangs
- Retry DELETE up to 3 times on failure
- Poll the API after DELETE to confirm the droplet is actually gone (up to 60s)
- Remove -f flag from curl so %{http_code} is always captured

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 22:09:15 -05:00
Ahmed Abushagur
d8dbf952c2
fix(e2e): fix openclaw input test (PATH, CLI flags, gateway restart) (#2045)
The openclaw e2e input test was failing for three independent reasons:

1. PATH missing ~/.npm-global/bin — openclaw installs via npm with a
   custom prefix, but verify_openclaw and input_test_openclaw didn't
   include that directory in PATH

2. Wrong CLI invocation — used `openclaw -p` which doesn't exist.
   The correct command is `openclaw agent --message "..." --session-id`

3. Gateway not running — the openclaw gateway (port 18789) can die
   between provisioning and verification. Now the input test ensures
   the gateway is running before sending the prompt.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-28 20:41:48 -05:00
A
dc27f456ea
fix(e2e): add sh/aws/hermes.sh and mark aws/hermes as implemented (#2042)
Hermes agent was fully implemented in shared/agent-setup.ts (createAgents
includes hermes with install, envVars, and launchCmd) but the convenience
shell script sh/aws/hermes.sh was missing and the matrix showed "missing".

- Add sh/aws/hermes.sh (matching pattern of all other aws agent scripts)
- Update manifest.json: "aws/hermes" -> "implemented"
- Update sh/aws/README.md with Hermes Agent install command

Discovered during QA E2E sweep: E2E suite lists hermes in ALL_AGENTS and
would attempt to provision it; without the matrix entry and script the
agent was silently untracked as a missing implementation gap.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 20:38:26 -05:00
A
ab08476a63
refactor: Remove dead code and stale references (#2028)
- Add `hermes` to ALL_AGENTS in sh/e2e/lib/common.sh (stale: hermes added to
  manifest.json in #2023 but never added to the e2e agent list)
- Add verify_hermes() and input_test_hermes() to sh/e2e/lib/verify.sh and
  wire them into verify_agent/run_input_test dispatch tables
- Remove dead log_warn() from sh/shared/github-auth.sh (defined but never called)
- Remove dead get_cloud_env_vars() from sh/shared/key-request.sh (no callers outside file)
- Remove dead invalidate_cloud_key() from sh/shared/key-request.sh (no callers anywhere)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-28 12:55:09 -08:00
A
187cd6aafc
feat(agent): add Hermes Agent (Nous Research) (#2023)
Implements Hermes Agent on Sprite cloud. Hermes is a persistent AI
agent by Nous Research with multi-platform messaging (Telegram,
Discord, Slack, CLI), memory across sessions, tool use, and native
OpenRouter support.

- Add hermes agent entry to manifest.json with env config
- Add matrix entries for all 7 clouds (sprite implemented, rest missing)
- Create sh/sprite/hermes.sh thin bash shim

Closes #1952

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 06:53:13 -08:00
A
111512220f
fix(security): replace eval with safe export parser in provision.sh (#2022)
Replace eval "${cloud_env}" with a while-read loop that only
processes lines matching ^export VAR="VALUE" — arbitrary shell
commands in cloud driver output are silently ignored.

Also removes the now-unused cloud_env local variable since the
while-read loop calls cloud_headless_env directly.

Fixes #2019

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 09:43:29 -05:00
A
54c7764d03
fix(security): prevent cmd injection in sprite exec via positional args (#2021)
Replace bash -c "${cmd}" with bash -c '$1' _ "${cmd}" so the
command is passed as a positional argument, not interpolated into
the shell string. Same pattern applied to the timeout wrapper.

Fixes #2018

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 09:42:13 -05:00
Ahmed Abushagur
d5461adc16
feat: SPAWN_CLI_DIR env var to force local source in e2e (#2015)
* feat: SPAWN_CLI_DIR env var to force local source in e2e and shell scripts

When SPAWN_CLI_DIR is set, the entire toolchain uses local TypeScript
source instead of downloading pre-bundled scripts from GitHub releases:

- e2e.sh: auto-sets SPAWN_CLI_DIR to repo root when running locally
- provision.sh: exports SPAWN_CLI_DIR into the headless subshell
- commands.ts: reads local shell scripts instead of fetching from CDN
- All 36 cloud/agent shell scripts: exec local main.ts when set

This enables e2e tests to validate local changes before they're released.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): add path traversal defense to SPAWN_CLI_DIR script loading

Canonicalize the path via realpathSync and verify it stays inside the
resolved CLI directory before reading. Prevents SPAWN_CLI_DIR from
being used to read arbitrary files via ../ traversal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): harden SPAWN_CLI_DIR path traversal defense

- Validate cloud/agent names don't contain '..', '/' or '\' before
  constructing file paths
- Fix root-directory edge case in prefix check by handling trailing
  separator correctly

Agent: pr-maintainer
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-02-28 04:14:36 -05:00
Ahmed Abushagur
c1e605c884
fix(e2e): increase server sizes and install timeouts (#2014)
E2E tests were failing because agent installs didn't complete within
the default 120s timeout, and small VMs ran out of memory during builds.

- INSTALL_WAIT: 120s → 300s (with per-cloud override via cloud_install_wait)
- AWS: nano_3_0 → medium_3_0 (all agents need 4GB for reliable installs)
- DigitalOcean: s-1vcpu-512mb-10gb → s-2vcpu-2gb, cap at 3 parallel
- GCP: e2-medium → e2-standard-2
- Hetzner: cap at 5 parallel (primary IP limit)
- Sprite: 300s install wait (slower exec than SSH)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-28 00:25:36 -08:00
Ahmed Abushagur
c595c90dc4
fix(e2e): prevent multi-cloud name and file collisions (#2013)
When multiple clouds run in parallel, they generate the same app name
(e.g. e2e-claude-TIMESTAMP) and write to the same temp files
(.exit/.stdout/.stderr), causing data corruption.

- Include ACTIVE_CLOUD in make_app_name: e2e-gcp-claude-TIMESTAMP
- Use ${app_name} instead of ${agent} for provision temp files

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-28 03:12:05 -05:00
A
87978b424d
refactor: Remove dead code and stale references (#2010)
Dead code removed:
- `cleanup_stale_apps` function in `sh/e2e/lib/cleanup.sh` — defined but
  never called; `e2e.sh` calls `cloud_cleanup_stale` directly instead
- `generateEnvConfig` and `AgentConfig` re-exports from all 7 cloud-specific
  `agents.ts` modules (aws, hetzner, gcp, digitalocean, daytona, local,
  sprite) — nothing imported these from the cloud modules; they were already
  available via `@openrouter/spawn-shared` and `../shared/agents`

All 1435 tests pass, biome lint is clean (0 errors).

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 00:18:20 -05:00
Ahmed Abushagur
627026a26b
feat(e2e): multi-cloud test suite with cloud driver pattern (#2004)
* feat(e2e): multi-cloud test suite with cloud driver pattern

Scale the E2E test suite from AWS-only to all 6 infrastructure clouds
(aws, hetzner, digitalocean, gcp, daytona, sprite) with parallel
execution support.

Architecture:
- Cloud driver pattern: each cloud implements _cloudname_func() functions
- load_cloud_driver() wires cloud-specific functions to generic names
  (cloud_exec, cloud_teardown, etc.)
- Shared orchestration stays in one place, cloud details are isolated

New files:
- sh/e2e/e2e.sh — unified entry point with --cloud flag
- sh/e2e/lib/clouds/{aws,hetzner,digitalocean,gcp,daytona,sprite}.sh

Refactored:
- common.sh — removed AWS constants, added load_cloud_driver()
- provision.sh — cloud-agnostic via cloud_headless_env/cloud_provision_verify
- verify.sh — replaced aws_ssh with cloud_exec/cloud_exec_long
- teardown.sh/cleanup.sh — delegate to cloud driver functions
- aws-e2e.sh — thin wrapper: exec e2e.sh --cloud aws

Usage:
  e2e.sh --cloud aws                     # Single cloud
  e2e.sh --cloud aws --cloud hetzner     # Multiple clouds in parallel
  e2e.sh --cloud all --parallel 3        # All clouds, 3 agents parallel

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(e2e): prevent subshell EXIT trap inheritance and single-cloud early exit

- Reset EXIT trap in multi-cloud subshells to prevent LOG_DIR deletion
  before the main process reads log files
- Use `|| true` for single-cloud run_agents_for_cloud to prevent set -e
  from skipping the summary on env validation failure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: default to parallel agent provisioning in e2e tests

All agents within a cloud now run in parallel by default instead of
sequentially. Use --sequential to restore the old behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: cap sprite parallelism, 4GB for openclaw, remove stderr suppression

- Sprite: add _sprite_max_parallel (cap 2 concurrent agents) to avoid
  CLI rate limiting that caused all 6 agents to fail
- AWS: use medium_3_0 (4GB) bundle for openclaw which needs more RAM
- Input tests: remove 2>/dev/null from agent commands so failures
  produce visible error output instead of empty responses
- Add cloud_max_parallel to driver interface, respected by e2e.sh

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use bash instead of sh for exec_long across all cloud drivers

Ubuntu's /bin/sh is dash, which doesn't support bash-specific PATH
sourcing from .spawnrc/.cargo/env. This caused codex and zeroclaw
input tests to fail with "command not found" even though verify passed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: codex input test uses positional prompt, not -q flag

codex CLI takes prompt as positional arg: `codex "PROMPT"`.
The -q flag doesn't exist, causing "Usage:" error output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use codex exec -q for non-interactive input test

codex requires `exec` subcommand for non-interactive mode.
Plain `codex PROMPT` expects a TTY (stdin is not a terminal).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: codex exec takes no -q flag, just positional prompt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use cx23 instead of deprecated cx22 for Hetzner e2e tests

Hetzner deprecated server type cx22 (ID 104). The default now uses cx23.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-27 19:28:08 -08:00
A
edcdf78ba8
fix(e2e): correct env var names passed to AWS CLI in provision.sh (#1985)
The provision.sh was setting wrong env var names that the TypeScript CLI
does not read:
- AWS_LIGHTSAIL_INSTANCE_NAME → LIGHTSAIL_SERVER_NAME (read by aws.ts:getServerName)
- AWS_REGION → AWS_DEFAULT_REGION (read by aws.ts:authenticate/promptRegion)
- AWS_BUNDLE → LIGHTSAIL_BUNDLE (read by aws.ts:promptBundle)

Without the correct names, each provisioning run created an instance with a
random generated name instead of app_name, causing the post-provision
existence check to fail every time.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 01:47:41 -08:00
A
fe7cf1ba4b
refactor: Remove dead code and stale references (#1987)
Remove stale Fly.io references from shared shell scripts. Fly.io was
removed as a cloud provider (#1979) but comments referencing its
specific token format ("FlyV1 <macaroon>") and container behavior
remained in key-request.sh and github-auth.sh.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 04:19:53 -05:00
A
d8131c3df6
fix(hetzner): update deprecated server types to cx23/cpx22 gen (#1983)
* fix(hetzner): update deprecated cx22/cpx21 server types to cx23/cpx22

Hetzner deprecated the entire cx*2 and cpx*1 server lines on Jan 1, 2026.
New orders fail with "server type is deprecated". Updates to the current
gen3 CX and gen2 CPX lines (cx23, cx33, cx43, cx53, cpx22, cpx32).

Also shows the server type picker by default instead of requiring --custom,
so users can choose their instance size on every deploy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(zeroclaw): append autonomy config instead of overwriting onboard output

zeroclaw onboard generates a complete config with required fields like
default_temperature. Our setup was overwriting that with a partial config
missing required fields, causing a crash loop on startup. Now appends
the security/shell settings instead so onboard's fields are preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix biome formatting in agent-setup.ts

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Agent: pr-maintainer

---------

Co-authored-by: spawn-bot <spawn-bot@openrouter.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-02-27 00:20:31 -08:00
A
7d83486077
refactor: Remove dead code and fix stale references in QA sweep (#1978)
- Replace local rec() helper in hetzner.ts with shared toRecord() from
  @openrouter/spawn-shared, eliminating a duplicate implementation that
  already existed in the shared package with equivalent behavior
- Fix stale comment in key-request.sh referencing non-existent qa.sh

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-27 00:10:16 -05:00
A
d04096a15b
feat!: remove Fly.io cloud provider support (#1979)
* feat!: remove Fly.io cloud provider support

Drop Fly.io as a supported cloud provider. Sprite (which uses Fly.io
infrastructure internally) is retained.

- Delete packages/cli/src/fly/ module, sh/fly/ scripts, fixtures/fly/
- Remove fly cloud entry and 6 fly matrix entries from manifest.json
- Remove fly imports, destroy cases, and connection handlers from commands.ts
- Remove fly-ssh sentinel from security.ts
- Port E2E test suite from Fly.io to AWS Lightsail (fly-e2e.sh → aws-e2e.sh)
- Update README (7 clouds, 42 combinations), CLAUDE.md, and skill prompts
- Clean up fly references in build config, gitignore, icon sources
- Bump CLI version to 0.11.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: restore Docker image build under sh/docker/

Move openclaw Dockerfile from sh/fly/docker/ to sh/docker/ and rename
workflow from fly-docker.yml to docker.yml with updated paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix extra blank lines in commands.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: spawn-bot <spawn-bot@openrouter.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-02-27 00:06:32 -05:00
A
f1c2b4d4e4
refactor: Remove dead code and stale references (#1976)
* refactor: remove dead offerGithubAuth exports from cloud agents.ts files

The per-cloud offerGithubAuth re-exports in each cloud's agents.ts were
never called from outside their own module. The actual GitHub auth
orchestration is handled by shared/orchestrate.ts which calls
offerGithubAuth from shared/agent-setup.ts directly.

Also update stale comment in sh/test/fixtures/_shared_agent_assertions.sh
that referenced mock.sh, a test harness file that no longer exists in
the repository.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: collapse multi-line imports to single-line per biome format

After removing offerGithubAuth exports, the remaining 2-import blocks
should be single-line. Also collapse fly/agents.ts 4-import block and
remove trailing blank line.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-02-26 22:04:33 -05:00
A
e2f40b3d8b
docs(digitalocean): document DO_REGION, DO_DROPLET_SIZE, and --custom flag (#1970)
Add Environment Variables section to sh/digitalocean/README.md with:
- DO_REGION: list of all 10 available regions with default (nyc3)
- DO_DROPLET_SIZE: list of all 6 available sizes with default (s-2vcpu-4gb)
- --custom flag: interactive region + size picker

Fixes #1968

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-26 16:35:47 -08:00
A
b6f021ecf2
fix(security): clarify base64+single-quote pattern in fly_ssh (#1937)
Fixes #1933. The comments incorrectly implied base64 encoding alone
prevents injection. Safety relies on the combination of base64 output
(no single quotes in alphabet) + single-quote wrapping. Made this
explicit in all 7 affected comments.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-25 18:44:51 -05:00
A
b09295131a
fix(security): replace eval with pipe-to-sh in fly_ssh functions (#1928)
Eliminates nested quote eval pattern in favor of direct pipe to sh,
removing potential injection surface in fly_ssh and fly_ssh_long.
Fixes #1927

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 14:50:24 -05:00
A
f9c1568f9c
fix(security): use explicit exports in provision.sh subshell (#1926)
Replace inline env-var prefix pattern (VAR=value command) with explicit
export statements inside the subshell. While the inline prefix is
POSIX-compliant and not a real injection vector, explicit exports are
clearer about intent, eliminate the fragile backslash-continuation chain,
and prevent future copy-paste of the pattern into unsafe contexts.

Fixes #1924

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-25 12:54:58 -05:00
A
cac06b2706
fix(security): base64-encode INPUT_TEST_PROMPT in E2E verify.sh to prevent injection (#1923)
Fixes #1921

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 10:26:18 -05:00
A
638250e744
fix(security): use base64 encoding in fly_ssh to prevent command injection (#1918)
Replace single-quote escaping (which only handled ' but not other shell
metacharacters like $(), backticks, ;, ||, &&, |) with base64 encoding.
Base64 output contains only [A-Za-z0-9+/=] characters, completely
eliminating shell metacharacter injection risks regardless of command
content. Compatible with both GNU coreutils (Linux) and BSD (macOS).

Fixes #1912

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-25 03:25:31 -08:00