Commit graph

1735 commits

Author SHA1 Message Date
A
f1ca7808c4
fix(ux): remove duplicate OAuth browser fallback URL message (#2143)
The DigitalOcean OAuth flow printed two near-identical fallback URL
messages: one manually before calling openBrowser(), and one from
openBrowser() itself. Remove the manual one since openBrowser()
already handles the fallback.

Fixes #2140

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 14:50:30 -05:00
A
91960b5e80
fix: exit process when remote session ends (#2148)
After showing post-session messages, the local process now exits cleanly
instead of requiring an extra Ctrl+C. The root cause was that after main()
resolved, lingering event loop handles (from @clack/prompts stdin listeners,
fetch connections, etc.) prevented Node/Bun from exiting naturally.

The fix adds process.exit(0) on successful main() completion, which covers
all session paths (bash script execution via execScript, SSH reconnection
via cmdConnect, and agent re-entry via cmdEnterAgent).

Fixes #2145

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 11:48:43 -08:00
A
79aa70c390
test: add coverage for untested ui utility functions (#2135)
* test: add coverage for 6 untested pure utility functions in shared/ui.ts

Adds tests for validateServerName, validateRegionName, validateModelId,
toKebabCase, sanitizeTermValue (security-critical), and jsonEscape.
These exported functions previously had zero test coverage.

Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: apply biome formatting to ui-utils test file

Address formatting review feedback: reformats destructuring import
to match project style.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 11:22:06 -08:00
A
83d68c6a37
refactor: Remove dead code and stale references (#2137)
Remove `cleanup_stale_apps()` in `sh/e2e/lib/cleanup.sh` which was dead
code — defined but never called. The E2E orchestrator (`e2e.sh`) invokes
`cloud_cleanup_stale` directly on the active cloud driver; the wrapper
function and its file served no purpose.

Also remove the corresponding `source` call in `e2e.sh`.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-03 11:51:59 -05:00
A
22a06e3237
test: Remove duplicate and theatrical tests (#2136)
- Remove cmdlast "should not call cmdRun when no history exists" test which
  admitted in its own comment that it could not verify its stated intent and
  simply duplicated the assertion from the previous test in the same describe block.

- Fix always-pass risk in manifest-type-contracts: "Interactive prompts
  structure" and "Config files structure" tests iterated over optional agent
  fields with a bare continue when the field was absent, meaning both tests
  would vacuously pass if no agents had those fields. Added guard assertions
  (expect(length).toBeGreaterThan(0)) matching the pattern used by sibling
  tests in the same file.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 11:50:54 -05:00
A
8de2c17c99
refactor: Remove dead code and stale references (#2132)
* refactor: Remove dead code and stale references

- Remove unused variables and functions in test files:
  - cmdlast.test.ts: remove unused cmdRunMock and consoleOutput function
  - cmdlist-integration.test.ts: remove unused resolveDisplayName import and consoleErrorOutput function
  - cmd-listing-output.test.ts: remove unused getTerminalWidth import
  - commands-update-download.test.ts: remove unused callIndex variable
  - download-and-failure.test.ts: remove unused callCount variable and unused init parameter
  - manifest-cache-lifecycle.test.ts: remove unused m1 variable
  - manifest-integrity.test.ts: fix unused key in for-loop destructuring
  - manifest-type-contracts.test.ts: fix 9 unused loop variables, remove implicit any let,
    replace while-exec loop with matchAll to resolve noAssignInExpressions error
- Fixes biome lint errors from 22 down to 0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: apply biome format to fix CI check

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-03 09:49:53 -05:00
A
6881719b1a
fix(security): pipe base64 via stdin in daytona uploadFile (#2133)
Eliminates b64 interpolation into the remote shell command string,
providing defense-in-depth alongside existing path validation.

Fixes #2130

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 08:32:40 -05:00
A
99a0f58937
test: fix always-pass in-memory cache test to assert fetch not called again (#2131)
The "should use fresh disk cache without calling fetch" test only checked
toHaveProperty("agents"), which would pass even if fetch was called again.
Renamed to reflect actual behavior (in-memory cache path) and added
assertions: expect(m2).toBe(m1) and fetch call count unchanged.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 08:30:33 -05:00
Ahmed Abushagur
300b330106
fix: address 4 reliability issues across codebase (#2129)
* fix: address 4 reliability issues across codebase

1. sprite.ts: add --force to destroy command (stdin is "ignore" so
   interactive prompts would hang until 60s timeout)

2. verify.sh: replace /dev/tcp port checks with ss -tln primary
   (Debian/Ubuntu bash compiled without /dev/tcp support)

3. verify.sh: make _openclaw_restart_gateway a hard failure instead
   of log_warn (matching _openclaw_ensure_gateway behavior)

4. agent-setup.ts: add ss -tln port check + "already running" early
   exit + increase timeout from 120s to 300s (gateway takes ~3min
   to initialize on AWS medium instances)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: biome format - use consistent double quotes in portCheck

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-03 03:18:44 -05:00
A
c9b8ee5997
refactor: Remove dead code and stale references (#2128)
- sprite/sprite.ts: Replace duplicate saveVmConnection implementation
  with a call to the shared saveVmConnection from history.ts. The local
  version duplicated the mkdir + writeFileSync logic already provided by
  the shared function, just with Sprite-specific hardcoded values.
  Remove now-unused writeFileSync, mkdirSync, and getSpawnDir imports.
- Bump CLI version 0.12.5 → 0.12.6 (patch)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 22:05:38 -08:00
A
8bc0a0291b
test: fix always-pass cache test to assert fetch was not called (#2127)
The "should use disk cache when fresh" test in manifest.test.ts set up
a mock fetch with a comment saying it "should not be called" but never
asserted expect(global.fetch).not.toHaveBeenCalled(). The test passed
whether or not the cache was actually used, providing no signal.


-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:04:48 -05:00
Ahmed Abushagur
4a90abdaa2
fix(e2e): improve openclaw reliability on AWS and other clouds (#2123)
* fix(e2e): improve openclaw reliability on AWS and other clouds

Three changes to make openclaw e2e tests more robust:

1. Increase PROVISION_TIMEOUT from 480s to 720s — AWS cloud-init
   for "full" tier (Node.js + Bun + build-essential) can exceed 480s,
   causing the CLI to be killed before .spawnrc is written.

2. Add .spawnrc manual fallback in provision.sh — if the CLI is killed
   before writing .spawnrc, construct it via SSH using OPENROUTER_API_KEY
   with agent-specific env vars (openclaw, zeroclaw).

3. Add retry logic to openclaw gateway input test — the gateway can
   crash with 1006 websocket closure on resource-constrained instances.
   Now retries once after killing and restarting the gateway process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): fix command injection in e2e provision scripts

- Use printf %q and temp file for api_key handling in provision.sh to
  prevent shell metachar injection (single quotes, backticks, $)
- Double-quote env_b64 interpolation in cloud_exec call to prevent
  word splitting
- Replace echo with printf in bashrc append to avoid portability issues
- Replace overbroad pkill -f 'openclaw gateway' in verify.sh with
  PID-targeted kill via lsof/fuser on port 18789

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-02 23:19:34 -05:00
A
7b650a0103
test: Remove duplicate and theatrical tests (#2124)
* test: Remove duplicate and theatrical tests

Remove 18 duplicate tests from run-path-credential-display.test.ts
that repeated coverage already provided by dedicated test files:
- "entity validation for run path" (7 tests) duplicated check-entity.test.ts
- "key resolution for run path" (6 tests) duplicated fuzzy-key-matching.test.ts
- "run-path validation sequence integration" (5 tests) duplicated
  check-entity.test.ts, fuzzy-key-matching.test.ts, and script-failure-guidance.test.ts

Replace the three duplicate describe blocks with a focused 2-test
describe("isRetryableExitCode") block that covers the only unique
assertions in that section. Also remove unused spyOn import and
unused mockExit variable.

Bump version 0.12.4 → 0.12.5.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(fmt): collapse import to single line for biome format compliance

Agent: team-lead
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-02 22:06:38 -05:00
A
ffe4cf8c9e
refactor: Remove stale shellcheck disable comment from aws/kilocode.sh (#2125)
The SC2154 (referenced but not assigned) comment was leftover from a
prior version of the script. No such external variable is referenced in
the current implementation, making the suppression comment stale.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 20:40:14 -05:00
Ahmed Abushagur
f7b3bc91c9
fix: show all polling status on single updating line (#2115)
Add logStepInline/logStepDone helpers to ui.ts and convert all 9
polling loops (DO droplet, DO cloud-init, AWS instance, AWS cloud-init,
Hetzner cloud-init, Daytona SSH, Sprite connectivity, GCP startup,
shared SSH port) from multi-line spam to a single line that updates
in place.

Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-02 18:10:27 -05:00
A
4aaf125a2c
fix(reliability): add curl download error handling to AWS and Hetzner shims (#2122)
14 agent shim scripts in sh/aws/ and sh/hetzner/ were missing error
handlers on the curl command that downloads the JS bundle from GitHub
releases. If the download failed (network issue, 404, etc.), the script
would silently proceed to exec an empty/corrupt file via bun, producing
a confusing error instead of a clear "Failed to download" message.

All other clouds (GCP, Daytona, DigitalOcean, Sprite) already had this
error handling pattern. This brings AWS and Hetzner into consistency.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 18:09:25 -05:00
A
37c1881613
fix(security): validate AWS region immediately after reading from env (#2119)
Adds validateRegionName() check immediately wherever awsRegion is
assigned from environment variables, rather than waiting until
createInstance(). Prevents malicious region values from being used
in SigV4 signing and shell commands.

Fixes #2113

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 13:52:20 -08:00
A
277c4236a3
fix(security): replace eval with direct indirection in load_cloud_driver (#2121)
Removes eval-based function creation pattern in e2e/lib/common.sh.
Uses variable indirection (ACTIVE_CLOUD global + wrapper functions)
instead of eval to reduce attack surface.

Fixes #2118

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 16:50:27 -05:00
A
2a23ebcaf2
fix(security): restrict OAuth auth code regex to alphanumeric only (#2116)
Removes underscore and hyphen from the OAuth authorization code
validation regex, restricting it to alphanumeric characters only.
Defense in depth: if the code is ever used in logging or other
contexts, special characters won't create injection opportunities.

Fixes #2114

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 13:49:12 -08:00
A
7cc21e4111
fix(security): quote timeout var and validate numeric in sprite.sh (#2120)
Fixes unquoted ${timeout} in _sprite_exec_long that could allow
command injection if timeout contained shell metacharacters.
Adds numeric validation before use.

Fixes #2117

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 16:47:39 -05:00
A
97a92f3d4f
fix(digitalocean): throw on non-2xx in doApi() wrapper (#2112)
* fix(digitalocean): throw on non-2xx in doApi() wrapper

Make doApi() throw on non-2xx responses, matching hetznerApi and daytonaApi.
5/7 call sites were silently swallowing 401/403/404/422 errors by only
destructuring text and ignoring status.

Agent: code-health
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix biome formatting in doApi() signature

Function signature needed multi-line format to match biome expectations.

Agent: code-health
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 12:47:00 -08:00
A
0e145c2e8a
refactor: Remove dead getState() exports from cloud modules (#2108)
Removed `getState()` from hetzner, gcp, daytona, sprite, and digitalocean
modules. These functions were exported but never called from production code
or tests. The aws module retains its `getState()` which is tested in
custom-flag.test.ts to verify region state mutation.

Also bumps CLI patch version (0.12.2 → 0.12.3) as required per project rules.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 10:58:48 -08:00
Ahmed Abushagur
b35ecdfaff
fix(e2e): drop --timeout flag from openclaw agent command (#2109)
The outer cloud_exec_long already enforces a timeout via
INPUT_TEST_TIMEOUT. The inner --timeout 60 was redundant and could
cause premature kills before the outer timeout expired.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: A <258483684+la14-1@users.noreply.github.com>
2026-03-02 10:57:44 -08:00
A
9c7fd0c7da
fix: add 30s fetch timeout to all cloud API client wrappers (#2110)
* fix: add 30s AbortSignal.timeout to all cloud API fetch wrappers

All four cloud provider API client wrapper functions (lightsailRest,
hetznerApi, doFetch, daytonaApi) were missing fetch timeouts, while
every other fetch call in the codebase already used AbortSignal.timeout.
A stalled TCP connection to any cloud provider would cause the CLI to
hang indefinitely with no user feedback or recovery path.

Agent: team-lead
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: apply biome formatting to fetch timeout changes

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 13:55:34 -05:00
Ahmed Abushagur
35badf8d1b
fix(e2e): fail hard when OpenClaw gateway doesn't start (#2111)
The gateway startup was silently swallowed with log_warn, masking
real failures. Now tracks whether the port came up and fails the
test with the gateway log contents if it didn't.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-02 13:51:29 -05:00
A
1c35f648b9
test: Remove conditional always-pass guards in manifest-integrity (#2107)
The script content tests had `if (existsSync(scriptPath))` guards
that silently skipped files if they weren't found on disk. Since
the prior test in the same describe block already asserts that all
implemented entries have script files, these guards were dead code
that could mask failures. Remove them so every sampled script is
unconditionally read and validated.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 11:50:12 -05:00
A
e3229ebe65
fix(digitalocean): throw on non-2xx responses to prevent silent destroy failures (#2106)
The destroyServer function only checked for status 204 (success) and
responses containing a `message` field. Any other non-204 response
(e.g., 403 with no message, 500 with HTML body) fell through to the
success path, logging "Droplet destroyed" and returning normally.

This caused `spawn delete` to mark the droplet as deleted in history
while it was still running and incurring charges.

Now all non-204 responses unconditionally throw, matching the pattern
established in Hetzner (#2105) and Daytona (#2102).

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 09:22:25 -05:00
A
c61a9c6085
fix(hetzner): throw on non-2xx responses to prevent silent failures (#2105)
hetznerApi() was returning raw response text on non-2xx final attempts
instead of throwing, identical to the bug fixed in daytonaApi() by PR #2102.

Impact: a 5xx when fetching SSH keys caused createServer() to receive null
from parseJsonObj(), fall back to an empty SSH key list, and provision a
server the user cannot SSH into -- with no error or warning.

Fix matches the pattern from lightsailRest() (AWS) and PR #2102 (Daytona):
throw with the HTTP status code after retries exhaust on any !resp.ok response.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 05:36:42 -08:00
A
23acf62e1a
refactor: Remove dead code and stale references (#2104)
Update stale comments referencing the monolithic commands.ts (replaced
by commands/ directory in #2095). The shim at commands.ts still exists
for backward compat but all internal code paths now live under commands/.

- Fix test file comments: point checkEntity to commands/shared.ts,
  cmdInteractive to commands/interactive.ts, cmdRun paths to commands/run.ts,
  cmdUpdate to commands/update.ts, cmdCloudInfo to commands/info.ts,
  cmdListClear to commands/list.ts
- Fix guidance-data.ts: update stale extraction comment and circular-dep
  notes to reference commands/run.ts instead of commands.ts
- Fix CLAUDE.md file structure diagram: show commands/ directory and
  note commands.ts is a compatibility shim
- Fix packages/cli/README.md: update directory structure diagram and
  "Adding a New Command" guide to reflect the per-module layout

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 08:32:02 -05:00
A
9a88460b1d
fix(daytona): throw on non-2xx responses to prevent silent destroy failures (#2102)
daytonaApi() returned the raw response body on all final attempts regardless
of HTTP status. destroyServer() checked hasApiError() which only matched 4xx
patterns, so persistent 500/502/503 responses were silently treated as
success — users were told "Sandbox destroyed" when billing continued.

Fix: throw on !resp.ok after retries exhaust, consistent with other cloud
modules (aws, gcp). destroyServer() now uses try/catch. testDaytonaToken()
already had try/catch so the hasApiError() check was redundant.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-02 07:17:30 -05:00
A
afa17d09ff
test: remove Bun.spawnSync subprocess calls from ssh-keys tests (#2101)
* test: remove Bun.spawnSync subprocess calls from ssh-keys tests

Replace Bun.spawnSync calls to ssh-keygen in createFakeKeyPair helper
with plain file writes, and mock Bun.spawnSync via spyOn for all tests
that exercise getKeyType, generateSshKey, and getSshFingerprint.

Cuts test runtime from 1212ms to ~47ms (25x speedup) and brings the
test file into compliance with the CLAUDE.md no-subprocess-spawning
policy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: apply biome formatting to ssh-keys test

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-02 04:25:24 -05:00
Ahmed Abushagur
9242d44cbb
fix(e2e): add --force to sprite destroy in teardown (#2100)
Without --force, sprite destroy prompts for confirmation in
non-interactive E2E mode and silently fails ("Ok, come back later!"),
leaving stale instances running indefinitely.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 00:24:43 -08:00
Ahmed Abushagur
297e7ff21c
fix(sprite): add 60s timeout to destroyServer to prevent hanging (#2096)
The sprite destroy command can hang indefinitely when the Sprite API
is unresponsive. Add a 60s timeout using the existing killWithTimeout
utility (same pattern as runSprite).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Co-authored-by: A <258483684+la14-1@users.noreply.github.com>
2026-03-01 22:37:57 -08:00
A
3911b5bc28
refactor: resolve conflicts — merge packages/shared into packages/cli/src/shared (#2092)
Rebased fix/issue-2083 onto main after commands.ts split (PR #2095).
Key resolutions:
- commands.ts: kept HEAD shim (re-exports from ./commands/index.ts)
- package.json: kept PR version 0.12.0 without @openrouter/spawn-shared dep
- Fixed @openrouter/spawn-shared imports in commands/shared.ts, commands/update.ts,
  and __tests__/orchestrate.test.ts that were added after the PR branched

All 1390 tests pass, biome lint clean.

Agent: pr-maintainer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 22:05:41 -08:00
A
dc489fa652
docs: update __tests__/README.md to reflect current test structure (#2098)
The README was referencing commands.test.ts and integration.test.ts which
no longer exist (split into 20+ specialized files), and incorrectly stated
the test runner was vitest (banned — project uses bun:test). Rewrote to
accurately document all 44 test files with their coverage scope.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 00:04:53 -05:00
A
3d5812602c
refactor: convert hermes scripts to thin-wrapper pattern (#2094)
- hetzner/hermes.sh: add thin-shim header comment, blank line after
  _ensure_bun definition, and section comments (Local checkout, Remote)
  to match the canonical pattern used by aws/gcp/sprite/daytona
- digitalocean/hermes.sh: add detailed _run_with_restart comment block
  and inline section comments (Normal exit, SIGTERM, Other failure) to
  match digitalocean/claude.sh

Both scripts now produce identical output to their cloud's reference
script (e.g. aws/hermes.sh, digitalocean/claude.sh) when the agent
name is substituted.

Fixes #2082

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 20:27:59 -08:00
A
aa7584096d
test: centralize @clack/prompts mock in test-helpers.ts (#2090)
* test: centralize @clack/prompts mock in test-helpers.ts

Adds mockClackPrompts() factory to test-helpers.ts, eliminating ~15-line
duplicate mock.module blocks from 19 test files. When @clack/prompts adds
a new export, only one file needs updating instead of 19.

Fixes #2080

Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* style: fix Biome formatting after merge with main

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 20:27:24 -08:00
A
65b872afa3
refactor: split commands.ts into per-command modules (#2095)
* refactor: split commands.ts into per-command modules

commands.ts (3,522 lines) is split into focused modules under
packages/cli/src/commands/:
- shared.ts: helpers, entity resolution, credentials
- interactive.ts: cmdInteractive, cmdAgentInteractive
- run.ts: cmdRun, cmdRunHeadless
- list.ts: cmdList, cmdLast, cmdListClear
- delete.ts: cmdDelete
- info.ts: cmdMatrix, cmdAgents, cmdClouds, cmdAgentInfo, cmdCloudInfo
- update.ts: cmdUpdate
- help.ts: cmdHelp
- pick.ts: cmdPick
- index.ts: barrel re-export

commands.ts is kept as a 2-line compatibility shim so all existing
imports from "./commands.js" continue to work unchanged.

Fixes #2076

Agent: complexity-hunter
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: apply biome formatting

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 23:24:15 -05:00
A
4802852fac
fix: derive agent lists dynamically in usage messages (#2089)
Six of seven cloud main.ts files had hardcoded agent lists that were
stale (missing hermes, added in #2084). Replace all hardcoded lists
with Object.keys(agents).join(", ") so they stay in sync automatically
when new agents are added.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-01 23:21:15 -05:00
A
29cb0d4c69
test: add unit tests for shared/orchestrate.ts (#2093)
* test: add unit tests for shared/orchestrate.ts

Add 19 focused tests for runOrchestration covering:
- Cloud lifecycle method ordering (authenticate -> provision -> install -> launch)
- API key acquisition and injection into agent.envVars
- process.exit forwarding of interactiveSession exit codes
- Optional hooks: preProvision (non-fatal on error), configure, preLaunch
- Model selection gating via agent.modelPrompt / modelDefault
- Restart loop wrapping for non-local clouds vs raw passthrough for local
- saveLaunchCmd receives unwrapped command
- prepareStdinForHandoff and offerGithubAuth integration

Fixes #2077

Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* style: fix Biome formatting in orchestrate.test.ts

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 23:19:33 -05:00
A
608e8b4bdd
refactor: eliminate 7 identical agents.ts boilerplate files (#2088)
* refactor: eliminate 7 identical agents.ts boilerplate files

Adds createCloudAgents() factory to shared/agent-setup.ts, reducing
each cloud's agents.ts from 16-line copy-paste to a single call.
Net reduction of 49 lines across 9 files.

Fixes #2078

Agent: complexity-hunter
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore: apply biome formatting

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 22:06:57 -05:00
A
b755c6966c
feat: add local/hermes to complete the 7x7 matrix (#2091)
Fixes #2079 — local/hermes was the only remaining missing entry in the
cloud×agent matrix. All 49 entries are now implemented.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 22:04:38 -05:00
Ahmed Abushagur
6dabe88016
fix(qa): add key preflight, retry loop, and failure issue filing for schedule mode (#2087)
- Request missing API keys via key-server in quality mode (was fixtures-only)
- Retry quality cycle up to 3 times before giving up
- File a GitHub issue with log tail when all retries are exhausted

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-01 20:44:59 -05:00
A
2605c9cb83
refactor: Remove dead code and stale references (#2086)
- Add getSpawnCloudConfigPath(cloud) helper to shared/ui.ts, eliminating
  four identical 3-line getConfigPath() functions across hetzner, daytona,
  digitalocean, and aws cloud modules
- Remove duplicate homedir/join imports from hetzner, daytona, digitalocean,
  and aws now that the shared helper centralizes the path construction
- Update commands.ts hasCloudConfigCredentials to use the shared helper
  and drop its stale homedir import
- Bump CLI to 0.11.24 (patch)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-01 20:43:12 -05:00
A
9043143a5a
test: Remove duplicate and theatrical tests (#2085)
Remove 14 redundant tests across two files:

- check-entity.test.ts: Remove 6 individual "valid entities" tests
  (claude, codex, cline as agents; sprite, hetzner, vultr as clouds) that
  are fully covered by the loop-based "all manifest agents/clouds validate
  correctly" describe blocks which exhaustively test all entities.

- check-entity.test.ts: Remove 6 individual "wrong-type detection" tests
  (3 clouds-as-agents, 3 agents-as-clouds) that are covered by the loop
  tests "should reject every agent key when checked as cloud" and
  "should reject every cloud key when checked as agent".

- cloud-init.test.ts: Consolidate 3 NODE_INSTALL_CMD tests into 1.
  "is a non-empty string" is theatrical (tests the constant exists, not
  what it does). Merge with the two content checks into a single test.

Test count: 1385 → 1371.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 20:41:40 -05:00
A
d713f9650f
feat: add hermes agent to 4 clouds, bump install wait to 600s (#2084)
- Add hermes shim scripts for GCP, Hetzner, DigitalOcean, and Daytona
- Update manifest.json matrix entries from "missing" to "implemented"
- Bump default INSTALL_WAIT from 300s to 600s to fix zeroclaw timeout
  on small VMs where Rust compilation takes 8-12 minutes
- Update cloud READMEs with hermes usage docs
- Bump CLI version to 0.11.18

Co-authored-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 19:31:50 -05:00
A
69d1971abf
fix(security): remove space from token validation charset in key-request.sh (#2074)
API tokens never contain spaces; allowing them risks word splitting
in downstream unquoted uses of these env vars. Updated both the shell
regex in key-request.sh and the corresponding TypeScript regexes in
digitalocean.ts to stay in sync.

Fixes #2072

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 17:10:22 -05:00
A
bb4deaf24c
fix: reset stale cache flag, guard gcloud null, validate DO config (#2073)
- manifest.ts: Reset _staleCache on successful fetch/cache load so
  isStaleCache() doesn't falsely report stale data after reconnecting
- gcp.ts: Replace getGcloudCmd()! with requireGcloudCmd() that throws
  a descriptive error instead of crashing with null dereference
- digitalocean.ts: Replace unvalidated JSON.parse return with
  parseJsonObj() + isString()/isNumber() guards for type safety

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 17:08:38 -05:00
A
b066b3a1ac
test: Remove duplicate and theatrical tests (#2070)
* test: Remove duplicate and theatrical tests

Remove 4 duplicate tests spread across security and command resolution test files:

- security-edge-cases.test.ts: Remove "should accept prompts with dollar signs in
  safe contexts" (duplicate of security.test.ts "should accept dollar signs in
  non-expansion contexts")
- security-edge-cases.test.ts: Remove "should accept prompts with pipe to non-shell
  commands" (duplicate of security.test.ts "should accept prompts with pipes to
  other commands")
- security-edge-cases.test.ts: Remove "should accept prompts with semicolons not
  followed by rm" (duplicate of security-encoding.test.ts "should accept semicolons
  not followed by rm")
- commands-swap-resolve.test.ts: Remove "should not log resolution for already-
  lowercase exact keys" (duplicate of commands-resolve-run.test.ts "should not log
  resolution when exact keys are used" — identical cmdRun("claude", "sprite") call)

No functional behavior changes. Test count: 1389 → 1385.

* fix: remove trailing blank line for biome format

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-01 15:53:27 -05:00
A
dbec3e768c
fix(security): add restrictive file permissions to Sprite saveVmConnection (#2068)
The Sprite saveVmConnection() wrote ~/.spawn/last-connection.json without
restrictive permissions (defaulting to umask 0o644/0o755), unlike the shared
saveVmConnection() in history.ts which correctly uses mode 0o700 for the
directory and 0o600 for the file. On multi-user systems this could expose
server names and connection metadata to other users.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 15:44:15 -05:00