Commit graph

532 commits

Author SHA1 Message Date
A
59dea5fc09
refactor: remove dead code and stale references (#2908)
- remove `export` from `LocalTarball` interface in `shared/agent-tarball.ts`
  — the type is only used internally as the return type of `downloadTarballLocally`;
  it was never imported from outside the module.

- remove `getTerminalWidth` re-export from `commands/index.ts`
  — `getTerminalWidth` is only called inside `commands/info.ts` itself;
  it was re-exported through the barrel but never imported from there by any consumer or test.

bump CLI version patch: 0.25.18 → 0.25.19

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 19:51:41 +07:00
A
f296544c1c
fix(cli): bump version to 0.25.18 for security fix in #2904 (#2906)
Commit 97b6424 (fix(security): add cmd validation to Sprite
runSprite() and runSpriteSilent()) changed production CLI code without
a corresponding version bump. The CLI has auto-update — without this
bump users won't receive the null-byte injection guard.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-23 18:50:00 +07:00
A
97b6424ebe
fix(security): add cmd validation to Sprite runSprite() and runSpriteSilent() (#2904)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
Mirrors the guard already in interactiveSession() and all other clouds.
Null bytes in cmd could truncate commands at the C level.

Fixes #2903

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 17:30:25 +07:00
A
5392ff2d7a
fix: detect and recover from Hetzner primary_ip_limit exceeded error (#2905)
When parallel E2E runs exhaust Hetzner's Primary IP quota, the CLI now
detects the `resource_limit_exceeded` / `primary_ip_limit` error, automatically
cleans up orphaned Primary IPs (unattached to any server), and retries once.
If cleanup doesn't free quota, a clear message guides users to delete stale
resources or request a quota increase.

Fixes #2902

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 17:26:32 +07:00
A
d2f11bbf06
test: remove duplicate and theatrical tests (#2901)
cmd-pick-cov.test.ts: remove 8 theatrical flag-parsing tests that all hit
the same early-exit code path (no stdin options → exit 1). Each test
passed a different flag combination but all verified only that exit(1) was
thrown — no flag-specific behavior was actually exercised. Keep the one
meaningful test: "exits with error when no options provided".

ssh-cov.test.ts: consolidate 5 single-assertion constant-check tests into
2 tests (one per constant). All 5 previously tested string membership in
SSH_BASE_OPTS / SSH_INTERACTIVE_OPTS in separate it() blocks.

Before: 1868 tests, 4454 expect() calls
After:  1857 tests, 4446 expect() calls (-11 tests, -8 expects)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 16:28:30 +07:00
A
7aba20e327
fix(ux): deduplicate install messages, add newlines to SSH polling, clarify completion messages (#2900)
- Suppress stdout+stderr from `claude install --force` to prevent duplicate
  "successfully installed" messages (was printed up to 4x)
- Make logStepInline fall back to newline-separated output when stderr is not
  a TTY, so SSH port polling status is readable in piped/captured contexts
- Consolidate post-install completion messages into a single clear milestone:
  "Agent setup complete -- {agent} is ready on {cloud}"
- Bump CLI version to 0.25.16

Fixes #2899

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 15:26:34 +07:00
A
e7e3b327a1
test: remove duplicate saveSpawnRecord describe block (#2896)
The saveSpawnRecord tests in history-trimming.test.ts duplicated the
describe block already in history.test.ts. Moved the two unique test
cases ("no cap" 200-record retention and "assign id when missing") into
history.test.ts and removed the duplicate block from history-trimming.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-23 12:14:49 +07:00
A
f1f2667cb0
fix: skip interactive session in headless mode (#2895)
* fix: skip interactive session in headless mode (#2892)

When SPAWN_HEADLESS=1, the orchestrator now exits with code 0 after
provisioning completes instead of attempting to launch the agent
interactively. This fixes Claude Code (and other agents) failing with
"Input must be provided through stdin or --prompt" when spawned via
`--headless --output json` without a prompt.

The VM is fully provisioned and ready — callers can SSH in or use
`spawn connect` to start the agent manually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: clean up SPAWN_HEADLESS env in test afterEach to prevent leaks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-22 21:38:53 -07:00
A
b0593952df
fix(security): validate cmd parameter in sprite interactiveSession (#2888)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
Add empty-string and null-byte validation to sprite's interactiveSession,
matching the guards already present in aws, hetzner, digitalocean, and gcp.
Without this check, a raw cmd string is passed directly to bash -c.

Fixes #2881

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:53:28 -07:00
A
da07fd4031
fix(security): prevent command injection in sprite uploadFile (#2889)
Replace shell string interpolation with array-based exec arguments in
uploadFileSprite. Previously, remotePath and tempRemote were interpolated
into a bash -c string (`mkdir -p $(dirname '${normalizedRemote}') && mv
'${tempRemote}' '${normalizedRemote}'`), which is inherently unsafe
even with regex validation.

Now uses two separate sprite exec calls with paths passed as discrete
array arguments after `--`, and computes dirname in TypeScript using
node:path/posix instead of shell command substitution. Also fixes the
mockBunSpawn test helper to return fresh ReadableStream instances per
call, preventing "ReadableStream already used" errors.

Fixes #2880

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:51:51 -07:00
A
0224b56a4d
fix(digitalocean): detect droplet limit before creation, clear error on 422 (#2891)
checkAccountStatus() now queries the account's droplet_limit and
current droplet count. When at capacity it warns interactively and
throws immediately in headless/E2E mode with a clear message instead
of attempting creation and getting a cryptic 422.

Also adds specific detection of droplet limit 422 errors in
createServer() with actionable guidance (limit increase URL).

Bump CLI to 0.25.14.

Fixes #2865

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 18:49:17 -07:00
A
83cd6bc6df
test: remove duplicate generateCodeVerifier/generateCodeChallenge tests from oauth-cov (#2885)
These two describe blocks in oauth-cov.test.ts were redundant subsets of the more
comprehensive coverage already in oauth-pkce.test.ts (which includes RFC 7636 test
vectors, uniqueness checks, padding validation, and base64url character checks).

Duplicates found: 1 function pair (generateCodeVerifier + generateCodeChallenge)
Tests removed: 2
Tests rewritten: 0

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 08:43:14 +07:00
A
054a740e5a
refactor: remove stale Packer comment in hetzner.ts (#2878)
The reference to "Hetzner Packer" was removed in #2869.
Updated the comment to accurately describe the snapshot naming convention.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-23 04:14:00 +07:00
A
76afe9546b
test: add missing assertions to no-op smoke tests (#2879)
19 tests across 7 files were calling functions with no expect() calls —
they verified "does not throw" implicitly but provided zero signal on
side effects or return values.

Added assertions to each:
- agent-setup-cov: expect runServer called after graceful failure
- auto-update: expect runServer called on non-fatal SSH error
- aws-cov: assert state.awsRegion set by promptRegion env var paths,
  spawnSync call counts for ensureAwsCli, fetch called for destroyServer
- do-cov: assert SPAWN_NAME_KEBAB preserved on early return,
  fetch NOT called when no token in checkAccountStatus
- gcp-cov: assert spy call counts for authenticate, destroyInstance,
  ensureGcloudCli; spawnSync NOT called when GCP_PROJECT env set;
  fetch NOT called when no project in checkBillingEnabled
- hetzner-cov: assert fetch called for ensureHcloudToken validation
  and for destroyServer REST calls
- ssh-cov: assert connectSpy and bunSpawnSpy called in waitForSsh

All 1925 tests pass. expect() calls increased from 4555 to 4575.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 04:12:18 +07:00
Ahmed Abushagur
baf03ce47b
fix: prevent sprite idle shutdown during agent install (#2874)
The sprite was going idle and shutting down during long npm install
operations because the remote keep-alive script wasn't installed yet
and sprite exec alone doesn't count as activity.

- Add local keep-alive that pings the sprite's public URL every 30s
  from the client machine during provisioning and agent install
- Stop it when the interactive session starts (remote script takes over)
- Add i/o timeout to spriteRetry's transient error regex so connection
  timeouts are retried instead of failing immediately

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 02:13:07 +07:00
A
87f49eba48
test: remove duplicate and theatrical tests (#2873)
Remove 7 redundant tests that test the same code paths as existing tests:

- history.test.ts: consolidate 4 separate "unrecognized JSON value" tests
  (non-array object, JSON string, null, number) into one data-driven test.
  All 4 hit the identical parseHistoryData "Unrecognized format" branch.

- cmd-link-cov.test.ts: remove "exits with error when no IP provided" —
  duplicate of the same test in cmd-link.test.ts with identical behavior.

- update-check-cov.test.ts: remove "skips in test environment" and "skips
  when SPAWN_NO_UPDATE_CHECK=1" — both already covered in update-check.test.ts.

- orchestrate-cov.test.ts: remove "calls preLaunch when defined" — identical
  to the same test in orchestrate.test.ts (same mock setup, same assertion).

All 1866 remaining tests pass. Lint clean.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 20:22:47 +07:00
A
c25594cf09
test: Remove duplicate killWithTimeout tests (#2870)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
* test: remove duplicate and theatrical tests

- cmd-fix-cov.test.ts: remove 6 duplicate fixSpawn tests already covered
  in cmd-fix.test.ts; keep only the unique success message assertion
- icon-integrity.test.ts: consolidate 54 per-entity it() blocks into 4
  data-driven tests (same 67 expect() calls, 50 fewer test cases)
- manifest-type-contracts.test.ts: consolidate per-field for-loop it()
  blocks into 3 grouped tests (same 662 expect() calls, 15 fewer cases)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: remove duplicate killWithTimeout tests from ssh-cov.test.ts

The `killWithTimeout additional` describe block in ssh-cov.test.ts
duplicated scenarios already covered in kill-with-timeout.test.ts:
- "sends SIGTERM then SIGKILL" == kill-with-timeout's SIGKILL grace test
- "does nothing when first kill throws" == kill-with-timeout's SIGTERM throw test

Removed the 2 duplicate tests from ssh-cov.test.ts. The dedicated
kill-with-timeout.test.ts file is the canonical location for
killWithTimeout coverage.

---------

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-22 16:47:59 +07:00
A
cc8b6601ec
refactor: remove stale references and add missing entries to test README (#2871)
- remove stale reference to `commands-update-download.test.ts` (renamed to `cmd-update-cov.test.ts`)
- remove stale reference to `picker.test.ts` (renamed to `picker-cov.test.ts`)
- add 25 missing `-cov.test.ts` files that exist on disk but were undocumented

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-22 15:47:58 +07:00
A
7e56e1839b
test: remove duplicate and theatrical tests (#2868)
- cmd-fix-cov.test.ts: remove 6 duplicate fixSpawn tests already covered
  in cmd-fix.test.ts; keep only the unique success message assertion
- icon-integrity.test.ts: consolidate 54 per-entity it() blocks into 4
  data-driven tests (same 67 expect() calls, 50 fewer test cases)
- manifest-type-contracts.test.ts: consolidate per-field for-loop it()
  blocks into 3 grouped tests (same 662 expect() calls, 15 fewer cases)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 12:06:55 +07:00
A
c1363b138c
feat(gcp): default boot disk to 40 GB, configurable via GCP_DISK_SIZE (#2867)
GCP's default 10 GB boot disk is insufficient for coding agents — node_modules,
apt packages, and build caches easily exceed it. Default to 40 GB and allow
override via GCP_DISK_SIZE env var.

Closes #2866

Co-authored-by: Claude <claude@anthropic.com>
2026-03-22 11:21:05 +07:00
A
92f2de4036
test: remove theatrical tests — replace no-op assertions with real signal (#2863)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
preflight-credentials.test.ts: all 7 tests had zero expect() calls with
comments like "// No crash = pass". Rewrote to capture logWarn mock calls
from mockClackPrompts() and assert on warning presence and credential names.

sprite-cov.test.ts: 13 out of 23 tests had no expect/rejects calls (just
called functions and discarded results). Added assertions on Bun.spawn call
counts to verify: authenticated paths skip login, unauthenticated paths
trigger login, createSprite reuses vs creates based on list output,
verifySpriteConnectivity calls sprite twice, setupShellEnvironment runs
multiple exec commands.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 08:38:39 +07:00
A
300e2fc221
fix(security): shellQuote cmd in runServer() across all cloud providers (#2862)
Defense-in-depth: explicitly shellQuote(cmd) inside runServer() so the
cmd parameter is always protected by single-quote escaping, regardless
of how the surrounding command string is constructed.

Previously, cmd was interpolated raw into fullCmd before the outer
shellQuote() wrapper. While the outer wrapper did protect it, this
made the safety non-obvious and fragile against future refactors.
The new pattern matches interactiveSession() where cmd gets its own
shellQuote() call.

Fixes #2859

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 14:48:37 -07:00
A
3f12cb9ee8
refactor: remove duplicate docker constants into shared orchestrate module (#2860)
Consolidate DOCKER_CONTAINER_NAME and DOCKER_REGISTRY constants from
gcp/main.ts and hetzner/main.ts into shared/orchestrate.ts. Both files
defined identical values ("spawn-agent" and "ghcr.io/openrouterteam"); they
now import the shared exports instead.

Bumps CLI patch version to 0.25.11.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-21 14:27:21 -07:00
A
d480a7fec4
test: remove duplicate and theatrical tests (#2861)
- manifest.test.ts: remove 4 duplicate loadManifest error/fallback tests
  (HTTP 500 stale-cache, no-cache-HTTP500-throws, invalid-manifest-throws,
  network-error-throws) — all covered more thoroughly by
  manifest-cache-lifecycle.test.ts

- ssh-keys.test.ts: remove 2-key sorting test superseded by ssh-keys-cov.test.ts
  which validates the full 3-way sort order (ED25519 > RSA > ECDSA)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:43:47 +07:00
A
7ab6c693d3
fix: add --beta docker to help output and update description (#2857)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
The --beta docker feature (PR #2854) was missing from `spawn help`
output, and its error description said "Hetzner" only but it also
works on GCP.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-21 06:20:35 -07:00
A
2f329684e0
test: remove duplicate and theatrical tests (#2858)
- aws-cov.test.ts: remove aws/BUNDLES (3 tests) and aws/credential-persistence
  (6 tests) — all scenarios already covered by aws.test.ts with stronger
  assertions (>= 5 tiers vs >= 3, pricing format, naming convention, etc.)

- cmd-run-cov.test.ts: remove "cmdRun dry run" and "cmdRun validation" (3 tests)
  — dry-run is covered more thoroughly in cmdrun-happy-path.test.ts;
  validation tests duplicate commands-error-paths.test.ts exactly

- agent-setup-cov.test.ts: remove "agents return non-empty launch commands"
  (weaker duplicate of "all agents have launchCmd") and "agents have configure
  functions" (no expect() calls — theatrical)

Total: 5 tests removed, 162 lines deleted, 0 regressions (1951 pass)

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 19:49:27 +07:00
Ahmed Abushagur
6d2c4746f5
feat: add --beta docker for Hetzner Docker CE app image (#2854)
* feat: add --beta docker for Hetzner Docker CE app image

Uses Hetzner's pre-built docker-ce app image when --beta docker
(or --fast) is active, giving faster boot times similar to DO
marketplace images. Snapshots still take priority when available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: pull and run pre-built agent Docker images on Hetzner

When --beta docker (or --fast) is active, boots Hetzner with docker-ce
app image, then pulls ghcr.io/openrouterteam/spawn-{agent}:latest and
runs it. All runServer commands are routed through docker exec into
the container, and the interactive session uses docker exec -it.
Skips agent install since the agent is pre-baked in the image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add --beta docker support for GCP with Container-Optimized OS

When --beta docker (or --fast) is active on GCP, uses cos-stable
from cos-cloud (Docker pre-installed, read-only OS). Skips cloud-init
startup script (incompatible with COS), pulls the pre-built agent
image from ghcr.io, and routes all commands through docker exec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: correct import path for logInfo/logStep (shared/log.js -> shared/ui.js)

The log.js module does not exist; these functions are exported from ui.ts.
Also merge duplicate ui.js imports per biome organizeImports.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-21 17:10:19 +07:00
A
bfe9fb9808
test: remove duplicate and theatrical tests (#2856)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
- Replace 10x `expect(true).toBe(true)` in update-check-cov.test.ts with
  meaningful assertions: skip-condition tests now verify fetch was NOT called,
  fetch-failure tests use `resolves.toBeUndefined()`, backoff edge-case tests
  verify fetch WAS called (proving the skip was bypassed)
- Remove theatrical executor existence check (`typeof executor.execFileSync === "function"`)
  that proved nothing about behavior
- Replace structural `typeof agent.install/envVars/launchCmd === "function"` checks in
  agent-setup-cov.test.ts with assertion that agent names are non-empty strings;
  the downstream tests already prove the functions work by calling them

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-21 15:48:44 +07:00
Ahmed Abushagur
8c7a381375
fix: auto-reconnect on Sprite connection drops (#2855)
Sprite CLI exits with code 1 on "connection closed" (not 255 like SSH).
The reconnect loop now treats exit code 1 on Sprite as a connection
drop, retrying up to 5 times with a 3s delay between attempts.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:13:14 +07:00
A
a3e0dbd4dd
test: remove duplicate and theatrical tests (#2853)
- Remove `digitalocean/findSpawnSnapshot` describe from do-cov.test.ts
  (3 basic tests) — fully superseded by do-snapshot.test.ts (7 thorough
  tests covering name filtering, invalid IDs, network failure, etc.)

- Remove `setupAutoUpdate` describe from agent-setup-cov.test.ts
  (2 shallow tests checking only "systemd" string presence) — fully
  superseded by auto-update.test.ts which verifies exact systemd unit
  content, base64-encoded scripts, timer schedules, and error handling

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 12:24:00 +07:00
Ahmed Abushagur
26332afa56
fix: prevent silent exit in --fast mode on Sprite (#2852)
In fast mode, Promise.allSettled runs server boot, OAuth, and tarball
download concurrently. When all operations complete — especially after
Bun.serve.stop(true) in the OAuth flow removes its event loop handle —
the event loop can appear empty before the await continuation starts
new I/O operations. This causes Bun to exit silently with code 0,
dropping the user back to their shell after "Successfully obtained
OpenRouter API key via OAuth!" with no error.

Fix: keep a dummy setInterval handle alive during the fast-mode
concurrent section so the event loop never drains prematurely.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:51:02 -07:00
A
acfc31027b
test: delete theatrical unicode-cov.test.ts (#2848)
Fixes #2847

Removes 273 lines of false-confidence tests that copy-paste
shouldForceAscii() logic inline 9x with zero imports from
unicode-detect.ts. Every test passed even if the real source
was deleted — a theatrical test is worse than no test.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 19:29:14 -07:00
A
84e78a0274
fix(test): prevent flaky timeout in checkBillingEnabled test (#2845)
The test assumed _state.project would be empty, but module-level state
persists across tests due to import caching. Prior resolveProject tests
set _state.project, so checkBillingEnabled would attempt a real
gcloudSync call and time out at 5s. Mock spawnSync to handle both cases.

Agent: pr-maintainer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-20 18:38:48 -07:00
A
a7690f8400
test: remove duplicate and theatrical tests (#2846)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
- history-cov.test.ts: remove duplicate filterHistory ordering test and
  no-cap saveSpawnRecord test — both are already covered more thoroughly
  in history-trimming.test.ts

- unicode-cov.test.ts: remove theatrical pattern where each test
  re-implemented shouldForceAscii as an inline lambda (testing an inline
  copy instead of the real function). consolidate into a single shared
  helper that mirrors the actual module logic, tested once per scenario.

-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-20 17:57:40 -07:00
A
62e5918078
fix(security): wrap runServer SSH commands with shellQuote in DO and Hetzner (#2843)
DigitalOcean and Hetzner runServer() passed the command string directly
to SSH without shell-quoting, allowing metacharacters (;, |, $(), etc.)
to be interpreted by the remote shell. AWS and GCP already used
`bash -c ${shellQuote(fullCmd)}` — this applies the same pattern to the
two affected modules.

Fixes #2836

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-20 17:34:43 -07:00
A
ffb4cbeb11
fix(security): prevent path traversal in uploadFile/downloadFile across all cloud providers (#2844)
Check for ".." path traversal in the raw input BEFORE normalize() strips
it, fixing CWE-22 where crafted paths like "/tmp/../../etc/passwd"
normalized to "/etc/passwd" and bypassed the post-normalize ".." check.

Extracts a shared validateRemotePath() into shared/ssh.ts and replaces
the duplicated inline validation in all 5 providers (DigitalOcean,
Hetzner, GCP, AWS, Sprite) plus agent-setup.ts.

Fixes #2835

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-20 16:48:58 -07:00
A
b9e326d649
fix: use base64 encoding for GITHUB_TOKEN to prevent injection (#2840)
* fix: use base64 encoding for GITHUB_TOKEN to prevent injection

Aligns GITHUB_TOKEN handling with the existing base64 pattern used for
OPENROUTER_API_KEY in orchestrate.ts, eliminating the single-quote
escaping vulnerability.

Fixes #2834

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: apply shellQuote to base64-encoded GITHUB_TOKEN

Address security review feedback: wrap the base64-encoded token in
shellQuote() for defense-in-depth, preventing any theoretical shell
metacharacter escape from the interpolated value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 16:46:49 -07:00
A
3551995aa1
refactor: remove dead code and stale references (#2832)
Deduplicate identical mockBunSpawn helper that was copy-pasted across
five test files (aws-cov, gcp-cov, do-cov, hetzner-cov, sprite-cov).
Centralise it in test-helpers.ts and import from there instead.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 14:12:20 -07:00
A
5e263dd12f
test: remove duplicate and theatrical tests (#2831)
Remove 10 duplicate test cases from cmd-list-cov.test.ts and
cmd-run-cov.test.ts that were already covered by dedicated test files:

- buildRecordLabel (3 tests) — duplicated from cmdlast.test.ts
- buildRecordSubtitle (3 tests) — duplicated from cmdlast.test.ts
- cmdListClear (2 tests) — weaker duplicates of clear-history.test.ts
- cmdLast (1 test) — duplicated from cmdlast.test.ts
- cmdRun detectAndFixSwappedArgs (1 test) — duplicated from
  commands-swap-resolve.test.ts which has 10 thorough swap tests

-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-20 13:47:17 -07:00
A
32525f5dd7
test: remove duplicate and theatrical tests (#2830)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
- delete manifest-cov.test.ts: it duplicated stripDangerousKeys,
  agentKeys/cloudKeys/matrixStatus/countImplemented from manifest.test.ts;
  unique tests (isStaleCache, getCacheAge, richer loadManifest edge cases)
  consolidated into manifest.test.ts
- remove sprite/interactiveSession from sprite-cov.test.ts: superseded by
  sprite-keep-alive.test.ts which tests actual script content
- remove sprite/installSpriteKeepAlive from sprite-cov.test.ts: superseded
  by sprite-keep-alive.test.ts
- remove startGateway from agent-setup-cov.test.ts: superseded by
  gateway-resilience.test.ts which checks systemd config, cron, and port-wait

all 2050 tests pass

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-20 09:51:57 -07:00
A
1dc9c04eeb
fix: standardize ESM import extensions across 35 production files (#2827)
Add .js extensions to 124 relative imports that were missing them.
The codebase is "type": "module" (ESM) and the dominant pattern already
used .js extensions, but 35 files had a mix of extensionless and .js
imports — sometimes within the same file. Standardize to .js everywhere.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 08:51:40 -07:00
A
f4e2cd80a4
fix(ux): add spawn link to help output and --fast to KNOWN_FLAGS (#2828)
spawn link is a fully implemented command (440 lines) that was
completely missing from `spawn help`. Users had no way to discover
it through the CLI's self-documentation.

Also adds --fast to the KNOWN_FLAGS set for consistency — it was
accepted by the CLI but not registered in the flag validation set.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 08:49:26 -07:00
Ahmed Abushagur
21c0e1511c
fix: remove 100-entry history cap — keep all records (#2819)
The MAX_HISTORY_ENTRIES=100 cap silently archived records when you
spawned more than 100 times, making older active servers vanish from
`spawn list`. The cap was solving a non-problem — 1000 records is ~500KB.

Removed:
- MAX_HISTORY_ENTRIES constant and trimming logic
- archiveRecords() and readExistingArchive() (no longer needed)
- Smart trim tests (history-trimming.test.ts rewritten to test ordering only)

Existing archive files (~/.spawn/history-YYYY-MM-DD.json) are still
readable by recoverFromArchives() for corruption recovery.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 06:32:08 -07:00
A
a7cebd4054
test: remove duplicate and theatrical tests (#2826)
- delete commands-update-download.test.ts (7 tests): superseded by
  cmd-update-cov.test.ts which has 13 tests with better fallback URL
  coverage and uses clack mocks properly

- remove saveSpawnRecord id generation describe from history-cov.test.ts
  (1 test): superseded by history-spawn-id.test.ts which has 3 more
  thorough tests covering the same scenario

- remove 4 describe blocks from cmd-run-cov.test.ts (18 tests):
  getSignalGuidance, getScriptFailureGuidance, getScriptFailureGuidance
  additional, and getSignalGuidance additional are all covered more
  thoroughly by the dedicated script-failure-guidance.test.ts; the
  "additional" blocks were theatrical (only checked joined.length > 0)

- delete picker.test.ts and merge its 8 parsePickerInput tests into
  picker-cov.test.ts to eliminate duplicate describe name collision

2063 -> 2036 tests (-27), 0 failures

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-20 06:11:57 -07:00
A
8f24067336
test: remove duplicate and theatrical tests (#2820)
Remove thin duplicate test blocks that were redundant with more comprehensive
coverage elsewhere:

- ui-cov.test.ts: drop shellQuote (4 tests → gcp-shellquote.test.ts has 11),
  jsonEscape (1 test → ui-utils.test.ts has 4), toKebabCase (2 tests →
  ui-utils.test.ts has 5), sanitizeTermValue (2 tests → ui-utils.test.ts has
  6), withRetry (3 tests → with-retry-result.test.ts has 8)
- agent-setup-cov.test.ts: drop wrapSshCall (5 tests → with-retry-result.test.ts
  has 7 plus integration tests)
- run-path-credential-display.test.ts: drop isRetryableExitCode (2 tests →
  cmd-run-cov.test.ts has 5)
- history-cov.test.ts: drop generateSpawnId (2 tests → history-spawn-id.test.ts
  has 2 with UUID format check) and clearHistory (2 tests →
  clear-history.test.ts has extensive coverage)
- cmd-list-cov.test.ts: drop formatRelativeTime (9 tests →
  commands-exported-utils.test.ts has 10 with an extra boundary case)

All 2063 tests pass, biome lint clean.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-20 05:00:22 -07:00
A
9ddf8b67b0
fix(ux): remove -n short flag from spawn link --name to prevent silent conflict (#2822)
The top-level arg parser in index.ts:820 claims -n for --dry-run before
any subcommand sees it. Running `spawn link 1.2.3.4 -n my-server` silently
drops the intended name value — the user gets no error, the spawn is
registered without the name they specified.

Removing -n from link's --name extractFlag call eliminates the conflict.
The --name long form is unaffected and documented in the usage string.

Also updates cmd-link-cov.test.ts to use --name in the short-flags test.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 04:01:00 -07:00
A
24bdf664ab
fix(types): resolve TypeScript strict mode errors in production code (#2824)
Fix 24 TypeScript strict mode errors across 7 production files:

- interactive.ts: guard against undefined `val` in validate callback
- list.ts: use already-narrowed `conn` variable instead of `selected.connection`
- run.ts: widen `buildCloudLines` defaults param to `Record<string, unknown>`
- digitalocean.ts: use `toRecord()` to safely drill into nested API responses;
  capture narrowed `oauthCode` in const for async closure
- history.ts: backfill missing record IDs via `backfillRecordIds()` helper;
  use `v.safeParse` output directly to get properly typed records
- index.ts: use `Manifest` type for `showUnknownCommandError` parameter
- orchestrate.ts: capture narrowed `tunnel` and `getConnectionInfo` in const
  variables before async closures

Fixes #2821

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-20 03:17:04 -07:00
Ahmed Abushagur
54820f0bea
fix: add file locking to history writes + backfill missing record IDs (#2817)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 5s
Lint / Biome Lint (push) Failing after 5s
Lint / macOS Compatibility (push) Successful in 15s
Lint / ShellCheck (push) Successful in 1m13s
History records were being silently lost when concurrent spawn processes
did load→modify→save simultaneously (last writer wins, first record
vanishes). This explains records disappearing from `spawn list`.

Changes:
- Add mkdir-based advisory file locking (withHistoryLock) around all
  write operations: saveSpawnRecord, saveLaunchCmd, saveMetadata,
  markRecordDeleted, removeRecord, updateRecordIp, updateRecordConnection
- Stale lock detection (>30s) prevents deadlocks from crashed processes
- Backfill IDs on legacy records without them during loadHistory()
- Validate archive records during merge (readExistingArchive)
- Limit archive recovery scan to 30 most recent files

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:48:58 -07:00
A
0ea0d5bb61
test: add coverage for retryOrQuit and skipCloudInit auto-detection (#2810)
Both functions were added in recent commits but had zero test coverage:
- retryOrQuit (ed127cf): non-interactive mode now verified to throw
- skipCloudInit (2280550): 4 cases verify correct tier/cloud/mode conditions

1468 tests pass, 0 failures.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 23:45:04 -07:00
A
69b6f8aa66
fix(test): fix 7 failing tests — GCP mock gaps and sandbox pollution (#2816)
- GCP coverage tests (6 failures): getServerIp, listServers, and
  authenticate tests did not mock the `which gcloud` spawnSync call
  inside requireGcloudCmd(), causing "gcloud CLI not found" errors.
  Add mockSpawnSyncWithGcloud/mockWhichGcloud helpers that satisfy
  the gcloud discovery call before the test-specific mock.

- Sandbox guardrail test (1 failure): cmd-uninstall-cov deletes
  ~/.spawn and other sandbox directories but never re-creates them.
  Since Bun runs test files in the same process, the fs-sandbox
  test then fails. Add afterEach restoration of sandbox dirs.

- Add coverageThreshold to bunfig.toml with correct syntax
  (coverageThreshold under [test], not [test.coverage])

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-19 23:43:13 -07:00