Commit graph

497 commits

Author SHA1 Message Date
A
ffb4cbeb11
fix(security): prevent path traversal in uploadFile/downloadFile across all cloud providers (#2844)
Check for ".." path traversal in the raw input BEFORE normalize() strips
it, fixing CWE-22 where crafted paths like "/tmp/../../etc/passwd"
normalized to "/etc/passwd" and bypassed the post-normalize ".." check.

Extracts a shared validateRemotePath() into shared/ssh.ts and replaces
the duplicated inline validation in all 5 providers (DigitalOcean,
Hetzner, GCP, AWS, Sprite) plus agent-setup.ts.

Fixes #2835

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-20 16:48:58 -07:00
A
b9e326d649
fix: use base64 encoding for GITHUB_TOKEN to prevent injection (#2840)
* fix: use base64 encoding for GITHUB_TOKEN to prevent injection

Aligns GITHUB_TOKEN handling with the existing base64 pattern used for
OPENROUTER_API_KEY in orchestrate.ts, eliminating the single-quote
escaping vulnerability.

Fixes #2834

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: apply shellQuote to base64-encoded GITHUB_TOKEN

Address security review feedback: wrap the base64-encoded token in
shellQuote() for defense-in-depth, preventing any theoretical shell
metacharacter escape from the interpolated value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 16:46:49 -07:00
A
3551995aa1
refactor: remove dead code and stale references (#2832)
Deduplicate identical mockBunSpawn helper that was copy-pasted across
five test files (aws-cov, gcp-cov, do-cov, hetzner-cov, sprite-cov).
Centralise it in test-helpers.ts and import from there instead.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 14:12:20 -07:00
A
5e263dd12f
test: remove duplicate and theatrical tests (#2831)
Remove 10 duplicate test cases from cmd-list-cov.test.ts and
cmd-run-cov.test.ts that were already covered by dedicated test files:

- buildRecordLabel (3 tests) — duplicated from cmdlast.test.ts
- buildRecordSubtitle (3 tests) — duplicated from cmdlast.test.ts
- cmdListClear (2 tests) — weaker duplicates of clear-history.test.ts
- cmdLast (1 test) — duplicated from cmdlast.test.ts
- cmdRun detectAndFixSwappedArgs (1 test) — duplicated from
  commands-swap-resolve.test.ts which has 10 thorough swap tests

-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-20 13:47:17 -07:00
A
32525f5dd7
test: remove duplicate and theatrical tests (#2830)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
- delete manifest-cov.test.ts: it duplicated stripDangerousKeys,
  agentKeys/cloudKeys/matrixStatus/countImplemented from manifest.test.ts;
  unique tests (isStaleCache, getCacheAge, richer loadManifest edge cases)
  consolidated into manifest.test.ts
- remove sprite/interactiveSession from sprite-cov.test.ts: superseded by
  sprite-keep-alive.test.ts which tests actual script content
- remove sprite/installSpriteKeepAlive from sprite-cov.test.ts: superseded
  by sprite-keep-alive.test.ts
- remove startGateway from agent-setup-cov.test.ts: superseded by
  gateway-resilience.test.ts which checks systemd config, cron, and port-wait

all 2050 tests pass

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-20 09:51:57 -07:00
A
1dc9c04eeb
fix: standardize ESM import extensions across 35 production files (#2827)
Add .js extensions to 124 relative imports that were missing them.
The codebase is "type": "module" (ESM) and the dominant pattern already
used .js extensions, but 35 files had a mix of extensionless and .js
imports — sometimes within the same file. Standardize to .js everywhere.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 08:51:40 -07:00
A
f4e2cd80a4
fix(ux): add spawn link to help output and --fast to KNOWN_FLAGS (#2828)
spawn link is a fully implemented command (440 lines) that was
completely missing from `spawn help`. Users had no way to discover
it through the CLI's self-documentation.

Also adds --fast to the KNOWN_FLAGS set for consistency — it was
accepted by the CLI but not registered in the flag validation set.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 08:49:26 -07:00
Ahmed Abushagur
21c0e1511c
fix: remove 100-entry history cap — keep all records (#2819)
The MAX_HISTORY_ENTRIES=100 cap silently archived records when you
spawned more than 100 times, making older active servers vanish from
`spawn list`. The cap was solving a non-problem — 1000 records is ~500KB.

Removed:
- MAX_HISTORY_ENTRIES constant and trimming logic
- archiveRecords() and readExistingArchive() (no longer needed)
- Smart trim tests (history-trimming.test.ts rewritten to test ordering only)

Existing archive files (~/.spawn/history-YYYY-MM-DD.json) are still
readable by recoverFromArchives() for corruption recovery.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 06:32:08 -07:00
A
a7cebd4054
test: remove duplicate and theatrical tests (#2826)
- delete commands-update-download.test.ts (7 tests): superseded by
  cmd-update-cov.test.ts which has 13 tests with better fallback URL
  coverage and uses clack mocks properly

- remove saveSpawnRecord id generation describe from history-cov.test.ts
  (1 test): superseded by history-spawn-id.test.ts which has 3 more
  thorough tests covering the same scenario

- remove 4 describe blocks from cmd-run-cov.test.ts (18 tests):
  getSignalGuidance, getScriptFailureGuidance, getScriptFailureGuidance
  additional, and getSignalGuidance additional are all covered more
  thoroughly by the dedicated script-failure-guidance.test.ts; the
  "additional" blocks were theatrical (only checked joined.length > 0)

- delete picker.test.ts and merge its 8 parsePickerInput tests into
  picker-cov.test.ts to eliminate duplicate describe name collision

2063 -> 2036 tests (-27), 0 failures

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-20 06:11:57 -07:00
A
8f24067336
test: remove duplicate and theatrical tests (#2820)
Remove thin duplicate test blocks that were redundant with more comprehensive
coverage elsewhere:

- ui-cov.test.ts: drop shellQuote (4 tests → gcp-shellquote.test.ts has 11),
  jsonEscape (1 test → ui-utils.test.ts has 4), toKebabCase (2 tests →
  ui-utils.test.ts has 5), sanitizeTermValue (2 tests → ui-utils.test.ts has
  6), withRetry (3 tests → with-retry-result.test.ts has 8)
- agent-setup-cov.test.ts: drop wrapSshCall (5 tests → with-retry-result.test.ts
  has 7 plus integration tests)
- run-path-credential-display.test.ts: drop isRetryableExitCode (2 tests →
  cmd-run-cov.test.ts has 5)
- history-cov.test.ts: drop generateSpawnId (2 tests → history-spawn-id.test.ts
  has 2 with UUID format check) and clearHistory (2 tests →
  clear-history.test.ts has extensive coverage)
- cmd-list-cov.test.ts: drop formatRelativeTime (9 tests →
  commands-exported-utils.test.ts has 10 with an extra boundary case)

All 2063 tests pass, biome lint clean.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-20 05:00:22 -07:00
A
9ddf8b67b0
fix(ux): remove -n short flag from spawn link --name to prevent silent conflict (#2822)
The top-level arg parser in index.ts:820 claims -n for --dry-run before
any subcommand sees it. Running `spawn link 1.2.3.4 -n my-server` silently
drops the intended name value — the user gets no error, the spawn is
registered without the name they specified.

Removing -n from link's --name extractFlag call eliminates the conflict.
The --name long form is unaffected and documented in the usage string.

Also updates cmd-link-cov.test.ts to use --name in the short-flags test.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 04:01:00 -07:00
A
24bdf664ab
fix(types): resolve TypeScript strict mode errors in production code (#2824)
Fix 24 TypeScript strict mode errors across 7 production files:

- interactive.ts: guard against undefined `val` in validate callback
- list.ts: use already-narrowed `conn` variable instead of `selected.connection`
- run.ts: widen `buildCloudLines` defaults param to `Record<string, unknown>`
- digitalocean.ts: use `toRecord()` to safely drill into nested API responses;
  capture narrowed `oauthCode` in const for async closure
- history.ts: backfill missing record IDs via `backfillRecordIds()` helper;
  use `v.safeParse` output directly to get properly typed records
- index.ts: use `Manifest` type for `showUnknownCommandError` parameter
- orchestrate.ts: capture narrowed `tunnel` and `getConnectionInfo` in const
  variables before async closures

Fixes #2821

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-20 03:17:04 -07:00
Ahmed Abushagur
54820f0bea
fix: add file locking to history writes + backfill missing record IDs (#2817)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 5s
Lint / Biome Lint (push) Failing after 5s
Lint / macOS Compatibility (push) Successful in 15s
Lint / ShellCheck (push) Successful in 1m13s
History records were being silently lost when concurrent spawn processes
did load→modify→save simultaneously (last writer wins, first record
vanishes). This explains records disappearing from `spawn list`.

Changes:
- Add mkdir-based advisory file locking (withHistoryLock) around all
  write operations: saveSpawnRecord, saveLaunchCmd, saveMetadata,
  markRecordDeleted, removeRecord, updateRecordIp, updateRecordConnection
- Stale lock detection (>30s) prevents deadlocks from crashed processes
- Backfill IDs on legacy records without them during loadHistory()
- Validate archive records during merge (readExistingArchive)
- Limit archive recovery scan to 30 most recent files

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:48:58 -07:00
A
0ea0d5bb61
test: add coverage for retryOrQuit and skipCloudInit auto-detection (#2810)
Both functions were added in recent commits but had zero test coverage:
- retryOrQuit (ed127cf): non-interactive mode now verified to throw
- skipCloudInit (2280550): 4 cases verify correct tier/cloud/mode conditions

1468 tests pass, 0 failures.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 23:45:04 -07:00
A
69b6f8aa66
fix(test): fix 7 failing tests — GCP mock gaps and sandbox pollution (#2816)
- GCP coverage tests (6 failures): getServerIp, listServers, and
  authenticate tests did not mock the `which gcloud` spawnSync call
  inside requireGcloudCmd(), causing "gcloud CLI not found" errors.
  Add mockSpawnSyncWithGcloud/mockWhichGcloud helpers that satisfy
  the gcloud discovery call before the test-specific mock.

- Sandbox guardrail test (1 failure): cmd-uninstall-cov deletes
  ~/.spawn and other sandbox directories but never re-creates them.
  Since Bun runs test files in the same process, the fs-sandbox
  test then fails. Add afterEach restoration of sandbox dirs.

- Add coverageThreshold to bunfig.toml with correct syntax
  (coverageThreshold under [test], not [test.coverage])

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-19 23:43:13 -07:00
A
18c7834d24
fix: restore packages/cli/bunfig.toml for preload when running from subdir (#2813)
The pre-merge hook and `cd packages/cli && bun test` need a local
bunfig.toml so the preload path resolves correctly for the sandbox.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:57:03 -07:00
A
9ae3525030
feat: enforce CI coverage thresholds + colocate billing guidance (#2811)
- Move bunfig.toml to repo root with valid coverageThreshold syntax
  (line=80%, function=0 to avoid per-file false positives)
- Add --coverage flag to CI test step
- Delete packages/cli/bunfig.toml (superseded by root config)
- Add tests for packages/shared (type-guards, parse, result)
- Colocate billing config into each cloud directory (aws/billing.ts,
  gcp/billing.ts, hetzner/billing.ts, digitalocean/billing.ts)
- Refactor billing-guidance.ts: BillingConfig interface replaces
  cloud-string-keyed Record maps
- Bump CLI version to 0.25.1

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-19 22:52:45 -07:00
Ahmed Abushagur
aa4b2a23d6
feat: auto-reconnect on SSH drops during interactive session (#2806)
When SSH exits with code 255 (connection dropped/timed out), retry up
to 5 times with 3s delay between attempts. Clean exits (0), Ctrl+C
(130), and agent crashes exit immediately without retrying.

Only applies to remote clouds — local sessions skip reconnect logic.

Signed-off-by: L <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-19 22:28:10 -07:00
A
72c3f23364
test: add comprehensive code coverage tests (#2802)
* test: add comprehensive coverage tests (67% → 85% lines)

Add 27 new test files with ~565 tests covering all major modules:

Shared modules:
- ui-cov: logging, prompts, validation, shellQuote, withRetry, loadApiToken
- ssh-cov: spawnInteractive, killWithTimeout, startSshTunnel, waitForSsh
- ssh-keys-cov: generateSshKey edge cases, key sorting, fingerprint
- oauth-cov: PKCE flow, code verifier/challenge, key management
- orchestrate-cov: provisioning flow, enabled steps, model preferences
- agent-setup-cov: wrapSshCall, createCloudAgents, GitHub auth

Commands:
- connect, status, uninstall, pick, delete, update, fix, interactive
- link, run, list (with formatRelativeTime, filters, actions)

Cloud providers:
- aws, gcp, digitalocean, hetzner, sprite (auth, CRUD, SSH ops)

Remaining:
- picker, unicode-detect, history, manifest, update-check

Also fixes:
- do-payment-warning.test.ts: use spyOn instead of mock.module for
  shared/ui to prevent cross-test contamination
- preflight-credentials.test.ts: resilient to @clack/prompts mock
  replacement by other test files

Coverage: 74% → 90% functions, 67% → 85% lines
Tests: 1467 → 2032, 0 failures

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: expand coverage tests for commands, oauth, orchestrate, and link

Add 65+ new tests across 7 test files:
- cmd-list-cov: handleRecordAction branches (rerun, fix, no-connection),
  resolveListFilters with cloud filter, footer and empty message paths
- cmd-run-cov: showDryRunPreview edge cases, getScriptFailureGuidance
  for all exit codes, getSignalGuidance, cmdRun validation
- cmd-pick-cov: flag edge cases (missing values, multiple flags)
- cmd-link-cov: IP generation, detection spinner, invalid IP
- cmd-fix-cov: additional fix paths
- oauth-cov: non-standard key confirmation, null config handling
- orchestrate-cov: tunnel support, checkAccountReady, tarball,
  SPAWN_NAME, preLaunch, restart loop, step validation

Coverage: 90.50% functions, 85.13% lines (2097 tests, 0 failures)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add coverage thresholds (80% lines, 90% functions)

Configure bun test coverage thresholds in bunfig.toml to enforce
minimum coverage levels and prevent regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:24:54 -07:00
A
646faf66e2
test: remove duplicate config_files test in manifest-type-contracts (#2809)
Consolidated two overlapping describe blocks that both iterated over the
same config_files data:

- 'Agent optional field types' had a test checking config_files keys were
  strings with length > 0
- 'Config files structure' had a separate describe checking the same keys
  match a path regex and values are non-null objects

Merged into a single test within 'Agent optional field types' that checks
all constraints: key is string, key is non-empty, key matches path regex
(/[/~./]), and value is a non-null object. Removed the now-redundant
'Config files structure' describe block.

-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-19 22:05:41 -07:00
Ahmed Abushagur
ed127cf592
feat: never-give-up resilience layer (#2807)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 5s
Lint / Biome Lint (push) Failing after 4s
Lint / macOS Compatibility (push) Successful in 15s
Lint / ShellCheck (push) Successful in 59s
* feat: never-give-up resilience layer — retry every failure instead of exiting

Add retryOrQuit() helper to shared/ui.ts that prompts "Try again? (Y/n)"
after any recoverable failure. Wrap all fatal exit points with retry loops:

- Cloud auth (Hetzner, DigitalOcean, AWS, GCP): retry after 3 failed tokens
- API key acquisition: retry after 3 failed OAuth+manual attempts
- Server creation: retry on any createServer failure (both fast & sequential)
- SSH readiness: retry on waitForReady timeout
- Agent install: retry on install failure
- Pre-launch hooks: retry on preLaunch failure

Non-interactive mode (SPAWN_NON_INTERACTIVE=1) still throws immediately.
Ctrl+C at any retry prompt exits cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(e2e): add AI-driven interactive test harness

Add --interactive mode to the E2E test framework. Instead of running spawn
in headless mode (SPAWN_NON_INTERACTIVE=1), this spawns the CLI in a real
PTY and uses Claude Haiku to respond to prompts like a human user would.

New files:
- sh/e2e/interactive-harness.ts — Bun script that drives the PTY + AI loop
- sh/e2e/lib/interactive.sh — Bash integration with the E2E framework

Usage:
  e2e.sh --cloud hetzner claude --interactive

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(qa): wire interactive E2E into scheduled QA pipeline

- Add `e2e-interactive` option to workflow_dispatch in qa.yml
- Add `e2e-interactive` run mode to qa.sh (loads cloud creds + ANTHROPIC_API_KEY)
- Runs `e2e.sh --cloud hetzner claude --interactive` directly (no Claude Code needed)
- Defaults to hetzner (cheapest), overridable via E2E_INTERACTIVE_CLOUD/AGENT env vars

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(qa): schedule interactive E2E daily at 6am UTC

Runs one agent (claude) on one cloud (hetzner) with AI-driven prompts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): offset soak cron to avoid GitHub Actions schedule dedup

GitHub Actions deduplicates overlapping cron schedules into one run,
making `github.event.schedule` unpredictable. The soak test at `0 3 * * 1`
was getting absorbed by the `0 */4 * * *` quality sweep and never firing
as reason=soak.

Move soak to `30 1 * * 1` (Monday 1:30am UTC) — safely between the
0am and 4am quality sweep slots. Interactive E2E at `0 6 * * *` is
already safe (between the 4am and 8am slots).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): add e2e-interactive to trigger server valid reasons

The trigger server validates reason query params against an allowlist.
Without this, the `e2e-interactive` dispatch returns 400.

Also note: `soak` is already in VALID_REASONS in the repo but the running
service on the QA VM is stale — needs a restart to pick up both soak and
e2e-interactive reasons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:33:22 -07:00
Ahmed Abushagur
2280550c18
perf: skip cloud-init for minimal-tier agents with tarballs/snapshots (#2804)
* perf: skip cloud-init for minimal-tier agents with tarballs/snapshots

Ubuntu 24.04 base images already have curl + git, so minimal-tier
agents (claude, opencode, zeroclaw, hermes) don't need the cloud-init
package install step when using tarballs or snapshots.

Adds skipCloudInit flag to CloudOrchestrator — set automatically when
(tarball || snapshot) && tier === "minimal". Each cloud's waitForReady
checks this flag and calls waitForSshOnly instead of waitForCloudInit.

Saves ~30-60s on minimal-tier agent deploys with --fast or --beta tarball.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add --fast mode and updated beta features to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove timing table from README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-19 16:14:49 -07:00
A
1d0349cc23
test: add SPAWN_FAST fast-mode coverage to orchestrate (#2801)
Add 6 test cases verifying the Promise.allSettled parallel orchestration
path introduced in #2796. Tests cover: happy path, server boot failure
propagation, API key failure propagation, tarball fallback to
agent.install, local cloud exclusion from fast mode, and non-fatal
preProvision/checkAccountReady failures.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 13:16:02 -07:00
Ahmed Abushagur
5efbcf9ee7
feat: add --fast flag for parallel server boot + setup (#2796)
* feat: add --fast flag for parallel server boot + setup

Adds `--fast` flag that runs server creation concurrently with API key
prompt, account check, pre-provision hooks, tarball download, and env
config generation. Once SSH is up, uploads tarball and applies config.

--fast implies --beta tarball and --beta images, enabling snapshots
and pre-built tarballs automatically.

Flow without --fast (sequential):
  auth → API key → preProvision → size → create → boot → install → configure

Flow with --fast (parallel):
  auth → size → [create+boot | API key | preProvision | tarball download | accountCheck]
              → upload tarball → inject env → configure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add --beta parallel as standalone opt-in for parallel setup

--beta parallel enables the parallel orchestration without implying
tarball/images. --fast still implies all three (tarball + images +
parallel).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 10:26:54 -07:00
A
6772ed1cd7
fix(cli): validate agentKey in buildFixScript and fixSpawn before manifest lookup (#2792)
Some checks failed
Lint / ShellCheck (push) Successful in 1m5s
CLI Release / Build and release CLI (push) Failing after 18s
Lint / Biome Lint (push) Failing after 4s
Lint / macOS Compatibility (push) Successful in 14s
Add validateIdentifier() calls to buildFixScript() and fixSpawn() to
ensure agent keys from spawn history match [a-z0-9_-]+ before using
them to index manifest.agents. This prevents potential prototype
pollution or unexpected behavior from tampered history files.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-19 06:36:06 -07:00
A
787087144c
fix(cli): bump version to 0.23.2 for missed patch releases (#2787)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 5s
Lint / Biome Lint (push) Failing after 4s
Lint / macOS Compatibility (push) Successful in 17s
Lint / ShellCheck (push) Successful in 57s
Two CLI changes landed after the last version bump (0.23.1) without
incrementing the version:
- d9575acd: fix(cli): exit with code 1 on spawn fix error paths
- 148cc9e7: refactor: extract duplicate waitForSshSnapshotBoot to shared/ssh.ts

The CLI has auto-update enabled — without a version bump, users won't
pick up these fixes on next run.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 01:00:10 -07:00
A
148cc9e7ee
refactor: extract duplicate waitForSshSnapshotBoot to shared/ssh.ts (#2783)
The waitForSshOnly function was identically duplicated in hetzner.ts and
digitalocean.ts. Extract the shared logic into waitForSshSnapshotBoot() in
shared/ssh.ts and replace the duplicate cloud implementations with thin
wrappers that resolve module-local state before delegating.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-18 22:10:25 -07:00
A
d9575acd43
fix(cli): exit with code 1 on spawn fix error paths (#2781)
cmdFix error paths (spawn not found, non-interactive with multiple
servers, picker mismatch) previously returned without setting a
non-zero exit code. Scripts checking $? would incorrectly see success.

Now exits with code 1 on all error paths in cmdFix. fixSpawn() is
unchanged since it is also called from the list picker where returning
to loop is correct behavior.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 20:43:31 -07:00
A
15a62a9ad0
fix(cli): use tryCatch for JSON.parse in loadPreferredModel (#2782)
tryCatchIf(isFileError) only catches filesystem errors (ENOENT, EACCES),
but JSON.parse throws SyntaxError on corrupted preferences.json. This
was the same bug fixed in 16a2f180 across 4 files, but orchestrate.ts
was missed. A corrupted ~/.spawn/preferences.json would crash the CLI
instead of gracefully falling back to no preferred model.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 20:15:17 -07:00
Ahmed Abushagur
7289f3ef36
feat(hetzner): add snapshot support + Packer image builds (#2774)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 31s
Lint / ShellCheck (push) Successful in 40s
Lint / Biome Lint (push) Failing after 14s
Lint / macOS Compatibility (push) Successful in 18s
CLI changes:
- Add findSpawnSnapshot() to query Hetzner /images?type=snapshot API
  for pre-built spawn-{agent}-* images (matches by description prefix)
- Add waitForSshOnly() for snapshot boots (skips cloud-init polling)
- Update createServer() to accept optional snapshotId — boots from
  snapshot instead of ubuntu-24.04, skips cloud-init userdata
- Wire up orchestrator with skipAgentInstall flag

Packer changes:
- Add packer/hetzner.pkr.hcl using hcloud plugin, mirroring the DO
  template (tier scripts, agent install, cleanup, manifest)
- Unify packer-snapshots.yml to build both DO and Hetzner in a single
  workflow with cloud×agent matrix and per-cloud cleanup steps

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 16:46:48 -07:00
A
04eb54b409
test: consolidate repetitive validateLaunchCmd and validatePreLaunchCmd valid-input tests (#2771)
7 agent-specific it() blocks for validateLaunchCmd (all calling .not.toThrow()
on trivially different inputs) collapsed into one data-driven loop. Similarly,
6 individual validatePreLaunchCmd valid-pattern tests collapsed into one loop.

Reduces it() count in security-connection-validation.test.ts from 93 to 81 with
zero change in coverage - every command variant is still exercised.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 14:16:38 -07:00
A
16a2f1807c
fix(cli): use tryCatch instead of tryCatchIf for JSON.parse callsites (#2770)
tryCatchIf(isFileError) only catches filesystem errors (ENOENT, EACCES),
but JSON.parse throws SyntaxError on corrupted input. Since tryCatchIf
rethrows non-matching errors, a corrupted config file crashes the CLI
instead of returning the intended null/false fallback.

Affected: readCache(), local manifest loader, loadApiToken(),
loadSavedOpenRouterKey(), hasCloudConfigCredentials()

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-18 12:54:41 -07:00
A
fc98700a24
fix(digitalocean): use s-2vcpu-4gb-intel for openclaw to support nyc3 region (#2769)
s-2vcpu-4gb is not available in nyc3 (the default E2E region), causing
openclaw provisioning to fail with 422. s-2vcpu-4gb-intel offers the same
specs (2 vCPUs, 4 GB RAM) and is available in all regions including nyc3.

-- qa/e2e-tester

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-18 11:26:19 -07:00
A
b46524887d
feat(hetzner): fetch locations from API, re-prompt on unavailable location (#2766)
Hetzner disabled fsn1 (Falkenstein), causing a fatal HTTP 412 error for
all users using the default location. This change:

- Fetches available locations dynamically from GET /locations API
- Falls back to a hardcoded list if the API call fails
- On location-unavailable errors (HTTP 412 resource_unavailable),
  prompts the user to pick a different location instead of crashing
- Changes default location from fsn1 to nbg1 (Nuremberg)
- Excludes previously-failed locations from the re-pick list

Closes #2764

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Security Reviewer <security@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-18 10:39:42 -07:00
A
1ad385117e
test: consolidate redundant platform tests in shell.test.ts (#2767)
macOS and Linux return identical results for getLocalShell, getWhichCommand,
getInstallScriptUrl, and getInstallCmd. Collapsed the duplicate per-platform
tests into a data-driven loop over ["darwin", "linux"], reducing repetition
while preserving the same coverage. Also added the missing Linux case for
getInstallCmd (was only tested for Windows and macOS).

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 10:28:09 -07:00
A
4e31e8dd4c
docs(tests): document 5 undocumented test files in README (#2762)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 19s
Lint / Biome Lint (push) Failing after 3s
Lint / macOS Compatibility (push) Successful in 14s
Lint / ShellCheck (push) Successful in 58s
Added missing entries to packages/cli/src/__tests__/README.md for:
- auto-update.test.ts — setupAutoUpdate systemd service unit generation
- kill-with-timeout.test.ts — killWithTimeout SIGKILL grace period logic
- shell.test.ts — platform-aware shell detection utilities
- digitalocean-token.test.ts — DigitalOcean token storage and API helpers
- hetzner-pagination.test.ts — Hetzner API multi-page pagination

All 1467 tests pass. No code changes.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-18 07:01:19 -07:00
A
47b8bd30cc
test: remove duplicate and theatrical tests (#2763)
removed the "integration with getScriptFailureGuidance" describe block
from credential-hints.test.ts. all three tests were redundant:

- "always includes setup instructions regardless of env state": tested
  for vague "setup instructions" string, already verified by the
  "when all required env vars are missing" describe block above.

- "always returns at least one line": pure existence check, already
  proven by the "when no authHint is provided" tests which assert exact
  length of 1.

- "returns more lines when authHint is provided": tests line-count
  implementation detail rather than behavior; behavior is fully covered
  by the per-scenario describe blocks.

1467 to 1464 tests. zero regressions. biome lint: 0 errors.


-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 06:37:40 -07:00
A
af300ba248
fix(digitalocean): paginate SSH keys/droplets and harden key registration check (#2758)
Add doGetAll() pagination helper (matching Hetzner's hetznerGetAll pattern)
and use it for all three unpaginated DO API calls:
- ensureSshKey(): /account/keys (was silently truncated at 20 keys)
- createServer(): /account/keys (same issue for SSH key ID collection)
- listServers(): /droplets (was silently truncated at 20 droplets)

Replace fragile `regText.includes('"id"')` string check with proper
`parseJsonObj(regText)?.ssh_key` validation for SSH key registration.

Fixes #2748
Fixes #2749

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 01:18:06 -07:00
A
d4774fdc8e
fix(sprite): append to ~/.bash_profile and gate exec zsh on interactive shells (#2756)
- Use >> instead of > to append to ~/.bash_profile (preserves existing config)
- Gate exec zsh on interactive shells: [[ $- == *i* ]] && exec /usr/bin/zsh -l
- Bump CLI version to 0.21.7

Fixes #2740

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 23:55:33 -07:00
A
75c75d42d4
fix(ui): propagate Ctrl+C/Esc cancellation instead of returning empty string (#2757)
When p.isCancel() detected user cancellation in prompt() and
selectFromList(), the result was silently converted to "" instead of
exiting. This caused infinite retry loops in billing prompts, silent
fallthrough in oauth key entry, and unintended defaults in name prompts.

Now both functions call process.exit(0) on cancel for a clean exit.

Fixes #2745

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 23:54:32 -07:00
A
fef312cd47
fix(update): cache successful update checks for 1 hour (#2755)
checkForUpdates() previously fetched the latest version from GitHub on
every single CLI invocation, blocking for up to 10s on slow/offline
connections. Now it writes a timestamp to ~/.config/spawn/.update-checked
after a successful check and skips the network call if the cache is
less than 1 hour old.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-17 23:08:05 -07:00
A
133b94939e
fix(hetzner): ensure cloud-init marker is always written despite early exit (#2747)
Remove `set -e` from userdata script and add an EXIT trap to guarantee
/root/.cloud-init-complete is written even if apt-get or other setup
steps fail. Add `|| true` to apt-get commands for extra resilience.

Previously, the userdata script used `set -e` causing it to abort on
any command failure before reaching the marker write at the end. This
made waitForCloudInit() always time out with "Cloud-init marker not
found, continuing anyway..." adding ~5 minutes to every Hetzner
provisioning.

Fixes #2739

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 23:02:16 -07:00
A
1b978c03ce
fix(tarball): validate VM architecture when only one arch asset exists (#2753)
When a GitHub Release contains only one architecture-specific tarball
(e.g., x86_64 only), the download command now checks `uname -m` on
the remote VM and fails with exit 1 if the arch doesn't match. This
prevents installing an x86_64 binary on ARM (or vice versa) and ensures
the orchestrator falls back to live installation.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-17 22:59:04 -07:00
A
035ee3ca63
fix(ssh): always escalate to SIGKILL in killWithTimeout (#2752)
proc.killed is true as soon as kill() is called, not when the process
exits. This meant SIGKILL escalation was always skipped, leaving stuck
processes hanging indefinitely. Remove the faulty guard and always
attempt SIGKILL after the grace period — try/catch handles already-dead
processes.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 05:54:38 +00:00
A
a557fb1002
fix(cli): handle --help and --version flags after positional args (#2750)
Previously, `spawn claude sprite --help` would warn about extra args
and proceed to provision a server. Now trailing help/version flags are
detected and handled correctly in both the default command path and
verb alias path (e.g., `spawn run claude sprite --help`).

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:29:48 -07:00
Ahmed Abushagur
39f62b8c75
fix(windows): use dirname() instead of unix-only regex for config paths (#2738)
The regex `configPath.replace(/\/[^/]+$/, "")` only matches forward
slashes, so on Windows (which uses backslashes) it returns the full
path unchanged. `mkdirSync` then creates `digitalocean.json` as a
directory, causing EISDIR on the next write.

Replace with `dirname()` from `node:path` which handles both separators.
Affects digitalocean.ts, hetzner.ts, and aws.ts (oauth.ts already used
dirname correctly).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: PR Reviewer <pr-reviewer@spawn>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-17 22:22:30 -07:00
A
800c446ca4
fix(security): resolve symlinks in prompt file validation to prevent bypass (#2744)
validatePromptFilePath used path.resolve() which only normalizes the
string but doesn't follow symlinks. An attacker could create a symlink
(e.g., innocent.txt -> ~/.ssh/id_rsa) to bypass sensitive path checks
and exfiltrate credentials. Now uses realpathSync() to canonicalize
the path before pattern matching.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:21:11 -07:00
A
1e190924bf
fix(aws): wait for public IP before returning from waitForInstance (#2746)
Lightsail can report state=running before assigning a public IP. Continue
polling until both state is running and IP is non-empty, preventing SSH
connection failures from an empty IP address.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:16:57 -07:00
A
1ac7b9a0d1
fix(hetzner): paginate SSH key and server list API calls to prevent truncation at 25 items (#2741)
Hetzner API defaults to 25 items per page. Users with >25 SSH keys would
hit SSH lockout on server creation because the newly registered key landed
on page 2+ and was omitted from the ssh_keys payload.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-17 22:11:45 -07:00
A
f35696434a
fix(security): use writeFileSync for credential files — Bun.write ignores mode option (#2742)
Bun.write does not support the `mode` option, so credential config files
(Hetzner, DigitalOcean, AWS, OpenRouter) were created with 0644 permissions
instead of the intended 0600, exposing API tokens to other local users.

Switch to node:fs writeFileSync which correctly applies file permissions.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 22:09:36 -07:00