Five undefined variable references across three cloud modules caused
billing retry paths to silently fail:
- digitalocean: doToken, doDropletId, doServerIp → _state.token/dropletId/serverIp
- gcp: gcpProject → _state.project
- aws: instanceName → _state.instanceName
These caused checkAccountStatus() and checkBillingEnabled() to always
return early, and billing retry saves to use wrong/undefined values.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove `export` from `verifyOpenrouterKey` in shared/oauth.ts (only used internally)
- Remove `export` from `tcpCheck` in shared/ssh.ts (only used internally)
- Fix stale comment in commands/index.ts referencing non-existent `./commands.js`
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Tests were failing because getActiveServers() found real history
records in ~/.spawn/history.json, causing an extra p.select() call
that shifted the mock prompt index and made manifest.agents[agent]
resolve to undefined.
Set SPAWN_HOME to an isolated directory in beforeEach so tests
always see an empty history regardless of host state.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: update cloud picker prompt to "Pick your cloud"
The previous "Where should your agent run?" was vague. Simplify to
"Pick your cloud (type to filter)" for clarity.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use "Select a cloud" for cloud picker prompt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picks UX improvements from #2321: simplifies cloud descriptions
to plain language, adds account/payment requirements upfront so users
know what they need before starting.
Fixes#2323
Agent: ux-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: reorder auth flow and persist OpenRouter API key across retries
Two onboarding issues reported by users:
1. After DigitalOcean OAuth, the message said "OpenRouter authentication
in 5s..." but then a GitHub CLI prompt appeared first. Fix: move API
key acquisition immediately after cloud auth, before preProvision
hooks (which include the GitHub prompt). Remove the misleading 5s
delay message.
2. On retry after billing failure, DigitalOcean token was remembered but
the OpenRouter API key was lost (only stored in process.env). Fix:
persist the key to ~/.config/spawn/openrouter.json and load it on
subsequent runs, matching how cloud tokens are already persisted.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add mode 0o700 to config dir and await saveOpenRouterKey
- Add mode: 0o700 to mkdirSync in saveOpenRouterKey to match other cloud
modules (aws, hetzner, digitalocean) and prevent directory permission leak
- Add missing await on saveOpenRouterKey(manualKey) to ensure manual API
keys persist to disk before the function returns
Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
The runServerCapture function was defined in aws, hetzner, gcp, and
digitalocean modules but never called anywhere in the codebase. All
cloud modules use runServer (which streams to stderr) and the
CloudRunner interface only requires runServer, not runServerCapture.
Bump CLI version 0.15.14 → 0.15.15.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
New users don't know which SSH key to pick. Just use all discovered
keys silently (ed25519 sorted first). If none exist, generate one.
Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
New users don't know what LLM models are — prompting them to pick one
with no context is confusing and openrouter/auto can route to weak
models. Remove the interactive model prompt entirely; agents use their
modelDefault silently (or MODEL_ID env var for power users).
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Detect billing-related server creation errors, open the cloud's billing
page in the browser, and prompt the user to retry after adding a payment
method. Adds pre-flight account checks for DigitalOcean (account status)
and GCP (billing enabled).
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Auto-detect GitHub credentials (GITHUB_TOKEN env var or `gh auth token`)
instead of interactively asking users. Rename promptGithubAuth → detectGithubAuth.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(cli): show connect-or-create menu when existing spawns are present
When the user runs `spawn` with no arguments and has active servers in
history, display a top-level menu before jumping into the create flow:
What would you like to do?
❯ Connect to existing server
Create a new server
Selecting "Connect to existing server" opens the same interactive picker
as `spawn list` (activeServerPicker). Selecting "Create a new server" or
having no existing spawns continues with the current create flow, so
there is no behaviour change for first-time users.
Fixes#2308
Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* chore(cli): bump version to 0.15.14
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove tests that verify JavaScript language semantics rather than
application logic. These tests would pass even if the source code
were deleted:
- 18 isValidManifest tests (JS truthiness of null, 0, false, "", [])
- 7 matrixStatus edge cases (Object property lookup with hyphens,
underscores, empty strings, long keys)
- 5 agentKeys/cloudKeys ordering tests (Object.keys insertion order,
an ES2015 spec guarantee)
- 3 countImplemented tests (for-loop over 1000 items, single entry,
non-standard statuses)
Kept 17 tests that exercise real application behavior: cache corruption
recovery, HTTP error fallback, in-memory cache, fallback chains, and
countImplemented case-sensitivity.
Closes#2315
Agent: test-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Interactive picker: add blank separator line between entries so label
and subtitle are visually grouped (not blending into adjacent entries)
- Non-interactive table: wrap subtitle in pc.dim() for better contrast
with the bold entry name
- Update pickerHeight to account for added separator lines
Fixes#2309
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Three distinct E2E bugs fixed:
1. SSH key generation race condition: When multiple agents provision in
parallel, concurrent processes all call generateSshKey() and race to
create ~/.ssh/id_ed25519. ssh-keygen won't overwrite an existing file
(prompts on stdin which is "ignore"), causing zeroclaw/codex to fail
with "SSH key generation failed". Fix: check if key already exists
before generating, and re-check after a failed generation attempt.
2. Hetzner SSH key 409 uniqueness_error: The Hetzner API returns HTTP 409
with "SSH key not unique" when the same key content is registered under
a different name. The hetznerApi() function throws on non-2xx before
the error-parsing code runs, and the regex /already/ didn't match
"not unique". Fix: catch 409 in ensureSshKey() and match against
uniqueness_error/not unique/already patterns.
3. Hermes binary not found: The hermes install script (uv tool) creates
the actual binary + venv at ~/.hermes/hermes-agent/venv/ with a symlink
at ~/.local/bin/hermes. The tarball capture script only captured the
symlink + ~/.local/share/, leaving a dangling symlink. Fix: include
~/.hermes/ in capture paths, add venv/bin to verify.sh PATH check,
and update hermes launchCmd to include the venv PATH.
Fixes#2304
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Tests for getScriptFailureGuidance were failing when cloud credential
env vars (HCLOUD_TOKEN, DO_API_TOKEN) were set in the environment.
The tests expected these vars to appear as "missing" in the output,
but only unset OPENROUTER_API_KEY. Now both the cloud-specific var
and OPENROUTER_API_KEY are saved/unset before each test.
Bump CLI version to 0.15.11.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
The Phase 2 SSH handshake loop in waitForSsh spawns SSH processes
without a per-process timeout. ConnectTimeout=10 only covers TCP
connect — if sshd accepts the connection but stalls during key
exchange or authentication, the process hangs indefinitely. This
causes the entire spawn command to freeze with no way to recover.
Add a 30s killWithTimeout guard to each probe, matching the pattern
already used in every cloud-specific runServer/uploadFile function.
-- refactor/code-health
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
All four SSH-based uploadFile functions (Hetzner, DO, AWS, GCP) used
`await proc.exited` on SCP subprocesses without any timeout guard.
If SCP hangs due to a network issue, the CLI hangs indefinitely.
This adds the same killWithTimeout pattern already used by runServer
and runServerCapture in these same files: a 120-second timeout that
kills the SCP process if it stalls.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
De-export interfaces, types, and constants that are only used within
their own module files. These were exported but never imported by any
other module or test file, unnecessarily widening the public API surface.
Affected symbols:
- aws: AwsState, Region, REGIONS, AGENT_BUNDLE_DEFAULTS
- digitalocean: DigitalOceanState, DropletSize, DROPLET_SIZES, DoRegion, DO_REGIONS
- gcp: GcpState, MachineTypeTier, MACHINE_TYPES, ZoneOption, ZONES
- hetzner: HetznerState, ServerTypeTier, SERVER_TYPES, LocationOption, LOCATIONS
- sprite: SpriteState
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
The PKCE migration TODO referenced closed issue #2041. The TODO
itself is still valid (DigitalOcean still doesn't support PKCE),
so keep the migration checklist but drop the issue number.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
* refactor: remove commands.ts compatibility shim and fix stale references
- Delete packages/cli/src/commands.ts shim file (only re-exported commands/index.ts)
- Update index.ts to import directly from ./commands/index.js
- Update 24 test files to import from ../commands/index.js
- Fix stale CLAUDE.md reference to commands.ts
- Fix stale QA prompt references to commands.ts and wrong line numbers
- Bump CLI version to 0.15.8
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: remove stale references to deleted commands.ts compatibility shim
---------
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The v0 fallback path in loadHistory() returned raw parsed JSON array
directly without validating individual elements. This could cause
TypeErrors (e.g. r.agent.toLowerCase() on undefined) in callers like
getActiveServers and filterHistory when corrupted entries exist.
Now filters each element through v.safeParse(SpawnRecordSchema, el),
matching the validation the v1 path already performs.
Fixes#2277
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: claude snapshot build — remove npm fallback from install command
The native install (curl | bash) succeeds but exits non-zero due to a
PATH warning. The || fallback then tries `npm install` which doesn't
exist on the "minimal" tier → exit 127.
Fix: replace npm fallback with binary existence check (same pattern
as hermes agent). If install exits non-zero but ~/.local/bin/claude
exists, the build succeeds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: snapshot cleanup and lookup — use name prefix instead of tags
DO Packer builder `tags` only apply to the temporary build droplet,
not the resulting snapshot image. Both the workflow cleanup step and
the CLI's findSpawnSnapshot() were querying by `tag_name` which
returned nothing — old snapshots piled up and the CLI couldn't find
existing snapshots.
Fix: filter by snapshot name prefix (`spawn-{agent}-`) instead of
tags, in both the workflow and the CLI. Remove misleading `tags`
from the Packer template. Add test cases for name-prefix filtering.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Remove 5 unused reset*State() exports (aws, hetzner, gcp, digitalocean,
sprite) that were never called anywhere in the codebase. Convert their
associated _state variables from let to const since they are no longer
reassigned.
Remove stale Daytona references in status.ts (comment and IP check)
left over after Daytona cloud provider removal in #2261.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Restores the nightly Packer snapshot build pipeline (reverted in #2205)
that pre-bakes agent images as DigitalOcean snapshots. When a snapshot
exists on the user's account, droplet boot skips cloud-init and tarball
install entirely — cutting provisioning from ~10min to ~2min.
- Add `packer/digitalocean.pkr.hcl` HCL2 template with multi-region
distribution, apt-lock wait, and snapshot marker
- Add `.github/workflows/packer-snapshots.yml` nightly build with
matrix strategy, auto-cleanup of old snapshots, and injection-safe
env var handling
- Add `findSpawnSnapshot()` to query DO API for pre-built snapshots
- Add `waitForSshOnly()` for snapshot boots (skip cloud-init wait)
- Modify `createServer()` to accept optional `snapshotId` param
- Wire snapshot detection in DO `main.ts` orchestrator
- Add `skipAgentInstall` to `CloudOrchestrator` interface to skip
tarball + install steps when booting from snapshot
- Add 5 unit tests for snapshot lookup (happy path, empty, error,
invalid ID, network failure)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The status command (PR #2254) added --prune and --json flags but did not
register them in KNOWN_FLAGS. This caused the CLI to reject them with
"Unknown flag" errors before the command could even dispatch.
Bump CLI version 0.15.4 -> 0.15.5.
Agent: ux-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Simplify the cloud matrix by removing Daytona. All Daytona-specific code,
scripts, tests, and configuration have been removed. Daytona has been moved
to "Previously Considered" in the Cloud Provider Wishlist (#1183) and can
be revived on community demand.
Closes#2260
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes#2249
The overly broad `>>? word` pattern and generic doubled-operator check
were blocking legitimate natural-language developer prompts like:
- "Fix the merge conflict >> registration flow"
- "Run tests && deploy if they pass"
Root cause: `validatePrompt` is called before the prompt is set as the
`SPAWN_PROMPT` env var. Inside double-quoted shell arguments, `>>` and
`&&` are not interpreted as shell operators, so blocking them provided
no real security benefit while creating confusing UX rejections.
Changes:
- Remove `/>>?\s*[a-zA-Z_]\w{2,}/` pattern (false-positive on >> in English)
- Remove generic `hasDoubledOperators` check (false-positive on && in English)
- Keep all targeted patterns: $(cmd), backticks, ${var}, | bash/sh,
; rm -rf, fd redirections, heredoc, process substitution, path redirects
- Update tests: split broad && / || tests into "commands" vs "natural language"
- Add tests asserting all issue #2249 example prompts are now accepted
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes#2252
history.json now uses a versioned envelope:
{ "version": 1, "records": [...] }
This creates a migration escape hatch for future SpawnRecord shape changes.
loadHistory() transparently reads both v0 (bare array) and v1 formats,
automatically migrating v0 files on next write. All write operations now
use writeHistory() to stamp the current schema version consistently.
Validation uses valibot schemas (VMConnectionSchema, SpawnRecordSchema,
HistoryFileV1Schema) so the structure is verified and typed without `as`
casts. Updated all affected tests to check data.records instead of data.
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the `spawn status` command requested in #2253. The command:
- Reads active (non-deleted) cloud servers from history
- Queries Hetzner and DigitalOcean REST APIs in parallel using saved tokens
- Shows a live-state table: ID, Agent, Cloud, IP, State, Since
- States: running (green), stopped (yellow), gone (dim), unknown (dim)
- --prune flag marks gone servers as deleted in history
- --json flag outputs machine-readable JSON for scripting
- `spawn ps` is an alias for `spawn status`
Other clouds (AWS, GCP, Sprite, Daytona) require CLI auth flows that cannot
run non-interactively; they report "unknown" with a helpful hint.
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Each cloud module (aws, daytona, digitalocean, gcp, hetzner, sprite) previously
stored per-operation state in bare module-level `let` variables, making them
process-global singletons. This is safe for single-cloud CLI invocations today
but creates latent bugs for multi-cloud orchestration and test isolation.
Replace scattered `let` globals with a single typed `_state` object per module:
- `AwsState` / `resetAwsState()` — 8 fields including `selectedBundle`
- `DaytonaState` / `resetDaytonaState()` — 5 fields
- `DigitalOceanState` / `resetDigitalOceanState()` — 3 fields
- `GcpState` / `resetGcpState()` — 5 fields
- `HetznerState` / `resetHetznerState()` — 3 fields
- `SpriteState` / `resetSpriteState()` — 2 fields
Each module exports a `resetXxxState()` function for test isolation. No function
signatures or existing exports were changed.
Fixes#2251
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: ARM tarball builds + arch-aware download
- Add ARM64 matrix entries for native binary agents (zeroclaw, opencode,
hermes, claude) in agent-tarballs.yml workflow
- Update agent-tarball.ts to detect remote VM arch via uname -m and
download the correct tarball (x86_64 or arm64)
- Change release strategy to support multiple arch assets per tag
- Document ARM build requirements in discovery.md for future agents
- Bump CLI version to 0.15.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use sudo for tarball extraction on non-root SSH clouds
On AWS Lightsail, SSH connects as 'ubuntu' (not root), but tarballs
extract to /root/. Without sudo, tar fails with "Permission denied".
Conditionally use sudo when not running as root (id -u != 0).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Docker delivery is superseded by the tarball approach (#2232) which is
faster (curl|tar ~5-15s vs Docker install ~30s + pull ~60s) and works
on every cloud without Docker as a dependency.
- Remove tryInstallFromDocker, withDockerInstall, DOCKER_IMAGE_PREFIX
- Remove dockerImage and slowInstall from AgentConfig
- Remove Docker cloud-init from DigitalOcean
- Unwrap openclaw and zeroclaw to direct install (tarball is tried
first in orchestrate.ts, these are the fallback)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Remove parseJsonRaw from packages/shared — exported but never imported
- Remove dead re-exports from agent-setup.ts (AgentConfig type, generateEnvConfig)
that no consumer imports (all callers use the original modules directly)
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
- Replace require() calls with ESM import in history-spawn-id.test.ts
(require() violates ESM-only rule per shell-scripts.md)
- Fix stale parseJsonRaw reference in test README (cli parse.ts does
not export parseJsonRaw; only packages/shared does)
- Add 5 missing test file entries to test README
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Remove two tests from "sequential saves at the boundary" that were
exact duplicates of tests in the "MAX_HISTORY_ENTRIES trimming" section:
- "99 to 100 entries" duplicated "should keep all entries when at exactly 100"
- "100 to 101 entries" duplicated "should trim to 100 when adding entry that exceeds the limit"
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: pre-built agent tarballs on GitHub Releases for fast install
Adds a nightly GitHub Actions workflow that builds and uploads agent
tarballs to rolling GitHub Releases. During provisioning, the CLI now
attempts to download and extract a tarball before falling back to live
install. Priority chain: snapshot > tarball > live install.
- New workflow: .github/workflows/agent-tarballs.yml
- New capture script: packer/scripts/capture-agent.sh
- New module: packages/cli/src/shared/agent-tarball.ts
- Orchestrate tries tarball first on non-local clouds
- Skip tarball when using DO snapshot (skipTarball flag)
- Tests for tarball install + orchestration integration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use global.fetch mock pattern and address security review
- Use `global.fetch = mock(...)` instead of `spyOn(globalThis, "fetch")`
to match codebase convention and fix CI mock interception
- Add URL validation regex to reject shell metacharacters (CRITICAL)
- Add agent name validation in workflow input (MEDIUM)
- Add `jq has()` check before executing install commands (CRITICAL)
- Use `tar -T` instead of unquoted word-splitting in capture-agent.sh (MEDIUM)
- Resolve merge conflicts with upstream/main (keep Docker fields, adapt
to simplified DO flow, bump version to 0.15.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use globalThis.fetch for testability in CI
Bun's native fetch binding doesn't go through global.fetch property
lookup, so global.fetch = mock(...) doesn't intercept it. Using
globalThis.fetch explicitly ensures the mock interception works.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add missing packer dependencies and harden install command safety
- Add packer/agents.json (agent tier + install command definitions)
- Add packer/scripts/tier-{minimal,node,bun,full}.sh (dependency scripts)
- Add basic command safety check rejecting suspicious patterns
- Document packer/agents.json as a trust boundary requiring PR review
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): fix npm prefix mismatch, add apt-get update, cleanup
- Add apt-get update -y before apt-get install in all tier scripts
- Add --prefix ~/.npm-global to npm install commands in agents.json
so installed packages land where capture-agent.sh expects them
- Rename misleading MARKER_DIR → MARKER_FILE in capture-agent.sh
- Remove stale comment referencing packer snapshots in workflow
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): detect empty agent installs in capture script
The "no files found" check was dead code — the marker file is always
created before filtering, so FILTERED_FILE always had at least one
entry. Now we count non-marker entries to catch cases where the agent
install silently fails and no actual files are on disk.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): use bare fetch() for Bun mock compatibility in CI
In Bun, global.fetch = mock(...) overrides bare fetch() calls but NOT
globalThis.fetch() calls. Every other source file in the codebase uses
bare fetch() and their mocks work fine in CI. Switch to match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): use dependency injection for fetch in tests
Bun's global.fetch mock doesn't reliably intercept bare fetch() calls
across all Bun versions in CI. Instead of fighting the runtime, accept
an optional fetchFn parameter (defaults to fetch) and pass mock fetch
directly in tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): bypass mock.module bleed in agent-tarball tests
orchestrate.test.ts uses mock.module("../shared/agent-tarball", ...)
which is process-global in Bun and bleeds into agent-tarball.test.ts.
Import via URL (import.meta.url resolution) to bypass the specifier-
based mock matching and get the real module.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): eliminate mock.module bleed between test files
Bun's mock.module is process-global — orchestrate.test.ts mocking
agent-tarball poisoned agent-tarball.test.ts (the mock function
ignored the fetchFn parameter and always returned false).
Fix: make tryTarballInstall injectable via OrchestrationOptions.
orchestrate.test.ts passes the mock directly via options instead
of using mock.module. agent-tarball.test.ts imports the real module.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tests): mock Bun.which in credential priority tests
Tests assumed no cloud CLIs were installed, but machines with hcloud/
doctl would get "CLI installed" hint overrides, failing the assertion.
Spy on Bun.which to return null so tests are environment-independent.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: fix import ordering after rebase
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* security: add curl domain allowlist and expand command blocklist
Addresses security review findings:
- Add domain allowlist for curl/wget targets (claude.ai, opencode.ai,
raw.githubusercontent.com, registry.npmjs.org, crates.io, github.com)
- Expand suspicious command blocklist (python -c, perl -e, ruby -e, dd, /dev/)
- Document 4-layer security model in workflow comments
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* security: add rm -rf to command blocklist
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Remove sh/e2e/aws-e2e.sh: dead backwards-compat wrapper with no
references (superseded by unified e2e.sh --cloud aws)
- Remove getStatusDescription from commands/shared.ts: defined and
tested but never called in production code
- Remove parseJsonRaw from packages/cli/src/shared/parse.ts: zero
production usages (still available in packages/shared if needed)
- Update corresponding test files to remove dead code tests
- Bump CLI version to 0.14.4
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: add unique spawn IDs to prevent history record corruption
History records were matched by heuristic ("most recent record for this
cloud without a connection"), which caused saveVmConnection and
saveLaunchCmd to overwrite the wrong record during concurrent or failed
spawns.
Fix: every SpawnRecord now has a unique `id` (UUID). All history
operations (saveVmConnection, saveLaunchCmd, removeRecord,
markRecordDeleted, mergeLastConnection) match by id when available,
falling back to the old heuristic for pre-migration records.
The orchestrator (TS path) now creates the history record AFTER server
creation succeeds, not before — so failed provisions don't leave orphan
entries.
Also adds "Remove from history" option to the spawn ls action picker,
restoring the ability to soft-delete entries without destroying the VM.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add 18 unit tests for spawn ID history behavior
Tests cover:
- generateSpawnId returns unique UUIDs
- saveSpawnRecord auto-generates id when not provided
- saveVmConnection matches by spawnId (not heuristic)
- saveVmConnection does not cross-contaminate concurrent spawns
- saveVmConnection falls back to heuristic without spawnId
- saveLaunchCmd matches by spawnId (not heuristic)
- saveLaunchCmd falls back without spawnId
- removeRecord matches by id, not by timestamp+agent+cloud
- removeRecord handles duplicate timestamps correctly
- removeRecord falls back for legacy records without id
- markRecordDeleted targets correct record by id
- mergeLastConnection uses spawn_id from last-connection.json
- mergeLastConnection falls back to heuristic without spawn_id
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: enable biome import sorting with grouped imports
Adds organizeImports to biome assist config with groups:
1. Type imports
2. Node built-ins
3. Third-party packages
4. @openrouter/* packages
5. Aliases
Auto-fixed import order and lint issues across all TypeScript files,
including .claude/skills/ and packages/cli/src/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: Remove duplicate and theatrical tests
- Remove duplicate countImplemented empty-matrix test from
manifest-cache-lifecycle.test.ts (already covered in manifest.test.ts)
- Remove duplicate agentKeys/cloudKeys empty-manifest test from
manifest-cache-lifecycle.test.ts (already covered in manifest.test.ts)
- Consolidate gateway-resilience.test.ts from 9 identical startGateway()
invocations into 3 grouped tests, reducing redundant async setup overhead
while keeping the same assertion coverage (18 expects)
- Move stderrSpy.mockRestore() from each it() into afterEach() in
gateway-resilience.test.ts
-- qa/dedup-scanner
* test: Remove dead guards after expect(parsed.success).toBe(true) in icon-integrity
Replace v.safeParse + success-check + dead-return guard pattern with v.parse,
which throws on invalid input and removes 20 redundant expect() calls and 5
unreachable return statements across agent and cloud icon tests.
-- qa/dedup-scanner
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The reconnect path in connect.ts (cmdConnect and cmdEnterAgent) was
missing SSH key identity file opts (-i flags). Every cloud provider's
interactiveSession includes getSshKeyOpts(await ensureSshKeys()) but
the reconnect path omitted them, causing "Permission denied" failures
for users with non-default SSH key paths.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The isOAuthConfigured() function always returned true unconditionally,
making the two !isOAuthConfigured() guards in tryRefreshDoToken() and
tryDoOAuth() unreachable dead code. Remove the function and inline the
always-true behavior by dropping the dead branches entirely.
Bump CLI patch version to 0.14.1.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
- Remove duplicate countImplemented empty-matrix test from
manifest-cache-lifecycle.test.ts (already covered in manifest.test.ts)
- Remove duplicate agentKeys/cloudKeys empty-manifest test from
manifest-cache-lifecycle.test.ts (already covered in manifest.test.ts)
- Consolidate gateway-resilience.test.ts from 9 identical startGateway()
invocations into 3 grouped tests, reducing redundant async setup overhead
while keeping the same assertion coverage (18 expects)
- Move stderrSpy.mockRestore() from each it() into afterEach() in
gateway-resilience.test.ts
-- qa/dedup-scanner
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
* test(e2e): add openclaw gateway kill/restart resilience test
Verifies that the openclaw gateway auto-restarts after being killed
with SIGKILL, validating the systemd Restart=always supervision.
The test runs as part of verify_openclaw:
1. Confirms gateway is listening on :18789
2. Kills it with SIGKILL (simulates a hard crash)
3. Waits up to 30s for systemd to auto-restart it
4. Verifies port 18789 comes back online
If the gateway isn't running (e.g. non-systemd env), the test is
skipped gracefully. On failure, dumps systemd status and gateway
logs for diagnostics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Revert "test(e2e): add openclaw gateway kill/restart resilience test"
This reverts commit 39b79d5c12.
* test: add unit tests for openclaw gateway resilience config
Verifies that startGateway() produces correct systemd and cron
configuration for auto-restart after a gateway crash:
- Restart=always and RestartSec=5 in the systemd unit
- Cron heartbeat checks port 18789 and restarts if dead
- Wrapper script sources .spawnrc and execs openclaw gateway
- Multiple port-check fallbacks (ss, /dev/tcp, nc)
- Non-systemd fallback to setsid/nohup
- 300s startup timeout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test(e2e): add openclaw gateway kill/restart resilience test
Kills the gateway with SIGKILL during verify_openclaw and verifies
systemd Restart=always brings it back within 30s. Skips gracefully
on non-systemd environments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
* feat(digitalocean): use Docker marketplace image for agent deployments
Use DigitalOcean's Docker marketplace image (docker-20-04) instead of
plain Ubuntu + installing Docker via cloud-init. Docker is pre-installed
so cloud-init only needs to `docker pull` the agent image.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use docker-22-04 marketplace image (Ubuntu 22.04)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* revert: back to docker-20-04 marketplace image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(digitalocean): use Docker marketplace image with SSH/UFW setup
The docker-20-04 marketplace image has Docker pre-installed but our
user_data replaces its default first-boot script. Add UFW allow for
SSH + sshd restart at the top of cloud-init to restore SSH access.
Skip Docker installation when using the marketplace image since it's
already available.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: remove SSH ForceCommand block from marketplace image
DO marketplace images ship with an SSH ForceCommand that blocks login
with "Please wait..." until the image's first-boot script removes it.
Since our user_data replaces that first-boot script, we must strip the
ForceCommand ourselves before sshd restarts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(digitalocean): don't provide user_data to Docker marketplace image
The Docker marketplace image (docker-20-04) has its own first-boot
process that removes the SSH ForceCommand and configures UFW. Providing
user_data conflicts with this and prevents SSH from ever becoming
accessible.
Instead, boot without user_data and run all setup (package install,
Node/bun, docker pull) via SSH after the marketplace image completes
its own initialization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(digitalocean): use docker-22-04 marketplace image slug
The Docker marketplace image is Ubuntu 22.04 based, not 20.04.
docker-20-04 was causing SSH timeouts due to deprecated first-boot process.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(digitalocean): revert to docker-20-04 slug (is actually Ubuntu 22.04)
DO API confirms docker-20-04 is the correct slug — it maps to
"Docker on Ubuntu 22.04". docker-22-04 is not a valid slug.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(digitalocean): use ubuntu + cloud-init Docker install instead of marketplace image
The Docker marketplace image (docker-20-04) has a slow first-boot
process (~90-180s before SSH opens). Using ubuntu-24-04-x64 with
Docker installed via cloud-init (get.docker.com) is faster end-to-end
because SSH opens in ~30-60s and Docker installs in parallel.
Cloud-init now installs Docker and starts docker pull in background
when an agentName is provided. tryInstallFromDocker() checks if the
image is ready at install time.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: wait for in-progress docker pull before extraction
The docker pull started during cloud-init runs in background (&).
If tryInstallFromDocker() runs before the pull completes, it falls
back to normal install unnecessarily. Now waits for any in-progress
docker pull process to finish before checking image availability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use nohup for background docker pull in cloud-init
The docker pull was backgrounded with bare & in the cloud-init script.
When the script exits after touching .cloud-init-complete, the
background process receives SIGHUP and gets killed. Using nohup
prevents this so the pull survives the script exit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* debug: add diagnostic output to tryInstallFromDocker
Temporary debug logging to diagnose why docker pull isn't available.
Also increased timeout from 60s to 120s.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* perf: optimize provisioning — Docker only for slow agents, reorder cloud-init
- Only ZeroClaw (slow Rust build) gets Docker image extraction via
withDockerInstall + slowInstall flag
- Fast agents (claude, codex, openclaw, opencode, kilocode, hermes)
skip Docker entirely — their native install is faster than Docker overhead
- Reorder cloud-init: Docker install first, pull in background, then
apt-get/node/bun run in parallel with the pull
- Remove debug output from tryInstallFromDocker()
- Version bump to 0.14.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: poll for Docker image availability instead of relying on pgrep
The docker CLI process exits while dockerd continues pulling layers
internally. pgrep-based wait exited early, then the image check failed.
Now polls `docker images -q` every 5s for up to 5min until the image
actually appears. Also increases SSH timeout to 600s to match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: clear pre-existing zeroclaw config before onboard
Docker image extraction copies ~/.zeroclaw/config.toml from the image,
which already contains [security]. Then setupZeroclawConfig appends
another [security] section → TOML duplicate key error.
Fix: rm the old config before zeroclaw onboard generates a fresh one.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: re-add Docker image extraction for OpenClaw
OpenClaw benefits from Docker pre-pull since npm install is slower
than docker cp extraction. Add slowInstall + withDockerInstall back.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: sed zeroclaw config in-place instead of appending duplicate sections
zeroclaw onboard already generates [security] and [shell] sections.
Appending duplicate sections causes TOML parse errors. Now uses sed
to modify existing values in-place, with fallback to append if the
sections don't exist.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Adds cross-cloud flags for specifying zone/region and instance size
directly from the command line instead of env vars:
spawn claude gcp --zone us-east1-b --size e2-standard-4
spawn claude digitalocean --region lon1 --size s-4vcpu-8gb
spawn claude hetzner --zone ash --size cx32
Each flag maps to the appropriate cloud-specific env var:
--zone/--region → GCP_ZONE, DO_REGION, HETZNER_LOCATION, AWS_DEFAULT_REGION
--size/--machine-type → GCP_MACHINE_TYPE, DO_DROPLET_SIZE, HETZNER_SERVER_TYPE, LIGHTSAIL_BUNDLE
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mockFailedFetch function in test-helpers.ts was never imported or
used by any test file. Removed to reduce dead code.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>