spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-04-28 11:59:29 +00:00

Author	SHA1	Message	Date
A	5db9cc2a80	fix: show history table directly when no active servers found in spawn list (#2451 ) Instead of telling users to pipe through `spawn list \| cat` to view their spawn history, render the history table inline when no active connections exist. The \| cat workaround was needed because non-interactive mode skips the picker; now interactive mode falls through to renderListTable directly, consistent with what `spawn list \| cat` was already doing. Agent: ux-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-10 15:21:00 -04:00
A	a46a92a8a4	fix: add missing PATH entries in Hetzner and DigitalOcean runServer/interactiveSession (#2450 ) AWS and GCP both include $HOME/.npm-global/bin and $HOME/.claude/local/bin in the PATH exported before running remote commands. Hetzner and DO were missing these two entries, causing "command not found" errors for Claude Code and npm-global packages on those clouds. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-10 14:24:16 -04:00
A	0380ad33f9	refactor: remove dead exports only used within their own files (#2431 ) - withSpinner in commands/shared.ts - ENTITY_DEFS in commands/shared.ts - isValidManifest in manifest.ts - waitForInstance in aws/aws.ts - SignalEntry, ExitCodeEntry in guidance-data.ts Bump version: 0.15.37 -> 0.15.38 Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-10 08:51:15 -04:00
A	15e4715555	fix: validate server ID in status.ts before API calls (#2430 ) status.ts passed server_id from history directly into Hetzner/DO API URLs without calling validateServerIdentifier(). Both delete.ts and connect.ts validate first; status.ts was the only gap. A tampered ~/.spawn/history.json could craft a server_id with path traversal characters (e.g. "../v2/account") causing the Bearer token to be sent to an unintended API endpoint (SSRF via URL path manipulation). Fix: call validateServerIdentifier() after extracting serverId, returning "unknown" gracefully on failure. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-10 07:17:07 -04:00
A	72ccb098ab	feat: integrate Sprite keep-alive tasks for all Sprite agents (#2428 ) Adds sprite-keep-running support so sprites stay alive during long agent sessions instead of shutting down due to inactivity. - Add installSpriteKeepAlive() to sprite/sprite.ts: downloads and installs the sprite-keep-running script (~/.local/bin) on the sprite during setup. Non-fatal: logs a warning if download fails so deployment still proceeds. - Modify interactiveSession() to wrap the session command in a temp script (base64-encoded to handle multi-line restart loops) and exec it via sprite-keep-running if available, with plain bash fallback. - Call installSpriteKeepAlive() in sprite/main.ts createServer() step after setupShellEnvironment(), applying to all Sprite agents. - Add sprite-keep-alive.test.ts: 11 unit tests covering download URL, install path, error resilience, session script structure, and keep-alive wrapper inclusion. Fixes #2424 Agent: issue-fixer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-10 02:24:18 -07:00
A	de76599b39	refactor: centralize path resolution into shared/paths.ts (#2422 ) Move all filesystem path helpers (getUserHome, getSpawnDir, getHistoryPath, getSpawnCloudConfigPath, getCacheDir, getCacheFile, getUpdateFailedPath, getSshDir, getTmpDir) into a single shared/paths.ts module. This eliminates scattered homedir()/process.env.HOME patterns across 8+ files and provides a single import source for all path resolution. - Create packages/cli/src/shared/paths.ts with 9 exported functions - Update 17 source files to import from paths.ts - Add re-exports in ui.ts and history.ts for backward compatibility - Remove direct homedir() imports from gcp, sprite, local, ssh-keys, etc. - Add comprehensive unit tests in paths.test.ts - Bump CLI version to 0.15.34 Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-10 00:48:03 -07:00
A	486aba49f6	fix: use process.env.HOME instead of os.homedir() for test sandboxing (#2417 ) Bun's os.homedir() reads from getpwuid() and ignores runtime changes to process.env.HOME. Named imports capture the native function binding, so patching os.homedir on the default export doesn't propagate. This caused all test files using homedir() to write .spawn-test-* dirs to the real home directory instead of the preload sandbox. Add getUserHome() helper to shared/ui.ts that prefers process.env.HOME, replace all direct homedir() calls in production and test code. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-10 00:20:19 -07:00
A	f272294902	refactor: Deduplicate getServerName and promptSpawnName across cloud modules (#2415 ) Consolidates duplicate server naming logic from 5 cloud modules into shared utilities in src/shared/ui.ts. No behavioral changes - purely structural refactor. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-10 05:26:25 +00:00
L	2da9e6cd46	refactor: restore @openrouter/spawn-shared workspace package (#2405 ) * refactor: restore @openrouter/spawn-shared workspace package Restore packages/shared/ as canonical location for parse.ts, result.ts, and type-guards.ts. CLI shared files become thin re-exports, preserving all existing import paths. SPA imports switch from fragile relative paths to the workspace package. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sort exports in shared package barrel to satisfy biome Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sort SPA imports to satisfy biome organizeImports Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-09 17:14:26 -07:00
Ahmed Abushagur	06796ec95c	fix: isolate orchestrate tests from user's ~/.spawn history (#2398 ) The orchestrate test suite called runOrchestration (which internally calls saveSpawnRecord) without setting SPAWN_HOME to a temp directory. Every test run wrote ~20 fake records into the user's real history, eventually filling it with 100 connectionless "testagent" entries and wiping all real spawn history. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 18:46:19 -04:00
L	e182806eee	fix: graceful recovery from corrupted history.json (#2391 ) * fix: graceful recovery from corrupted history.json - Atomic writes (write to .tmp, rename into place) to prevent corruption - Backup corrupted files with .corrupt suffix before discarding - Per-record salvaging: if some v1 records are malformed, keep the valid ones - Archive recovery: when history.json is corrupted, try loading from archives - Stderr warnings when corruption is detected or records are recovered Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: replace try/catch with Result tryCatch wrapper in history.ts Add tryCatch() to shared/result.ts and use it throughout history.ts to eliminate all 7 try/catch blocks. Errors are now handled via Result pattern matching instead of exception control flow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: A <258483684+la14-1@users.noreply.github.com>	2026-03-09 14:50:29 -07:00
L	d9a25a4720	fix: ESC/Ctrl-C in picker falls back to numbered list instead of cancelling (#2390 ) The TTY key loop treated explicit user cancellation (ESC/Ctrl-C) the same as a TTY failure — both called fallback() which renders a numbered-list picker. Now the key loop distinguishes between the two: cancel() exits cleanly, fallback() is only used when /dev/tty is unavailable. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-09 14:28:02 -07:00
Ahmed Abushagur	e38f4483d6	fix: align cloud defaults with manifest (DO size, Hetzner location) (#2387 ) DO default was s-2vcpu-4gb which isn't available in nyc3, causing 422 errors. Changed to s-2vcpu-2gb to match manifest.json. Also aligned Hetzner default location from nbg1 to fsn1 to match manifest. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 18:23:22 +00:00
Ahmed Abushagur	7bab1c3289	fix: set browser.defaultProfile to openclaw for managed browser mode (#2384 ) On headless VMs there's no Chrome extension to attach to. Setting defaultProfile to "openclaw" tells OpenClaw to launch and manage the browser itself via CDP instead of waiting for an extension relay. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 13:23:23 -04:00
A	2074211d13	fix: wire maxAttempts parameter in waitForCloudInit for hetzner and digitalocean (#2380 ) The `_maxAttempts` parameter in both Hetzner and DigitalOcean's `waitForCloudInit()` was silently ignored — loop bounds and early-exit checks were hardcoded. Rename to `maxAttempts` and use it consistently, matching the AWS/GCP implementations. Fixes #2378 Agent: issue-fixer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-09 09:35:43 -04:00
Ahmed Abushagur	4004b51f6d	fix: use curl for Chrome download + capture google-chrome-stable in tarball (#2370 ) - wget not available on many cloud VMs, use curl instead - Remove 2>/dev/null from dpkg/apt so install errors are visible - Capture /usr/bin/google-chrome-stable in tarball (actual .deb binary name) - Use curl in packer/agents.json tarball build too Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-08 23:59:32 -07:00
Ahmed Abushagur	7e2f9f45fc	fix: use Google Chrome .deb for OpenClaw browser tool (#2368 ) * fix: use Google Chrome .deb instead of Playwright for OpenClaw browser Snap Chromium on Ubuntu 24.04 fails because AppArmor confinement blocks CDP control. OpenClaw's own docs recommend installing Google Chrome via .deb package which bypasses snap entirely. Also adds browser.noSandbox and browser.executablePath to the OpenClaw config so the browser tool works out of the box on Linux VMs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unnecessary confirmation prompt when OAuth fails If OAuth didn't complete, the user obviously wants to paste a key. The "Paste your API key manually? (Y/n)" prompt was pointless friction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unnecessary "Continue anyway?" credential confirmation If the user selected a cloud, they obviously want to continue. The warning + setup guidance is sufficient — no need to block on a confirm. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: move Chrome install to configure step so it runs after tarball The tarball path skips agent.install() entirely, so Chrome never got installed. Moving it to configure() (setupOpenclawConfig) ensures it always runs regardless of install method. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: bundle Google Chrome in openclaw tarball Add Chrome .deb install to openclaw's tarball build so it ships pre-installed. Capture /usr/bin/google-chrome and /opt/google/chrome/ in the tarball. Add dl.google.com to the workflow domain allowlist. The configure() step still has a fallback install with idempotency check (command -v google-chrome) for non-tarball installs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use openclaw config set for browser setup + correct binary name - Use `google-chrome-stable` (actual .deb binary name) not `google-chrome` - Set browser config via `openclaw config set` CLI (the supported way) instead of writing JSON directly which wasn't being picked up - Remove browser section from JSON config to avoid conflicts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 01:52:08 -04:00
Ahmed Abushagur	57a7a9e033	feat: install Playwright Chromium for OpenClaw browser tool (#2362 ) Ubuntu 24.04 replaced chromium-browser with a snap redirect that fails on cloud VMs without snapd. Playwright's bundled Chromium is self-contained (~170MB), works headless, and has no snap dependency. Installed as a non-fatal post-install step — if it fails, the agent still works but without browser capabilities. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 00:20:33 -04:00
A	3d7ad51f6d	fix: GCP billing retry fails because temp startup script is already deleted (#2361 ) The startup script temp file was cleaned up immediately after the first gcloud call, but the billing retry path re-used the same args array referencing that file. This meant billing retries always failed with a file-not-found error. Move cleanup to a try/finally block that runs after all retry paths. Also add randomness and mode 0o600 to the temp file path. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 23:07:57 -04:00
A	62e1df9be5	refactor: deduplicate PkgVersionSchema to shared/parse.ts (#2357 ) Move the PkgVersionSchema (v.object({ version: v.string() })) from its duplicate definitions in commands/shared.ts and update-check.ts into the shared parse module. Both consumers now import from the single source. Bump CLI version to 0.15.22. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-08 21:45:51 -04:00
A	4396703615	refactor: use shared getErrorMessage() and deduplicate OAuth CSS (#2348 ) Replace 4 inline `err instanceof Error ? err.message : String(err)` patterns in aws.ts, digitalocean.ts, and hetzner.ts with the shared getErrorMessage() helper. The shared helper uses duck-typing which is more robust across realms/prototypes than instanceof checks. Export OAUTH_CSS from shared/oauth.ts and import it in digitalocean/digitalocean.ts instead of duplicating the 250+ char CSS string. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 13:42:08 -04:00
A	8ac2ae366f	refactor: remove unused hasMessage type guard (#2346 ) hasMessage was exported from shared/type-guards.ts but never imported outside of its own test file. getErrorMessage already covers the message-extraction use case. Remove the dead function and its tests. -- qa/code-quality Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-08 12:51:18 -04:00
A	36582b3b95	refactor: deduplicate getErrorMessage into shared/type-guards.ts (#2343 ) Moves getErrorMessage to zero-dep shared module, eliminating 13 inline copies and 2 hasMessage variant sites across the codebase. Fixes #2341 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 07:45:11 -07:00
A	48af1c3459	fix: resolve undefined variable refs in Hetzner billing retry path (#2340 ) PR #2335 fixed this bug in digitalocean.ts, gcp.ts, and aws.ts but missed hetzner.ts. The billing retry block assigned serverId/serverIp to undefined local variables (hetznerServerId, hetznerServerIp) instead of _state.serverId / _state.serverIp, so the retry always threw "Server creation failed" even when the API call succeeded. This also adds the missing saveVmConnection() call in the retry success path so the VM is recorded in spawn history. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 09:48:54 -04:00
L	a77f70adfc	fix: update cloud picker prompt to 'Pick your cloud' (#2334 ) * fix: update cloud picker prompt to "Pick your cloud" The previous "Where should your agent run?" was vague. Simplify to "Pick your cloud (type to filter)" for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use "Select a cloud" for cloud picker prompt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-08 05:04:28 -07:00
Ahmed Abushagur	bc0c1827bb	fix: reorder auth flow and persist OpenRouter API key (#2320 ) * fix: reorder auth flow and persist OpenRouter API key across retries Two onboarding issues reported by users: 1. After DigitalOcean OAuth, the message said "OpenRouter authentication in 5s..." but then a GitHub CLI prompt appeared first. Fix: move API key acquisition immediately after cloud auth, before preProvision hooks (which include the GitHub prompt). Remove the misleading 5s delay message. 2. On retry after billing failure, DigitalOcean token was remembered but the OpenRouter API key was lost (only stored in process.env). Fix: persist the key to ~/.config/spawn/openrouter.json and load it on subsequent runs, matching how cloud tokens are already persisted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add mode 0o700 to config dir and await saveOpenRouterKey - Add mode: 0o700 to mkdirSync in saveOpenRouterKey to match other cloud modules (aws, hetzner, digitalocean) and prevent directory permission leak - Add missing await on saveOpenRouterKey(manualKey) to ensure manual API keys persist to disk before the function returns Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-08 06:48:14 -04:00
Ahmed Abushagur	ff3a60267c	feat: add billing/payment setup guidance for new cloud users (#2319 ) Detect billing-related server creation errors, open the cloud's billing page in the browser, and prompt the user to retry after adding a payment method. Adds pre-flight account checks for DigitalOcean (account status) and GCP (billing enabled). Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-08 04:50:51 -04:00
A	459e25a844	feat(cli): show connect-or-create menu when existing spawns are present (#2310 ) * feat(cli): show connect-or-create menu when existing spawns are present When the user runs `spawn` with no arguments and has active servers in history, display a top-level menu before jumping into the create flow: What would you like to do? ❯ Connect to existing server Create a new server Selecting "Connect to existing server" opens the same interactive picker as `spawn list` (activeServerPicker). Selecting "Create a new server" or having no existing spawns continues with the current create flow, so there is no behaviour change for first-time users. Fixes #2308 Agent: issue-fixer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * chore(cli): bump version to 0.15.14 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 01:56:37 -05:00
A	bd41641c11	fix(cli): improve visual spacing in spawn list output (#2311 ) - Interactive picker: add blank separator line between entries so label and subtitle are visually grouped (not blending into adjacent entries) - Non-interactive table: wrap subtitle in pc.dim() for better contrast with the bold entry name - Update pickerHeight to account for added separator lines Fixes #2309 Agent: issue-fixer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-08 00:01:53 -05:00
A	51dec6e877	fix: E2E failures - SSH key gen race, hetzner 409, hermes binary path (#2305 ) Three distinct E2E bugs fixed: 1. SSH key generation race condition: When multiple agents provision in parallel, concurrent processes all call generateSshKey() and race to create ~/.ssh/id_ed25519. ssh-keygen won't overwrite an existing file (prompts on stdin which is "ignore"), causing zeroclaw/codex to fail with "SSH key generation failed". Fix: check if key already exists before generating, and re-check after a failed generation attempt. 2. Hetzner SSH key 409 uniqueness_error: The Hetzner API returns HTTP 409 with "SSH key not unique" when the same key content is registered under a different name. The hetznerApi() function throws on non-2xx before the error-parsing code runs, and the regex /already/ didn't match "not unique". Fix: catch 409 in ensureSshKey() and match against uniqueness_error/not unique/already patterns. 3. Hermes binary not found: The hermes install script (uv tool) creates the actual binary + venv at ~/.hermes/hermes-agent/venv/ with a symlink at ~/.local/bin/hermes. The tarball capture script only captured the symlink + ~/.local/share/, leaving a dangling symlink. Fix: include ~/.hermes/ in capture paths, add venv/bin to verify.sh PATH check, and update hermes launchCmd to include the venv PATH. Fixes #2304 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 22:05:44 -05:00
A	e7ac388110	fix: make credential hint tests environment-independent (#2303 ) Tests for getScriptFailureGuidance were failing when cloud credential env vars (HCLOUD_TOKEN, DO_API_TOKEN) were set in the environment. The tests expected these vars to appear as "missing" in the output, but only unset OPENROUTER_API_KEY. Now both the cloud-specific var and OPENROUTER_API_KEY are saved/unset before each test. Bump CLI version to 0.15.11. Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-07 20:41:52 -05:00
A	90ae485c02	fix: add per-process timeout to SSH handshake probes in waitForSsh (#2299 ) The Phase 2 SSH handshake loop in waitForSsh spawns SSH processes without a per-process timeout. ConnectTimeout=10 only covers TCP connect — if sshd accepts the connection but stalls during key exchange or authentication, the process hangs indefinitely. This causes the entire spawn command to freeze with no way to recover. Add a 30s killWithTimeout guard to each probe, matching the pattern already used in every cloud-specific runServer/uploadFile function. -- refactor/code-health Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-07 18:40:48 -05:00
A	1991ffcb15	fix: add timeout protection to uploadFile across all SSH-based clouds (#2298 ) All four SSH-based uploadFile functions (Hetzner, DO, AWS, GCP) used `await proc.exited` on SCP subprocesses without any timeout guard. If SCP hangs due to a network issue, the CLI hangs indefinitely. This adds the same killWithTimeout pattern already used by runServer and runServerCapture in these same files: a 120-second timeout that kills the SCP process if it stalls. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 13:48:11 -08:00
A	0ef8eb4467	fix: validate v0 history entries against SpawnRecordSchema (#2279 ) The v0 fallback path in loadHistory() returned raw parsed JSON array directly without validating individual elements. This could cause TypeErrors (e.g. r.agent.toLowerCase() on undefined) in callers like getActiveServers and filterHistory when corrupted entries exist. Now filters each element through v.safeParse(SpawnRecordSchema, el), matching the validation the v1 path already performs. Fixes #2277 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 03:47:11 -05:00
Ahmed Abushagur	d77a067aa4	fix: snapshot cleanup + claude install (name-prefix filter) (#2273 ) * fix: claude snapshot build — remove npm fallback from install command The native install (curl \| bash) succeeds but exits non-zero due to a PATH warning. The \|\| fallback then tries `npm install` which doesn't exist on the "minimal" tier → exit 127. Fix: replace npm fallback with binary existence check (same pattern as hermes agent). If install exits non-zero but ~/.local/bin/claude exists, the build succeeds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: snapshot cleanup and lookup — use name prefix instead of tags DO Packer builder `tags` only apply to the temporary build droplet, not the resulting snapshot image. Both the workflow cleanup step and the CLI's findSpawnSnapshot() were querying by `tag_name` which returned nothing — old snapshots piled up and the CLI couldn't find existing snapshots. Fix: filter by snapshot name prefix (`spawn-{agent}-`) instead of tags, in both the workflow and the CLI. Remove misleading `tags` from the Packer template. Add test cases for name-prefix filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 21:32:58 -08:00
A	3a1de9d4cf	refactor: remove packages/shared, deduplicate with CLI shared (#2257 ) * refactor: remove packages/shared, deduplicate with packages/cli/src/shared packages/shared duplicated packages/cli/src/shared (parse.ts, result.ts, type-guards.ts) with the CLI never importing from the shared package. The only consumer was .claude/skills/setup-spa, which now imports directly from packages/cli/src/shared via relative paths. - Delete packages/shared entirely - Update setup-spa imports to use relative paths to CLI shared - Remove @openrouter/spawn-shared workspace dependency from setup-spa - Update CLAUDE.md and type-safety.md references Agent: complexity-hunter Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: remove packages/shared from lint workflow, fix import sorting The Biome Lint CI step referenced packages/shared/src/ which no longer exists after this PR removes the package. Also fix import ordering in setup-spa files to satisfy Biome's organizeImports rule. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: address Devin review — update stale packages/shared references - Update type-safety.md line 67: packages/shared/src/parse.ts → packages/cli/src/shared/parse.ts - Update install.ps1 sparse-checkout: remove packages/shared reference Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-06 21:58:42 -05:00
A	9e26d74ddb	fix: add --prune and --json to KNOWN_FLAGS for spawn status (#2263 ) The status command (PR #2254) added --prune and --json flags but did not register them in KNOWN_FLAGS. This caused the CLI to reject them with "Unknown flag" errors before the command could even dispatch. Bump CLI version 0.15.4 -> 0.15.5. Agent: ux-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-06 19:31:07 -05:00
A	035e4bf830	Remove Daytona cloud provider from codebase (#2261 ) Simplify the cloud matrix by removing Daytona. All Daytona-specific code, scripts, tests, and configuration have been removed. Daytona has been moved to "Previously Considered" in the Cloud Provider Wishlist (#1183) and can be revived on community demand. Closes #2260 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-06 18:53:08 -05:00
A	50397f19a3	fix: narrow validatePrompt patterns to prevent false positives on developer phrases (#2259 ) Fixes #2249 The overly broad `>>? word` pattern and generic doubled-operator check were blocking legitimate natural-language developer prompts like: - "Fix the merge conflict >> registration flow" - "Run tests && deploy if they pass" Root cause: `validatePrompt` is called before the prompt is set as the `SPAWN_PROMPT` env var. Inside double-quoted shell arguments, `>>` and `&&` are not interpreted as shell operators, so blocking them provided no real security benefit while creating confusing UX rejections. Changes: - Remove `/>>?\s*[a-zA-Z_]\w{2,}/` pattern (false-positive on >> in English) - Remove generic `hasDoubledOperators` check (false-positive on && in English) - Keep all targeted patterns: $(cmd), backticks, ${var}, \| bash/sh, ; rm -rf, fd redirections, heredoc, process substitution, path redirects - Update tests: split broad && / \|\| tests into "commands" vs "natural language" - Add tests asserting all issue #2249 example prompts are now accepted Agent: issue-fixer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-06 15:20:39 -08:00
Ahmed Abushagur	141254c4e1	feat: ARM tarball builds + arch-aware download (#2248 ) * feat: ARM tarball builds + arch-aware download - Add ARM64 matrix entries for native binary agents (zeroclaw, opencode, hermes, claude) in agent-tarballs.yml workflow - Update agent-tarball.ts to detect remote VM arch via uname -m and download the correct tarball (x86_64 or arm64) - Change release strategy to support multiple arch assets per tag - Document ARM build requirements in discovery.md for future agents - Bump CLI version to 0.15.2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use sudo for tarball extraction on non-root SSH clouds On AWS Lightsail, SSH connects as 'ubuntu' (not root), but tarballs extract to /root/. Without sudo, tar fails with "Permission denied". Conditionally use sudo when not running as root (id -u != 0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 17:10:33 -05:00
Ahmed Abushagur	849e980bf3	refactor: remove Docker install wrapper, tarballs replace it (#2244 ) Docker delivery is superseded by the tarball approach (#2232) which is faster (curl\|tar ~5-15s vs Docker install ~30s + pull ~60s) and works on every cloud without Docker as a dependency. - Remove tryInstallFromDocker, withDockerInstall, DOCKER_IMAGE_PREFIX - Remove dockerImage and slowInstall from AgentConfig - Remove Docker cloud-init from DigitalOcean - Unwrap openclaw and zeroclaw to direct install (tarball is tried first in orchestrate.ts, these are the fallback) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 14:19:22 -05:00
Ahmed Abushagur	8072c084c2	feat: pre-built agent tarballs for fast install (#2232 ) * feat: pre-built agent tarballs on GitHub Releases for fast install Adds a nightly GitHub Actions workflow that builds and uploads agent tarballs to rolling GitHub Releases. During provisioning, the CLI now attempts to download and extract a tarball before falling back to live install. Priority chain: snapshot > tarball > live install. - New workflow: .github/workflows/agent-tarballs.yml - New capture script: packer/scripts/capture-agent.sh - New module: packages/cli/src/shared/agent-tarball.ts - Orchestrate tries tarball first on non-local clouds - Skip tarball when using DO snapshot (skipTarball flag) - Tests for tarball install + orchestration integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use global.fetch mock pattern and address security review - Use `global.fetch = mock(...)` instead of `spyOn(globalThis, "fetch")` to match codebase convention and fix CI mock interception - Add URL validation regex to reject shell metacharacters (CRITICAL) - Add agent name validation in workflow input (MEDIUM) - Add `jq has()` check before executing install commands (CRITICAL) - Use `tar -T` instead of unquoted word-splitting in capture-agent.sh (MEDIUM) - Resolve merge conflicts with upstream/main (keep Docker fields, adapt to simplified DO flow, bump version to 0.15.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use globalThis.fetch for testability in CI Bun's native fetch binding doesn't go through global.fetch property lookup, so global.fetch = mock(...) doesn't intercept it. Using globalThis.fetch explicitly ensures the mock interception works. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add missing packer dependencies and harden install command safety - Add packer/agents.json (agent tier + install command definitions) - Add packer/scripts/tier-{minimal,node,bun,full}.sh (dependency scripts) - Add basic command safety check rejecting suspicious patterns - Document packer/agents.json as a trust boundary requiring PR review Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): fix npm prefix mismatch, add apt-get update, cleanup - Add apt-get update -y before apt-get install in all tier scripts - Add --prefix ~/.npm-global to npm install commands in agents.json so installed packages land where capture-agent.sh expects them - Rename misleading MARKER_DIR → MARKER_FILE in capture-agent.sh - Remove stale comment referencing packer snapshots in workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): detect empty agent installs in capture script The "no files found" check was dead code — the marker file is always created before filtering, so FILTERED_FILE always had at least one entry. Now we count non-marker entries to catch cases where the agent install silently fails and no actual files are on disk. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): use bare fetch() for Bun mock compatibility in CI In Bun, global.fetch = mock(...) overrides bare fetch() calls but NOT globalThis.fetch() calls. Every other source file in the codebase uses bare fetch() and their mocks work fine in CI. Switch to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): use dependency injection for fetch in tests Bun's global.fetch mock doesn't reliably intercept bare fetch() calls across all Bun versions in CI. Instead of fighting the runtime, accept an optional fetchFn parameter (defaults to fetch) and pass mock fetch directly in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): bypass mock.module bleed in agent-tarball tests orchestrate.test.ts uses mock.module("../shared/agent-tarball", ...) which is process-global in Bun and bleeds into agent-tarball.test.ts. Import via URL (import.meta.url resolution) to bypass the specifier- based mock matching and get the real module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): eliminate mock.module bleed between test files Bun's mock.module is process-global — orchestrate.test.ts mocking agent-tarball poisoned agent-tarball.test.ts (the mock function ignored the fetchFn parameter and always returned false). Fix: make tryTarballInstall injectable via OrchestrationOptions. orchestrate.test.ts passes the mock directly via options instead of using mock.module. agent-tarball.test.ts imports the real module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tests): mock Bun.which in credential priority tests Tests assumed no cloud CLIs were installed, but machines with hcloud/ doctl would get "CLI installed" hint overrides, failing the assertion. Spy on Bun.which to return null so tests are environment-independent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: fix import ordering after rebase Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: add curl domain allowlist and expand command blocklist Addresses security review findings: - Add domain allowlist for curl/wget targets (claude.ai, opencode.ai, raw.githubusercontent.com, registry.npmjs.org, crates.io, github.com) - Expand suspicious command blocklist (python -c, perl -e, ruby -e, dd, /dev/) - Document 4-layer security model in workflow comments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: add rm -rf to command blocklist Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 04:49:39 -05:00
A	8bc45b4283	refactor: Remove dead code and stale references (#2238 ) - Remove sh/e2e/aws-e2e.sh: dead backwards-compat wrapper with no references (superseded by unified e2e.sh --cloud aws) - Remove getStatusDescription from commands/shared.ts: defined and tested but never called in production code - Remove parseJsonRaw from packages/cli/src/shared/parse.ts: zero production usages (still available in packages/shared if needed) - Update corresponding test files to remove dead code tests - Bump CLI version to 0.14.4 Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 03:49:47 -05:00
L	65a81edc57	fix: add unique spawn IDs to prevent history record corruption (#2235 ) * fix: add unique spawn IDs to prevent history record corruption History records were matched by heuristic ("most recent record for this cloud without a connection"), which caused saveVmConnection and saveLaunchCmd to overwrite the wrong record during concurrent or failed spawns. Fix: every SpawnRecord now has a unique `id` (UUID). All history operations (saveVmConnection, saveLaunchCmd, removeRecord, markRecordDeleted, mergeLastConnection) match by id when available, falling back to the old heuristic for pre-migration records. The orchestrator (TS path) now creates the history record AFTER server creation succeeds, not before — so failed provisions don't leave orphan entries. Also adds "Remove from history" option to the spawn ls action picker, restoring the ability to soft-delete entries without destroying the VM. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add 18 unit tests for spawn ID history behavior Tests cover: - generateSpawnId returns unique UUIDs - saveSpawnRecord auto-generates id when not provided - saveVmConnection matches by spawnId (not heuristic) - saveVmConnection does not cross-contaminate concurrent spawns - saveVmConnection falls back to heuristic without spawnId - saveLaunchCmd matches by spawnId (not heuristic) - saveLaunchCmd falls back without spawnId - removeRecord matches by id, not by timestamp+agent+cloud - removeRecord handles duplicate timestamps correctly - removeRecord falls back for legacy records without id - markRecordDeleted targets correct record by id - mergeLastConnection uses spawn_id from last-connection.json - mergeLastConnection falls back to heuristic without spawn_id Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: enable biome import sorting with grouped imports Adds organizeImports to biome assist config with groups: 1. Type imports 2. Node built-ins 3. Third-party packages 4. @openrouter/* packages 5. Aliases Auto-fixed import order and lint issues across all TypeScript files, including .claude/skills/ and packages/cli/src/. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-05 23:27:03 -08:00
A	7f4b64ce1b	fix: Add SSH key identity opts to reconnect path (#2231 ) The reconnect path in connect.ts (cmdConnect and cmdEnterAgent) was missing SSH key identity file opts (-i flags). Every cloud provider's interactiveSession includes getSshKeyOpts(await ensureSshKeys()) but the reconnect path omitted them, causing "Permission denied" failures for users with non-default SSH key paths. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-06 00:01:45 -05:00
A	b6a0a8d28b	refactor: Remove dead isOAuthConfigured() stub from DigitalOcean module (#2229 ) The isOAuthConfigured() function always returned true unconditionally, making the two !isOAuthConfigured() guards in tryRefreshDoToken() and tryDoOAuth() unreachable dead code. Remove the function and inline the always-true behavior by dropping the dead branches entirely. Bump CLI patch version to 0.14.1. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-05 17:40:34 -08:00
Ahmed Abushagur	4cfdb0ad9b	feat: Docker-based agent delivery with optimized provisioning (#2225 ) * feat(digitalocean): use Docker marketplace image for agent deployments Use DigitalOcean's Docker marketplace image (docker-20-04) instead of plain Ubuntu + installing Docker via cloud-init. Docker is pre-installed so cloud-init only needs to `docker pull` the agent image. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use docker-22-04 marketplace image (Ubuntu 22.04) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * revert: back to docker-20-04 marketplace image Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(digitalocean): use Docker marketplace image with SSH/UFW setup The docker-20-04 marketplace image has Docker pre-installed but our user_data replaces its default first-boot script. Add UFW allow for SSH + sshd restart at the top of cloud-init to restore SSH access. Skip Docker installation when using the marketplace image since it's already available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove SSH ForceCommand block from marketplace image DO marketplace images ship with an SSH ForceCommand that blocks login with "Please wait..." until the image's first-boot script removes it. Since our user_data replaces that first-boot script, we must strip the ForceCommand ourselves before sshd restarts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(digitalocean): don't provide user_data to Docker marketplace image The Docker marketplace image (docker-20-04) has its own first-boot process that removes the SSH ForceCommand and configures UFW. Providing user_data conflicts with this and prevents SSH from ever becoming accessible. Instead, boot without user_data and run all setup (package install, Node/bun, docker pull) via SSH after the marketplace image completes its own initialization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(digitalocean): use docker-22-04 marketplace image slug The Docker marketplace image is Ubuntu 22.04 based, not 20.04. docker-20-04 was causing SSH timeouts due to deprecated first-boot process. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(digitalocean): revert to docker-20-04 slug (is actually Ubuntu 22.04) DO API confirms docker-20-04 is the correct slug — it maps to "Docker on Ubuntu 22.04". docker-22-04 is not a valid slug. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(digitalocean): use ubuntu + cloud-init Docker install instead of marketplace image The Docker marketplace image (docker-20-04) has a slow first-boot process (~90-180s before SSH opens). Using ubuntu-24-04-x64 with Docker installed via cloud-init (get.docker.com) is faster end-to-end because SSH opens in ~30-60s and Docker installs in parallel. Cloud-init now installs Docker and starts docker pull in background when an agentName is provided. tryInstallFromDocker() checks if the image is ready at install time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: wait for in-progress docker pull before extraction The docker pull started during cloud-init runs in background (&). If tryInstallFromDocker() runs before the pull completes, it falls back to normal install unnecessarily. Now waits for any in-progress docker pull process to finish before checking image availability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use nohup for background docker pull in cloud-init The docker pull was backgrounded with bare & in the cloud-init script. When the script exits after touching .cloud-init-complete, the background process receives SIGHUP and gets killed. Using nohup prevents this so the pull survives the script exit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * debug: add diagnostic output to tryInstallFromDocker Temporary debug logging to diagnose why docker pull isn't available. Also increased timeout from 60s to 120s. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf: optimize provisioning — Docker only for slow agents, reorder cloud-init - Only ZeroClaw (slow Rust build) gets Docker image extraction via withDockerInstall + slowInstall flag - Fast agents (claude, codex, openclaw, opencode, kilocode, hermes) skip Docker entirely — their native install is faster than Docker overhead - Reorder cloud-init: Docker install first, pull in background, then apt-get/node/bun run in parallel with the pull - Remove debug output from tryInstallFromDocker() - Version bump to 0.14.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: poll for Docker image availability instead of relying on pgrep The docker CLI process exits while dockerd continues pulling layers internally. pgrep-based wait exited early, then the image check failed. Now polls `docker images -q` every 5s for up to 5min until the image actually appears. Also increases SSH timeout to 600s to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: clear pre-existing zeroclaw config before onboard Docker image extraction copies ~/.zeroclaw/config.toml from the image, which already contains [security]. Then setupZeroclawConfig appends another [security] section → TOML duplicate key error. Fix: rm the old config before zeroclaw onboard generates a fresh one. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: re-add Docker image extraction for OpenClaw OpenClaw benefits from Docker pre-pull since npm install is slower than docker cp extraction. Add slowInstall + withDockerInstall back. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sed zeroclaw config in-place instead of appending duplicate sections zeroclaw onboard already generates [security] and [shell] sections. Appending duplicate sections causes TOML parse errors. Now uses sed to modify existing values in-place, with fallback to append if the sections don't exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 18:50:59 -05:00
L	9dff1296f0	feat: add --zone/--region and --size/--machine-type CLI flags (#2223 ) Adds cross-cloud flags for specifying zone/region and instance size directly from the command line instead of env vars: spawn claude gcp --zone us-east1-b --size e2-standard-4 spawn claude digitalocean --region lon1 --size s-4vcpu-8gb spawn claude hetzner --zone ash --size cx32 Each flag maps to the appropriate cloud-specific env var: --zone/--region → GCP_ZONE, DO_REGION, HETZNER_LOCATION, AWS_DEFAULT_REGION --size/--machine-type → GCP_MACHINE_TYPE, DO_DROPLET_SIZE, HETZNER_SERVER_TYPE, LIGHTSAIL_BUNDLE Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-05 14:06:06 -08:00
L	9f00c26ef7	fix: nest workspace trust entry under "projects" key in .claude.json (#2220 ) The hasTrustDialogAccepted entry was at the top level of .claude.json but Claude Code expects it nested under "projects": { "/root": { ... } }. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-05 11:04:22 -08:00
Ahmed Abushagur	08cf5e6d8a	fix(e2e): DigitalOcean name mismatch and bash 3.2 compat (#2218 ) 1. promptSpawnName() now checks DO_DROPLET_NAME before generating a random name, matching getServerName() behavior. This fixes the e2e harness creating droplets as spawn-XXXX when it expects e2e-digitalocean-AGENT-TIMESTAMP. 2. Replace BASH_REMATCH with sed-based parsing in provision.sh for macOS bash 3.2 compatibility. BASH_REMATCH was returning empty values, causing `export: '=': not a valid identifier`. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-05 13:44:32 -05:00

1 2 3

131 commits