Commit graph

163 commits

Author SHA1 Message Date
A
a26d27f139
style: enforce biome format across codebase, add CI check (#1794)
Run `biome format --write` on all 98 source files (38 needed fixes).
The main change: object literals and long argument lists are now expanded
onto separate lines per Biome's `"expand": "always"` setting, making
code much easier to scan on narrow screens.

Add `biome format` check step to CI lint workflow so formatting
regressions are caught on every PR.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 23:32:12 -08:00
A
86cae8ee32
feat: add SSH key discovery & selection across all providers (#1792)
All 4 providers (Hetzner, DO, AWS, GCP) hardcoded ~/.ssh/id_ed25519 and
duplicated key generation logic. Users with id_rsa or custom-named keys
got unwanted new keys generated. This adds a shared ssh-keys module that:

- Scans ~/.ssh/ for all valid key pairs (matching pub + private files)
- With 0 keys: generates id_ed25519 (same as before)
- With 1 key: uses it silently
- With 2+ keys: prompts multiselect (all selected by default)
- Caches the result at module level for the session
- Centralizes getSshFingerprint() (was duplicated in Hetzner + DO)
- All providers now pass -i flags for selected keys to SSH commands

Net -152 lines of duplicated code across providers.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 23:22:50 -08:00
A
b802dfbc16
refactor: extract saveLaunchCmd to history.ts (#1789)
Eliminates copy-paste of saveLaunchCmd across 8 cloud provider files.
The local/local.ts copy had already diverged (using Bun.write() instead
of writeFileSync()), confirming the maintenance risk.

Fixes #1786

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:11:14 -08:00
A
ed7ebedde4
fix: clean up stdin/TTY state before interactive session handoff (#1790)
After provisioning, @clack/prompts and readline leave stdin with stale
listeners, raw mode, and buffered input. This causes flaky keyboard input
in the interactive SSH session. Add prepareStdinForHandoff() that closes
the shared readline, removes all stdin listeners, resets raw mode, and
pauses stdin before launching the child process.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 01:56:49 -05:00
A
3a554a5ada
fix: replace instanceof Error with hasMessage() duck-typing in SSH retry paths (#1785)
wrapSshCall (agent-setup.ts) and spriteRetry (sprite.ts) used `instanceof
Error` to extract error messages — an anti-pattern explicitly avoided
throughout the rest of the codebase (consistent with comments in index.ts,
commands.ts, manifest.ts, etc.). When errors cross module or bundling
boundaries, instanceof returns false even for real Error objects, causing
err.message to fall back to String(err) and producing `[object Object]` in
the retry logs. Uses `hasMessage()` from shared/type-guards for consistent
duck-typed narrowing.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 00:57:03 -05:00
A
fa34d29b7e
fix: explicitly pass SSH identity file for DigitalOcean connections (#1784)
DigitalOcean SSH was failing with "Permission denied (publickey)" because
the SSH client was not explicitly told which identity file to use. When
users have multiple SSH keys or an SSH agent with different keys loaded,
SSH may try the wrong key first and fail — especially with BatchMode=yes
which suppresses interactive fallbacks.

The fix adds `-i ~/.ssh/id_ed25519` to SSH_OPTS (matching AWS's approach)
and passes sshKeyPath to the shared waitForSsh utility, ensuring the
correct key is always used for both the handshake wait and all subsequent
SSH/SCP commands.

Fixes #1783

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 00:11:59 -05:00
A
3988ffe90e
fix: check exitCode in openBrowser() and reinit readline after @clack/prompts (#1782)
openBrowser() never checked the exitCode from Bun.spawnSync, so it silently
returned success even when the browser command failed (headless VMs, no
DISPLAY). Now checks exitCode and always shows the URL as fallback.

selectFromList() uses @clack/prompts which creates/destroys its own readline
on stdin. After it finishes, the shared readline in ui.ts can be corrupted
(Bun #1707). Now explicitly closes and nulls the shared readline after
@clack/prompts returns so the next prompt() call gets a fresh one.

Fixes #1770

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:21:06 -05:00
A
16c8a2b90b
fix: use getSpawnDir()/getConnectionPath() in all cloud providers (#1774)
Fixes #1769

All 8 cloud providers hard-coded `${process.env.HOME}/.spawn` for
connection data, bypassing the SPAWN_HOME env var support in history.ts.
Replaced all 16 occurrences with getSpawnDir() and getConnectionPath().

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 19:27:21 -08:00
A
0843c5e708
feat: shared SSH wait utility with TCP pre-check and stderr capture (#1779)
Replace 5 duplicated SSH wait implementations (AWS, DO, Hetzner, GCP,
Sprite) with a shared two-phase utility in cli/src/shared/ssh.ts:

- Phase 1: cheap TCP probe (2s intervals) until port 22 opens
- Phase 2: full SSH handshake (3s intervals) with stderr capture
- Adds BatchMode=yes to prevent interactive prompt hangs
- Removes ~220 lines of duplicated sleep/SSH_OPTS/waitForSsh code

Daytona (token auth) and Fly (WireGuard) left unchanged — too different.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 19:17:09 -08:00
A
b62dc1af33
feat: ban as type assertions, add runtime schema validation with valibot (#1775)
* fix: resolve all biome lint warnings across the codebase

- Replace all noExplicitAny with proper types (unknown, Record<string, unknown>)
- Fix useBlockStatements in picker.ts (braceless if)
- Fix useNumberNamespace in picker.ts (parseInt → Number.parseInt)
- Codebase now passes biome lint with 0 errors and 0 warnings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: ban `as` type assertions, add runtime schema validation with valibot

Replace all ~170 unsafe `as` type assertions across the entire codebase
(production + tests) with runtime-validated alternatives:

- Add GritQL biome plugin (`no-type-assertion.grit`) that bans all `as`
  casts except `as const`
- Add valibot for schema-validated JSON parsing (`parseJsonWith`)
- Add shared utilities: `parse.ts` (schema parsing), `type-guards.ts`
- Replace `as` casts in all 5 cloud modules (aws, daytona, hetzner,
  digitalocean, fly) with valibot schemas + type guards
- Replace `as` casts in shared modules (manifest, update-check, oauth,
  commands, history, ui)
- Replace `as any` in all 26 test files with proper `new Response()`
  mocks and typed variables
- Add 13 tests for parseJsonWith/parseJsonRaw
- Add "Embrace Bold Changes" culture rule to CLAUDE.md
- Bump version 0.6.19 → 0.7.0

1859 tests pass, 0 lint errors across 95 files, bundle +6KB from valibot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: move GritQL plugin into cli/lint/ directory

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 18:50:53 -08:00
A
f0a70b66a1
feat: multi-line layout for ls/delete — name first, then agent · cloud · time (#1777)
Entries in `spawn ls` and `spawn delete` now display as two lines:
  - Line 1: spawn name (bold)
  - Line 2: Agent · Cloud · relative time

Removes SSH connection info and prompt previews from the list display
to keep it clean and scannable.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 18:33:06 -08:00
A
2413db6ade
fix: truncate picker lines to terminal width to prevent redraw corruption (#1772)
Long labels (e.g. "Claude Code on GCP Compute Engine -- spawn-trial-000-ahmed")
wrap to multiple rows, but the redraw logic uses a fixed line count to cursor-up.
This causes old content to pile up on every arrow-key press.

Query terminal width via `stty size` and truncate all lines to fit within
a single row, with a 1-char margin to prevent auto-wrap edge cases.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 17:22:46 -08:00
A
8112276121
feat: add delete sub-menu (destroy/remove) and spawn kill alias (#1765)
Pressing `d` in the server picker now shows a sub-menu:
- Destroy server: hard delete (destroys cloud VM + marks deleted)
- Remove from history: soft delete (removes entry, no cloud API call)
- Cancel: go back to picker

Also adds `kill` as an alias for `spawn delete`.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 15:23:49 -08:00
A
545ddafe4a
fix: extract flags module to fix KNOWN_FLAGS drift in tests (#1757)
KNOWN_FLAGS in unknown-flags.test.ts was copy-pasted from index.ts and
was missing the --name flag, causing silent test gaps. Extract
KNOWN_FLAGS, findUnknownFlag, and expandEqualsFlags into a new flags.ts
module so tests import the real source of truth.

- Create cli/src/flags.ts with KNOWN_FLAGS, findUnknownFlag, expandEqualsFlags
- Update index.ts to import from flags.ts (checkUnknownFlags now uses findUnknownFlag)
- Update unknown-flags.test.ts to import from flags.ts instead of copy-pasting
- Add tests for --name flag, KNOWN_FLAGS completeness, and expandEqualsFlags
- Bump CLI version to 0.6.15

Fixes #1744

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 18:10:07 -05:00
A
f3a2b85b5b
fix: always confirm cloud resource name with user, even when SPAWN_NAME is set (#1758)
When the CLI collects a display name (SPAWN_NAME), each cloud now shows
the kebab-case derivative as the default in the resource name prompt
instead of silently accepting it. Users can hit Enter to accept or type
an override. Non-interactive mode still skips the prompt.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 14:25:34 -08:00
A
7b021fb1f5
fix: set TERM and use login shell for interactive SSH sessions (#1754)
SSH interactive sessions ran the agent command in a non-login,
non-interactive shell — .bashrc/.profile weren't sourced and TERM
wasn't always set, making the shell feel broken (no colors, bad
line editing, missing env).

Fix for all 6 SSH-based clouds (DO, Hetzner, AWS, GCP, Fly, Daytona):
- Forward local TERM (default xterm-256color) to the remote
- Use `exec bash -l -c` for a proper login shell

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 14:14:13 -08:00
A
3e5cd2d076
fix: spawn fails with bun not found after install (#1748)
* fix: add ~/.bun/bin to shell rc files so spawn finds bun after install

The install script was only adding ~/.local/bin to shell profile files
(bashrc/zshrc/bash_profile), but not ~/.bun/bin. Since the spawn binary
uses #!/usr/bin/env bun as its shebang, bun must be in PATH for spawn
to work. After exec $SHELL, only dirs in rc files are available.

Now ensure_in_path() patches shell rc files for both ~/.local/bin (for
spawn) and ~/.bun/bin (for bun), and correctly checks both when deciding
whether to show "Run spawn" vs "exec $SHELL" instructions.

Fixes #1747

Agent: code-health
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: quote dir in fish_add_path to prevent command injection

Address security review feedback on PR #1748 — unquoted ${dir} in
fish command string could allow injection if HOME/BUN_INSTALL env
vars contain metacharacters.

Agent: code-health
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 13:41:27 -08:00
A
57d4ee7eeb
fix: drop apt nodejs/npm, install Node 22 directly via n (#1746)
apt-get install nodejs npm pulls in hundreds of node-* packages
(libhwasan, node-jsonify, node-eslint-utils, etc.) adding 60-90s
to cloud-init. We immediately replace it with Node 22 via n anyway.

Fix: bootstrap n directly from curl and install Node 22 in one step.
No apt nodejs/npm needed.

Before: apt install nodejs npm → npm install -g n → n 22 (slow)
After:  curl n | bash -s install 22 (fast, no apt bloat)

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 12:40:22 -08:00
A
9d3728fd8d
fix: add build-essential to node cloud-init tier (#1743)
* fix: add build-essential to node cloud-init tier

The "node" tier (used by claude, codex, kilocode) was missing
build-essential. Native npm packages that compile C/C++ addons
fail without it. The "full" tier had it but no agent uses "full".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: upgrade openclaw to full cloud-init tier

Openclaw needs the most dependencies (build-essential, nodejs, npm,
bun) but was on the "bun" tier which only installed curl/unzip/git/zsh.
Switch to "full" which includes everything.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 12:27:11 -08:00
A
30f3758902
fix: set HOME=/root in cloud-init userdata to prevent unbound variable (#1741)
DigitalOcean's cloud-init environment doesn't set HOME. Combined
with set -e, any $HOME or ~ reference (bun install, .bashrc writes)
fails with "HOME: unbound variable" and cloud-init silently aborts.

Fixed in both DigitalOcean and Hetzner (same pattern). AWS doesn't
use set -e so is unaffected.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 12:13:56 -08:00
A
dc21fa223b
fix: cloud-init streaming script bash syntax error (#1737)
.join("; ") produced invalid bash: &; after background command,
do; after for, then; after if. Use newline-joined string instead.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 12:05:54 -08:00
A
b50b27141c
fix: stream cloud-init output instead of blind-polling on DigitalOcean (#1734)
Replace 60×5s blind poll loop ("Cloud-init in progress N/60") with
real-time streaming of /var/log/cloud-init-output.log via tail -f
over SSH. Users now see every apt-get, curl, and error as it happens.

Background checker exits as soon as .cloud-init-complete marker
appears. 5min timeout. Brief 30s fallback poll if streaming fails.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 11:58:29 -08:00
A
ac5e8495b1
feat: customize cloud-init per agent to fix boot timeouts (#1733)
Agents declare their dependency tier (minimal/node/bun/full), and
cloud-init only installs what's needed. Lightweight agents like
OpenCode and ZeroClaw skip Node.js upgrade, Bun install, and
build-essential — saving 60-90s on boot and eliminating the
DigitalOcean cloud-init timeout.

- Add CloudInitTier type + cloudInitTier field to AgentConfig
- Add shared/cloud-init.ts: tier-to-packages mapping
- Update all 6 clouds (DO, Hetzner, AWS, GCP, Fly, Daytona)
- Bump CLI version to 0.6.8

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 11:43:45 -08:00
A
738ad18fee
fix: add 5s delay between DigitalOcean and OpenRouter OAuth flows (#1727)
When both OAuth flows open browser tabs back-to-back, the user may
reactively close the second tab thinking it's a duplicate. Add a 5-second
pause with a message after DO OAuth completes, only when browser auth
was actually used (skipped for env var / saved token paths).

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 11:27:14 -08:00
A
f2010ce3bd
fix: add account:read scope to DigitalOcean OAuth flow (#1724)
OAuth token validation calls GET /v2/account which requires the
account:read scope. Without it, the token exchange succeeds but
validation fails with 403, falling through to manual token entry.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 11:14:26 -08:00
A
e527d79815
feat: DigitalOcean OAuth2 flow for automatic token provisioning (#1716)
* feat: add DigitalOcean OAuth2 flow for automatic token provisioning

Implements the OAuth2 authorization code flow for DigitalOcean as an
alternative to manual API token entry. The flow mirrors the existing
OpenRouter OAuth pattern using Bun.serve() for the local callback.

Changes:
- Add tryDoOAuth() with local Bun.serve callback, CSRF state, and
  code-for-token exchange via DO's /v1/oauth/token endpoint
- Add tryRefreshDoToken() for refreshing expired tokens without
  re-authorization
- Extend config persistence with refresh_token, expires_at, auth_method
- Modify ensureDoToken() flow: env var -> saved config (with refresh) ->
  OAuth browser flow -> manual paste fallback
- OAuth is gated on DO_OAUTH_CLIENT_ID and DO_OAUTH_CLIENT_SECRET env vars
- Add 37 tests covering config persistence, CSRF generation, code
  validation, token expiry, URL construction, and feature toggle
- Bump CLI version to 0.6.5

Closes #1715

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: hardcode DO OAuth credentials, remove env var gate

Embed client_id and client_secret as constants (same pattern as gh CLI,
doctl, gcloud). OAuth is now always available — no env vars needed.
Public CLI clients cannot keep secrets confidential; security comes from
the authorization code flow itself (user consent, localhost redirect,
CSRF state, single-use codes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add droplet:delete scope for spawn delete support

The spawn CLI's destroyServer() calls DELETE /droplets/{id} which
requires the droplet:delete scope. All its required sub-scopes
(droplet:read, regions:read, sizes:read, actions:read, image:read)
were already present.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 13:49:53 -05:00
A
9cb265d820
refactor: remove all cloud bash libs, convert AWS to JS bundle fallback (#1714)
All clouds now use TypeScript. Convert the last holdout (AWS) from bash
lib fallback to the JS bundle download pattern, then delete all remaining
cloud bash libs and clean up stale test code.

- Convert 6 AWS agent scripts to JS bundle fallback (matching hetzner)
- Delete aws/lib/common.sh and hetzner/lib/common.sh
- Delete orphaned test/fixtures/ovh/
- Stub out dead functions in test/e2e.sh that sourced deleted libs
- Delete 3 test files that only tested cloud bash libs
- Remove dead describe blocks from 3 remaining test files
- Bump CLI version 0.6.3 → 0.6.4

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 10:13:36 -08:00
A
21f7e7683f
refactor: deduplicate remaining 6 clouds into shared agent-setup pattern (#1704)
Convert gcp, daytona, digitalocean, hetzner, sprite, and local clouds
to use shared/agent-setup.ts and shared/orchestrate.ts, matching the
pattern established by AWS and Fly. Each cloud's agents.ts is now a
~26-line thin wrapper; each main.ts uses runOrchestration().

- Delete gcp/lib/common.sh (406 lines of dead bash code)
- Delete cli/src/fly/oauth.ts and cli/src/fly/ui.ts re-export wrappers
- Fix all fly/oauth and fly/ui imports to use shared/ directly
- Update test thresholds for reduced bash cloud count
- Bump CLI version to 0.6.3

Net reduction: ~2,850 lines removed.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 09:20:34 -08:00
A
eac5713ef0
refactor: deduplicate AWS/Fly agent setup into shared modules (#1700)
Extract ~800 lines of duplicated agent helpers and orchestration logic
from aws/agents.ts and fly/agents.ts into shared modules:

- shared/agent-setup.ts: CloudRunner interface, installAgent,
  uploadConfigFile, installClaudeCode, setupClaudeCodeConfig,
  GitHub auth, config helpers, createAgents(), resolveAgent()
- shared/orchestrate.ts: CloudOrchestrator interface + 12-step
  runOrchestration() pipeline
- shared/agents.ts: AgentConfig type + generateEnvConfig (single source)

Each cloud becomes a thin wrapper (~25-60 lines) that constructs a
CloudRunner/CloudOrchestrator from its provider-specific functions.

Also fixes pre-existing test breakage (aws.test.ts imported renamed
exports LIGHTSAIL_BUNDLES/BundleTier → BUNDLES/Bundle) and removes
dead aws/lib/common.sh reference from test/e2e.sh.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 08:40:21 -08:00
A
55df28137d
feat: convert gcp/ cloud provider from Bash to TypeScript (#1694)
Security review approved. All issues resolved.
2026-02-22 08:51:50 -05:00
A
850327c29d
feat: convert aws/ cloud provider from Bash to TypeScript (#1693)
Migrates AWS Lightsail from 609-line bash (aws/lib/common.sh) to TypeScript,
following the established Fly.io/local provider patterns. Type safety eliminates
SigV4 signing bugs, @clack/prompts provides interactive bundle/region pickers,
and error handling is explicit.

- cli/src/aws/aws.ts — Core: AWS CLI wrapper, SigV4 REST API, auth, provisioning, SSH
- cli/src/aws/agents.ts — Agent configs and install helpers
- cli/src/aws/main.ts — Orchestrator
- aws/*.sh — Converted to thin bun shims with bash fallback (curl|bash compatible)
- cli/package.json — Version bump to 0.6.0

Fixes #1675

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 07:50:54 -05:00
A
435d9125d5
feat: convert local/ cloud provider from Bash to TypeScript (#1688)
Creates cli/src/local/{main,local,agents}.ts following the Fly.io
pattern. All 6 agent .sh files replaced with thin bun shims.
Extracts shared oauth.ts and ui.ts to cli/src/shared/ for reuse
across cloud providers. Updates fly/ to re-export from shared.

Fixes #1681

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 10:49:40 +00:00
A
0f4df7be71
feat: pre-built Docker image for OpenClaw on Fly.io (#1686)
Eliminates the slow waitForCloudInit() + bun install phase by booting
a pre-built image with Node.js, bun, and openclaw already installed.
The image is rebuilt daily via GitHub Actions to pick up new releases.

Other agents are unaffected — they still use ubuntu:24.04 + cloud-init.

Co-authored-by: spawn-bot <spawn-bot@openrouter.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 02:50:46 -05:00
A
4cec25c6b7
fix: pass spawn name through cmdRun and headless flows (#1674)
cmdRun (spawn <agent> <cloud>) was not collecting or passing the spawn
name, so SPAWN_NAME was never set in the script environment and the
history record lacked a name. cmdRunHeadless had the same gap.

- Add promptSpawnName() call to cmdRun and pass result to execScript
- Wire spawnName through HeadlessOptions to runBashHeadless
- Add --name CLI flag to set SPAWN_NAME from the command line
- Skip interactive name prompt when SPAWN_NAME is already set
- Bump CLI to 0.5.33

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 21:52:52 -08:00
A
461d945212
chore: bump CLI version to 0.5.32 (#1666)
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 20:30:14 -08:00
A
760fa22dea
fix: bump fly default VM to 4GB, add 10GB volume, hide keepalive dots (#1654)
- Default VM memory: 1024MB → 4096MB (all agents except ZeroClaw
  which stays at 1024MB). Prevents OOM kills during native installs.
- Attach a 10GB persistent volume at /data with a /root/work symlink
  so agents have enough storage to clone repos and work.
  Configurable via FLY_VOLUME_SIZE env var.
- Keepalive: changed dots to spaces so they're invisible in terminal.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 17:56:02 -08:00
A
8650ad15d8
feat: interactive VM size and volume prompts for Fly.io (#1655)
Users can now choose VM size (1x/2x/4x shared CPU tiers) and opt into
persistent volumes during provisioning instead of getting hardcoded defaults.
FLY_VM_MEMORY env var still works for CI/headless mode.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 17:19:09 -08:00
A
b43d3f1b70
fix: combine gateway start + port wait into single SSH session (#1642)
The old flow opened up to 60+ separate fly ssh console sessions to
poll port 18789 after starting the gateway daemon. Each session opens
a new WireGuard tunnel which is slow and flaky.

Now: one SSH session starts the daemon, then polls the port in-band
with a simple for loop. Output from the loop also serves as a
keepalive for flyctl.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 16:16:08 -08:00
A
09d9f597ac
fix: use openclaw curl installer to prevent fly ssh hang (#1640)
bun install -g openclaw spawns child processes that keep stdout/stderr
FDs open, preventing fly ssh console from detecting EOF. Replace with
the official curl installer (--no-onboard) which handles Node detection
and cleanup without leaving orphan processes on the pipe.

See: https://docs.openclaw.ai/install

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 15:58:05 -08:00
A
42f2b66b55
fix: keep stdin pipe open during fly ssh to prevent session teardown (#1638)
flyctl tears down the WireGuard transport when stdin closes ("session
forcibly closed; the remote process may still be running"). This
killed long-running commands like `bun install -g openclaw`.

Instead of calling stdin.end() immediately, keep the pipe open for
the duration of the command and close it after the process exits.
The pipe still prevents interactive prompts from hanging (no data
flows through it), but flyctl no longer interprets the closed fd
as a signal to kill the session.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 15:44:14 -08:00
A
b4be9b9d2f
fix: use pipe+close for fly ssh stdin instead of ignore (#1632)
fly ssh console with stdin as /dev/null ("ignore") can cause the
connection to hang — flyctl doesn't get a clean EOF signal to know
when to close the transport. Switch to "pipe" and immediately call
stdin.end() so flyctl receives a proper EOF.

Applied to runServer, runServerCapture, and uploadFile.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 15:12:44 -08:00
A
246351874e
fix: skip unnecessary apt recommends (python3) during fly VM setup (#1629)
git on Ubuntu 24.04 pulls in python3 via recommended packages. Use
--no-install-recommends to install only direct dependencies. Also
added ca-certificates explicitly since it's needed for HTTPS but
won't be auto-pulled without recommends.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 15:02:55 -08:00
A
cbd8c87a6d
fix: prevent terminal hang during fly agent install + fatal preLaunch (#1628)
Two fixes:

1. runServer() was inheriting stdin, so commands like `claude install`
   that try to read input would hang the terminal indefinitely. Changed
   stdin to "ignore" (/dev/null) for non-interactive remote commands.

2. preLaunch failures (e.g. OpenClaw gateway) were silently swallowed,
   dropping users into a broken TUI with no gateway. Now preLaunch
   errors propagate — users get a clear error instead of a mystery hang.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 14:53:07 -08:00
A
cbecb9cbea
fix: suppress interactive dpkg prompts during fly VM setup (#1626)
tzdata (pulled in as a Node.js dependency) tries to run
dpkg-reconfigure interactively, which fails on headless Fly
machines. Set DEBIAN_FRONTEND=noninteractive so apt silently
accepts defaults.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 14:35:27 -08:00
A
0f59f0e844
fix: fly machine wait timeout exceeds API max of 60s (#1619)
The Fly Machines API enforces a [1s, 1m0s] range on
WaitMachineRequest.Timeout. We were passing 90s, which caused an
invalid_argument error and prevented machines from starting.

Lower the default to 60s (the API maximum) and retry up to 3 times
so slow-starting machines still have a full 3-minute window.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 14:00:13 -08:00
A
d69c4f0f02
fix: reject non-2xx responses in Fly.io token validation (#1614)
testFlyToken() fallback to /v1/user accepted 404 plain text responses
because hasError() only checks for JSON "error"/"errors" keys. Adding
resp.ok check ensures non-2xx responses are correctly rejected.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 13:47:40 -08:00
A
eff99caefe
fix: apply default spawn name when user presses Enter without typing (#1605)
promptSpawnName() used `placeholder` (visual hint only) without `defaultValue`,
so pressing Enter returned an empty string instead of applying the placeholder.
Now generates a unique default like `spawn-a3f2` with a random suffix to avoid
Fly.io global name collisions.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 11:46:59 -08:00
A
3570caa840
fix: accept localhost and hostnames in validateConnectionIP (#1531)
validateConnectionIP rejected "localhost" (written by local cloud) and
hostnames like "ssh.app.daytona.io" (written by Daytona), causing
mergeLastConnection to silently discard connection data. This broke
spawn list and spawn delete for these providers.

- Add "localhost" to CONNECTION_SENTINELS
- Add HOSTNAME_PATTERN for valid multi-label DNS hostnames
- Update tests: localhost now valid, add hostname acceptance/rejection tests

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 11:49:23 -05:00
L
eea43adcad
fix: re-exec with new binary after auto-update for all invocations (#1526)
Two bugs in reExecWithArgs():

1. args.length === 0 early exit:
   Running bare `spawn` (interactive picker) after an auto-update would
   print "Run your spawn command again" and exit, requiring the user to
   manually re-invoke. Now always re-exec so the new flow triggers
   immediately.

2. process.argv[1] stale binary path:
   If the installer places the updated binary in a different directory than
   the currently running binary (e.g. old: ~/.local/bin, new: /usr/local/bin),
   re-exec would run the old stale binary. Fix: add findUpdatedBinary() which
   resolves via `which spawn` (PATH lookup) first, falling back to
   process.argv[1] only if which fails.

Bump CLI version 0.5.17 → 0.5.18.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 10:26:02 -05:00
A
7b6d6eed3b
fix: replace hardcoded history path in security.ts error messages (#1520)
* fix: replace hardcoded ~/.spawn/history.json path in security.ts error messages

Error messages in security validation functions (validateConnectionIP,
validateUsername, validateServerIdentifier, validateMetadataValue) hardcoded
~/.spawn/history.json as the fix path. This is wrong when SPAWN_HOME is set,
directing users to a nonexistent file. Replace all 9 occurrences with
'spawn list --clear' which works regardless of SPAWN_HOME and is simpler
than manually editing JSON.

Agent: ux-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: bump cli version to 0.5.17

Required by CLAUDE.md: any change to cli/ needs a version bump.
PR #1520 changes security.ts error messages (cli/ change).

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 08:37:01 -05:00