Commit graph

561 commits

Author SHA1 Message Date
A
ac7fa14c61
fix: cache AWS credentials to avoid re-prompting on retry (#1841) (#1852)
After a successful interactive credential entry, credentials are now
saved to ~/.config/spawn/aws.json (chmod 600). On the next run, cached
credentials are loaded and validated before prompting again.

Supports --reauth flag / SPAWN_REAUTH=1 to force fresh credential entry.

Fixes #1841

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-24 00:53:34 -05:00
A
a96a396e79
fix: add Lightsail activation prerequisite to docs and error messages (#1850)
- Adds a Prerequisites section to sh/aws/README.md (updated path after
  shell scripts reorganization in #1843) with the Lightsail activation
  URL as the first step
- Surfaces the Lightsail activation URL in createInstance error hints
  for both CLI and REST paths so users get actionable guidance on failure
- Bumps CLI to 0.9.1

Fixes #1838

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-24 00:53:28 -05:00
A
c487ea215f
refactor: move test fixtures to root /fixtures directory (#1849)
* refactor: move test fixtures to root /fixtures directory

Moves test/fixtures/ → fixtures/ at the repo root for easier
discoverability. Updates all references in CLAUDE.md and QA
prompt files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: move install.ps1 to sh/cli/ and fixtures to root

- Moves cli/install.ps1 → sh/cli/install.ps1 (consistent with install.sh)
- Moves test/fixtures/ → fixtures/ at the repo root
- Updates all references in README, CLAUDE.md, QA prompts, and the ps1 itself

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 21:38:18 -08:00
A
b84adfb74e
refactor: move all shell scripts to /sh directory (#1843)
Reorganizes the project so all shell scripts live under a dedicated
/sh directory, enabling the OpenRouter rewrite URL to point at /sh/
instead of the repository root.

Moves:
- cli/install.sh → sh/cli/install.sh
- shared/*.sh → sh/shared/*.sh
- {cloud}/{agent}.sh → sh/{cloud}/{agent}.sh (48 scripts)
- {cloud}/README.md → sh/{cloud}/README.md
- e2e/*.sh → sh/e2e/*.sh
- test/macos-compat.sh → sh/test/macos-compat.sh
- test/fixtures/**/*.sh → sh/test/fixtures/**/*.sh

Updates all references:
- RAW_BASE path construction in commands.ts, update-check.ts
- GitHub auth URL in agent-setup.ts
- Self-referencing URLs in install.sh, github-auth.sh
- CI workflow paths in lint.yml, cli-release.yml
- Test file paths in install-script-validation, manifest-integrity
- Documentation in README.md, cli/README.md, CLAUDE.md
- QA scripts in .claude/skills/

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 21:14:54 -08:00
A
8812f693c0
fix: switch ZeroClaw install to bootstrap.sh with --prefer-prebuilt (#1836)
The pinned scripts/install.sh is deprecated and does `git clone --depth 1`
of the latest ZeroClaw main branch, pulling in commit 63f485e which added
leak_detector.rs with Rust 2021 edition string literal errors.

Fix by switching to scripts/bootstrap.sh (the canonical installer) and
adding --prefer-prebuilt so ZeroClaw installs from a pre-built release
binary instead of compiling from source. The v0.1.6 release binary was
compiled before the problematic code was merged.

Fixes #1829

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-24 00:08:49 -05:00
A
31c6bec2f6
fix: route spawn <cloud> to cloud info in interactive TTY mode (#1834)
* fix: route `spawn <cloud>` to cloud info in interactive TTY mode

When a user runs `spawn digitalocean` in interactive mode, the CLI was
treating "digitalocean" as an agent name, producing "Unknown agent: digitalocean".
This broke the error message in credential checks which tells users to
"Run spawn <cloud> for setup instructions."

Fix: Before routing a single argument to `cmdAgentInteractive`, check if
it resolves to a cloud name and route to `cmdCloudInfo` instead. This
matches documented behavior (`spawn <cloud>` = show available agents).

Fixes #1830

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: expand stdout/stderr mock objects to pass Biome format check

The node:child_process mock had stdout and stderr property values
on single lines, which Biome's formatter requires to be expanded
to multi-line object style.

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-24 00:08:43 -05:00
A
8394386cd4
fix: install openclaw with npm instead of bun to fix gateway plugin loading (#1833)
* fix: install openclaw with npm instead of bun to fix gateway plugin loading

The OpenClaw gateway daemon runs on Node.js, but openclaw was being
installed via `bun install -g`. Bun and Node use incompatible module
resolution strategies, causing channel plugins (Telegram, Discord, etc.)
to silently fail to load at gateway startup.

Switch both install paths to `npm install -g openclaw` so the daemon's
Node runtime can resolve its dependencies correctly.

Fixes #1828

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: apply biome formatting to commands-update-download.test.ts

The file was added in #1831 without passing Biome's format check.
Auto-formatted with `bunx @biomejs/biome format --write`.

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: use bun install/runtime for openclaw instead of npm

npm install is broken on target VMs. Switch all openclaw install
commands back to bun and remove npm prefix from gateway PATH.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: spawn-bot <spawn-bot@openrouter.ai>
2026-02-24 00:08:40 -05:00
A
693e18bfef
test: mock node:child_process to fix flaky cmdUpdate timeout (#1831)
The "should handle update failure gracefully" test triggered a real
execSync("curl -fsSL .../install.sh | bash") via performUpdate() when
the mocked remote version differed from the current version. In isolation
this completes in ~5s, but under full-suite concurrency (52 files, 1897
tests) network contention caused it to timeout at 58267ms — far exceeding
the 5000ms limit. This also violated CLAUDE.md: "no subprocess spawning".

Also mock spawn() used by spawnBash() for cmdRun tests, firing the "close"
event immediately (exit code 0) so Promise-based callers resolve without
hanging. Result: 1897 pass, 0 fail, full suite runs in 3.15s.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 23:19:39 -05:00
A
6a5e0c5161
feat: SPA — Spawn's Personal Agent (#1825)
* feat: add Slack issue bot for #proj-spawn

Socket Mode bot that listens for @mentions in a configured Slack channel,
files GitHub issues via `gh` CLI, and syncs thread replies as issue comments.
State persisted to ~/.config/spawn/slack-issues.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: rewrite Spawnis to pipe threads into Claude Code sessions

- @mention triggers Claude Code with full thread as prompt
- Subsequent thread replies in tracked threads auto-trigger new runs
- System prompt focuses on GitHub issue management via `gh` CLI
- Streams Claude Code responses back to Slack in real-time
- Bot resolves own user ID at startup to skip self-messages
- Adds slack-manifest.yml for one-click Slack app creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: lowercase display name to spawnis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: rename to SPA — Spawn Processes Autonomously

Display name: spa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: SPA — Spawn's Personal Agent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: lowercase app name to Spa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add biome config and fix lint/format to match CLI rules

Adds local biome.json mirroring cli/biome.json rules (minus GritQL
plugins). Fixes all useBlockStatements errors and applies expand:always
formatting to match the project style.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: share biome config via root biome.json + extends

Move shared linter rules, formatter, and JS formatter settings to a
root-level biome.json. Both cli/ and .claude/skills/slack-bot/ extend
from it — CLI adds its GritQL plugins and test overrides, slack-bot
just overrides includes and disables VCS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: move GritQL lint rules to repo root lint/

Move no-type-assertion.grit and no-typeof-string-number.grit from
cli/lint/ to lint/ at the repo root. Both cli/ and slack-bot share
the no-type-assertion rule; cli/ additionally uses no-typeof-string-number.

Plugin paths live in each child biome.json (not root) because biome
resolves plugin paths relative to the consumer config, not the extended
config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: move shared biome.json into lint/

All shared lint config now lives under lint/:
  lint/biome.json
  lint/no-type-assertion.grit
  lint/no-typeof-string-number.grit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add no-banner-comments lint rule, fix slack-bot

GritQL can't match comments (they're trivia in biome's CST), so this
is a Bun script at lint/no-banner-comments.ts that catches decorative
// --------- separator blocks and suggests /** Section */ or #region.

Replace all 9 banner blocks in slack-bot.ts with /** */ headers.

Usage: bun run lint/no-banner-comments.ts [files...]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: use // #region instead of /** */ section headers

Switch slack-bot.ts to // #region / // #endregion for all section
markers (collapsible in most editors). Update no-banner-comments lint
script to recommend #region as the preferred style.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: replace lint script with PostToolUse hook for banner comments

Move banner comment detection into the existing PostToolUse hook on
Write|Edit in .claude/settings.json. Runs inline on every .ts file
edit — no separate bun script needed. Delete lint/no-banner-comments.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: simplify Slack manifest description

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: rename skill from slack-bot to setup-spa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use named import for @slack/bolt App class

Bun resolves `import App from "@slack/bolt"` as the App constructor
directly, not a module with a `.default` property. Switch to named
import `{ App }` and remove all `.default` usage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add --verbose flag required by stream-json output format

Claude Code requires --verbose when using --output-format=stream-json
with --print. Also fix systemd PATH to include ~/.local/bin for claude.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: stream all Claude Code events to Slack (tools, results, text)

Replace text-only streaming with full event parsing:
- Tool use: shows 🛠️ *ToolName*
- Tool result: shows truncated output in code block
- Text delta: accumulates as before
- Errors: shows  prefix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: enforce issue title templates in system prompt

Add mandatory bracket prefix format matching the repo's issue templates:
[Bug]:, [CLI]:, [Agent]:, [Cloud]:, [Team]:. Also instructs Claude to
apply matching labels (bug + pending-review, cli + enhancement, etc.).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: reference issue templates at runtime instead of hardcoding

Tell Claude to read .github/ISSUE_TEMPLATE/ for the correct title
prefix, labels, and fields rather than hardcoding them in the prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 19:52:14 -08:00
A
18a7f7e96a
fix: improve AWS Lightsail error message for new accounts (#1826)
* fix: improve AWS Lightsail error message for new accounts

Fixes #1824

Agent: ux-engineer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix biome formatting in AWS Lightsail error message

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 22:05:50 -05:00
A
0bca426980
security: validate RAW_BASE immediately before curl|bash invocation (#1822)
Fixes #1819

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 19:31:53 -05:00
A
a4e805d83b
fix: restore gateway port wait and batch openclaw setup into fewer SSH sessions (#1820)
The previous refactor (b43d3f1) deleted the gateway port wait entirely,
causing the TUI to launch before the gateway was listening on port 18789.

Changes:
- startGateway() now starts the daemon AND polls port 18789 in the same
  SSH session (up to 60s), using /dev/tcp with nc fallback.
- New setupOpenclawBatched() combines install verification + env var
  setup + openclaw config into a single SSH session (was 6 separate
  SSH calls, now 2 total for the whole openclaw flow).
- New optional `setup` hook on AgentConfig lets agents opt into the
  batched path; other agents are unaffected.

Co-authored-by: spawn-bot <spawn-bot@openrouter.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:46:19 -05:00
A
f97079a35d
fix: use correct default server type in Hetzner createServer fallback (#1818)
The hardcoded fallback "cx23" is a stale Hetzner server type that no
longer exists in their API. Replace it with DEFAULT_SERVER_TYPE ("cx22")
which is consistent with the SERVER_TYPES array and promptServerType().

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 16:55:22 -05:00
A
aa88e70488
fix: add concurrency guard and workflow_dispatch to CLI release (#1812)
The race condition: two PRs merged 3 seconds apart both triggered the
CLI Release workflow. The second run (v0.7.12) finished last and
overwrote the release with a stale binary, even though the repo HEAD
was at v0.8.0.

- Add concurrency group so concurrent releases cancel the older one
- Add workflow_dispatch trigger for manual re-runs

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 13:52:17 -05:00
A
c76d930046
fix: prevent GCP delete from hanging on interactive project prompt (#1813)
When deleting a GCP instance, resolveProject() could trigger an
interactive prompt ("Use project?") that collides with the deletion
spinner, causing the command to hang indefinitely. This happened when
instance metadata was missing the project (pre-ee653ca instances) or
when GCP_PROJECT was set to an empty string.

Fix: run resolveProject() in non-interactive mode during deletion so it
auto-accepts the gcloud config default. Also fail fast instead of
showing an interactive picker when no project is available.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 13:52:16 -05:00
A
463c7a2efb
feat: add --custom flag for machine type/region selection (#1810)
* feat: add --custom flag for interactive machine type/region selection

By default, all clouds now skip size/region prompts and use sensible
defaults for faster provisioning. The --custom flag enables interactive
pickers on all clouds, unifying the previously inconsistent behavior
where some clouds always prompted and others never did.

- AWS: promptRegion/promptBundle gated on SPAWN_CUSTOM
- GCP: promptMachineType/promptZone gated on SPAWN_CUSTOM
- Fly: promptVmOptions gated on SPAWN_CUSTOM
- Hetzner: new promptServerType/promptLocation with type/location arrays
- DigitalOcean: new promptDropletSize/promptDoRegion with size/region arrays
- Daytona: new promptSandboxSize with cpu/memory/disk presets
- Sprite: no change (managed platform, no meaningful size options)
- --custom + --headless is an error (incompatible modes)
- Version bump to 0.8.0 (new feature)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: fix biome format violations in --custom flag code

Auto-format object literals in arrays (expand to multi-line), wrap
long console.error line, and expand inline array in test assertion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 12:29:51 -05:00
A
b51b5aa11e
feat: configure zeroclaw for full autonomy in spawned VMs (#1811)
Adds ~/.zeroclaw/config.toml with autonomy settings (equivalent to
Claude Code's dangerouslySkipPermissions) so zeroclaw runs without
approval prompts inside sandbox VMs.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 12:29:48 -05:00
A
ee653ca4a6
fix: persist GCP zone/project in connection metadata for deletion (#1809)
GCP delete was re-prompting for project/zone because saveVmConnection
didn't save metadata. Now createInstance passes zone and project as
metadata, and mergeLastConnection reads it back into history.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 08:17:44 -08:00
A
d961df7f61
fix: wrap fly ssh console -C command in bash -c for session resume (#1806) (#1807)
When resuming a Fly.io session via 'spawn list' → Enter, cmdEnterAgent
passed the stored launch command directly to 'fly ssh console -C'. The
stored command uses shell builtins (source), operators (;), and export
statements that require a bash interpreter. Without a shell wrapper,
fly exec'd the command directly (no shell), causing failures.

The fix wraps the command in `bash -c '...'`, matching the pattern already
used by interactiveSession(), runServer(), runServerCapture(), and
uploadFile() — all of which consistently wrap fly -C arguments in bash.

Fixes #1806

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 10:22:37 -05:00
A
493ca7a1cd
fix: use SSH_INTERACTIVE_OPTS in interactiveSession() for hetzner/do/aws/gcp (#1805)
BatchMode=yes in SSH_BASE_OPTS actively blocks TTY prompts in interactive
sessions. Four cloud providers (Hetzner, DigitalOcean, AWS, GCP) were
using SSH_BASE_OPTS in their interactiveSession() functions despite
SSH_INTERACTIVE_OPTS being purpose-built for this (added in PR #1795).
This also adds Compression=yes, IPQoS=lowdelay, StrictHostKeyChecking=accept-new,
and the -t flag (already included), aligning with the commands.ts reconnect path.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:25:26 -05:00
A
9aa41bfa67
fix: reject non-ASCII filenames in install.sh download validation (#1802)
Fixes #1800 - explicit ASCII check blocks Unicode lookalike bypass

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 08:35:24 -05:00
A
180b19d9f4
fix: reduce SSH interactive lag (GSSAPIAuthentication + TCPKeepAlive) (#1795)
* fix: reduce SSH interactive lag with GSSAPIAuthentication=no and TCPKeepAlive=no

GSSAPIAuthentication causes latency on every SSH interaction when
the server doesn't support Kerberos (i.e. always for our VMs).
TCPKeepAlive is redundant with ServerAliveInterval and can cause
retransmission issues through NAT/firewalls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use SSH_INTERACTIVE_OPTS for all interactive sessions

The reconnect (cmdConnect) and agent launch (cmdEnterAgent) paths
were using bare SSH with only StrictHostKeyChecking, missing all
performance flags. Now they use SSH_INTERACTIVE_OPTS which includes:

- GSSAPIAuthentication=no (skip Kerberos timeout)
- TCPKeepAlive=no (avoid NAT retransmission issues)
- ServerAliveInterval=15 (encrypted keepalives)
- Compression=yes (reduce latency on slow/distant links)
- IPQoS=lowdelay (mark packets for low-latency treatment)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 03:20:49 -05:00
A
61fc36c557
refactor: enforce isString/isNumber type guards via GritQL lint rule (#1796)
Add `lint/no-typeof-string-number.grit` plugin that bans raw
`typeof x === "string"` and `typeof x === "number"` checks. All
occurrences replaced with `isString(x)` / `isNumber(x)` from
`shared/type-guards.ts`.

This makes narrowing patterns consistent and scannable — every
type check uses the same vocabulary project-wide.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 03:20:42 -05:00
A
a26d27f139
style: enforce biome format across codebase, add CI check (#1794)
Run `biome format --write` on all 98 source files (38 needed fixes).
The main change: object literals and long argument lists are now expanded
onto separate lines per Biome's `"expand": "always"` setting, making
code much easier to scan on narrow screens.

Add `biome format` check step to CI lint workflow so formatting
regressions are caught on every PR.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 23:32:12 -08:00
A
86cae8ee32
feat: add SSH key discovery & selection across all providers (#1792)
All 4 providers (Hetzner, DO, AWS, GCP) hardcoded ~/.ssh/id_ed25519 and
duplicated key generation logic. Users with id_rsa or custom-named keys
got unwanted new keys generated. This adds a shared ssh-keys module that:

- Scans ~/.ssh/ for all valid key pairs (matching pub + private files)
- With 0 keys: generates id_ed25519 (same as before)
- With 1 key: uses it silently
- With 2+ keys: prompts multiselect (all selected by default)
- Caches the result at module level for the session
- Centralizes getSshFingerprint() (was duplicated in Hetzner + DO)
- All providers now pass -i flags for selected keys to SSH commands

Net -152 lines of duplicated code across providers.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 23:22:50 -08:00
A
b802dfbc16
refactor: extract saveLaunchCmd to history.ts (#1789)
Eliminates copy-paste of saveLaunchCmd across 8 cloud provider files.
The local/local.ts copy had already diverged (using Bun.write() instead
of writeFileSync()), confirming the maintenance risk.

Fixes #1786

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:11:14 -08:00
A
ed7ebedde4
fix: clean up stdin/TTY state before interactive session handoff (#1790)
After provisioning, @clack/prompts and readline leave stdin with stale
listeners, raw mode, and buffered input. This causes flaky keyboard input
in the interactive SSH session. Add prepareStdinForHandoff() that closes
the shared readline, removes all stdin listeners, resets raw mode, and
pauses stdin before launching the child process.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 01:56:49 -05:00
A
4a45a2c9c1
refactor: extract saveVmConnection to history.ts (#1788)
Eliminates copy-paste of saveVmConnection across 6 cloud provider files.
Fixes #1787

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 01:56:48 -05:00
A
3a554a5ada
fix: replace instanceof Error with hasMessage() duck-typing in SSH retry paths (#1785)
wrapSshCall (agent-setup.ts) and spriteRetry (sprite.ts) used `instanceof
Error` to extract error messages — an anti-pattern explicitly avoided
throughout the rest of the codebase (consistent with comments in index.ts,
commands.ts, manifest.ts, etc.). When errors cross module or bundling
boundaries, instanceof returns false even for real Error objects, causing
err.message to fall back to String(err) and producing `[object Object]` in
the retry logs. Uses `hasMessage()` from shared/type-guards for consistent
duck-typed narrowing.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 00:57:03 -05:00
A
fa34d29b7e
fix: explicitly pass SSH identity file for DigitalOcean connections (#1784)
DigitalOcean SSH was failing with "Permission denied (publickey)" because
the SSH client was not explicitly told which identity file to use. When
users have multiple SSH keys or an SSH agent with different keys loaded,
SSH may try the wrong key first and fail — especially with BatchMode=yes
which suppresses interactive fallbacks.

The fix adds `-i ~/.ssh/id_ed25519` to SSH_OPTS (matching AWS's approach)
and passes sshKeyPath to the shared waitForSsh utility, ensuring the
correct key is always used for both the handshake wait and all subsequent
SSH/SCP commands.

Fixes #1783

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 00:11:59 -05:00
A
3988ffe90e
fix: check exitCode in openBrowser() and reinit readline after @clack/prompts (#1782)
openBrowser() never checked the exitCode from Bun.spawnSync, so it silently
returned success even when the browser command failed (headless VMs, no
DISPLAY). Now checks exitCode and always shows the URL as fallback.

selectFromList() uses @clack/prompts which creates/destroys its own readline
on stdin. After it finishes, the shared readline in ui.ts can be corrupted
(Bun #1707). Now explicitly closes and nulls the shared readline after
@clack/prompts returns so the next prompt() call gets a fresh one.

Fixes #1770

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:21:06 -05:00
A
abe1f33318
fix: use sentinel values in Daytona saveVmConnection (#1778)
Daytona was writing raw sshHost/sshToken as ip/user in last-connection.json.
history.ts:mergeLastConnection() calls validateUsername() on the user field,
rejecting SSH tokens (>32 chars) and deleting the connection file. This meant
spawn list/delete/resume never showed Daytona sandboxes.

Replace with the "daytona-sandbox" sentinel (already in CONNECTION_SENTINELS
in security.ts:31 and checked by all relevant handlers in commands.ts) — the
same pattern Fly.io and Sprite use for their provider-managed SSH.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 19:41:51 -08:00
A
16c8a2b90b
fix: use getSpawnDir()/getConnectionPath() in all cloud providers (#1774)
Fixes #1769

All 8 cloud providers hard-coded `${process.env.HOME}/.spawn` for
connection data, bypassing the SPAWN_HOME env var support in history.ts.
Replaced all 16 occurrences with getSpawnDir() and getConnectionPath().

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 19:27:21 -08:00
A
ef2748069f
fix: use child_process.spawn for interactive sessions to fix TTY passthrough (#1780)
Bun.spawn() doesn't properly restore TTY state after @clack/prompts
manipulates stdin raw mode during provisioning. This causes laggy/broken
keyboard input in SSH sessions launched via `spawn run`. Node's
child_process.spawn() with stdio: "inherit" does a clean FD handoff,
matching the already-working pattern in runInteractiveCommand() used by
`spawn ls` resume.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 19:22:17 -08:00
A
0843c5e708
feat: shared SSH wait utility with TCP pre-check and stderr capture (#1779)
Replace 5 duplicated SSH wait implementations (AWS, DO, Hetzner, GCP,
Sprite) with a shared two-phase utility in cli/src/shared/ssh.ts:

- Phase 1: cheap TCP probe (2s intervals) until port 22 opens
- Phase 2: full SSH handshake (3s intervals) with stderr capture
- Adds BatchMode=yes to prevent interactive prompt hangs
- Removes ~220 lines of duplicated sleep/SSH_OPTS/waitForSsh code

Daytona (token auth) and Fly (WireGuard) left unchanged — too different.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 19:17:09 -08:00
A
b62dc1af33
feat: ban as type assertions, add runtime schema validation with valibot (#1775)
* fix: resolve all biome lint warnings across the codebase

- Replace all noExplicitAny with proper types (unknown, Record<string, unknown>)
- Fix useBlockStatements in picker.ts (braceless if)
- Fix useNumberNamespace in picker.ts (parseInt → Number.parseInt)
- Codebase now passes biome lint with 0 errors and 0 warnings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: ban `as` type assertions, add runtime schema validation with valibot

Replace all ~170 unsafe `as` type assertions across the entire codebase
(production + tests) with runtime-validated alternatives:

- Add GritQL biome plugin (`no-type-assertion.grit`) that bans all `as`
  casts except `as const`
- Add valibot for schema-validated JSON parsing (`parseJsonWith`)
- Add shared utilities: `parse.ts` (schema parsing), `type-guards.ts`
- Replace `as` casts in all 5 cloud modules (aws, daytona, hetzner,
  digitalocean, fly) with valibot schemas + type guards
- Replace `as` casts in shared modules (manifest, update-check, oauth,
  commands, history, ui)
- Replace `as any` in all 26 test files with proper `new Response()`
  mocks and typed variables
- Add 13 tests for parseJsonWith/parseJsonRaw
- Add "Embrace Bold Changes" culture rule to CLAUDE.md
- Bump version 0.6.19 → 0.7.0

1859 tests pass, 0 lint errors across 95 files, bundle +6KB from valibot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: move GritQL plugin into cli/lint/ directory

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 18:50:53 -08:00
A
f0a70b66a1
feat: multi-line layout for ls/delete — name first, then agent · cloud · time (#1777)
Entries in `spawn ls` and `spawn delete` now display as two lines:
  - Line 1: spawn name (bold)
  - Line 2: Agent · Cloud · relative time

Removes SSH connection info and prompt previews from the list display
to keep it clean and scannable.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 18:33:06 -08:00
A
b67c6a30e1
fix: use minimal cloud-init tier for Claude Code (#1776)
The `installClaudeCode()` SSH step already handles Node.js and Claude Code
installation with retries and fallbacks, making the cloud-init Node/npm
install redundant. Switch to "minimal" so cloud-init only installs
curl/unzip/git/ca-certificates — finishing faster and eliminating the
duplicate install path.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 18:18:43 -08:00
A
15d769dfa6
fix: remove git dependency from install script to avoid macOS Xcode CLT trigger (#1773)
macOS ships a /usr/bin/git shim that triggers a ~1.5GB Xcode CLT download
when invoked. The install script's `command -v git` check was fooled by
this shim, causing the script to hang or silently fail on fresh macOS.

Removes the git clone path entirely — the curl-based download is fast,
reliable, and has zero external dependencies beyond curl and bun.

Closes #1768

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 17:39:54 -08:00
A
ec210e37af
fix: Result monad for retry logic — prevent duplicate server creation (#1771)
* fix: Result monad for retry logic — prevent duplicate server creation

SSH exit 255 after an interactive session caused runWithRetries to retry
the entire bash script, creating duplicate servers. The old withRetry
also blindly retried all errors including timeouts where the remote
command may have already completed.

Introduces a Result<T> monad (Ok/Err) so callers explicitly signal
whether a failure is retryable (return Err) or fatal (throw). Adds
wrapSshCall() that classifies SSH errors: transient connection failures
are retryable, timeouts are not. Removes retry loop from the top-level
script runner entirely since it spans server creation + interactive
session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: mandate draft-PR-first workflow for all changes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add biome lint to CI and pre-commit hook, fix lint violations

- Add Biome lint job to .github/workflows/lint.yml
- Add TypeScript lint check to .githooks/pre-commit
- Fix useBlockStatements violations in ui.ts and tests
- Add biome lint to CLAUDE.md "After Each Change" checklist

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: rename Result.value to Result.data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: clean up stale pre-commit hook

- Remove dead check for deleted functions (write_oauth_response_file,
  create_oauth_response_html) — they no longer exist in the codebase
- Fix early exit skipping Biome lint when no .sh files are staged
- Replace echo -e with printf (the hook was using the pattern it bans)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve biome lint errors blocking CI

- Fix useImportType: import { type Result } → import type { Result }
- Fix noUnusedImports: remove unused KNOWN_FLAGS import
- Fix noUnusedTemplateLiteral: template literal → string literal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 20:39:42 -05:00
A
2413db6ade
fix: truncate picker lines to terminal width to prevent redraw corruption (#1772)
Long labels (e.g. "Claude Code on GCP Compute Engine -- spawn-trial-000-ahmed")
wrap to multiple rows, but the redraw logic uses a fixed line count to cursor-up.
This causes old content to pile up on every arrow-key press.

Query terminal width via `stty size` and truncate all lines to fit within
a single row, with a 1-char margin to prevent auto-wrap edge cases.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 17:22:46 -08:00
A
8112276121
feat: add delete sub-menu (destroy/remove) and spawn kill alias (#1765)
Pressing `d` in the server picker now shows a sub-menu:
- Destroy server: hard delete (destroys cloud VM + marks deleted)
- Remove from history: soft delete (removes entry, no cloud API call)
- Cancel: go back to picker

Also adds `kill` as an alias for `spawn delete`.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 15:23:49 -08:00
A
97992dc6a2
feat: add retry logic for failure-prone orchestration operations (#1764)
Agent installation, config upload, env setup, and agent configuration
can all fail transiently due to network flakiness or SSH instability
on fresh VMs. Add a shared withRetry() helper and wrap these operations
with 2-attempt retries to improve reliability without over-engineering.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 15:20:16 -08:00
A
63bce1bd04
security: sanitize TERM env var in interactiveSession to prevent shell injection (#1763)
All 6 cloud providers interpolated process.env.TERM directly into shell
commands without validation. A malicious TERM value (e.g., containing
$(cmd)) would execute on the remote server, potentially exfiltrating
OPENROUTER_API_KEY and other credentials.

Add sanitizeTermValue() allowlist (alphanumeric, dots, hyphens, underscores)
to cli/src/shared/ui.ts and apply it in all interactiveSession functions.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 18:11:09 -05:00
A
c958d3d41b
feat: unify list/delete commands with inline delete picker (#1762)
Both `spawn list` and `spawn delete` now share a single interactive
picker (`activeServerPicker`) backed by `getActiveServers()`. Pressing
`d` in the picker triggers inline delete-and-refresh without leaving
the list. Failed deletions now mark entries as deleted so users aren't
stuck with phantom servers they can't clear.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 18:10:49 -05:00
A
986a6ff371
fix: add remote path validation to GCP uploadFile (missing vs all other providers) (#1760)
All 6 other cloud providers (Fly, Hetzner, DigitalOcean, AWS, Sprite, Daytona)
validate remotePath with an allowlist regex before passing it to scp. GCP's
uploadFile had no validation at all, breaking the defense-in-depth pattern.

Adds the same allowlist check (^[a-zA-Z0-9/_.~$-]+$) plus dotdot check.
The regex includes $ to allow $HOME prefix paths used by agent-setup.ts.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 18:10:28 -05:00
A
545ddafe4a
fix: extract flags module to fix KNOWN_FLAGS drift in tests (#1757)
KNOWN_FLAGS in unknown-flags.test.ts was copy-pasted from index.ts and
was missing the --name flag, causing silent test gaps. Extract
KNOWN_FLAGS, findUnknownFlag, and expandEqualsFlags into a new flags.ts
module so tests import the real source of truth.

- Create cli/src/flags.ts with KNOWN_FLAGS, findUnknownFlag, expandEqualsFlags
- Update index.ts to import from flags.ts (checkUnknownFlags now uses findUnknownFlag)
- Update unknown-flags.test.ts to import from flags.ts instead of copy-pasting
- Add tests for --name flag, KNOWN_FLAGS completeness, and expandEqualsFlags
- Bump CLI version to 0.6.15

Fixes #1744

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 18:10:07 -05:00
A
7e7d4aa3d7
fix: add SSH keepalives, increase cloud-init patience, simplify openclaw launch (#1761)
- Add ServerAliveInterval=15 + ServerAliveCountMax=3 to SSH_OPTS on all
  clouds (DO, Hetzner, AWS, GCP) to prevent silent TCP drops during long
  idle periods (e.g. waiting on slow LLM API calls). Daytona already had
  these.
- Increase DigitalOcean cloud-init fallback poll from 6×5s (30s) to
  20×5s (100s) so full-tier installs (build-essential + bun + node)
  have time to finish when the streaming tail path fails.
- Replace `source ~/.zshrc` with explicit PATH export in openclaw launch
  command to avoid side effects from zshrc inside bash -l.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 14:54:35 -08:00
A
fdd6a9b6c3
chore: harden biome lint rules and auto-fix codebase (#1759)
* chore: harden biome lint rules and auto-fix codebase

Add strict biome rules for better TypeScript code quality:
- useBlockStatements: enforce braces on all control flow
- useConst: prefer const over let
- useNodejsImportProtocol: require node: prefix for builtins
- noUnusedImports/Variables: error (warn in tests)
- noExplicitAny: warn in source, off in tests
- noDoubleEquals, noAssignInExpressions, noFallthroughSwitchClause
- useNumberNamespace (Number.isNaN over isNaN)
- noImplicitAnyLet, noInferrableTypes, noUselessElse

Auto-fixed 55 files. Tests relaxed for any/unused patterns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: enable biome formatter with expand: always for brace newlines

Enable biome formatter with:
- expand: "always" — braces on their own lines
- indentStyle: space, indentWidth: 2
- lineWidth: 120
- arrowParentheses: always
- trailingCommas: all
- semicolons: always

82 files reformatted. All 1819 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 14:37:47 -08:00
A
f3a2b85b5b
fix: always confirm cloud resource name with user, even when SPAWN_NAME is set (#1758)
When the CLI collects a display name (SPAWN_NAME), each cloud now shows
the kebab-case derivative as the default in the resource name prompt
instead of silently accepting it. Users can hit Enter to accept or type
an override. Non-interactive mode still skips the prompt.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-22 14:25:34 -08:00