Commit graph

66 commits

Author SHA1 Message Date
A
5e0144b645
fix(zeroclaw): remove broken zeroclaw agent (repo 404) (#3107)
* fix(zeroclaw): remove broken zeroclaw agent (repo 404)

The zeroclaw-labs/zeroclaw GitHub repository returns 404 — all installs
fail. Remove zeroclaw entirely from the matrix: agent definition,
setup code, shell scripts, e2e tests, packer config, skill files,
and documentation.

Fixes #3102

Agent: code-health
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(zeroclaw): remove stale zeroclaw reference from discovery.md ARM agents list

Addresses security review on PR #3107 — the last remaining zeroclaw
reference in .claude/rules/discovery.md is now removed.

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(zeroclaw): remove remaining stale zeroclaw references from CI/packer

Remove zeroclaw from:
- .github/workflows/agent-tarballs.yml ARM build matrix
- .github/workflows/docker.yml agent matrix
- packer/digitalocean.pkr.hcl comment
- sh/e2e/e2e.sh comment

Addresses all 5 stale references flagged in security review of PR #3107.

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-30 15:35:40 -07:00
A
e44705d925
fix(ux): reduce SSH wait verbosity and clarify agent handoff (#3056)
- Replace repeated 'SSH port closed (N/36)' with periodic updates every 5 attempts
- Add clear 'Provisioning complete. Connecting...' line before agent attach

Fixes #3053

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-27 15:22:46 +07:00
A
4ac4a7e0cf
feat: recursive spawn tree passback (#3023)
* feat: pull child spawn history back to parent for `spawn tree`

When the interactive session ends (or headless mode completes), the
parent downloads the child VM's history.json and merges records into
local history. Before downloading, it runs `spawn pull-history` on the
child, which recursively pulls from all grandchildren — so the full
tree collapses up to the root regardless of depth.

Changes:
- Add getParentFields() — sets parent_id/depth on saveSpawnRecord calls
- Add pullChildHistory() — downloads + merges child history after session
- Add `spawn pull-history` command for recursive SSH-based history pull
- Add 11 tests for parseAndMergeChildHistory

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: trigger CI recompute

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): validate user/ip params before SSH exec in pull-history

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): use shared validators for SSH params in pull-history and delete

Replace inline regex checks in pull-history.ts with validateUsername()
and validateConnectionIP() from security.ts, matching the pattern used
across connect.ts, fix.ts, and link.ts. Also add the same validation
to delete.ts:pullChildHistory which had no SSH parameter validation.

orchestrate.ts uses the runner abstraction (not raw user@ip), so its
SSH params come from the cloud provider, not untrusted history records.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-26 15:21:50 -07:00
A
2dd87c986d
feat(cli): add star-the-repo nudge after successful spawns (#3025)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
Shows a non-intrusive " Enjoying Spawn? Star us on GitHub!" message
to returning users (2+ successful spawns) after a successful spawn
session completes. Shown at most once per 30 days.

- New `maybeShowStarPrompt()` in `shared/star-prompt.ts`
- Tracks `starPromptShownAt` in `~/.config/spawn/preferences.json`
- Called after `execScript()` returns success in cmdRun, cmdInteractive,
  and cmdAgentInteractive (skipped in headless mode)
- The `execScript()` return type changed from `void` to `boolean`
  to indicate whether the script ran successfully
- Added 7 unit tests covering all gate conditions

Fixes #3020

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-27 03:15:12 +07:00
A
405dbc6ba6
refactor: use getSpawnCloudConfigPath(), remove dead _cloudName param (#3010) (#3012)
Replace hand-constructed openrouter.json path with getSpawnCloudConfigPath("openrouter")
for single-source-of-truth path resolution. Remove unused _cloudName parameter since
the function delegates ALL cloud credentials unconditionally.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 19:26:09 +07:00
A
fd36ff0e3d
fix(security): add base64 validation guards in orchestrate.ts (fixes #3006) (#3007)
Some checks are pending
CLI Release / Build and release CLI (push) Waiting to run
Lint / ShellCheck (push) Waiting to run
Lint / Biome Lint (push) Waiting to run
Lint / macOS Compatibility (push) Waiting to run
Add /^[A-Za-z0-9+/=]+$/ validation after each .toString("base64") call
in delegateCloudCredentials() and injectEnvVars(), consistent with the
pattern established in agent-setup.ts by #2988.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 18:25:40 +07:00
A
52d06c4cb5
fix: resolve ANSI spinner corruption and garbled output (#3001) (#3003)
* fix(ux): replace download spinner with stderr logging, reset terminal before SSH handoff

Fixes two UX issues from live E2E session (#3001):

1. Download spinner (p.spinner from @clack/prompts) wrote ANSI escape codes
   to stdout. When stdout is captured (E2E harness, piped output), these
   sequences appeared as raw text rather than rendered colors. Replace
   p.spinner() in downloadScriptWithFallback and downloadBundle with
   logStep/logInfo/logError from shared/ui.ts, which write to stderr and
   correctly check isTTY before emitting ANSI codes.

2. Garbled output at start of interactive session (overlapping status lines
   from the remote agent's TUI) may be caused by residual ANSI state from
   @clack/prompts (hidden cursor, active color attributes). Emit
   ESC[?25h ESC[0m to stderr before prepareStdinForHandoff() to explicitly
   restore cursor visibility and reset all attributes before the SSH session
   takes over.

Agent: issue-fixer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: resolve ANSI spinner corruption and garbled output in interactive mode (#3001)

Three root causes fixed:

1. Spinner wrote to stdout while all other CLI status output goes to stderr,
   causing ANSI escape sequence interleaving and corruption when both streams
   are merged on a terminal. Redirected all p.spinner() calls to process.stderr.

2. unicode-detect.ts (which sets TERM=linux for SSH sessions to force ASCII
   fallback) was only imported in commands/shared.ts but not in shared/ui.ts.
   Cloud module entry points (hetzner/main.ts, etc.) that import shared/ui.ts
   loaded @clack/prompts without the TERM override, causing Unicode spinner
   frames in environments that can't render them.

3. After an interactive SSH session ends, the remote agent's TUI (e.g. Claude
   Code) may leave the terminal in raw mode with altered attributes. Added
   terminal reset (ANSI attribute reset + stty sane) after spawnInteractive()
   returns to prevent garbled post-session output.

Agent: ux-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 15:28:32 +07:00
Ahmed Abushagur
7fe36b8aa0
fix: delegate ALL cloud credentials, not just the current cloud (#2994)
delegateCloudCredentials only copied the current cloud's config file
(e.g. sprite.json when spawning on Sprite). Child VMs couldn't spawn
on other clouds because their tokens weren't forwarded.

Now iterates all known clouds and copies every credential file that
exists locally, so the agent can spawn children on any cloud.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 19:02:42 -07:00
A
a7f3e9da82
refactor: remove dead code and stale references (#2996)
- Remove `export` from `getTerminalWidth` in commands/info.ts — only
  used internally, not exported from commands/index.ts barrel
- Remove `export` from `makeDockerExec` in shared/orchestrate.ts — only
  used internally by `makeDockerRunner`, no external callers
- Bump CLI version to 0.26.6

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-26 08:40:42 +07:00
Ahmed Abushagur
90dde882d0
fix: installSpawnCli fails on Sprite — bun shim doesn't work (#2993)
Sprite has a bun shim at /.sprite/bin/bun that delegates to
$HOME/.bun/bin/bun, but that binary doesn't exist on fresh VMs.
`command -v bun` returns true (finds the shim) so the install script
skips bun installation, then bun fails when actually invoked.

Fixed in two places:
- installSpawnCli: source shell profiles, test `bun --version` (not
  just existence), and install bun fresh if it doesn't work
- install.sh: replace `command -v bun` with `bun --version` to detect
  broken shims

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 07:36:12 +07:00
Ahmed Abushagur
b47d6bbe1d
fix: embed skill content instead of reading from disk (#2992)
* fix: spawn step skipped when no explicit --steps passed

The spawn skill injection condition used `enabledSteps?.has("spawn")`
which is falsy when enabledSteps is undefined (no --steps flag). Now
checks the recursive beta flag directly and falls through when no
explicit steps are selected, matching how auto-update works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: embed skill content in spawn-skill.ts instead of reading from disk

The skills/ directory exists in the repo but isn't bundled when the CLI
is installed via npm. readSkillContent() couldn't find the files at
runtime, causing "No spawn skill file for agent" on every deploy.

Fixed by embedding all skill content directly as string constants in the
module. Removed fs-based getSkillsDir/readSkillContent/getSpawnSkillSourceFile
in favor of a single AGENT_SKILLS config map with inline content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 06:16:52 +07:00
Ahmed Abushagur
17817533a4
feat: skill injection — teach agents how to use spawn on recursive VMs (#2989)
When `--beta recursive` is active, a new "Spawn CLI" setup step injects
agent-native instruction files teaching each agent how to use the `spawn`
CLI to create child VMs. Skill files live in `skills/` at the repo root
and use each agent's native format (YAML frontmatter for Claude/Codex/
OpenClaw, plain markdown for others, append mode for Hermes).

- Add `skills/` directory with 8 agent-specific skill files
- Add `spawn-skill.ts` module with path mapping, file reading, and injection
- Register "spawn" as a conditional setup step gated by `--beta recursive`
- Wire `injectSpawnSkill()` into orchestrate.ts postInstall flow
- Add 52 tests covering path mapping, append mode, file existence, injection
- Bump CLI version to 0.26.0 (minor: new feature)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 15:32:20 -07:00
A
7194058c64
fix(security): add input validation to makeDockerExec (#2987)
Adds non-empty guard to makeDockerExec to make the security boundary
explicit and prevent silent misuse with empty commands.

Fixes #2985

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 15:17:53 -07:00
Ahmed Abushagur
b0674550c6
feat: recursive spawn (--beta recursive) (#2978)
* feat: add recursive spawn (--beta recursive)

Enables VMs to spawn child VMs. When --beta recursive is active:
- Injects SPAWN_PARENT_ID, SPAWN_DEPTH, SPAWN_BETA=recursive into .spawnrc
- Installs spawn CLI on the VM via install.sh
- Delegates cloud + OpenRouter credentials to the VM
- Tracks parent_id and depth on SpawnRecord for tree relationships
- Adds `spawn tree` command for full recursive tree view
- Adds `spawn history export` for pulling child history via SSH
- Adds `spawn list --json` and `spawn list --flat` flags
- Adds tree rendering in `spawn list` when parent-child relationships exist
- Adds cascade delete support in delete.ts
- Adds mergeChildHistory() for backward-pass history sync

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add recursive spawn to README

Add --beta recursive to beta features table, new commands
(spawn tree, spawn history export, spawn list --flat/--json)
to commands table, and a dedicated Recursive Spawn section
with usage examples for tree view and cascade delete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add cmdTree coverage tests to fix mock test CI

The CI coverage threshold (90% functions, 80% lines) was failing
because tree.ts had 0% coverage. Added tests that exercise cmdTree
with empty history, tree rendering, JSON output, flat records,
and deleted/depth labels. tree.ts now has 100% coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(security): validate cloudName and use valibot in pullChildHistory

- Add cloudName validation against ^[a-z0-9-]+$ to prevent
  command injection in delegateCloudCredentials
- Export SpawnRecordSchema from history.ts and replace loose
  type guard with valibot schema validation in pullChildHistory
- Resolve merge conflicts with main (include both docker and
  recursive beta features)

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: add installSpawnCli and delegateCloudCredentials coverage

Export and test installSpawnCli (success + timeout failure paths)
and delegateCloudCredentials (no creds, with creds, write failure,
mkdir failure paths) to improve orchestrate.ts function coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: gritQL rule false positives and delete.ts coverage

- use TsAsExpression() AST node instead of backtick pattern to avoid
  matching import aliases as type assertions
- export and test findDescendants() and pullChildHistory() to bring
  delete.ts line coverage above the 35% threshold
- add 8 new tests for descendant finding and history pull edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: A <258483684+la14-1@users.noreply.github.com>
2026-03-25 10:42:09 -07:00
Ahmed Abushagur
53189b80a2
fix: remove docker from --fast and fix docker cp into container (#2976)
* fix: remove docker from --fast and fix docker cp into container

Two fixes for --beta docker:

1. Remove "docker" from --fast beta features — --fast was auto-enabling
   --beta docker, pulling ghcr images that hang the session.
   Users must now opt in explicitly with --beta docker.

2. Fix uploadFile in docker mode — .spawnrc was uploaded to the host
   but never copied into the container. Add docker cp after SCP upload
   so env vars and configs reach the agent inside the container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: keep docker in --fast beta features

The docker cp fix resolves the hang — no need to remove docker from
--fast. The issue was missing file copy into the container, not the
docker mode itself.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: extract makeDockerRunner helper, fix uploadFile into container

Add makeDockerRunner() that wraps a CloudRunner so all commands and
file uploads target the Docker container. Replaces inline lambdas in
hetzner/main.ts and gcp/main.ts with a clean one-liner.

The key fix: uploadFile now docker cp's files into the container after
SCP — previously .spawnrc (API keys, env vars) only landed on the host,
so the agent inside the container had no config and hung.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(security): shellQuote remotePath in docker cp command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 14:52:05 +07:00
Ahmed Abushagur
a551cb2401
fix: remove local tarball download path (#2970)
* fix: remove local tarball download, use remote-only tarball install

The local-download-then-SCP-upload path was unnecessary complexity —
downloading a tarball to the user's machine just to re-upload it to the
VM is wasteful. The VM downloads directly from GitHub instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: force zeroclaw native runtime to prevent Docker container hang

ZeroClaw auto-detects Docker and launches in a container (pulling
ghcr.io/openrouterteam/spawn-zeroclaw), which hangs the interactive
session. Force native mode via ZEROCLAW_RUNTIME=native env var and
adapter = "native" in config.toml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable openclaw Docker sandbox to prevent container hang

Same issue as zeroclaw — openclaw auto-detects Docker and runs agents
in containers, hanging the interactive session. Disable via
agents.defaults.sandbox.mode = off in config and fallback JSON.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable codex Docker sandbox to prevent container hang

Codex CLI also auto-detects Docker for sandboxing. Set
sandbox_mode = "danger-full-access" in config.toml — the VM itself
provides isolation, Docker sandboxing just causes hangs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 21:42:31 -07:00
A
650708e30d
refactor: remove dead code and stale references (#2966)
Extract duplicate dockerExec helper from gcp/main.ts and hetzner/main.ts
into shared makeDockerExec() in orchestrate.ts. Both local functions were
identical — wrapping commands with docker exec using DOCKER_CONTAINER_NAME
and shellQuote.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 14:24:33 -07:00
A
d3889519bc
fix(e2e): fix --fast mode for native binary agents on Sprite (#2965)
Add 180s timeout to uploadFileSprite to prevent indefinite hangs during
tarball uploads. Without a timeout, large tarballs or stalled Sprite
connections block the entire provisioning pipeline past the 720s E2E
provision timeout, causing agent binary not-found failures for openclaw,
zeroclaw, and codex.

Also skip the redundant remote tarball download fallback when a local
tarball was already downloaded but its upload/extract failed -- the
remote download would face the same extraction issues. This saves ~150s
in the fallback chain, leaving enough time for the live install to
complete within the provision timeout.

Fixes #2960

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 02:59:11 +07:00
A
7aba20e327
fix(ux): deduplicate install messages, add newlines to SSH polling, clarify completion messages (#2900)
- Suppress stdout+stderr from `claude install --force` to prevent duplicate
  "successfully installed" messages (was printed up to 4x)
- Make logStepInline fall back to newline-separated output when stderr is not
  a TTY, so SSH port polling status is readable in piped/captured contexts
- Consolidate post-install completion messages into a single clear milestone:
  "Agent setup complete -- {agent} is ready on {cloud}"
- Bump CLI version to 0.25.16

Fixes #2899

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 15:26:34 +07:00
A
f1f2667cb0
fix: skip interactive session in headless mode (#2895)
* fix: skip interactive session in headless mode (#2892)

When SPAWN_HEADLESS=1, the orchestrator now exits with code 0 after
provisioning completes instead of attempting to launch the agent
interactively. This fixes Claude Code (and other agents) failing with
"Input must be provided through stdin or --prompt" when spawned via
`--headless --output json` without a prompt.

The VM is fully provisioned and ready — callers can SSH in or use
`spawn connect` to start the agent manually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: clean up SPAWN_HEADLESS env in test afterEach to prevent leaks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-22 21:38:53 -07:00
A
3f12cb9ee8
refactor: remove duplicate docker constants into shared orchestrate module (#2860)
Consolidate DOCKER_CONTAINER_NAME and DOCKER_REGISTRY constants from
gcp/main.ts and hetzner/main.ts into shared/orchestrate.ts. Both files
defined identical values ("spawn-agent" and "ghcr.io/openrouterteam"); they
now import the shared exports instead.

Bumps CLI patch version to 0.25.11.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-21 14:27:21 -07:00
Ahmed Abushagur
8c7a381375
fix: auto-reconnect on Sprite connection drops (#2855)
Sprite CLI exits with code 1 on "connection closed" (not 255 like SSH).
The reconnect loop now treats exit code 1 on Sprite as a connection
drop, retrying up to 5 times with a 3s delay between attempts.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:13:14 +07:00
Ahmed Abushagur
26332afa56
fix: prevent silent exit in --fast mode on Sprite (#2852)
In fast mode, Promise.allSettled runs server boot, OAuth, and tarball
download concurrently. When all operations complete — especially after
Bun.serve.stop(true) in the OAuth flow removes its event loop handle —
the event loop can appear empty before the await continuation starts
new I/O operations. This causes Bun to exit silently with code 0,
dropping the user back to their shell after "Successfully obtained
OpenRouter API key via OAuth!" with no error.

Fix: keep a dummy setInterval handle alive during the fast-mode
concurrent section so the event loop never drains prematurely.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:51:02 -07:00
A
1dc9c04eeb
fix: standardize ESM import extensions across 35 production files (#2827)
Add .js extensions to 124 relative imports that were missing them.
The codebase is "type": "module" (ESM) and the dominant pattern already
used .js extensions, but 35 files had a mix of extensionless and .js
imports — sometimes within the same file. Standardize to .js everywhere.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 08:51:40 -07:00
A
24bdf664ab
fix(types): resolve TypeScript strict mode errors in production code (#2824)
Fix 24 TypeScript strict mode errors across 7 production files:

- interactive.ts: guard against undefined `val` in validate callback
- list.ts: use already-narrowed `conn` variable instead of `selected.connection`
- run.ts: widen `buildCloudLines` defaults param to `Record<string, unknown>`
- digitalocean.ts: use `toRecord()` to safely drill into nested API responses;
  capture narrowed `oauthCode` in const for async closure
- history.ts: backfill missing record IDs via `backfillRecordIds()` helper;
  use `v.safeParse` output directly to get properly typed records
- index.ts: use `Manifest` type for `showUnknownCommandError` parameter
- orchestrate.ts: capture narrowed `tunnel` and `getConnectionInfo` in const
  variables before async closures

Fixes #2821

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-20 03:17:04 -07:00
Ahmed Abushagur
aa4b2a23d6
feat: auto-reconnect on SSH drops during interactive session (#2806)
When SSH exits with code 255 (connection dropped/timed out), retry up
to 5 times with 3s delay between attempts. Clean exits (0), Ctrl+C
(130), and agent crashes exit immediately without retrying.

Only applies to remote clouds — local sessions skip reconnect logic.

Signed-off-by: L <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-19 22:28:10 -07:00
Ahmed Abushagur
ed127cf592
feat: never-give-up resilience layer (#2807)
Some checks failed
CLI Release / Build and release CLI (push) Failing after 5s
Lint / Biome Lint (push) Failing after 4s
Lint / macOS Compatibility (push) Successful in 15s
Lint / ShellCheck (push) Successful in 59s
* feat: never-give-up resilience layer — retry every failure instead of exiting

Add retryOrQuit() helper to shared/ui.ts that prompts "Try again? (Y/n)"
after any recoverable failure. Wrap all fatal exit points with retry loops:

- Cloud auth (Hetzner, DigitalOcean, AWS, GCP): retry after 3 failed tokens
- API key acquisition: retry after 3 failed OAuth+manual attempts
- Server creation: retry on any createServer failure (both fast & sequential)
- SSH readiness: retry on waitForReady timeout
- Agent install: retry on install failure
- Pre-launch hooks: retry on preLaunch failure

Non-interactive mode (SPAWN_NON_INTERACTIVE=1) still throws immediately.
Ctrl+C at any retry prompt exits cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(e2e): add AI-driven interactive test harness

Add --interactive mode to the E2E test framework. Instead of running spawn
in headless mode (SPAWN_NON_INTERACTIVE=1), this spawns the CLI in a real
PTY and uses Claude Haiku to respond to prompts like a human user would.

New files:
- sh/e2e/interactive-harness.ts — Bun script that drives the PTY + AI loop
- sh/e2e/lib/interactive.sh — Bash integration with the E2E framework

Usage:
  e2e.sh --cloud hetzner claude --interactive

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(qa): wire interactive E2E into scheduled QA pipeline

- Add `e2e-interactive` option to workflow_dispatch in qa.yml
- Add `e2e-interactive` run mode to qa.sh (loads cloud creds + ANTHROPIC_API_KEY)
- Runs `e2e.sh --cloud hetzner claude --interactive` directly (no Claude Code needed)
- Defaults to hetzner (cheapest), overridable via E2E_INTERACTIVE_CLOUD/AGENT env vars

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(qa): schedule interactive E2E daily at 6am UTC

Runs one agent (claude) on one cloud (hetzner) with AI-driven prompts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): offset soak cron to avoid GitHub Actions schedule dedup

GitHub Actions deduplicates overlapping cron schedules into one run,
making `github.event.schedule` unpredictable. The soak test at `0 3 * * 1`
was getting absorbed by the `0 */4 * * *` quality sweep and never firing
as reason=soak.

Move soak to `30 1 * * 1` (Monday 1:30am UTC) — safely between the
0am and 4am quality sweep slots. Interactive E2E at `0 6 * * *` is
already safe (between the 4am and 8am slots).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): add e2e-interactive to trigger server valid reasons

The trigger server validates reason query params against an allowlist.
Without this, the `e2e-interactive` dispatch returns 400.

Also note: `soak` is already in VALID_REASONS in the repo but the running
service on the QA VM is stale — needs a restart to pick up both soak and
e2e-interactive reasons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:33:22 -07:00
Ahmed Abushagur
2280550c18
perf: skip cloud-init for minimal-tier agents with tarballs/snapshots (#2804)
* perf: skip cloud-init for minimal-tier agents with tarballs/snapshots

Ubuntu 24.04 base images already have curl + git, so minimal-tier
agents (claude, opencode, zeroclaw, hermes) don't need the cloud-init
package install step when using tarballs or snapshots.

Adds skipCloudInit flag to CloudOrchestrator — set automatically when
(tarball || snapshot) && tier === "minimal". Each cloud's waitForReady
checks this flag and calls waitForSshOnly instead of waitForCloudInit.

Saves ~30-60s on minimal-tier agent deploys with --fast or --beta tarball.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add --fast mode and updated beta features to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove timing table from README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-19 16:14:49 -07:00
Ahmed Abushagur
5efbcf9ee7
feat: add --fast flag for parallel server boot + setup (#2796)
* feat: add --fast flag for parallel server boot + setup

Adds `--fast` flag that runs server creation concurrently with API key
prompt, account check, pre-provision hooks, tarball download, and env
config generation. Once SSH is up, uploads tarball and applies config.

--fast implies --beta tarball and --beta images, enabling snapshots
and pre-built tarballs automatically.

Flow without --fast (sequential):
  auth → API key → preProvision → size → create → boot → install → configure

Flow with --fast (parallel):
  auth → size → [create+boot | API key | preProvision | tarball download | accountCheck]
              → upload tarball → inject env → configure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add --beta parallel as standalone opt-in for parallel setup

--beta parallel enables the parallel orchestration without implying
tarball/images. --fast still implies all three (tarball + images +
parallel).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 10:26:54 -07:00
A
15a62a9ad0
fix(cli): use tryCatch for JSON.parse in loadPreferredModel (#2782)
tryCatchIf(isFileError) only catches filesystem errors (ENOENT, EACCES),
but JSON.parse throws SyntaxError on corrupted preferences.json. This
was the same bug fixed in 16a2f180 across 4 files, but orchestrate.ts
was missed. A corrupted ~/.spawn/preferences.json would crash the CLI
instead of gracefully falling back to no preferred model.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 20:15:17 -07:00
Ahmed Abushagur
6e92cc832b
feat: add systemd auto-update service for agents on cloud VMs (#2728)
Installs a systemd timer + oneshot service that updates the agent binary
and system packages every 6 hours without disrupting running instances.

Agent update safety:
- Binary agents (Go, Rust): Linux keeps old inode in memory; safe to replace
- npm agents: Node.js caches modules at startup; running processes unaffected
- New version takes effect on next restart via the existing restart loop

System update safety:
- Disables Ubuntu's unattended-upgrades to prevent dpkg lock contention
- Uses flock -w 300 on /var/lib/dpkg/lock-frontend before apt operations
- DEBIAN_FRONTEND=noninteractive with --force-confdef/--force-confold

User-facing:
- "Auto-update" option in setup multiselect (default on, user can uncheck)
- Skipped for local cloud and non-systemd systems
- Non-fatal: setup failure doesn't block agent launch
- Logs to /var/log/spawn-auto-update.log

Timer: 15min after boot, then every 6h with 30min random jitter.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:34:12 -07:00
Ahmed Abushagur
66b16d8651
feat: add Windows PowerShell support — remove bash dependency for local execution (#2727)
Replace hardcoded "bash" shell references with platform-aware utilities so
spawn works natively from PowerShell on Windows without WSL or Git Bash.

- New shared/shell.ts: isWindows(), getLocalShell(), getInstallScriptUrl(),
  getInstallCmd(), getWhichCommand() with platform override for testability
- local/local.ts: use getLocalShell() for runLocal() and interactiveSession()
- commands/run.ts: spawnScript/runScriptHeadless use getLocalShell()
- commands/update.ts: Windows downloads install.ps1, runs via PowerShell
- update-check.ts: Windows auto-update uses install.ps1; "where" replaces "which"
- shared/orchestrate.ts: PowerShell-compatible .spawnrc setup for local Windows
- Remote SSH commands unchanged — remote servers are always Linux

Closes #2726

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-17 16:35:23 -07:00
A
0ea2692e1e
fix(github-auth): always run gh setup when user explicitly opts in (#2674)
When the user selects the GitHub CLI step in setup options (interactive
prompt or --steps github), offerGithubAuth() was silently returning early
if no local gh token was found by detectGithubAuth(). This made the step
unreachable for users without gh installed locally — exactly the ones who
need remote setup most.

Fix: accept an `explicitlyRequested` parameter in offerGithubAuth(). When
true, skip the githubAuthRequested guard and always run the remote install.
The orchestrator passes enabledSteps?.has("github") as this flag.

detectGithubAuth() still auto-enables the step when a local token exists
(convenience forwarding), but can no longer block a user-explicit request.

Fixes #2672

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-15 22:19:38 -07:00
Ahmed Abushagur
0a7a95ec3c
feat: add custom model selection to all agents (#2659)
Move "Custom model" from OpenClaw-specific to common setup steps so
every agent shows it in the setup menu. Add modelEnvVar to agents that
support model override via environment variable:

- Kilo Code: KILOCODE_MODEL
- ZeroClaw: ZEROCLAW_MODEL
- Hermes: LLM_MODEL
- Junie: JUNIE_MODEL

When a custom model is selected, the env var is injected into .spawnrc
alongside the other agent env vars. OpenClaw continues to use its
existing configure() path. Claude and Codex don't have modelEnvVar
since they handle model routing differently.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 12:44:48 -07:00
A
f7c23de716
feat: add downloadFile to CloudRunner + local OpenClaw config merge (#2636)
* feat: add downloadFile to CloudRunner + local OpenClaw config merge

Add `downloadFile(remotePath, localPath)` to the CloudRunner interface
and implement it across all 6 cloud providers (Hetzner, AWS, GCP,
DigitalOcean, Sprite, Local) — mirroring the existing `uploadFile` with
reversed SCP direction.

Replace the OpenClaw config write with a download → deep-merge → upload
flow so config merging happens in our own linted TypeScript instead of
a remote script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: move isPlainObject and deepMerge to shared utils

Extract `isPlainObject` to `shared/type-guards.ts` and `deepMerge` to
`shared/parse.ts` so they're reusable across the codebase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: promote isPlainObject to shared package, use across codebase

Move `isPlainObject` from cli/type-guards.ts into
@openrouter/spawn-shared so it can be used everywhere. Replace
inline `val !== null && typeof val === "object" && !Array.isArray(val)`
checks in:

- shared/type-guards.ts (toRecord, toObjectArray)
- shared/parse.ts (parseJsonObj)
- cli/manifest.ts (isValidManifest)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: remove type-guards re-export, import directly from spawn-shared

Delete `packages/cli/src/shared/type-guards.ts` (was just a re-export
barrel). All 35 consuming files now import `getErrorMessage`, `isString`,
`isNumber`, `isPlainObject`, `toRecord`, etc. directly from
`@openrouter/spawn-shared`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 15:47:32 -07:00
Ahmed Abushagur
d435963dbc
fix: remove WhatsApp from setup, nothing pre-selected by default (#2626)
WhatsApp setup is too complex for normal users (QR scan + separate
device + pairing). Remove it from the setup options entirely.

Also change multiselect defaults to nothing pre-selected — let users
opt in to what they want instead of pre-selecting for them.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 14:10:28 -07:00
A
c878e5b5d8
feat: persist tunnel metadata so spawn ls can re-establish dashboard proxy (#2620)
When an agent has an SSH tunnel (e.g., OpenClaw dashboard), store the
tunnel remote port and browser URL template in connection.metadata at
spawn time. On reconnect via `spawn ls` → "Enter agent", re-establish
the SSH tunnel and open the dashboard automatically.

- Add saveMetadata() to history.ts for merging key-value pairs into records
- Store tunnel_remote_port and tunnel_browser_url_template in orchestrate.ts
- Re-establish tunnel in cmdEnterAgent (connect.ts) when metadata is present

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 15:43:13 -04:00
Ahmed Abushagur
9f51244cb2
fix: messaging UX — silence doctor, fix groupPolicy, drop early WhatsApp pairing (#2607)
* fix: messaging UX — silence doctor, fix groupPolicy, remove early WhatsApp pairing

- Set groupPolicy to "open" for both Telegram and WhatsApp (was
  "allowlist" with empty allowFrom, causing doctor warnings)
- Suppress doctor warning spam by redirecting openclaw config set
  stdout to /dev/null
- Remove WhatsApp pairing prompt (appeared immediately after QR scan
  before user could message the bot — now just tells them the command)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: improve Telegram/WhatsApp pairing instructions

Add step-by-step instructions for Telegram pairing so users know to
search for their bot in Telegram and message it. Improve WhatsApp
post-link instructions to explain how contacts pair.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: pre-select Telegram in setup options as recommended channel

Telegram has the smoothest setup UX (bot token + pairing code) compared
to WhatsApp (QR scan + separate device). Pre-select it alongside Chrome
in the multiselect and label it as "recommended" in the hint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 03:46:19 -04:00
Ahmed Abushagur
ca5fe851cd
fix: proper Telegram/WhatsApp channel setup using config + pairing (#2605)
Telegram is a built-in channel, not a plugin. Replace broken
`openclaw plugins enable telegram` (OOM) and `openclaw channels add`
(doesn't exist) with proper setup:

- Write channel config (botToken, dmPolicy: pairing, groups) directly
  into the atomic JSON config file during setup
- After gateway starts, prompt user to pair via
  `openclaw pairing approve <channel> <CODE>`
- WhatsApp: QR scan via `openclaw channels login`, then pairing
- Bump version to 0.17.16

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 02:21:02 -04:00
Ahmed Abushagur
b3f221f5bd
fix: use openclaw onboard for channel setup (#2598)
* fix: set telegram groupPolicy to open during channel setup

OpenClaw defaults groupPolicy to "allowlist" with an empty groupAllowFrom,
which silently drops all group messages. Set it to "open" after adding the
Telegram channel so group messages work out of the box.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use OpenClaw config file for Telegram setup instead of broken CLI commands

Telegram is a built-in channel in OpenClaw, not a plugin. The previous
approach used `openclaw plugins enable telegram` (caused OOM on 2GB) and
`openclaw channels add --channel telegram` (command doesn't exist).

Now writes Telegram config (botToken, enabled, groupPolicy) directly into
the atomic JSON config file during setup. Also sets groupPolicy to "open"
so group messages work out of the box instead of being silently dropped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use openclaw onboard for channel setup instead of manual config

OpenClaw has a built-in `openclaw onboard` command that interactively
guides users through Telegram/WhatsApp channel setup. Use that instead
of manually prompting for tokens and writing config ourselves.

- Remove custom Telegram token prompt from agent-setup.ts
- Remove broken `openclaw channels add` and `openclaw plugins enable`
- Run `openclaw onboard` after gateway starts for channel setup
- Base config (API key, gateway, model) still written atomically

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:45:16 -04:00
Ahmed Abushagur
06bbbcb2a4
fix: move channel setup to after gateway starts (#2590)
* fix: move Telegram/WhatsApp channel setup to after gateway starts

OpenClaw's `channels add` and `channels login` commands require a running
gateway. Previously, Telegram token configuration ran in setupOpenclawConfig
(pre-gateway) using `openclaw config set`, causing the gateway to hang on
startup when a token was present for a disabled-by-default plugin.

Now:
- Plugin enables stay in setupOpenclawConfig (pre-gateway)
- Channel config (token add, QR login) runs in orchestrate.ts step 11c
  after the gateway is up, using `openclaw channels add/login`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* security: use shellQuote instead of jsonEscape for Telegram token

jsonEscape uses JSON.stringify which produces double-quoted strings that
the shell interprets, creating a command injection vector. shellQuote
wraps in single quotes, preventing shell interpretation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: fix biome export ordering in interactive.ts and manifest.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 13:47:50 -07:00
Ahmed Abushagur
f683dd857b
feat: add --config and --steps CLI flags for programmatic setup (#2545)
* feat: add Telegram and WhatsApp options to OpenClaw setup picker

Adds separate "Telegram" and "WhatsApp" checkboxes to the OpenClaw
setup screen:

- Telegram: prompts for bot token from @BotFather, injects into
  OpenClaw config via `openclaw config set`
- WhatsApp: reminds user to scan QR code via the web dashboard
  after launch (no CLI setup possible)

Updates USER.md with channel-specific guidance when either is selected.

Bump CLI version to 0.16.16.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: run WhatsApp QR scan interactively before TUI launch

Instead of punting WhatsApp setup to "after launch", runs
`openclaw channels login --channel whatsapp` as an interactive SSH
session between gateway start and TUI launch. The user scans the
QR code with their phone during provisioning setup.

Flow: gateway starts → tunnel set up → WhatsApp QR scan → TUI launch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update WhatsApp hint to reflect pre-TUI QR scanning

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add --config and --steps CLI flags for programmatic setup

Add --config <path> flag to load spawn options from a JSON config file
(model, steps, name, setup data like telegram_bot_token). Add --steps
<list> flag for comma-separated setup step control. Both enable the
web UI and headless automation to control which setup steps run.

Priority order: CLI flags > --config file > env vars > defaults.

- New spawn-config.ts module with valibot validation
- OptionalStep extended with dataEnvVar and interactive metadata
- validateStepNames() for step name validation with warnings
- Telegram setup reads TELEGRAM_BOT_TOKEN env var before prompting
- WhatsApp auto-skipped in headless mode with warning
- promptSetupOptions() skipped when SPAWN_ENABLED_STEPS already set
- E2E verify helpers for github, browser, telegram setup artifacts
- QA reference file documenting all agent setup options
- Version bump to 0.17.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add --model flag and priority order tests

- Add --model <id> CLI flag that sets MODEL_ID env var
- --model is extracted before --config so it takes priority
- Add config-priority.test.ts with 8 tests verifying:
  - --model overrides config model
  - --steps overrides config steps
  - --steps "" disables all steps
  - --name overrides config name
  - Config tokens apply as defaults
  - Explicit env vars override config tokens
- Remove preferences.json from priority order docs (not needed)
- Add --model to help text and unknown-flag guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add --model, --config, --steps to README

Document config file format, setup steps table, and new CLI flags
in the commands table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review feedback

- Move null byte check before path resolution (defense-in-depth)
- Move agent-setup-options.md from .claude/rules/ to .docs/ (git-ignored)
  per documentation policy

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve rebase conflicts and deduplicate --model flag extraction

Rebase on main introduced a duplicate --model flag extraction block
(one from the PR at line 804, one from main at line 941). Consolidated
into the single early extraction point with -m shorthand support.
Also removed duplicate --model entry from KNOWN_FLAGS set.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-13 00:32:58 +00:00
Ahmed Abushagur
d2d71b17ef
feat: add --model flag and preferences file for LLM model override (#2543)
Adds --model / -m CLI flag to override the agent's default LLM model:
  spawn codex gcp --model openai/gpt-5.3-codex

Also supports persistent per-agent model preferences via config file at
~/.config/spawn/preferences.json:
  { "models": { "codex": "openai/gpt-5.3-codex" } }

Priority: --model flag > preferences file > agent default.

This enables a future web UI to pass model selection via CLI args when
invoking spawn programmatically to provision machines.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 18:47:09 -04:00
A
5031d84e6c
refactor: eliminate process-global mock.module() pollution in tests (#2490)
Replace mock.module() calls with dependency injection to prevent
cross-file test pollution in Bun's shared worker process. Changes:

- orchestrate.ts: add getApiKey to OrchestrationOptions
- billing-guidance.ts: add injectable BillingGuidanceDeps parameter
- delete.ts: add optional deleteHandler parameter to confirmAndDelete
- update.ts: add UpdateOptions with injectable runUpdate function
- sprite.ts: add optional spawnFn parameter to interactiveSession
- Remove unnecessary oauth mocks from junie-agent and do-snapshot tests

Only @clack/prompts mock (shared via test-helpers.ts) and
do-payment-warning.test.ts (safe spread pattern) remain.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-10 23:57:57 -07:00
A
9a1dad7fcb
feat: gate tarball install behind --beta=tarball flag (#2482)
* feat: gate tarball install behind --beta=tarball flag

Tarball install is not yet reliable enough to be the default.
Move it behind an opt-in --beta=tarball flag so users can test it
explicitly while live install remains the default path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: support multiple --beta flags (repeatable)

Parse all --beta flags from args in a loop, collecting them into a
comma-separated SPAWN_BETA env var. Consumers check for their feature
with Set.has() so multiple beta features can be active simultaneously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: replace for(;;) loop with extractAllFlagValues helper

Cleaner approach: a dedicated helper mutates args in place and returns
all values for a repeatable flag, replacing the infinite loop pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-10 21:24:51 -07:00
A
a7a2032584
refactor: replace ~50 try/catch blocks with Result helpers across 20 files (#2479)
Convert catch-all, catch-swallow, catch-return-fallback, and catch-classify
patterns to use tryCatch/asyncTryCatch/unwrapOr from @openrouter/spawn-shared.

Files changed: aws.ts, hetzner.ts, digitalocean.ts, gcp.ts, run.ts, delete.ts,
shared.ts, ssh.ts, agent-setup.ts, orchestrate.ts, ui.ts, index.ts,
update-check.ts, update.ts, status.ts, picker.ts, interactive.ts, list.ts,
pick.ts, ssh-keys.ts, billing-guidance.ts, oauth.ts, sprite.ts

Preserved all try/finally-only blocks, security-validation-exit blocks,
billing/classify blocks, spinner cleanup, and top-level handleError blocks.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
2026-03-10 19:26:41 -07:00
A
3fd17e3d1d
refactor: replace indiscriminate try/catch with guarded Result helpers (#2477)
Add tryCatchIf/asyncTryCatchIf with error predicates (isFileError,
isNetworkError, isOperationalError) so operational errors are handled
explicitly while programming bugs (TypeError, ReferenceError) propagate
and crash visibly instead of being silently swallowed.

Transforms ~40 try/catch blocks across 14 files:
- File I/O (manifest cache, config loading, history) → tryCatchIf(isFileError)
- Network/fetch (API calls, version checks, OAuth) → asyncTryCatchIf(isNetworkError)
- SSH/subprocess (agent setup, tunnel) → asyncTryCatchIf(isOperationalError)
- API retry loops (DO, Hetzner) → guard retries with isNetworkError

Intentionally keeps ~85 try/catch blocks as-is (cleanup/finally, retry
loops, user-facing error handlers, catch-classify-rethrow patterns).

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-10 18:55:07 -07:00
A
b3938144b7
fix: validate model ID before shell interpolation (fixes #2460) (#2472)
Add validateModelId() to reject model IDs containing shell metacharacters.
The validation is applied in orchestrate.ts immediately after resolving
MODEL_ID from env/agent defaults, before the value reaches any agent
configure function or runServer call. Invalid model IDs are dropped to
undefined with a warning.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-10 20:31:32 -04:00
Ahmed Abushagur
d82dea811d
feat: unified arrow-key selection + setup checkboxes (#2459)
* feat: unified arrow-key selection + setup checkboxes

Replace p.autocomplete (type-ahead) with p.select (arrow-key navigation)
for agent and cloud selection. Add p.multiselect checkboxes for optional
post-provision setup steps (GitHub CLI, Chrome browser), all ON by default.

Three fast prompts: agent → cloud → setup options. Defaults: OpenClaw,
first cloud with credentials, all steps enabled.

Key changes:
- interactive.ts: p.autocomplete → p.select with initialValue defaults
- interactive.ts: promptSetupOptions() with p.multiselect, exported for reuse
- run.ts: wire setup options into cmdRun direct path
- agents.ts: OptionalStep type, getAgentOptionalSteps() static metadata
- orchestrate.ts: read SPAWN_ENABLED_STEPS env var, gate GitHub auth + configure
- agent-setup.ts: gate Chrome install with enabledSteps in setupOpenclawConfig
- Version bump 0.15.40 → 0.16.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: mirror tarball files to $HOME for non-root SSH users (GCP, AWS)

Tarballs are built with absolute /root/ paths, but GCP and AWS Lightsail
SSH as a regular user whose $HOME is /home/<user>/. After extraction,
binaries like `claude` end up at /root/.claude/local/bin/ but the
launchCmd looks in $HOME/.claude/local/bin/ — causing "command not found".

Add a post-extraction step that copies /root/ dotfiles to $HOME/ when
the SSH user isn't root. This fixes `spawn claude gcp` failing with
exit code 127 after tarball install.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: A <258483684+la14-1@users.noreply.github.com>
2026-03-10 14:19:08 -07:00
Ahmed Abushagur
c77ca106d2
feat: ssh tunnel + browser auto-open for OpenClaw web dashboard (#2452)
OpenClaw runs a web dashboard on port 18791 of the remote VM. This
change SSH-tunnels that port to localhost and auto-opens the browser,
giving users a web UI with zero CLI knowledge needed.

- Add TunnelConfig to AgentConfig interface (agents.ts)
- Add startSshTunnel function with port-finding logic (ssh.ts)
- Capture gateway token in closure so the same token is used for both
  the remote config and the browser URL (agent-setup.ts)
- Wire tunnel into orchestration pipeline between preLaunch and
  interactiveSession (orchestrate.ts)
- Add getConnectionInfo to CloudOrchestrator interface and implement
  in all SSH-based clouds (DO, Hetzner, AWS, GCP)
- Local: opens browser directly at localhost:18791
- Sprite: gracefully skipped (no standard SSH)
- Add USER.md bootstrap to guide OpenClaw users to web dashboard

Closes #2449
Supersedes #2418

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-10 14:25:43 -04:00