Commit graph

109 commits

Author SHA1 Message Date
Ahmed Abushagur
b5d174a472
fix: pin Codex to 0.94.0 + wire_api=chat for multi-turn stability (#1518)
* fix: switch Codex wire_api from "responses" to "chat" for multi-turn stability

The Responses API format causes "Invalid Responses API request" errors on
the second turn and beyond — conversation history items round-trip through
OpenRouter with null content fields and missing IDs that fail validation.
Chat Completions format is fully supported and avoids this issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pin Codex to 0.94.0 + wire_api=chat for multi-turn stability

OpenRouter's Responses API proxy drops required fields (id, content) from
conversation-history items on multi-turn requests, causing "Invalid
Responses API request" at input[6]+. Codex >=0.97.0 removed wire_api=chat
support (openai/codex#10157), so we pin to 0.94.0 — the last release where
Chat Completions format still works.

Tracking: https://github.com/openai/codex/issues/12114
TODO: unpin once OpenRouter /responses handles round-trip correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 04:49:35 -05:00
A
b094accb93
fix: save connection info for all Sprite agents in cloud_provision (#1510)
5 of 6 Sprite agent scripts silently skipped saving connection info
for 'spawn list', because only sprite/claude.sh defined the
agent_save_connection hook. All other clouds save connection info in
their create_server() equivalent; move save_vm_connection into
cloud_provision() in sprite/lib/common.sh to match that pattern and
cover all agents uniformly. Remove now-redundant agent_save_connection
from sprite/claude.sh.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 01:51:25 -05:00
L
d5690a8b11
feat: spawn name prompt + kebab resource naming across all clouds (#1507)
* feat: add spawn name prompt and project confirmation to GCP flow

Ask for spawn name upfront (before auth), derive kebab-case default for
VM naming, and confirm the current GCP project before using it.

New interaction order:
  1. Spawn name: "My Dev Box" → kebab "my-dev-box" exported as
     GCP_INSTANCE_NAME_KEBAB
  2. gcloud auth + project confirm: "Current project: X  Keep? [Y/n]"
     If no → project picker shown
  3. SSH key
  4. Machine type picker (existing)
  5. Zone picker (existing)
  6. Instance name prompt: "Instance name [my-dev-box]: "
     User can press Enter to accept or type a custom name

New functions:
  _to_kebab_case()         — lowercases, replaces non-alnum with hyphens
  _gcp_prompt_spawn_name() — prompts for display name, exports kebab default;
                             honours SPAWN_NAME env var set by CLI (--name flag)

Modified:
  _gcp_resolve_project()  — adds Y/n confirmation when project already set
  get_server_name()       — shows kebab default in prompt, accepts Enter
  cloud_authenticate()    — calls _gcp_prompt_spawn_name first

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat: add spawn name prompt to all clouds via shared/common.sh

Move _to_kebab_case() and prompt_spawn_name() to shared/common.sh so all
clouds get upfront spawn name prompting and kebab-based resource naming.

shared/common.sh:
  + _to_kebab_case()    — "My Dev Box" → "my-dev-box"
  + prompt_spawn_name() — asks for display name, exports SPAWN_NAME_DISPLAY
                          and SPAWN_NAME_KEBAB; skips if already set;
                          honours SPAWN_NAME env var from CLI --name flag
  ~ get_resource_name() — replaces silent SPAWN_NAME fallback with a visible
                          prefilled default: "Enter server name [my-dev-box]: "

Per-cloud changes (cloud_authenticate gains prompt_spawn_name first):
  hetzner, fly, aws, daytona, digitalocean, sprite — one-line change each

gcp/lib/common.sh:
  - Remove _to_kebab_case()        (now in shared)
  - Remove _gcp_prompt_spawn_name() (now in shared as prompt_spawn_name)
  ~ cloud_authenticate: _gcp_prompt_spawn_name → prompt_spawn_name
  ~ get_server_name: simplified back to get_validated_server_name
    (shared get_resource_name now shows the kebab default in the prompt)

Result — every cloud shows this flow upfront:
  Spawn name (e.g. "My Dev Box"): My Claude Box
  ℹ Resource name: my-claude-box
  ...
  Enter server name [my-claude-box]: ⏎

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix: use "Use project '...'?" instead of "Keep this project?" in GCP prompt

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 22:22:59 -08:00
A
c3d251100b
fix: inline temp file cleanup in setup_shell_environment to preserve EXIT trap (#1489)
Replace both the trap-clobbering `trap 'rm -f ...' EXIT` calls and the
inline `rm -f` approach with `track_temp_file()` from shared/common.sh.
This registers temp files with the centralized cleanup handler that is
already set up on EXIT/INT/TERM, so:
- Temp files are cleaned up even on interrupt (not just success path)
- The calling script's EXIT trap is never clobbered
- _sprite_retry wrappers are preserved for transient error recovery

Agent: pr-maintainer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:30:48 +00:00
Ahmed Abushagur
9e2f84adf0
fix: use native OpenRouter model_provider for Codex CLI config (#1490)
Codex CLI's OPENAI_BASE_URL env var approach causes "Invalid Responses
API request" errors because OpenRouter doesn't fully support the
Responses API wire format via base URL override. Switch all 8 codex
scripts to use ~/.codex/config.toml with model_provider="openrouter"
which uses the native OpenRouter integration.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 18:47:40 -05:00
Ahmed Abushagur
4d32923d5f
fix: add retry logic for transient Sprite API errors (#1487)
Sprite API calls intermittently fail with TLS handshake timeouts and
connection resets. Add _sprite_retry() wrapper that retries up to 3
times with 3s delay on transient errors.

Wrapped calls: sprite create, sprite exec (run_sprite), sprite exec
-file (upload_file_sprite, setup_shell_environment uploads).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 17:49:29 -05:00
A
b29cf4a75d
fix: sync cloud READMEs with current agent list (#1486)
READMEs across all 8 clouds still referenced 5 removed agents
(NanoClaw, Cline, gptme, Plandex, Continue) and were missing
ZeroClaw. Users following these docs got 404 errors.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 17:47:57 -05:00
Ahmed Abushagur
a063fe61cd
fix: sprite npm PATH resolution and gateway timeout (#1484)
* fix: sprite npm PATH resolution and gateway timeout

Sprites use nvm-managed node, so npm global bin is at
/.sprite/languages/node/nvm/.../bin/ which isn't in default PATH.
Dynamically resolve $(npm prefix -g)/bin in install, launch, and
gateway commands for all sprite agents.

Also increase openclaw gateway timeout from 30s to 60s — gateway
starts slowly on sprites but TUI connects once ready.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add opencode bin dir to PATH in sprite launch command

OpenCode installs to $HOME/.opencode/bin/ which isn't in the sprite's
default PATH or the npm prefix path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 16:49:52 -05:00
Ahmed Abushagur
3b1f87e656
fix: pass -o org flag to all sprite CLI commands (#1479)
* fix: pass -o org flag to all sprite CLI commands

sprite create/exec/list/destroy fail with "authentication failed" when
the org isn't passed explicitly. Detect the selected org after login and
thread it through all sprite commands via _sprite_org_flags().

Also fix ensure_sprite_authenticated to fail loudly instead of
swallowing errors with || true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: sprite scripts fail when zsh is not available

setup_shell_environment overwrites .bashrc with `exec zsh`, but sprites
don't have zsh installed. This breaks PATH and causes all agent launch
commands that source .zshrc to fail.

- Only switch to zsh if it's actually available on the sprite
- Replace `source ~/.zshrc` with explicit PATH in all sprite agent
  launch commands (openclaw, opencode, codex, kilocode)
- Fix start_openclaw_gateway to use explicit PATH instead of .zshrc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: openclaw not found on sprite — bashrc corruption from prior runs

On reused sprites, .bashrc still has `exec /usr/bin/zsh -l` from a prior
run. Sourcing it in the install command causes `&&` to short-circuit, so
`bun install -g openclaw` never runs.

- Clean up stale `exec zsh` lines from .bashrc at start of
  setup_shell_environment (fixes reused sprites)
- Use explicit PATH in openclaw install command instead of relying on
  .bashrc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use npm instead of bun for openclaw install on sprite

bun 1.3.9 on sprites fails with "connection closed" during dependency
resolution. Other sprite agents (codex, kilocode) already use npm
successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: openclaw install — npm+bun fallback, verify binary exists

Try npm first (more reliable on sprites), fall back to bun, then verify
the binary is actually in PATH before continuing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: persist npm global bin path to .spawnrc on sprites

npm installs openclaw successfully but its global bin dir isn't in the
sprite's default PATH. Detect the npm bin path after install, write it
to .spawnrc so gateway and launch commands (which source .spawnrc) find
the binary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 15:47:47 -05:00
L
a67d83ed38
feat: reorder agents and remove NanoClaw (#1477)
* feat: add ZeroClaw agent (14.9k stars, native OpenRouter support)

Add ZeroClaw — a Rust-based autonomous AI assistant framework by
Harvard/MIT/Sundai.Club communities — across all 8 clouds.

Scripts: local, hetzner, digitalocean, fly, aws, gcp, daytona, sprite
Install: bootstrap.sh with --install-rust + --install-system-deps
Config:  zeroclaw onboard --provider openrouter (via agent_configure)
Env:     OPENROUTER_API_KEY + ZEROCLAW_PROVIDER=openrouter (native support)
Launch:  zeroclaw agent

Note: ZeroClaw compiles from Rust source (~5-10 min build time).
A build-time warning is shown to set expectations.

Also update test/mock-curl-script.sh to stub zeroclaw install URLs and
add zeroclaw to mock agent binaries in test/mock.sh.

Bump CLI version 0.5.8 → 0.5.9.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat: reorder agents and remove NanoClaw

New agent order: claude → openclaw → zeroclaw → codex → opencode → kilocode

- Remove NanoClaw (8 scripts + manifest entry + matrix entries + README row)
- Reorder manifest.json agents section to match new order
- Reorder matrix entries by cloud (local/hetzner/fly/aws/daytona/digitalocean/gcp/sprite)
  with agents in new order within each cloud block
- Update README matrix table row order
- Update test/mock.sh mock agent binary list to match
- Bump CLI version 0.5.9 → 0.5.10

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 11:39:03 -08:00
L
f7458952b0
feat: remove Cline, gptme, Plandex, and Continue agents (#1475)
Delete 32 agent scripts ({cloud}/{cline,gptme,plandex,continue}.sh across
8 clouds), remove the 4 agents from manifest.json with all their matrix
entries, update README matrix rows, remove stale mock agent binaries and
plandex.ai URL patterns from test harness, update CLI help examples to use
remaining agents, and bump version 0.5.7 → 0.5.8.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 11:12:46 -08:00
A
5612cda40b
feat: remove Aider, Goose, Open Interpreter, Gemini CLI, Amazon Q from matrix (#1472)
These 5 agents are being dropped from the Spawn matrix. This removes
45 agent scripts across 9 clouds, cleans the manifest, test fixtures,
READMEs, CLI source, and shared library comments.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 12:31:00 -05:00
A
d8785c3d0b
security: fix command injection in cline auth via remote env var expansion (#1473)
All 9 cline.sh scripts embedded OPENROUTER_API_KEY directly into the
cloud_run command string, allowing shell metacharacter injection on the
remote server. Fix by escaping the dollar sign (\${OPENROUTER_API_KEY})
so the variable is expanded on the remote machine where it's already
set via agent_env_vars()/generate_env_config, not locally before being
passed to cloud_run.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 12:25:16 -05:00
Ahmed Abushagur
8ee54d01a8
fix: harden agent reliability + security across all clouds (#1468)
* docs: add spawn delete command to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden openclaw across all clouds — validation, reliability, performance

Fixes multiple issues causing openclaw to break on most clouds:

Bugs fixed:
- Double-prefixed model ID (openrouter/openrouter/auto) in config generation
- AWS gateway starting without env vars (missing .zshrc source)
- DigitalOcean sourcing .spawnrc instead of .zshrc for gateway
- Destructive rm -rf ~/.openclaw on re-runs (now mkdir -p)

Validation added:
- API key checked against OpenRouter /auth/key endpoint with re-prompt on failure
- Model ID verified against OpenRouter model list with re-prompt loop
- openrouter/auto and openrouter/free bypass model check

Reliability improvements:
- Standardized gateway launch with </dev/null & disown across all 9 clouds
- Gateway log auto-displayed on startup timeout for diagnostics
- 2GB swap added to cloud-init to prevent OOM on small VMs
- Portable install timeout (10 min) with macOS gtimeout fallback

Performance:
- Reordered spawn_agent: OAuth runs while VM provisions (saves 30-60s)
- Fly.io: bumped to 2GB RAM + 2 shared CPUs for openclaw
- Fly.io: tries bun first (faster), falls back to npm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip sudo in gh install when running as root (Fly.io containers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — skip validation in tests, quote escaped cmd, escape model_id

- verify_openrouter_key and verify_openrouter_model skip network calls when
  SPAWN_SKIP_API_VALIDATION, BUN_ENV=test, or NODE_ENV=test is set
- install_agent timeout wrapper now quotes the escaped command for defense in depth
- model_id in openclaw JSON now uses json_escape() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove double-escaping in install_agent that broke shell operators

install_agent() was wrapping commands with printf '%q' + bash -c before
passing them to the run callback. But run callbacks (run_server, run_sprite,
ssh_run_server) already handle escaping for remote transport. The double-
escaping turned && || > | into literal characters, causing 'source' to
treat the entire command as a single filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use local github-auth.sh instead of curling from main

When running from a local checkout, base64-encode the local
github-auth.sh and send it inline to the remote machine. This
ensures fixes (like the sudo skip for root) take effect immediately
without waiting for a merge to main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle github-auth errors gracefully instead of terminating

GitHub CLI setup is optional — failures should not abort the spawn
session. Guard both run_callback calls in offer_github_auth with
|| log_warn so the script continues even if gh install fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use GOOGLE_GEMINI_BASE_URL to route Gemini CLI through OpenRouter

Gemini CLI ignores OPENAI_BASE_URL — it uses GEMINI_API_KEY to talk
directly to Google's API. The OpenRouter key is not a valid Google
API key, so all requests fail with "API key not valid".

Use GOOGLE_GEMINI_BASE_URL to redirect Gemini CLI to OpenRouter's
endpoint. Fixes all 9 cloud gemini scripts + manifest.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: guard optional spawn_agent hooks so failures don't kill the session

With set -eo pipefail, any unguarded failure terminates the script.
Several optional operations in spawn_agent were unguarded:

- agent_configure: config file uploads (agent works with defaults)
- agent_save_connection: convenience JSON for spawn list
- agent_pre_launch: gateway daemons, startup hooks
- agent_pre_provision: pre-provision prompts
- .spawnrc shell hooks: hooking env vars into .bashrc/.zshrc

These now log warnings and continue instead of aborting. Critical
steps (cloud_authenticate, agent_install, cloud_provision) still
exit on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: audit and fix env vars, escaping, and error handling across all agents

Audit findings from 3 parallel agents, fixes applied:

**Env vars (4 agents fixed across 9 clouds each = 36 scripts):**
- Amazon Q: remove fake OPENAI_* vars (Q uses AWS auth, can't use OpenRouter)
- Cline: replace OPENAI_* env vars with `cline auth -p openrouter` command
- Open Interpreter: drop OPENAI_* vars, use only OPENROUTER_API_KEY (native support via --model flag)
- NanoClaw: add ANTHROPIC_BASE_URL to .env file (was missing, requests went to Anthropic directly)

**Escaping:**
- execute_agent_non_interactive: replace printf '%q' with single-quote wrapping to avoid double-escaping on Fly.io

**Manifest updated** for amazonq, cline, interpreter entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use setsid to detach openclaw gateway daemon from SSH sessions

The gateway daemon launch (`nohup openclaw gateway ... & disown`) hangs
on all clouds because SSH/exec channels wait for child FDs to close.
setsid creates a new session, fully detaching the daemon so the channel
can close immediately. Falls back to nohup where setsid is unavailable.

Consolidates the daemon launch into a shared start_openclaw_gateway()
function used by all 9 cloud scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: configure npm global prefix for non-root clouds (AWS, GCP, OVH)

AWS Lightsail, GCP, and OVH SSH as non-root users (ubuntu/login user),
so `npm install -g` fails with EACCES on /usr/local/lib/node_modules/.

Fix: configure npm prefix to ~/.npm-global during cloud-init/setup and
add ~/.npm-global/bin to the SSH PATH prefix so agent install commands
find globally-installed npm binaries without sudo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove broken OpenRouter routing from Gemini CLI scripts

Gemini CLI uses Google's native API format (/v1beta/models/:streamGenerateContent),
not the OpenAI-compatible format (/v1/chat/completions). No base URL override can
bridge this — the request formats are fundamentally incompatible. Same situation
as Amazon Q (uses vendor-specific auth/API).

Removed GEMINI_API_KEY and GOOGLE_GEMINI_BASE_URL from all 9 scripts + manifest.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: auto-install AWS CLI and gcloud SDK when missing

Instead of printing manual install instructions and exiting, both CLIs
now auto-install:

- AWS: downloads official .pkg (macOS) or .zip (Linux) installer
- GCP: uses brew cask on macOS, Google's tarball installer on Linux

Falls back to manual instructions if auto-install fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: nanoclaw — install Docker on Linux, fix hardcoded /root/ path

Two issues broke NanoClaw on all clouds:

1. .env upload hardcoded /root/nanoclaw/.env — fails on non-root clouds
   (AWS=ubuntu, GCP=user, OVH=ubuntu). Now uses upload_config_file with
   $HOME which expands on the remote side.

2. NanoClaw requires a container runtime. On Linux it uses Docker, but
   Docker was never installed. Added Docker install via get.docker.com
   to all cloud scripts (with sudo where SSH user is non-root).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review findings from PR #1463

- Reject symlinked github-auth.sh before base64-encoding (falls back to remote URL)
- Hide API key from process list using curl -K - instead of -H in verify_openrouter_key

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: quote OPENROUTER_API_KEY in cline auth to prevent command injection

Unquoted variable in `cline auth -p openrouter -k ${OPENROUTER_API_KEY}`
allows shell metacharacters in the key to execute arbitrary commands on
the remote server. Wrapping in escaped double quotes prevents expansion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 08:36:24 -05:00
Ahmed Abushagur
be904cbe1c
fix: install_agent double-escaping + github-auth reliability (#1460)
* docs: add spawn delete command to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden openclaw across all clouds — validation, reliability, performance

Fixes multiple issues causing openclaw to break on most clouds:

Bugs fixed:
- Double-prefixed model ID (openrouter/openrouter/auto) in config generation
- AWS gateway starting without env vars (missing .zshrc source)
- DigitalOcean sourcing .spawnrc instead of .zshrc for gateway
- Destructive rm -rf ~/.openclaw on re-runs (now mkdir -p)

Validation added:
- API key checked against OpenRouter /auth/key endpoint with re-prompt on failure
- Model ID verified against OpenRouter model list with re-prompt loop
- openrouter/auto and openrouter/free bypass model check

Reliability improvements:
- Standardized gateway launch with </dev/null & disown across all 9 clouds
- Gateway log auto-displayed on startup timeout for diagnostics
- 2GB swap added to cloud-init to prevent OOM on small VMs
- Portable install timeout (10 min) with macOS gtimeout fallback

Performance:
- Reordered spawn_agent: OAuth runs while VM provisions (saves 30-60s)
- Fly.io: bumped to 2GB RAM + 2 shared CPUs for openclaw
- Fly.io: tries bun first (faster), falls back to npm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip sudo in gh install when running as root (Fly.io containers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — skip validation in tests, quote escaped cmd, escape model_id

- verify_openrouter_key and verify_openrouter_model skip network calls when
  SPAWN_SKIP_API_VALIDATION, BUN_ENV=test, or NODE_ENV=test is set
- install_agent timeout wrapper now quotes the escaped command for defense in depth
- model_id in openclaw JSON now uses json_escape() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove double-escaping in install_agent that broke shell operators

install_agent() was wrapping commands with printf '%q' + bash -c before
passing them to the run callback. But run callbacks (run_server, run_sprite,
ssh_run_server) already handle escaping for remote transport. The double-
escaping turned && || > | into literal characters, causing 'source' to
treat the entire command as a single filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use local github-auth.sh instead of curling from main

When running from a local checkout, base64-encode the local
github-auth.sh and send it inline to the remote machine. This
ensures fixes (like the sudo skip for root) take effect immediately
without waiting for a merge to main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle github-auth errors gracefully instead of terminating

GitHub CLI setup is optional — failures should not abort the spawn
session. Guard both run_callback calls in offer_github_auth with
|| log_warn so the script continues even if gh install fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use GOOGLE_GEMINI_BASE_URL to route Gemini CLI through OpenRouter

Gemini CLI ignores OPENAI_BASE_URL — it uses GEMINI_API_KEY to talk
directly to Google's API. The OpenRouter key is not a valid Google
API key, so all requests fail with "API key not valid".

Use GOOGLE_GEMINI_BASE_URL to redirect Gemini CLI to OpenRouter's
endpoint. Fixes all 9 cloud gemini scripts + manifest.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: guard optional spawn_agent hooks so failures don't kill the session

With set -eo pipefail, any unguarded failure terminates the script.
Several optional operations in spawn_agent were unguarded:

- agent_configure: config file uploads (agent works with defaults)
- agent_save_connection: convenience JSON for spawn list
- agent_pre_launch: gateway daemons, startup hooks
- agent_pre_provision: pre-provision prompts
- .spawnrc shell hooks: hooking env vars into .bashrc/.zshrc

These now log warnings and continue instead of aborting. Critical
steps (cloud_authenticate, agent_install, cloud_provision) still
exit on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:21:55 -05:00
Ahmed Abushagur
159ad49fec
fix: harden openclaw across all clouds (#1456)
* docs: add spawn delete command to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden openclaw across all clouds — validation, reliability, performance

Fixes multiple issues causing openclaw to break on most clouds:

Bugs fixed:
- Double-prefixed model ID (openrouter/openrouter/auto) in config generation
- AWS gateway starting without env vars (missing .zshrc source)
- DigitalOcean sourcing .spawnrc instead of .zshrc for gateway
- Destructive rm -rf ~/.openclaw on re-runs (now mkdir -p)

Validation added:
- API key checked against OpenRouter /auth/key endpoint with re-prompt on failure
- Model ID verified against OpenRouter model list with re-prompt loop
- openrouter/auto and openrouter/free bypass model check

Reliability improvements:
- Standardized gateway launch with </dev/null & disown across all 9 clouds
- Gateway log auto-displayed on startup timeout for diagnostics
- 2GB swap added to cloud-init to prevent OOM on small VMs
- Portable install timeout (10 min) with macOS gtimeout fallback

Performance:
- Reordered spawn_agent: OAuth runs while VM provisions (saves 30-60s)
- Fly.io: bumped to 2GB RAM + 2 shared CPUs for openclaw
- Fly.io: tries bun first (faster), falls back to npm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip sudo in gh install when running as root (Fly.io containers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — skip validation in tests, quote escaped cmd, escape model_id

- verify_openrouter_key and verify_openrouter_model skip network calls when
  SPAWN_SKIP_API_VALIDATION, BUN_ENV=test, or NODE_ENV=test is set
- install_agent timeout wrapper now quotes the escaped command for defense in depth
- model_id in openclaw JSON now uses json_escape() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove double-escaping in install_agent that broke shell operators

install_agent() was wrapping commands with printf '%q' + bash -c before
passing them to the run callback. But run callbacks (run_server, run_sprite,
ssh_run_server) already handle escaping for remote transport. The double-
escaping turned && || > | into literal characters, causing 'source' to
treat the entire command as a single filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:25:48 +00:00
A
f3ffb6caed
fix: broken error message in multi-creds validation, predictable temp path (#1442)
1. _multi_creds_validate referenced undefined help_url variable, causing
   empty "Get new credentials from: " error messages when OVH credential
   validation fails. Added help_url as parameter and pass it from caller.

2. _spawn_inject_env_vars (used by 130+ agent scripts via spawn_agent)
   uploaded credentials to static /tmp/env_config path. The older
   inject_env_vars_ssh/inject_env_vars_cb functions document this as a
   symlink attack vector and use randomized paths. Fixed to match.

3. Removed dead inject_env_vars_fly and inject_env_vars_sprite functions
   (all agent scripts now use spawn_agent -> _spawn_inject_env_vars).

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 07:51:28 -05:00
Ahmed Abushagur
db4aaa0c73
fix: prevent SSH hangs, fix command escaping, pin Python 3.12 for aider (#1439)
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds

aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update aider/gptme/interpreter assertions from pip to uv

The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).

Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger test run for uv assertion fix

* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds

- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
  stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
  SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
  no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
  $escaped_cmd that prevented the remote shell from consuming escapes,
  breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
  where it escaped shell operators into literal characters since daytona exec
  has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add --pty to fly ssh console for interactive sessions

fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.

Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 04:23:15 -05:00
Ahmed Abushagur
d9e6d058e0
fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds (#1436)
aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 03:21:59 -05:00
Ahmed Abushagur
963144ecbd
fix: use pipx to install Aider across all clouds (#1429)
Ubuntu 24.04 blocks system-wide pip installs (PEP 668 externally-managed-
environment). Switch all aider.sh scripts from `pip install aider-chat`
to `python3 -m pip install pipx && pipx install aider-chat`, which
installs into an isolated virtualenv and works on all target distros.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:16:39 -08:00
A
f412fb69bc
ux: wait for OpenClaw gateway to be ready before launching TUI (#1385)
Fixes #1354 - users experienced a ~30s delay with "gateway not connected"
errors when trying to use OpenClaw immediately after launch.

Root cause: gateway takes time to bind to port 18789, but TUI launched
after only 2 seconds.

Solution: Add wait_for_openclaw_gateway() helper that polls the gateway
port (max 30s) before launching TUI, ensuring immediate usability.

Changes:
- shared/common.sh: Add wait_for_openclaw_gateway() function
- All openclaw.sh scripts (10 files): Replace sleep 2 with gateway readiness check

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 03:49:53 -05:00
A
d2866b2976
ux: standardize destroy_server() wrapper for OVH and Sprite (#1345)
Adds destroy_server() wrapper functions to OVH and Sprite cloud libraries
to match the standardized function name used by 8 other clouds.

Before:
- OVH used destroy_ovh_instance()
- Sprite had no destroy function
- Cross-cloud scripts couldn't use a uniform destroy_server() call

After:
- OVH: destroy_server() wraps destroy_ovh_instance()
- Sprite: destroy_server() wraps "sprite destroy <name>" CLI command
- Cross-cloud scripts can now call destroy_server() uniformly

This fixes the blocker for PR #1217 which hardcodes destroy_server() calls
that would silently fail for OVH and Sprite users.

Fixes #1178

Agent: ux-engineer

Co-authored-by: spawn-bot <bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-16 20:24:40 -05:00
A
654352bed0
security: fix predictable temp file path in sprite upload_file_sprite (#1330)
Replace PID-based temp path with cryptographically random generation
to prevent symlink attacks on remote servers.

Severity: MEDIUM
Finding: sprite/lib/common.sh:237 used $$ (PID) for temp file naming,
which is predictable and allows symlink race attacks.

Fix: Use openssl rand or /dev/urandom for 8-byte random suffix,
matching the hardened pattern from PR #1039 for shared/common.sh.

Related: #763 (security batch tracking issue)

Agent: security-auditor

Co-authored-by: spawn-bot <bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-16 20:22:22 -05:00
Ahmed Abushagur
758b575658
feat: add server lifecycle management (reconnect + delete) (#1363)
Wire up connection tracking across all 10 clouds so users can reconnect
to and delete previously spawned servers via `spawn list` and `spawn delete`.

Phase 1 - Connection tracking:
- Extend save_vm_connection() with cloud and metadata params
- Add save_vm_connection to create_server() in all cloud libs
- Extend VMConnection with cloud, deleted, deleted_at, metadata fields

Phase 2 - Delete via interactive picker:
- Add "Delete this server" option to spawn list picker
- Build delete scripts that reuse each cloud's destroy_server()
- Confirmation UX with spinner feedback
- Soft-delete marking in history (deleted records show [deleted])

Phase 3 - Standalone delete command:
- spawn delete (aliases: rm, destroy) with interactive picker
- Filter support: spawn delete -a <agent> -c <cloud>

Also improves reconnect hints for Fly (fly ssh console) and
Daytona (daytona ssh) connections.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 17:06:49 -08:00
L
55e6b2e88e
fix: use ~/.spawnrc for env vars instead of inlining into .bashrc (#1362)
Ubuntu's default .bashrc has an interactive-shell guard that exits
early in non-interactive contexts. When SSH runs a command string
(ssh -t user@host -- "cmd"), the shell is non-interactive, so
env vars appended to .bashrc are never loaded — causing Claude Code
to start without OpenRouter credentials and get rejected.

Fix: write env vars to ~/.spawnrc and have .bashrc/.zshrc source it.
Launch commands source ~/.spawnrc directly, bypassing the guard.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 17:05:17 -08:00
A
ec81c74594
refactor: introduce cloud adapter + spawn_agent runner system (#1340)
Eliminate ~70% boilerplate across 149 agent scripts by introducing a
standard cloud_* adapter interface and spawn_agent orchestration runner.

Each cloud's lib/common.sh now exports 7 adapter functions (cloud_authenticate,
cloud_provision, cloud_wait_ready, cloud_run, cloud_upload, cloud_interactive,
cloud_label) that wrap cloud-specific operations behind a uniform interface.

Agent scripts define hooks (agent_install, agent_env_vars, agent_launch_cmd,
etc.) and call `spawn_agent "Agent Name"` — the runner handles the full
deployment flow: auth → provision → wait → install → API key → env → config → launch.

- shared/common.sh: add spawn_agent(), _fn_exists(), _spawn_inject_env_vars()
- 10 cloud lib/common.sh files: add cloud_* adapter functions
- 149 agent scripts: rewrite to hook pattern (~40-80 lines → ~20-35 lines)
- test/run.sh: update 2 sprite test patterns for new adapter paths
- Net reduction: ~4,300 lines (2,257 added, 6,563 removed)

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 16:25:44 -08:00
A
392fbb7049
fix: source .bashrc in launch commands so env vars are available (#1325)
Env vars (OPENROUTER_API_KEY, ANTHROPIC_BASE_URL, etc.) are written to
~/.bashrc by inject_env_vars_* functions, but launch commands only
exported PATH inline — they never sourced .bashrc. This meant Claude
started without API keys.

Previously `source ~/.bashrc` was removed because fnm's eval corrupted
PATH. fnm has been completely removed from the codebase, so it's now
safe to source .bashrc again.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 15:00:24 -05:00
A
bcb59eb925
fix: stop sourcing rc files in launch command — fnm env destroys PATH (#1261)
Root cause: the launch command did `source ~/.bashrc; source ~/.zshrc; claude`.
The .zshrc contains `eval "$(fnm env)"` which outputs PATH with literal
"$PATH" in quotes instead of expanding it, destroying the entire PATH.

Confirmed via debugging:
- `ssh -t ... 'export PATH=...; which claude'` → works (/root/.bun/bin/claude)
- `ssh -t ... 'export PATH=...; source ~/.zshrc; which claude'` → "command not found"
- `source ~/.zshrc; echo $PATH` → `"/run/user/0/fnm_multishells/...":"$PATH"` (broken)

Fix:
- Remove `source ~/.bashrc` and `source ~/.zshrc` from ALL launch commands
- ssh -t creates a pseudo-terminal, so bash auto-sources .bashrc for env vars
- Explicit PATH export is all we need for finding the claude binary
- Remove fnm eval snippet from _finalize_claude_install (it poisoned rc files)
- Also: clean up stale ~/.bash_profile, fix cloud-init PATH, move node
  install after bun attempt

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 01:06:55 -08:00
A
3030b1d036
fix: revert .profile writes, use explicit PATH in launch commands (#1260)
Stop writing env vars to ~/.profile and ~/.bash_profile — only write to
.bashrc and .zshrc. The .profile approach caused issues because login
shells source it inconsistently across distros, and creating .bash_profile
makes bash -l skip .profile entirely.

Replace `bash -lc claude` launch commands with explicit PATH export +
source pattern across all cloud providers. This ensures claude is found
regardless of shell initialization quirks.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 00:43:49 -08:00
A
46e6f46008
fix: stop creating ~/.bash_profile — was destroying system PATH (#1258)
On Ubuntu/Debian, ~/.bash_profile doesn't exist by default. When bash
starts as a login shell (bash -l), it sources the FIRST file it finds
from: ~/.bash_profile, ~/.bash_login, ~/.profile. Since only ~/.profile
exists, that's what gets sourced — and ~/.profile sets up the standard
PATH (/usr/bin, /bin, etc.) and sources ~/.bashrc.

Our inject_env_vars_* functions and _finalize_claude_install were writing
to ~/.bash_profile and ~/.zprofile (either via touch+append or via
for-loop over all rc files). Creating ~/.bash_profile caused bash -l to
source it INSTEAD of ~/.profile, completely losing the standard PATH
setup. After deployment, even basic commands like `ls` would fail.

Fix: Only write to ~/.profile, ~/.bashrc, ~/.zshrc across all clouds
(shared, fly, sprite). These are the standard files that work correctly
on all Linux distros without breaking the shell initialization chain.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 00:27:28 -08:00
A
99b21e2797
fix: write env config to all shell startup files including .bash_profile (#1251)
Root cause: bash -l sources the FIRST of ~/.bash_profile, ~/.bash_login,
~/.profile. If ~/.bash_profile exists (e.g. from cloud-init), ~/.profile
is never read and our claude PATH exports are invisible.

Additionally, .bashrc has a non-interactive guard that skips exports when
sourced from non-interactive shells like `ssh host "cmd"` or `bash -lc`.

Fix: write env config and PATH entries to ALL shell startup files:
~/.profile, ~/.bash_profile, ~/.bashrc, ~/.zshrc, ~/.zprofile.
This ensures both login and interactive shells on any platform find claude.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 00:04:36 -08:00
A
dac4c62d6c
fix: try bun before npm for Claude Code install, fix PATH in launch (#1249)
Two fixes:
1. Swap fallback order from curl → npm → bun to curl → bun → npm.
   Bun is faster and typically pre-installed. Use `bun i -g`.

2. Fix "claude: command not found" at launch. The default .bashrc has
   a non-interactive guard (`case $- in *i*) ;; *) return;; esac`)
   that skips PATH exports when sourced from SSH command strings.
   Fix: write env config to ~/.profile (always sourced by login shells)
   in addition to .bashrc/.zshrc, and launch with `bash -lc claude`
   which starts a login shell that sources ~/.profile.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 23:44:02 -08:00
A
db06ff84e0
fix: run claude install --force and persist fnm PATH to shell configs (#1245)
After installing Claude Code (via any method), run `claude install --force`
to set up shell integration, then ensure fnm bootstrap is persisted to both
.bashrc and .zshrc so interactive sessions can find node.

Also simplify all launch commands across 9 clouds: instead of hardcoding
PATH entries that may miss fnm, source the rc files which now contain all
the necessary PATH entries from both inject_env_vars and _finalize_claude_install.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 23:34:09 -08:00
A
6357e0b2d1
fix: ask GitHub CLI setup before provisioning, not after (#1243)
Previously offer_github_auth prompted interactively inside inject_env_vars_*,
which runs after the server is already provisioned. This means the user sits
through provisioning before being asked a simple yes/no question.

Split into two phases:
- prompt_github_auth: asks the question early (before create_server)
- offer_github_auth: executes the install later (after server is up),
  using the stored answer without re-prompting

Falls back to interactive prompt if prompt_github_auth was never called,
so non-claude scripts and older clouds keep working unchanged.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 23:20:59 -08:00
A
d0847986f8
fix: use shared install_claude_code across all clouds with fnm PATH fix (#1242)
All cloud claude.sh scripts had inline curl-only installs with no fallback.
When the curl installer failed (transient outage, rate limit), installation
failed with no recovery. Additionally, fnm-installed Node.js was invisible
to subsequent SSH sessions because each SSH command runs in a non-interactive
shell that doesn't source .bashrc/.zshrc.

Changes:
- Migrate 8 cloud scripts to use shared install_claude_code (curl → npm → bun)
- Move _ensure_node_runtime before npm/bun install attempts (not after)
- Add fnm paths to claude_path so node is discoverable across SSH sessions
- Prefix npm/bun install commands with claude_path for PATH visibility
- Update test assertion to match new install_claude_code behavior

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 23:16:23 -08:00
L
fffb3591c4
feat: wire shared/github-auth.sh into all agent flows (#1216)
* feat: wire shared/github-auth.sh into all agent flows

Add offer_github_auth() to shared/common.sh and call it from the
inject_env_vars_* functions so all agent flows automatically offer
GitHub CLI setup after env var injection — no per-script changes needed.

Changes:
- shared/common.sh: add offer_github_auth() function, call it from
  inject_env_vars_ssh() and inject_env_vars_local()
- sprite/lib/common.sh: call offer_github_auth() from
  inject_env_vars_sprite()
- OVH is covered automatically (inject_env_vars_ovh delegates to
  inject_env_vars_ssh)

Behavior:
- Prompts "Set up GitHub CLI (gh) on this machine? (y/N):"
- Defaults to No (non-blocking for users who don't need it)
- Skippable via SPAWN_SKIP_GITHUB_AUTH=1 env var for CI/automation
- Uses safe_read for curl|bash compatibility
- Downloads and runs shared/github-auth.sh on the remote VM

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: add shared agent setup helpers, deduplicate hetzner scripts (#1236)

Add 5 composable helper functions to shared/common.sh (install_agent,
verify_agent, get_or_prompt_api_key, inject_env_vars_cb, launch_session)
that use the same callback pattern as offer_github_auth and
setup_claude_code_config. Refactor all 15 hetzner agent scripts to use
them, reducing total line count from 868 to 579 (-33%).

Phase 1 of multi-phase rollout — remaining clouds to follow.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 23:00:53 -08:00
L
d8ac64863d
fix: inject env vars into both .bashrc and .zshrc, fix PATH across all clouds (#1213)
API keys and env vars were only written to .zshrc, so SSH sessions using
bash couldn't find credentials. Also fixes incorrect ~/.claude/local/bin
PATH (claude installs to ~/.local/bin) and syncs interactive_session PATH
with cloud-init PATH across all 9 clouds.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 17:30:40 -08:00
A
49c8c4f60b
feat: add VM reconnect functionality to spawn list (#1175)
* feat: add VM reconnect functionality to spawn list (#1144)

Implements ability to reconnect to previously spawned VMs instead of
always creating new instances. Changes include:

- Add VMConnection interface to track IP, user, and server metadata
- Add save_vm_connection() bash function for scripts to persist connection info
- Modify spawn list to show connection status and offer reconnect option
- Support both SSH (cloud providers) and sprite console reconnection
- Update digitalocean/claude.sh and sprite/claude.sh as reference implementations

Fixes #1144

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* improve: add helpful error message when VM reconnect fails

Show user-friendly message suggesting to spawn a new VM if
reconnection fails, rather than just showing raw SSH error.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 00:16:53 -05:00
A
a79dd226ac
fix(security): prevent API key injection in koyeb/nanoclaw.sh and harden remote temp file permissions (#1049)
koyeb/nanoclaw.sh embedded the API key directly in a run_server command
string using single quotes. If the key contained a single quote, it could
break out and enable command injection. Replaced with the safe mktemp +
upload_file pattern used by all other nanoclaw scripts.

Also added chmod 600 before mv on remote /tmp/nanoclaw_env in 8 nanoclaw
scripts to restrict permissions on the credential file during transfer.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 23:37:25 -05:00
A
f586e19790
fix(security): replace unquoted heredocs with printf to prevent shell expansion in API keys (#1031)
Unquoted `<< EOF` heredocs in nanoclaw .env file creation cause shell
expansion of the API key value. If an API key contains `$`, backticks,
or `\`, the value is silently corrupted or could trigger command
execution. Replace with `printf '%s'` which safely writes the value
without interpretation.

Also fix unquoted variable expansion in upload_config_file's mv command
and the github-codespaces/openclaw.sh config heredoc.

Fixes 34 scripts across all cloud providers.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 19:41:10 -05:00
A
a0f6b335a4
fix: harden upload_file path validation with strict allowlist regex across 10 clouds (#993)
Replace fragile blocklist validation and printf '%q' escaping in upload_file()
with strict allowlist regex [a-zA-Z0-9/_.~-]+ across all non-SSH cloud providers.
For codesandbox, additionally migrate from shell command interpolation to SDK
filesystem API via environment variables, eliminating the injection surface entirely.

Affected clouds: codesandbox, daytona, e2b, fly, koyeb, modal, northflank,
railway, render, sprite

Fixes #989

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 12:20:40 -08:00
A
0f60a2b082
fix: add actionable guidance to agent installation failures across 126 scripts (#966)
Add log_install_failed helper to shared/common.sh that provides
structured troubleshooting for agent install failures: possible causes,
SSH debug command (when server IP available), manual install command,
and re-run suggestion. Also improve SSH key registration error message.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 10:14:03 -08:00
L
6633873ccc
refactor: replace Python with jq in Hetzner lib, fix /lab → /labs URLs (#827)
Hetzner lib: replace all Python JSON parsing with jq. Uses the
/datacenters API as the authoritative source for server type
availability (server_types.available), cross-referenced with
/server_types for specs and pricing. jq is auto-installed if missing.

URLs: update openrouter.ai/lab/spawn → openrouter.ai/labs/spawn
across all READMEs and CLI source.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:14:11 -08:00
A
5a1037d92c
fix: replace ((var++)) with var=$((var + 1)) for macOS bash 3.x compat (#769)
((var++)) returns exit code 1 when the variable is 0 (falsy), which
causes set -e to terminate the script. Replace all instances with
the safe var=$((var + 1)) pattern in sprite/lib/common.sh and
test/run.sh.

Fixes #762

Agent: community-coordinator

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 16:45:51 -08:00
A
6c7ced54dd
fix: replace log_warn with log_step/log_info for non-warning messages (#604)
Agent: ux-engineer

Many shell scripts misused log_warn (yellow) for normal progress/status
messages, making routine operations appear alarming. This fixes 59 files:

- Progress messages -> log_step (cyan): "Injecting environment variables...",
  "Attaching volume...", "Powering on instance...", "Retrieving server IP...",
  "Terminating sandbox/server...", "Creating datacenter...", "Importing SSH key...",
  "Deleting service/app...", "Modal not authenticated. Running setup..."
- Informational notices -> log_info (green): WhatsApp QR code authentication
  notices (30 nanoclaw scripts), codespace delete hints (14 scripts),
  "Appending environment variables to ~/.zshrc..." (6 local scripts),
  credential prompt hints, package update skipped, app reuse notices

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 03:24:30 -08:00
A
0835b35a36
fix: use log_step (cyan) for progress messages instead of log_warn (yellow) (#534)
~1500 progress messages across 481 files were using log_warn (yellow)
for normal status updates like "Installing...", "Setting up...",
"Creating server...", etc. This made users think something was wrong
when everything was proceeding normally.

Changes:
- Replace log_warn with log_step for all progress/status messages
- Keep log_warn only for actual warnings (errors, remediation hints)
- Remove emoji from 3 sprite completion messages

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
2026-02-11 14:37:43 -08:00
A
d9037fad32
fix: improve error messages and UX consistency across CLI and shell scripts (#466)
- Clarify download error messages: distinguish HTTP errors from network errors
  with specific status codes in the message
- Add actionable next steps to OAuth timeout: re-run command or set key manually
- Standardize error help labels to "How to fix:" across CLI and shell scripts
  (was inconsistently "What to do:", "Troubleshooting:", or missing)
- Add API method/endpoint context to retry failure messages so users know
  which API call failed
- Make verify_agent_installed error cases mutually exclusive: first for
  PATH/installation issues, second for runtime/dependency issues

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 07:46:56 -08:00
A
10a40ca574
fix: add log_step for progress messages, fix misleading prompt error (#440)
- Add log_step() function (cyan) for status/progress messages
- Convert misused log_warn calls to log_step in shared/common.sh
  (14 instances: SSH key gen, agent verification, waiting, configuring)
- Convert representative cloud scripts: hetzner, digitalocean, sprite
- Fix misleading validatePrompt error that suggested --prompt-file as a
  workaround when it has the same validation

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 04:28:17 -08:00
A
81bab47a74
fix: Escape API keys in continue.sh JSON configs to prevent injection (#374)
Replace vulnerable heredoc patterns across 27 continue.sh scripts with
setup_continue_config() helper that uses json_escape() + upload_config_file()
to safely handle API keys containing special characters like quotes or braces.

Also fix _save_token_to_config() in shared/common.sh which had the same
unescaped heredoc vulnerability for local token storage.

Relates to #104

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 00:13:19 -08:00
Ahmed Abushagur
8b9f9a0e5a
QA-Bot setup (#335)
* feat: testing

* feat: auto-fix dead apis

* fix: mock works

* feat: new fixtures

* fix: more clouds tested

* fix: dry run fix

* fix: civo valid size

* fix: civo result wait

* feat: fixtures

* feat: per cloud agent
2026-02-10 19:51:07 -08:00