Commit graph

39 commits

Author SHA1 Message Date
A
46cadfd5e1
fix: add --no-install-recommends to all apt calls across clouds (#1631)
Same fix as the fly PR (#1629) applied to all bash-based clouds.
Without --no-install-recommends, `git` pulls in python3 (~50MB)
via recommended packages on Ubuntu 24.04.

Affected files:
- gcp/lib/common.sh — cloud-init userdata
- aws/lib/common.sh — cloud-init userdata + unzip install
- daytona/lib/common.sh — sandbox base tools
- shared/common.sh — jq install + Node.js fallback
- shared/github-auth.sh — gh CLI install

Also added DEBIAN_FRONTEND=noninteractive and ca-certificates
where missing.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 18:12:19 -05:00
A
b055c0a285
fix: gcp and digitalocean destroy_server silently swallow errors (#1615)
GCP's destroy_server redirected both stdout and stderr to /dev/null
without checking the exit code, so deletion failures were invisible
to users. DigitalOcean's destroy_server never checked the API response
for error payloads, always reporting success.

Both bugs could leave cloud instances running (and charging money)
while telling users they were destroyed. Same class of bug fixed for
AWS in PR #1606.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 17:08:53 -05:00
A
aa0b182f71
fix: validate GCP_USERNAME before assignment to prevent injection (#1537)
* fix: validate GCP_USERNAME before assignment to prevent injection

Assign logname output to _username first, validate against
^[a-zA-Z0-9_-]+$ regex, then assign to GCP_USERNAME. This
ensures the validated value is what gets used in su commands.

Fixes #1536

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: validate whoami output in gcp/lib/common.sh main script

Apply same validation pattern to line 27 as was applied in cloud-init.
Assigns whoami output to temp var, validates against alphanumeric pattern,
then assigns to GCP_USERNAME only after validation passes.

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 20:38:34 -05:00
L
d5690a8b11
feat: spawn name prompt + kebab resource naming across all clouds (#1507)
* feat: add spawn name prompt and project confirmation to GCP flow

Ask for spawn name upfront (before auth), derive kebab-case default for
VM naming, and confirm the current GCP project before using it.

New interaction order:
  1. Spawn name: "My Dev Box" → kebab "my-dev-box" exported as
     GCP_INSTANCE_NAME_KEBAB
  2. gcloud auth + project confirm: "Current project: X  Keep? [Y/n]"
     If no → project picker shown
  3. SSH key
  4. Machine type picker (existing)
  5. Zone picker (existing)
  6. Instance name prompt: "Instance name [my-dev-box]: "
     User can press Enter to accept or type a custom name

New functions:
  _to_kebab_case()         — lowercases, replaces non-alnum with hyphens
  _gcp_prompt_spawn_name() — prompts for display name, exports kebab default;
                             honours SPAWN_NAME env var set by CLI (--name flag)

Modified:
  _gcp_resolve_project()  — adds Y/n confirmation when project already set
  get_server_name()       — shows kebab default in prompt, accepts Enter
  cloud_authenticate()    — calls _gcp_prompt_spawn_name first

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat: add spawn name prompt to all clouds via shared/common.sh

Move _to_kebab_case() and prompt_spawn_name() to shared/common.sh so all
clouds get upfront spawn name prompting and kebab-based resource naming.

shared/common.sh:
  + _to_kebab_case()    — "My Dev Box" → "my-dev-box"
  + prompt_spawn_name() — asks for display name, exports SPAWN_NAME_DISPLAY
                          and SPAWN_NAME_KEBAB; skips if already set;
                          honours SPAWN_NAME env var from CLI --name flag
  ~ get_resource_name() — replaces silent SPAWN_NAME fallback with a visible
                          prefilled default: "Enter server name [my-dev-box]: "

Per-cloud changes (cloud_authenticate gains prompt_spawn_name first):
  hetzner, fly, aws, daytona, digitalocean, sprite — one-line change each

gcp/lib/common.sh:
  - Remove _to_kebab_case()        (now in shared)
  - Remove _gcp_prompt_spawn_name() (now in shared as prompt_spawn_name)
  ~ cloud_authenticate: _gcp_prompt_spawn_name → prompt_spawn_name
  ~ get_server_name: simplified back to get_validated_server_name
    (shared get_resource_name now shows the kebab default in the prompt)

Result — every cloud shows this flow upfront:
  Spawn name (e.g. "My Dev Box"): My Claude Box
  ℹ Resource name: my-claude-box
  ...
  Enter server name [my-claude-box]: ⏎

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix: use "Use project '...'?" instead of "Keep this project?" in GCP prompt

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 22:22:59 -08:00
L
ff261f3544
feat: add spawn pick to shared _display_and_select (Hetzner + all clouds) (#1505)
* feat: add spawn pick to _display_and_select in shared/common.sh

All clouds using interactive_pick (Hetzner, DigitalOcean, AWS, fly, etc.)
now get the arrow-key picker UI when the user runs via `spawn`.

Placement: between fzf (rarely installed) and numbered list (plain fallback).
Priority: fzf > spawn pick > numbered list.

Pipe-delimited items "id|field2|field3..." are converted to tab-delimited
"id\tid\tfield2 · field3 · ..." so spawn pick displays:
  > cx22  2 vCPU · 4.0 GB RAM · 40 GB disk · shared · $ 0.0057/hr
  > fsn1  Falkenstein · DE

The --default flag uses default_id when set, otherwise default_value,
so the correct item is pre-selected when the picker opens.

No 2>/dev/tty redirect (avoids the zsh 'file exists' failure that broke
the GCP picker; spawn pick opens /dev/tty internally via fs.openSync).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* refactor: replace custom _gcp_interactive_pick with shared interactive_pick

- Remove _gcp_interactive_pick (60 lines of custom picker logic)
- Convert option functions to pipe-delimited format (id|detail)
  to match what interactive_pick / _display_and_select expect
- Replace _gcp_pick_{machine_type,zone,project} with direct
  interactive_pick calls — same pattern as Hetzner
- _gcp_project_options: awk now outputs id|name instead of id\tid\tname

GCP now gets fzf → spawn pick → numbered list for free via the
shared helper, with no cloud-specific picker code.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 21:59:00 -08:00
L
015446eee8
fix: remove 2>/dev/tty from spawn pick call in GCP picker (#1504)
The 2>/dev/tty redirect caused spawn pick to exit 1 on zsh/macOS
with 'file exists: /dev/tty', silently breaking the picker and
always falling through to the numbered-list fallback.

spawn pick renders its arrow-key UI by opening /dev/tty directly
via fs.openSync() — it never uses stderr for the UI — so the
redirect served no purpose and only caused failures.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 21:44:01 -08:00
A
e2d6aa1444
fix: use json_escape in save_vm_connection to prevent malformed JSON (#1470)
save_vm_connection built JSON via direct string interpolation, which
produces malformed output if any value contains quotes, backslashes,
or other JSON-special characters. This breaks spawn list/delete/history.

Changes:
- Use json_escape for all string fields in save_vm_connection
- Use json_escape for GCP zone/project metadata values
- Switch AWS, GCP, Daytona get_server_name to get_validated_server_name
  for consistency with Hetzner, DigitalOcean, Fly, OVH

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 16:23:27 +00:00
Ahmed Abushagur
8ee54d01a8
fix: harden agent reliability + security across all clouds (#1468)
* docs: add spawn delete command to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden openclaw across all clouds — validation, reliability, performance

Fixes multiple issues causing openclaw to break on most clouds:

Bugs fixed:
- Double-prefixed model ID (openrouter/openrouter/auto) in config generation
- AWS gateway starting without env vars (missing .zshrc source)
- DigitalOcean sourcing .spawnrc instead of .zshrc for gateway
- Destructive rm -rf ~/.openclaw on re-runs (now mkdir -p)

Validation added:
- API key checked against OpenRouter /auth/key endpoint with re-prompt on failure
- Model ID verified against OpenRouter model list with re-prompt loop
- openrouter/auto and openrouter/free bypass model check

Reliability improvements:
- Standardized gateway launch with </dev/null & disown across all 9 clouds
- Gateway log auto-displayed on startup timeout for diagnostics
- 2GB swap added to cloud-init to prevent OOM on small VMs
- Portable install timeout (10 min) with macOS gtimeout fallback

Performance:
- Reordered spawn_agent: OAuth runs while VM provisions (saves 30-60s)
- Fly.io: bumped to 2GB RAM + 2 shared CPUs for openclaw
- Fly.io: tries bun first (faster), falls back to npm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip sudo in gh install when running as root (Fly.io containers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — skip validation in tests, quote escaped cmd, escape model_id

- verify_openrouter_key and verify_openrouter_model skip network calls when
  SPAWN_SKIP_API_VALIDATION, BUN_ENV=test, or NODE_ENV=test is set
- install_agent timeout wrapper now quotes the escaped command for defense in depth
- model_id in openclaw JSON now uses json_escape() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove double-escaping in install_agent that broke shell operators

install_agent() was wrapping commands with printf '%q' + bash -c before
passing them to the run callback. But run callbacks (run_server, run_sprite,
ssh_run_server) already handle escaping for remote transport. The double-
escaping turned && || > | into literal characters, causing 'source' to
treat the entire command as a single filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use local github-auth.sh instead of curling from main

When running from a local checkout, base64-encode the local
github-auth.sh and send it inline to the remote machine. This
ensures fixes (like the sudo skip for root) take effect immediately
without waiting for a merge to main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle github-auth errors gracefully instead of terminating

GitHub CLI setup is optional — failures should not abort the spawn
session. Guard both run_callback calls in offer_github_auth with
|| log_warn so the script continues even if gh install fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use GOOGLE_GEMINI_BASE_URL to route Gemini CLI through OpenRouter

Gemini CLI ignores OPENAI_BASE_URL — it uses GEMINI_API_KEY to talk
directly to Google's API. The OpenRouter key is not a valid Google
API key, so all requests fail with "API key not valid".

Use GOOGLE_GEMINI_BASE_URL to redirect Gemini CLI to OpenRouter's
endpoint. Fixes all 9 cloud gemini scripts + manifest.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: guard optional spawn_agent hooks so failures don't kill the session

With set -eo pipefail, any unguarded failure terminates the script.
Several optional operations in spawn_agent were unguarded:

- agent_configure: config file uploads (agent works with defaults)
- agent_save_connection: convenience JSON for spawn list
- agent_pre_launch: gateway daemons, startup hooks
- agent_pre_provision: pre-provision prompts
- .spawnrc shell hooks: hooking env vars into .bashrc/.zshrc

These now log warnings and continue instead of aborting. Critical
steps (cloud_authenticate, agent_install, cloud_provision) still
exit on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: audit and fix env vars, escaping, and error handling across all agents

Audit findings from 3 parallel agents, fixes applied:

**Env vars (4 agents fixed across 9 clouds each = 36 scripts):**
- Amazon Q: remove fake OPENAI_* vars (Q uses AWS auth, can't use OpenRouter)
- Cline: replace OPENAI_* env vars with `cline auth -p openrouter` command
- Open Interpreter: drop OPENAI_* vars, use only OPENROUTER_API_KEY (native support via --model flag)
- NanoClaw: add ANTHROPIC_BASE_URL to .env file (was missing, requests went to Anthropic directly)

**Escaping:**
- execute_agent_non_interactive: replace printf '%q' with single-quote wrapping to avoid double-escaping on Fly.io

**Manifest updated** for amazonq, cline, interpreter entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use setsid to detach openclaw gateway daemon from SSH sessions

The gateway daemon launch (`nohup openclaw gateway ... & disown`) hangs
on all clouds because SSH/exec channels wait for child FDs to close.
setsid creates a new session, fully detaching the daemon so the channel
can close immediately. Falls back to nohup where setsid is unavailable.

Consolidates the daemon launch into a shared start_openclaw_gateway()
function used by all 9 cloud scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: configure npm global prefix for non-root clouds (AWS, GCP, OVH)

AWS Lightsail, GCP, and OVH SSH as non-root users (ubuntu/login user),
so `npm install -g` fails with EACCES on /usr/local/lib/node_modules/.

Fix: configure npm prefix to ~/.npm-global during cloud-init/setup and
add ~/.npm-global/bin to the SSH PATH prefix so agent install commands
find globally-installed npm binaries without sudo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove broken OpenRouter routing from Gemini CLI scripts

Gemini CLI uses Google's native API format (/v1beta/models/:streamGenerateContent),
not the OpenAI-compatible format (/v1/chat/completions). No base URL override can
bridge this — the request formats are fundamentally incompatible. Same situation
as Amazon Q (uses vendor-specific auth/API).

Removed GEMINI_API_KEY and GOOGLE_GEMINI_BASE_URL from all 9 scripts + manifest.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: auto-install AWS CLI and gcloud SDK when missing

Instead of printing manual install instructions and exiting, both CLIs
now auto-install:

- AWS: downloads official .pkg (macOS) or .zip (Linux) installer
- GCP: uses brew cask on macOS, Google's tarball installer on Linux

Falls back to manual instructions if auto-install fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: nanoclaw — install Docker on Linux, fix hardcoded /root/ path

Two issues broke NanoClaw on all clouds:

1. .env upload hardcoded /root/nanoclaw/.env — fails on non-root clouds
   (AWS=ubuntu, GCP=user, OVH=ubuntu). Now uses upload_config_file with
   $HOME which expands on the remote side.

2. NanoClaw requires a container runtime. On Linux it uses Docker, but
   Docker was never installed. Added Docker install via get.docker.com
   to all cloud scripts (with sudo where SSH user is non-root).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review findings from PR #1463

- Reject symlinked github-auth.sh before base64-encoding (falls back to remote URL)
- Hide API key from process list using curl -K - instead of -H in verify_openrouter_key

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: quote OPENROUTER_API_KEY in cline auth to prevent command injection

Unquoted variable in `cline auth -p openrouter -k ${OPENROUTER_API_KEY}`
allows shell metacharacters in the key to execute arbitrary commands on
the remote server. Wrapping in escaped double quotes prevents expansion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 08:36:24 -05:00
A
fdf7a675b3
security: validate GCP username before su to prevent command injection (#1451)
Fixes command injection vulnerability in cloud-init where unquoted
$(logname 2>/dev/null || echo "$USER") could allow shell metacharacters
to be interpreted with root privileges.

Fixes #1450

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 23:20:27 -05:00
A
7e2a7bca1e
security: replace eval with indirect expansion in GCP picker (#1448)
Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 15:17:31 -05:00
A
e4bf4d86a4
feat: add spawn pick command and interactive GCP project/zone/machine-type pickers (#1443)
- New cli/src/picker.ts: modular picker module with pickToTTY() that renders
  an arrow-key UI directly to /dev/tty, works even when stdout is captured by
  bash $() subshell substitution and stdin is piped with options.

- New spawn pick subcommand: reads options from stdin as tab-separated lines
  (value\tLabel\tHint), shows clack-style picker via /dev/tty, writes selected
  value to stdout.  Falls back to a numbered list when no TTY is available.

  Usage from bash:
    zone=$(printf 'us-central1-a\tIowa\nus-east1-b\tVirginia\n' \
             | spawn pick --prompt "Select zone" --default "us-central1-a")

- gcp/lib/common.sh: interactive project, zone, and machine-type pickers for
  all GCP agent scripts.  Each picker respects env var overrides (GCP_PROJECT,
  GCP_ZONE, GCP_MACHINE_TYPE) and skips the prompt when already set.  Uses
  spawn pick for a nice arrow-key UI when available; falls back to
  _display_and_select (fzf or numbered list) from shared/common.sh.

  - _gcp_machine_type_options(): curated list of 8 popular instance types
  - _gcp_zone_options(): 12 curated zones across US / EU / APAC / AU
  - _gcp_project_options(): live list via gcloud projects list
  - _gcp_pick_{machine_type,zone,project}(): picker wrappers
  - _gcp_resolve_project(): now prompts interactively instead of erroring when
    no project is configured
  - create_server(): now calls pickers before provisioning instead of silently
    using defaults

- cli version bump 0.5.2 to 0.5.3

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 11:30:52 -05:00
Ahmed Abushagur
f2795a6d84
fix: Node.js v22 upgrade, aider uv install, SSH & cloud reliability (#1440)
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds

aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update aider/gptme/interpreter assertions from pip to uv

The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).

Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger test run for uv assertion fix

* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds

- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
  stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
  SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
  no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
  $escaped_cmd that prevented the remote shell from consuming escapes,
  breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
  where it escaped shell operators into literal characters since daytona exec
  has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add --pty to fly ssh console for interactive sessions

fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.

Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prepend ~/.local/bin to PATH in ssh_run_server

After uv installs to ~/.local/bin, the current shell session doesn't
have it in PATH, causing "uv: command not found" on DigitalOcean and
all other SSH-based clouds (Hetzner, AWS, GCP, OVH).

Fly.io's run_server already prepends this PATH — now the shared
ssh_run_server does the same, fixing all SSH-based clouds at once.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add Node.js to cloud-init for all cloud providers

npm-based agents (codex, kilocode, etc.) fail with "npm: command not
found" because Node.js isn't installed during cloud-init. Fly.io was
the only provider installing Node.js (in wait_for_cloud_init).

Now all cloud-init scripts install Node.js v22 LTS from nodesource,
matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and
GCP cloud-init (was already in shared/DigitalOcean/Hetzner).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use apt packages for nodejs/npm instead of nodesource

The nodesource setup script (setup_22.x) runs its own apt-get update
and repository configuration, nearly doubling cloud-init time and
causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm
in its default repos — just add them to the packages list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add timeouts and better error handling to Daytona CLI commands

Daytona CLI commands (login, list, create) can hang indefinitely when
the API is slow or unreachable. This causes:
- "Failed to create sandbox: timeout" with no recovery
- Token validation timeouts misreported as "invalid token"
- Users re-entering valid tokens that also timeout

Fixes:
- Wrap all daytona CLI calls with timeout (30s for auth, 120s for create)
- Detect timeout errors separately from auth errors
- Show actionable "try again / check status" messages for timeouts
- Add nodejs/npm to Daytona wait_for_cloud_init

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: set DAYTONA_API_URL to Daytona Cloud by default

The Daytona CLI may default to connecting to a local self-hosted
server instead of Daytona Cloud. Without DAYTONA_API_URL set to
https://app.daytona.io/api, every CLI command (login, list, create)
hangs trying to reach a non-existent local server and times out.

The SDK documents this as the default, but the CLI doesn't always
pick it up — now we export it explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing

n installs Node.js v22 to /usr/local/bin/node but apt's v18 at
/usr/bin/node can shadow it in non-interactive SSH sessions. After
n 22, symlink the new binaries over the apt ones so v22 is always
resolved. Also fix hcloud CLI token extraction for new TOML format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review, add curl timeouts to trigger workflows

- Fix ssh_run_server command injection concern: use single-quoted
  path_prefix so $HOME/$PATH expand remotely, not locally
- Add --connect-timeout 15 --max-time 30 to trigger workflows to
  prevent 5-min hangs when server streams responses
- Handle 409 (dedup) as success — expected when cron fires every 15min
  but cycles take 35min
- Reduce workflow timeout-minutes from 5 to 2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 06:54:07 -05:00
A
7187ef1cbf
security: fix unsafe command substitution in GCP cloud-init script (#1413)
Replace nested command substitution $(echo "$(whoami)") with $USER
environment variable to prevent potential command injection attacks.

The nested substitution was vulnerable because:
- whoami could be aliased or PATH-manipulated in compromised environments
- Running as root in cloud-init amplified the security impact
- Double nesting was unnecessary complexity

Using $USER is safer because:
- It's a shell variable, not command execution
- No subprocess spawning or PATH resolution
- Simpler and more reliable

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 12:54:06 -05:00
A
6be328c314
fix: auto-run gcloud auth login on expired GCP tokens (#1371)
Instead of telling users to run `gcloud auth login` manually, just
run it automatically when auth check fails or instance creation hits
a reauthentication error, then retry.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 19:34:54 -08:00
Ahmed Abushagur
758b575658
feat: add server lifecycle management (reconnect + delete) (#1363)
Wire up connection tracking across all 10 clouds so users can reconnect
to and delete previously spawned servers via `spawn list` and `spawn delete`.

Phase 1 - Connection tracking:
- Extend save_vm_connection() with cloud and metadata params
- Add save_vm_connection to create_server() in all cloud libs
- Extend VMConnection with cloud, deleted, deleted_at, metadata fields

Phase 2 - Delete via interactive picker:
- Add "Delete this server" option to spawn list picker
- Build delete scripts that reuse each cloud's destroy_server()
- Confirmation UX with spinner feedback
- Soft-delete marking in history (deleted records show [deleted])

Phase 3 - Standalone delete command:
- spawn delete (aliases: rm, destroy) with interactive picker
- Filter support: spawn delete -a <agent> -c <cloud>

Also improves reconnect hints for Fly (fly ssh console) and
Daytona (daytona ssh) connections.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 17:06:49 -08:00
A
ec81c74594
refactor: introduce cloud adapter + spawn_agent runner system (#1340)
Eliminate ~70% boilerplate across 149 agent scripts by introducing a
standard cloud_* adapter interface and spawn_agent orchestration runner.

Each cloud's lib/common.sh now exports 7 adapter functions (cloud_authenticate,
cloud_provision, cloud_wait_ready, cloud_run, cloud_upload, cloud_interactive,
cloud_label) that wrap cloud-specific operations behind a uniform interface.

Agent scripts define hooks (agent_install, agent_env_vars, agent_launch_cmd,
etc.) and call `spawn_agent "Agent Name"` — the runner handles the full
deployment flow: auth → provision → wait → install → API key → env → config → launch.

- shared/common.sh: add spawn_agent(), _fn_exists(), _spawn_inject_env_vars()
- 10 cloud lib/common.sh files: add cloud_* adapter functions
- 149 agent scripts: rewrite to hook pattern (~40-80 lines → ~20-35 lines)
- test/run.sh: update 2 sprite test patterns for new adapter paths
- Net reduction: ~4,300 lines (2,257 added, 6,563 removed)

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 16:25:44 -08:00
A
e4a3b60c90
ux: improve GCP CLI installation and setup guidance (#1149) (#1156)
Enhanced error messages in gcp/lib/common.sh to provide comprehensive,
platform-specific guidance for users getting started with GCP:

Changes:
- Added detailed install instructions for macOS (Homebrew), Ubuntu/Debian,
  and Fedora/RHEL platforms
- Included post-installation steps (auth and project configuration)
- Added links to create GCP project and enable Compute Engine API
- Improved error message structure with clear "How to fix" sections
- Maintained color-coded output for better readability

This addresses the issue where users felt the GCP flow "doesn't guide
me toward finding the right resource". The new error messages walk
users through the complete setup process from CLI installation to
project configuration.

Agent: ux-engineer

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 20:39:41 -05:00
A
577206bc1b
fix(ux): add post-session summary to Alibaba Cloud and GCP (#1052)
Both clouds had custom `interactive_session` functions that called
`ssh` directly, bypassing the shared `ssh_interactive_session` which
shows the post-session server-still-running warning. Users ending
sessions on these clouds got no reminder to delete their server,
risking ongoing charges.

Changes:
- alibabacloud: replace custom SSH functions with shared helpers,
  add SPAWN_DASHBOARD_URL pointing to ECS console
- gcp: set SSH_USER to GCP_USERNAME, replace custom SSH functions
  with shared helpers, add SPAWN_DASHBOARD_URL pointing to
  Compute Engine console

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 00:29:43 -05:00
A
8852bcfc36
refactor: decompose GCP ensure_gcloud and create_server into focused helpers (#964)
ensure_gcloud (36 lines -> 5 lines + 3 helpers):
- _gcp_check_cli_installed: verify gcloud CLI presence
- _gcp_check_auth: verify active authenticated account
- _gcp_resolve_project: resolve and export GCP_PROJECT

create_server (65 lines -> 37 lines + 4 helpers):
- _gcp_write_startup_script: write cloud-init to tracked temp file
- _gcp_invoke_create: run gcloud compute instances create
- _gcp_handle_create_error: actionable error guidance
- _gcp_get_instance_ip: fetch and export instance external IP

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 10:19:19 -08:00
A
f3d2384392
refactor: decompose GCP and Cherry create_server functions (#965)
GCP create_server was 64 lines (largest function across all cloud libs).
Cherry create_server was 54 lines. Both are now under 30 lines each
by extracting focused helpers:

GCP (64 -> 25 lines):
- _gcp_prepare_instance_files: startup script + SSH key temp files
- _gcp_run_create: gcloud command execution with error diagnostics
- _gcp_get_instance_ip: IP extraction from instance describe

Cherry (54 -> 27 lines):
- _cherry_build_server_body: JSON payload construction
- _cherry_submit_create: API call with error handling

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 10:15:47 -08:00
A
3c3c697ea5
fix: json_escape SSH key names and fix GCP metadata injection (#958)
SSH key registration in 11 cloud providers used unescaped key_name
directly in JSON request bodies. If the hostname (used to generate
key names) contained JSON-special characters like double-quotes, it
could break out of the JSON string and inject arbitrary JSON fields.

Fix: use json_escape for key_name in all providers, matching the
pattern already used by Scaleway.

Also fix GCP create_server which embedded the startup script inline
in --metadata with comma delimiters. Commas in the script could break
metadata parsing or inject additional metadata keys. Fix: use
--metadata-from-file for the startup script.

Affected providers: Hetzner, DigitalOcean, Vultr, BinaryLane,
Hostinger, Contabo, Cherry, HOSTKEY, Civo, Linode, Genesis Cloud, GCP.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 09:03:35 -08:00
A
fd80f1992c
fix: improve error messages for GCP, AWS Lightsail, Cherry, and Oracle (#957)
- GCP: capture gcloud stderr on failure, add common issues guidance,
  use _log_diagnostic for ensure_gcloud errors
- AWS Lightsail: add common issues for create_server failure,
  use _log_diagnostic for ensure_aws_cli errors,
  improve instance timeout message with actionable steps
- Cherry Servers: use extract_api_error_message instead of raw response
  dump, add common issues for server creation failure
- Oracle Cloud: capture OCI CLI stderr on instance launch failure,
  add common issues for VCN, subnet, and instance creation errors

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 09:00:10 -08:00
A
0835b35a36
fix: use log_step (cyan) for progress messages instead of log_warn (yellow) (#534)
~1500 progress messages across 481 files were using log_warn (yellow)
for normal status updates like "Installing...", "Setting up...",
"Creating server...", etc. This made users think something was wrong
when everything was proceeding normally.

Changes:
- Replace log_warn with log_step for all progress/status messages
- Keep log_warn only for actual warnings (errors, remediation hints)
- Remove emoji from 3 sprite completion messages

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
2026-02-11 14:37:43 -08:00
A
b0f924b511
fix: Prevent Python/shell injection via env vars and triple-quote strings (#102)
- Fix triple-quote injection in SSH keys (Scaleway, UpCloud), userdata
  (BinaryLane), init scripts (Civo, Kamatera), and GraphQL queries
  (RunPod) by passing data via stdin/json_escape instead of inline
  string interpolation
- Add input validation for all cloud provider env vars (region, type,
  plan, etc.) using validate_region_name/validate_resource_name to block
  shell metacharacters before they reach Python string interpolation
- Validate Modal image name as Python identifier to prevent code injection
- Validate numeric env vars (RAM, GPU count, disk size) across all providers

Affects: 19 cloud provider lib/common.sh files
Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 10:22:39 -08:00
Sprite
359777d855 refactor: Document and fix GCP variable export pattern
Improved the variable export pattern in gcp/lib/common.sh:
- Added SC2034 disable with clear documentation in create_server()
- Made server_ip a local variable before export
- Added SC2154 suppressions with documentation in all 10 GCP agent scripts

This eliminates shellcheck warnings while maintaining the existing
export behavior that makes GCP_SERVER_IP, GCP_INSTANCE_NAME_ACTUAL,
and GCP_ZONE available to calling scripts.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 03:40:12 +00:00
Sprite
55ef42c82e refactor: add shellcheck disables for intentional SSH_OPTS word splitting
Add SC2086 disable comments to interactive_session() functions in
GCP, Hetzner, and DigitalOcean providers. SSH_OPTS is intentionally
unquoted to allow word splitting for multiple SSH options.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 02:56:08 +00:00
Sprite
cabdbc37ba refactor: add pipefail to error handling flags
Changed 65 agent scripts from `set -e` to `set -eo pipefail` to ensure
errors in piped commands are properly caught. This prevents silent
failures when commands like `curl | bash` fail in the middle.

Files updated across all cloud providers:
- aws-lightsail: 10 scripts
- digitalocean: 3 scripts
- e2b: 10 scripts
- gcp: 10 scripts
- hetzner: 3 scripts
- lambda: 10 scripts
- linode: 3 scripts
- modal: 10 scripts
- sprite: 3 scripts
- vultr: 3 scripts

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 02:34:45 +00:00
Sprite
6171fa73a9 refactor: use generic_ssh_wait in GCP wait_for_cloud_init
- Call generic_ssh_wait() to establish SSH first
- Simplifies cloud-init waiting logic
- Reduces code duplication

Score: 27 (Impact: 6, Confidence: 9, Risk: 2)
2026-02-08 02:00:49 +00:00
Sprite
f6a16da0ab refactor: cache GCP username at file level
- Add GCP_USERNAME=$(whoami) at file scope
- Replace 6 duplicate whoami subprocess calls
- Eliminates redundant process invocations

Score: 40 (Impact: 4, Confidence: 10, Risk: 1)
2026-02-08 02:00:31 +00:00
Sprite
60524118de refactor: use generic_ssh_wait in AWS Lightsail wait_for_cloud_init
- Call generic_ssh_wait() with ubuntu user to establish SSH first
- Simplifies cloud-init waiting logic
- Reduces code duplication

Score: 16 (Impact: 4, Confidence: 8, Risk: 2)
2026-02-08 01:59:59 +00:00
Sprite
2b8806cc60 refactor: add braces to variable references in gcp/lib/common.sh
Fixed all 57 SC2250 shellcheck warnings by adding braces to variable
references. This improves code consistency and follows shellcheck
best practices.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:46:16 +00:00
Sprite
f883ff42e8 refactor: check exit codes directly instead of using $?
Fixed SC2181 warnings by refactoring indirect exit code checks
($? -ne 0) to direct command checks (if ! command). This improves
readability and follows shellcheck best practices.

Changes:
- aws-lightsail/lib/common.sh: Refactored aws lightsail create-instances check
- gcp/lib/common.sh: Refactored gcloud compute instances create check

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:29:56 +00:00
Sprite
16b9b1eda1 refactor: fix SC2155 shellcheck warnings in GCP and AWS Lightsail libraries
Apply mechanical fix to split local variable declarations from command
substitution assignments. This prevents masking return values and aligns
with shellcheck best practices.

Files fixed:
- gcp/lib/common.sh - 9 warnings fixed
- aws-lightsail/lib/common.sh - 3 warnings fixed

Pattern applied: `local var=$(cmd)` → `local var; var=$(cmd)`

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:21:12 +00:00
Sprite
0b4fe29026 refactor: fix SC2154 warnings for SSH_OPTS in provider libraries
Added shellcheck directive comments before first SSH_OPTS usage in:
- aws-lightsail/lib/common.sh
- gcp/lib/common.sh
- lambda/lib/common.sh
- vultr/lib/common.sh
- linode/lib/common.sh
- hetzner/lib/common.sh
- digitalocean/lib/common.sh

SSH_OPTS is defined in shared/common.sh but shellcheck can't detect
cross-file variable definitions, so we suppress the warning with
an explanatory comment.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:20:06 +00:00
Sprite
0ad6680f1f refactor: extract duplicate get_server_name logic to shared function
- Add get_resource_name() to shared/common.sh
  - Generic function for env-var-or-prompt pattern
  - Uses indirect expansion ${!var} for dynamic env vars
  - Preserves exact behavior: env check → prompt → error

- Update 9 cloud providers to use shared function:
  - aws-lightsail: LIGHTSAIL_SERVER_NAME
  - digitalocean: DO_DROPLET_NAME (with validation)
  - gcp: GCP_INSTANCE_NAME
  - hetzner: HETZNER_SERVER_NAME (with validation)
  - linode: LINODE_SERVER_NAME (with validation)
  - sprite: SPRITE_NAME (with validation)
  - vultr: VULTR_SERVER_NAME (with validation)
  - e2b: E2B_SANDBOX_NAME
  - modal: MODAL_SANDBOX_NAME

- Reduces code duplication: ~120 lines → ~25 lines
- Maintains backward compatibility (env vars, prompts, errors unchanged)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:16:20 +00:00
Sprite
a9818001da refactor: migrate 5 cloud libraries to shared/common.sh pattern
Migrated aws-lightsail, e2b, gcp, lambda, and modal libraries to use
the shared library pattern established in earlier refactoring rounds.

Changes:
- Added shared/common.sh sourcing block (local-first, remote-fallback)
- Added bash safety flags (set -eo pipefail) to all 5 libraries
- Removed duplicate provider-agnostic functions:
  - Logging (log_info, log_warn, log_error)
  - Input handling (safe_read)
  - OAuth flow (try_oauth_flow, get_openrouter_api_key_oauth)
  - SSH helpers (generate_ssh_key_if_missing, etc.)
- Retained cloud-specific functions:
  - API wrappers and CLI integration
  - Server provisioning and lifecycle management
  - Cloud-specific validation and configuration

Impact:
- Net reduction: 460 lines (-545 added +85)
- Eliminates code duplication across 5 providers
- Improves consistency and maintainability
- All libraries now follow the same architectural pattern

Task: #3 (score 56 - highest priority)
Also addresses Task #4 (bash safety flags)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:08:34 +00:00
L
591066cd53
Use ${VAR:-} for all optional env var checks (#28)
Protects against 'unbound variable' errors even if set -u is
re-enabled or inherited. Every [[ -n "$UPPER_VAR" ]] pattern now
uses [[ -n "${UPPER_VAR:-}" ]] to safely default to empty.

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 16:28:12 -08:00
L
6ac59e6bb3
Fix OAuth server for macOS bash 3.x (#24)
Three issues broke the OAuth callback server on macOS:

1. echo -e doesn't work in bash 3.x — \r\n appears as literal text
   in the HTTP response, browser gets malformed headers.
   Fix: pre-write response with printf to a file before the subshell.

2. local variables inside ( ... ) & subshell — undefined behavior in
   bash 3.x since subshells aren't function scope.
   Fix: use plain variables in subshells.

3. ((elapsed++)) when elapsed=0 evaluates to falsy — set -e kills
   the script on the first iteration of the timeout loop.
   Fix: use elapsed=$((elapsed + 1)) instead.

Also simplified nc_listen detection to only check for BusyBox
(the -p flag check could misfire on macOS nc).

Applied to all 10 lib/common.sh files.

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 14:21:47 -08:00
L
85938a7b0d
Add GCP Compute Engine as eighth cloud provider with all 10 agents (#19)
GCP instances via gcloud CLI with startup-script for provisioning.
Uses current username for SSH (not root/ubuntu).
- gcp/lib/common.sh: gcloud wrapper, SSH key handling, instance lifecycle
- All 10 agents: claude, openclaw, nanoclaw, aider, goose, codex, interpreter, gemini, amazonq, cline

Matrix now 10 agents x 8 clouds = 80/80 implemented.

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 12:06:37 -08:00