Replace fly/lib/common.sh (741 lines of bash) with a TypeScript
implementation using Bun runtime. The fly/ provider was the most
complex bash code in the project — recent fixes (#1597, #1599, #1600)
highlight the pain of debugging HTTP calls, JSON parsing, and multi-step
auth flows in shell.
New TypeScript modules:
- fly/lib/ui.ts — logging, prompts, validation (zero deps)
- fly/lib/fly.ts — API client (fetch), auth chain, org listing, provisioning
- fly/lib/oauth.ts — OpenRouter OAuth via Bun.serve(), key management
- fly/lib/agents.ts — typed agent configs for all 6 agents
- fly/main.ts — orchestrator entry point
Agent .sh files become thin shims (~30 lines) that install bun if needed,
download TS sources for curl|bash execution, and delegate to main.ts.
Test coverage:
- 44 TypeScript unit tests (bun test) for pure logic
- 4 fly failure-mode tests (mock.sh) for error scenarios
- All existing test suites pass (110 run.sh, 76 mock.sh)
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mock bun shim was broken on CI (ubuntu-latest, no real bun):
- Only passed $2 to node, dropping -- field default args needed by _fly_json
- Didn't strip TypeScript annotations (: any[], as any) that node can't parse
Fixes:
- shift 2 to preserve extra args, forward them to both real bun and node
- sed -E strips TS type annotations before passing to node --input-type=module
- All fly tests now pass under the node-only CI fallback path
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves sub-issues #1569, #1570, #1576, #1577, #1578, #1580.
#1569 — /wait endpoint replaces polling loop:
_fly_wait_for_machine_start now uses GET /apps/{app}/machines/{id}/wait
?state=started&timeout=90. One blocking API call instead of 30 polls.
#1570 — fly machine exec replaces fly ssh console for run_server:
run_server uses 'fly machine exec MACHINE_ID --app APP -- bash -c cmd'
(direct API, no WireGuard tunnel) when FLY_MACHINE_ID is set. Falls
back to 'fly ssh console -C' for environments without a machine ID.
#1576 — App name collision loop capped at 5 retries:
Prevents infinite re-prompt. Suggests FLY_APP_NAME env var after 5
failed attempts.
#1577 — destroy_server errors are now reported:
All fly_api calls check for error responses. Reports failed machine
deletions and exits non-zero on app deletion failure instead of
always logging "destroyed" regardless of outcome.
#1578 — bun replaced with python3 for all JSON parsing:
_fly_json_get, _fly_build_machine_body, _fly_list_orgs, destroy_server,
list_servers all use python3 -c now. python3 is universally available;
bun was only available after cloud-init completed on the target machine.
#1580 — upload_file uses stdin pipe instead of base64 string injection:
'fly machine exec ... -- bash -c "cat > path" < local_file' streams
file content directly. Eliminates the command-length/injection risk of
embedding base64 content in a shell argument string.
test/mock.sh: add 'fly machine exec' case to the fly CLI mock.
test/fixtures/fly/_env.sh: add FLY_MACHINE_ID to test env.
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* Revert "fix: handle raw m2. macaroon tokens from Fly.io CLI Sessions API (#1552)"
This reverts commit 9fc59ded1c.
* Revert "fix: replace bun -e with python3 in fly/lib/common.sh to fix 18 mock test failures (#1553)"
This reverts commit 328e6a6da4.
* fix: bun passthrough mock + restore Bun JSON parsing in fly/lib
Reverts PR #1553 (which reverted Bun in favour of Python to fix tests)
and instead fixes the root cause: the test/mock.sh bun mock was a dumb
no-op that discarded all output, causing _fly_json_get() to return empty
string and every fly script to fail with "Failed to extract machine ID".
test/mock.sh — smart bun mock:
- `bun -e "..."` (inline eval, used for JSON processing) → delegates to
the real bun binary so _fly_json_get() / _fly_build_machine_body()
actually produce correct output during tests
- All other bun invocations (install, run, etc.) → logged no-op as before
fly/lib/common.sh:
- Restores Bun-based _fly_json_get(), _fly_build_machine_body(),
destroy_server machine-ID extraction, and list_servers table formatter
- Re-applies m2. macaroon token fix from #1552 (which was lost when
#1553 reverted the whole file):
_sanitize_fly_token now wraps raw m2.* tokens as "FlyV1 m2.*" so
CLI Sessions OAuth tokens are sent with the correct auth header
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* test: add node fallback to bun mock for CI environments
CI (GitHub Actions ubuntu-latest) has node but not bun, so the bun
passthrough mock silently returns empty string, causing _fly_json_get
to fail and 18 Fly.io tests to break. Add a fallback chain:
real bun -> node (with Bun.stdin.text() polyfill) -> exit 0.
Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The heredoc overrode piped stdin, so $response never reached python3.
sys.stdin.read() got empty input, making API error detection silently
fail during live fixture recording. Pass data via environment variables
instead.
Agent: test-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The function only had a success branch — when temp files were leaked,
it silently returned without incrementing FAILED or printing output.
Add the missing else branch so leaked temp files are detected.
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The MC002 regex matched both `echo -e` and `echo -n`, but only
`echo -e` is non-portable on macOS bash 3.2. `echo -n` works fine
as a bash builtin. This caused 3 false positive errors (all TTY
probe patterns using `echo -n "" > /dev/tty`) making the linter
exit non-zero incorrectly.
Agent: test-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
printf -v was introduced in bash 4.0 but macOS ships bash 3.2.
_update_retry_interval() in shared/common.sh used printf -v and is called
from generic_ssh_wait and _cloud_api_retry_loop — meaning ALL SSH
connectivity checks and cloud API retries would fail on macOS with:
"printf: -v: invalid option"
Changes:
- shared/common.sh: replace printf -v with eval in _update_retry_interval()
- shared/common.sh: remove dead code in calculate_retry_backoff() where
next_interval was computed but never used
- shared/key-request.sh: same printf -v fix
- test/macos-compat.sh: add MC013 rule to catch printf -v in future
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Prevent cloud provider API tokens from being visible in ps aux output by passing Authorization headers via curl's -K - (config from stdin) instead of command-line arguments.
- Add SPAWN_SKIP_API_VALIDATION=1 and SPAWN_SKIP_GITHUB_AUTH=1 to
sprite test environment so verify_openrouter_key() doesn't make real
HTTP calls with the fake test key (which gets 401, clears the key,
and falls into OAuth — causing all sprite assertions to fail)
- Update agent iteration lists from stale "claude openclaw nanoclaw" to
current "claude openclaw codex opencode kilocode zeroclaw"
- Remove dead nanoclaw case from _assert_agent_specific
- Remove 5 dead agent cases (nanoclaw, cline, gptme, plandex, continue)
from _shared_agent_assertions.sh, add zeroclaw
Result: 108 passed, 0 failed (was: 48 passed, 18 failed)
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The json_escape fallback (used when python3 is unavailable) only escaped
backslashes and double quotes, producing invalid JSON when input contained
newlines, tabs, or carriage returns. This could cause JSON injection in
API request bodies sent to cloud providers (Hetzner, DigitalOcean, Fly.io)
and corrupt credential config files.
Add escaping for \n, \r, and \t in the fallback path. The python3 primary
path (json.dumps) was already correct.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Why: `set -eo pipefail` + `output=$(shellcheck ...)` on line 659 of
test/run.sh causes immediate exit when shellcheck finds any warning,
preventing the entire shell test suite from running. 53 CLI tests also
fail due to stale assertions after agents/clouds were removed in recent
PRs.
Fixes:
- test/run.sh:659 — add `|| true` to shellcheck command substitution so
shell test suite runs to completion even when scripts have warnings
- manifest-real-data.test.ts — lower agent count min from 10→5,
matrix count min from 80→40 (now 6 agents, 48 matrix entries)
- agent-env-injection-contract.test.ts — lower script count min
from 70→40 (now 47 implemented scripts)
- script-conventions.test.ts — same script count fix (70→40)
- cloud-lib-source-chain.test.ts — lower cloud lib min from 9→8
(OVH removed, now 8 clouds)
- commands-credential-display-internals.test.ts — add missing
@clack/prompts mock (tests call p.log.error but never mocked it)
- commands-exported-helpers-edges.test.ts — fix environment-dependent
assertion: only check credential-based hintOverrides, not
CLI-installed ones (sprite CLI is installed in CI/dev)
- agent-config-setup.test.ts — fix stale model ID assertion
("openrouter/anthropic/..." → "anthropic/...") and stale mkdir
command ("rm -rf && mkdir" → "mkdir -p")
- agent-info-quickstart.test.ts — remove sprite from singleAuthManifest
fixture (sprite CLI installed causes sprite to be prioritized over
hetzner, breaking 4 tests); update count assertions for single cloud
Agent: team-lead
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Why: test/record.sh used local -n (bash 4.3+ namerefs) which crashes
on macOS's default bash 3.2, breaking contributor workflow for recording
API fixtures. Fixes#1480.
Inlines the _export_env_vars_from_fields helper directly into
_load_multi_config_from_file, eliminating the nameref dependency while
preserving the security validation of env var names.
Agent: team-lead
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Delete 32 agent scripts ({cloud}/{cline,gptme,plandex,continue}.sh across
8 clouds), remove the 4 agents from manifest.json with all their matrix
entries, update README matrix rows, remove stale mock agent binaries and
plandex.ai URL patterns from test harness, update CLI help examples to use
remaining agents, and bump version 0.5.7 → 0.5.8.
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Remove OVH as a cloud provider: delete ovh/ directory (lib + 11 agent
scripts), remove from manifest.json clouds and all ovh/* matrix entries,
update README matrix table, remove OVH destroy case in CLI commands,
and clean up all test harness references (mock.sh, mock-curl-script.sh,
record.sh, e2e.sh, cloud-lib-api-surface.test.ts, test-infra-sync.test.ts)
- Make featured_cloud an array (string[]) so agents can recommend multiple
clouds; update manifest.ts type, all 10 manifest.json values, and the
prioritizeCloudsByCredentials() comparison in commands.ts
- Sandbox OAuth in subprocess tests: add OPENROUTER_API_KEY=sk-or-test-fake
to the default env in cli-entry-edge-cases.test.ts and
cmdrun-resolution.test.ts so get_or_prompt_api_key() never triggers the
real OAuth browser flow during test runs
- Fix upload-file-security.test.ts SSH cloud count (5→4) after OVH removal
- Bump CLI version 0.5.6 → 0.5.7
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
These 5 agents are being dropped from the Spawn matrix. This removes
45 agent scripts across 9 clouds, cleans the manifest, test fixtures,
READMEs, CLI source, and shared library comments.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. _multi_creds_validate referenced undefined help_url variable, causing
empty "Get new credentials from: " error messages when OVH credential
validation fails. Added help_url as parameter and pass it from caller.
2. _spawn_inject_env_vars (used by 130+ agent scripts via spawn_agent)
uploaded credentials to static /tmp/env_config path. The older
inject_env_vars_ssh/inject_env_vars_cb functions document this as a
symlink attack vector and use randomized paths. Fixed to match.
3. Removed dead inject_env_vars_fly and inject_env_vars_sprite functions
(all agent scripts now use spawn_agent -> _spawn_inject_env_vars).
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds
aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.
Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.
Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test: update aider/gptme/interpreter assertions from pip to uv
The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).
Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ci: trigger test run for uv assertion fix
* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds
- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
$escaped_cmd that prevented the remote shell from consuming escapes,
breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
where it escaped shell operators into literal characters since daytona exec
has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add --pty to fly ssh console for interactive sessions
fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.
Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Oracle Cloud is removed as a supported provider. Each agent now has a
`featured_cloud` field in manifest.json that controls cloud sort order
in the CLI picker — featured clouds appear after credential-detected
clouds but before CLI-installed ones, with a "recommended" hint.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add QA upgrade — macOS compat linter, per-agent mock assertions
Layer 1: macOS compat linter (test/macos-compat.sh)
- 12 rules (MC001–MC012) catching bash 3.2 incompatibilities
- Detects: base64 -w0 file args, non-portable echo flags, source <(),
((var++)), read -d, nounset flag, sed -i, date %N, local -n,
declare -A, ${var,,}, and |&
- Added to CI lint.yml in warn-only mode for burn-in
- Integrated as Phase 0.5 in qa-dry-run.sh
Layer 2: Per-agent mock assertions
- test/fixtures/_shared_agent_assertions.sh with install checks
for all 15 agents (claude, openclaw, aider, goose, etc.)
- Integrated into test/mock.sh via _run_agent_assertions()
Also includes branch fixes:
- Fix base64 -w0 to use stdin redirect (aws, daytona, fly)
- Fix fly/openclaw to use npm install instead of broken curl|bash
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add E2E test harness and integrate into QA pipeline
Add test/e2e.sh — a full E2E test harness that provisions real servers,
installs agents, and verifies setup across all clouds. Features:
- Smoke test (one canary agent per cloud) and full matrix modes
- Credential auto-detection for 8 clouds
- Per-cloud preflight validation (sequential) then parallel agent tests
- Stale server cleanup, timing history, cross-cloud comparison
- Auto-fix and optimization phases via Claude agents
- macOS bash 3.2 compatible
Integrate E2E as Phase 5 in both qa-cycle.sh and qa-dry-run.sh:
- Runs after mock tests pass, gated on cloud credentials
- Phase 5b auto-fixes failures using per-agent worktree branches
- Parses results and includes in QA summary
Also fixes:
- shared/common.sh: honour SPAWN_NON_INTERACTIVE=1 in safe_read()
- aws/lib/common.sh: fix SSH key import (use cat instead of base64,
handle race condition on concurrent imports)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
test/run.sh (3 failures fixed):
- Export TEST_DIR so sprite mock tracks create→list state across processes
- Add sleep mock to avoid 30s polling loops in ensure_sprite_exists
- Add timeout/gtimeout, python3 pass-through mocks for host protection
- Set HOME to fake home for isolation, create fake home directory structure
- Clean up /tmp/spawn_* temp files in cleanup trap
test/mock.sh (29 failures fixed):
- Fix fly mock to detect "echo ok" in fly ssh console -C arguments
(including printf %q escaped form) so _fly_wait_for_ssh() succeeds
- Add timeout/gtimeout pass-through mocks to prevent system calls
- Add python3 delegate mock for JSON parsing in shared/common.sh
- Clean up /tmp/spawn_* temp files in cleanup trap
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes#1409
The bash sandbox test now verifies that test runs don't create or
modify agent-specific directories and configuration files:
- Checks that ~/.openclaw, ~/.sprite, and ~/.claude directories are
not created by test runs
- Verifies ~/.claude.json and ~/.claude/settings.json are not modified
during tests (using mtime comparison to handle pre-existing files)
- Skips checks for directories/files that existed before tests ran to
avoid false positives in development environments
This ensures tests remain properly sandboxed and don't pollute the
production environment with agent artifacts.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes#1403
**Changes:**
1. **test/run.sh** - Isolated mock state files:
- Changed /tmp/sprite_mock_created* to use TEST_DIR instead
- Added cleanup of any leaked /tmp files in cleanup() trap
- Prevents /tmp pollution from mock sprite state files
2. **test/record.sh** - Sandboxed config directory:
- Added TEST_CONFIG_DIR environment variable support
- When set, overrides HOME to prevent writing to ~/.config/spawn/
- Allows tests to run without polluting production config
3. **test/qa-dry-run.sh** - Safe git operations:
- Changed git checkout to git restore for reverting README changes
- Prevents potential checkout pollution of working tree
- Falls back to git checkout -- for older git versions
4. **test/test-sandbox.sh** - New verification test:
- Verifies no /tmp pollution after test/run.sh
- Verifies production config not modified
- Verifies mock.sh uses isolated temp directories
**Why:** Prevents test suite from polluting production environment (file writes to /tmp, ~/.config/spawn/, git state mutations).
Agent: test-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Fly.io had zero test coverage — every bug fixed this session (stale
tokens, FlyV1 auth, name-taken failures, SSH hangs, PATH issues) went
undetected. This adds the full mock test infrastructure:
- test/fixtures/fly/ — env vars, API assertions, fixture JSONs for
app creation, machine creation, and token validation endpoints
- test/mock-curl-script.sh — URL stripping for api.machines.dev,
body validation for machine creation, synthetic status responses,
app creation POST handler, state tracking
- test/mock.sh — mock fly/flyctl CLI binary (ssh console, auth token),
URL stripping, required field validation, base64 mock
- test/record.sh — Fly.io REST endpoints now recordable, live
create+delete cycle, error detection, auth var mapping
All 15 agent scripts (aider, claude, openclaw, etc.) are automatically
discovered and tested: 75 passed, 0 failed.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: auto-run gcloud auth login on expired GCP tokens
Instead of telling users to run `gcloud auth login` manually, just
run it automatically when auth check fails or instance creation hits
a reauthentication error, then retry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: prioritize clouds with CLI installed + hcloud CLI integration
When selecting a cloud provider, clouds are now sorted in 3 tiers:
1. Credentials detected (env vars set) — top priority
2. CLI installed (e.g., gcloud, hcloud, aws) — middle priority
3. Neither — default order
Also adds hcloud CLI-first support for Hetzner operations (server
create/delete/list, SSH key management, auth) with automatic fallback
to the existing REST API when hcloud is not available.
Closes#1370
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: rename aws-lightsail to aws across the project
Simplifies the cloud key from "aws-lightsail" to "aws" — AWS should
have a single entry regardless of the underlying service used.
Renames the directory, updates manifest.json matrix keys, CLI map,
test fixtures, README, and all agent scripts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All cloud claude.sh scripts had inline curl-only installs with no fallback.
When the curl installer failed (transient outage, rate limit), installation
failed with no recovery. Additionally, fnm-installed Node.js was invisible
to subsequent SSH sessions because each SSH command runs in a non-interactive
shell that doesn't source .bashrc/.zshrc.
Changes:
- Migrate 8 cloud scripts to use shared install_claude_code (curl → npm → bun)
- Move _ensure_node_runtime before npm/bun install attempts (not after)
- Add fnm paths to claude_path so node is discoverable across SSH sessions
- Prefix npm/bun install commands with claude_path for PATH visibility
- Update test assertion to match new install_claude_code behavior
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reduce from 41 cloud providers to 10 (9 + local) curated for launch:
- local (free), oracle (free tier), hetzner (~€3.29/mo), ovh (~€3.50/mo),
fly (free tier), aws-lightsail ($3.50/mo), daytona (pay-per-second),
digitalocean ($4/mo), gcp ($7.11/mo), sprite (Fly.io VMs)
Changes:
- Remove 30 cloud directories, test fixtures, and provider-specific tests
- Slim manifest.json from 600 to 150 matrix entries, sorted by price
- Update CLAUDE.md with higher bar for adding clouds (prestige + pricing)
- Transform discovery service from code-implementing team to upvote-driven
demand tracker that creates proposal issues and only implements when a
proposal reaches 50+ upvotes
- Create GitHub issue #1183 as cloud wishlist with all dropped clouds
- Add discovery-team/cloud-proposal/agent-proposal labels
- Protect discovery-team issues from refactor team (no comments/changes)
- Fix all CLI tests (8034 pass, 0 fail) and shell tests (80 pass, 0 fail)
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mock test assertion was checking for GET /sshkeys but the actual
Scaleway API endpoint is /ssh-keys (with a hyphen), causing all 15
scaleway agent tests to fail the "fetches SSH keys" check.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract pattern-matching logic in _strip_api_base() into separate helper functions (_strip_gcore_endpoint, _strip_scaleway_endpoint) to reduce function complexity from 36 lines to organized cases with extracted handlers.
Refactor ensure_api_token_with_provider() in shared/common.sh by extracting:
- _prompt_for_api_token() handles user prompting
- _validate_env_var_name() handles security validation
Reduces main function complexity and improves testability.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Break down overly complex functions into smaller, single-purpose helpers:
discovery.sh:
- Extract _sync_and_setup() from run_team_cycle() for git sync + setup
- Extract _launch_claude() to handle process startup
- Extract _session_completed() to check session status
- Extract _cleanup_cycle_files() for file cleanup
- Reduces run_team_cycle() from 71 lines to 39 lines
record.sh:
- Extract _validate_response_not_empty() for empty check
- Extract _validate_response_json() for JSON validation
- Extract _validate_response_no_error() for API error checking
- Extract _record_fixture_metadata() for metadata recording
- Reduces _save_live_fixture() from 34 lines to 15 lines
shared/common.sh:
- Extract _check_agent_in_path() for PATH verification
- Extract _check_agent_runs() for execution verification
- Reduces verify_agent_installed() from 32 lines to 11 lines
Each helper is focused on one concern, improving maintainability and testability.
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Extract 60+ line nested case statement in _validate_body() into
dedicated _get_required_fields() function using cloud:endpoint pattern
matching. Reduces _validate_body() from 93 to 35 lines while improving
readability and maintainability.
Extract 162-line heredoc from build_team_prompt() into external
discovery-team-prompt.txt template file. Reduces function to 6 lines,
making discovery.sh more maintainable.
All 80 bash tests pass. No functionality change.
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Adds missing test infrastructure functions that were previously only in
mock-curl-script.sh but required by test-infra-sync.test.ts:
- _strip_api_base(): Strips cloud provider API base URLs to extract endpoint paths
- _validate_body(): Validates POST request bodies contain required fields for major clouds
Fixes test failures in test-infra-sync.test.ts where coverage validation checks
rely on these functions being present in test/mock.sh.
Agent: test-engineer
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Extracted ssh-keygen mock creation into _create_ssh_keygen_mock() to
simplify setup_mock_agents() from 38 to 13 lines.
Extracted validation and response handling in test/record.sh:
- _validate_endpoint_response(): handles empty/invalid/error responses
- _save_endpoint_fixture(): saves fixture and updates metadata
Reduces _record_endpoint() from 43 to 17 lines.
Extracted ID extraction and delete response handling:
- _extract_resource_id(): extracts ID from create response
- _handle_delete_response(): handles fallback for empty delete responses
Reduces _live_create_delete_cycle() from 44 to 28 lines.
All 79 tests pass.
Agent: complexity-hunter
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Extracted the large 270-line embedded mock curl script from the
setup_mock_curl() function into a separate file (mock-curl-script.sh).
This reduces setup_mock_curl() from 270 lines to 6 lines, improving
readability and maintainability.
The refactoring:
- Creates test/mock-curl-script.sh with all mock curl implementation
- Simplifies setup_mock_curl() to copy the external script
- Maintains identical functionality (all tests pass)
- Makes the mock curl logic easier to understand and modify
Agent: complexity-hunter
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Added _api_assertions.sh fixtures for binarylane, genesiscloud, hyperstack, kamatera, latitude, ovh, scaleway, and upcloud to enable comprehensive mock test coverage. Updated _validate_body() in test/mock.sh to validate POST request bodies for all cloud providers, ensuring payload correctness. Fixed syntax error in gcore validation (!! to ;;).
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Extracted helper functions to reduce cyclomatic complexity:
test/mock.sh:
- Extract _wait_with_timeout() from run_script_with_timeout() (reduced from 32→17 lines)
- Extract _setup_test_env() and _record_categorized_result() from run_test() (reduced from 50→26 lines)
test/record.sh:
- Refactor has_api_error() to use lambda dict for cloud-specific checks (improved readability, same logic)
- Extract _format_env_var_display() from list_clouds() to eliminate nested loop (reduced from 48→32 lines)
All functions maintain identical behavior and pass syntax validation.
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Extract assertion tracking and fixture detection logic in mock.sh:
- New _run_assertions_and_track() helper consolidates 20 lines of repeated assertions
- New _has_missing_fixture() helper checks mock log for fixture errors
- run_test() now 30 lines shorter, focusing on orchestration rather than details
Extract cloud endpoints data in record.sh:
- Replace 132-line case statement with data-driven approach
- Each cloud's endpoints now live in _ENDPOINTS_{cloud} variable
- get_endpoints() function reduced to 3 lines, delegates to variable lookup
Benefits:
- Reduced cognitive load: test logic separated from data
- Easier to add new clouds: just add _ENDPOINTS_* variable
- Better maintainability: centralized endpoint definitions
Tests: All 80 tests pass with fixtures enabled.
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The Gcore PR (#1079) introduced `!!` instead of `;;` as case statement
terminators in 4 places, causing a syntax error on line 542 that breaks
all fixture recording.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
CRITICAL: Add validation to prevent command injection via malicious environment variable names in `export "${var_name}=..."` patterns.
Vulnerability Details:
- All instances of `export "${var_name}=${value}"` where var_name is derived from external sources (manifest.json auth fields, user input, API responses) were vulnerable to command injection
- If var_name contained shell metacharacters like `;`, `$()`, or backticks, arbitrary code could be executed
- Example exploit: var_name=`FOO; rm -rf /` would execute the rm command
Affected Files:
- shared/key-request.sh: _try_load_env_var() - var_name from manifest.json
- shared/common.sh: _load_token_from_config(), ensure_api_token_with_provider(), _multi_creds_load_config(), _multi_creds_prompt(), _poll_instance_once() - var_name from function parameters
- test/record.sh: _load_multi_config_from_file(), _try_load_cloud_config(), _prompt_cloud_creds_interactive() - var_name from test fixtures
Fix Applied:
- Added regex validation before all export statements: `^[A-Z_][A-Z0-9_]*$`
- This allowlist enforces standard POSIX environment variable naming (uppercase letters, digits, underscores only, must start with letter or underscore)
- Returns error if validation fails, preventing injection
Impact:
- While current usage passes hardcoded env var names (e.g., "HCLOUD_TOKEN"), the vulnerability existed in the implementation
- manifest.json is currently trusted, but defense-in-depth prevents supply chain attacks or accidental malformed entries
- Test infrastructure was also vulnerable to malicious fixture data
Agent: security-auditor
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Add ServerSpace (serverspace.io) as a new cloud provider with global
locations (EU, US, Asia). Uses REST API with X-API-KEY auth and async
task-based server creation with polling.
- serverspace/lib/common.sh: Full provider library with API wrapper,
SSH key management, server provisioning with cloud-init, task polling
- serverspace/claude.sh: Claude Code agent deployment
- serverspace/aider.sh: Aider agent deployment
- serverspace/goose.sh: Goose agent deployment
- manifest.json: Cloud definition + 15 matrix entries (3 implemented)
- test/mock.sh: URL stripping, body validation, synthetic responses
- test/record.sh: Endpoints, auth, API calls, error detection
- test/fixtures/serverspace/: Mock fixtures for all API endpoints
Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- test/mock.sh: Extract _tracked_assert and _categorize_failure from run_test (86->74 lines)
- ionos/lib/common.sh: Extract _ionos_validate_create_params and _ionos_require_ubuntu_image from create_server (51->28 lines)
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Civo tests failed because networks.json, disk_images.json, and
correctly-named sshkeys.json fixtures were missing. Hetzner tests
failed because datacenters.json was missing (needed for server type
validation). Scaleway tests failed because SCW_DEFAULT_PROJECT_ID
was missing from env, images.json had no Ubuntu images, and
create_server.json fixture was absent.
Also adds Civo and Scaleway to mock's _synthetic_active_response
for instance polling, and fixes Scaleway account API URL stripping.
Results: 435 passed, 0 failed, 1 skipped (previously 270/165/1).
Agent: pr-maintainer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(security): harden weak crypto fallbacks, key validation, and temp paths
- CSRF state generation: fail instead of using predictable date+$RANDOM
fallback when openssl and /dev/urandom are unavailable (OAuth CSRF bypass)
- Kamatera password: fail instead of using predictable date-based password
when no secure random source available
- key-server validKeyVal: enforce 8-512 char limits and ASCII-only check
to block malformed/oversized values (Fixes#969)
- upload_config_file: use mktemp-derived randomness for remote temp paths
instead of predictable $RANDOM (symlink attack on remote server)
Agent: security-auditor
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): update assertions for upload_config_file mktemp-derived paths
The upload_config_file function now uses mktemp-derived basenames
(spawn_config_tmp.XXX) instead of the original filename for remote temp
paths. Update test/run.sh assertions to:
- Match "spawn_config" in the -file upload path
- Verify mv commands move files to correct final destinations
(settings.json, .claude.json)
Addresses reviewer feedback on PR #1039.
Agent: pr-maintainer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>