spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-05-09 19:49:58 +00:00

Author	SHA1	Message	Date
A	f3ffb6caed	fix: broken error message in multi-creds validation, predictable temp path (#1442 ) 1. _multi_creds_validate referenced undefined help_url variable, causing empty "Get new credentials from: " error messages when OVH credential validation fails. Added help_url as parameter and pass it from caller. 2. _spawn_inject_env_vars (used by 130+ agent scripts via spawn_agent) uploaded credentials to static /tmp/env_config path. The older inject_env_vars_ssh/inject_env_vars_cb functions document this as a symlink attack vector and use randomized paths. Fixed to match. 3. Removed dead inject_env_vars_fly and inject_env_vars_sprite functions (all agent scripts now use spawn_agent -> _spawn_inject_env_vars). Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 07:51:28 -05:00
Ahmed Abushagur	db4aaa0c73	fix: prevent SSH hangs, fix command escaping, pin Python 3.12 for aider (#1439 ) * fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds aider-chat on Python 3.13 fails with `ImportError: cannot import name '_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved — those releases have no Python 3.13 binary wheels, so the C extension is missing at runtime. Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>` and single quotes get mangled by `printf '%q'` in run_server before the command reaches the remote machine) with `--upgrade`, which forces all transitive deps including Pillow to their latest compatible versions. Also adds a plain-text echo before the install so users see progress instead of a silent hang during the 2-4 minute install. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: update aider/gptme/interpreter assertions from pip to uv The install method for aider, gptme, and open-interpreter was changed from pip to `uv tool install` across all clouds. The mock test assertions still checked for the old `pip.install.` patterns, causing 9 failures (3 agents × 3 clouds). Update patterns to match the actual `uv tool install` commands now used in all cloud scripts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger test run for uv assertion fix * fix: prevent SSH hangs, restore stderr, fix command escaping across clouds - Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH stdin theft causing sequential install/verify/configure steps to hang - Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default SSH_OPTS so long-running installs don't silently drop on flaky networks - Remove 2>/dev/null from Fly.io run_server so remote command errors are no longer silently swallowed (--quiet flag still suppresses flyctl noise) - Fix Fly.io printf '%q' double-quoting: remove extra quotes around $escaped_cmd that prevented the remote shell from consuming escapes, breaking && \|\| \| operators in commands - Remove broken printf '%q' from Daytona run_server and interactive_session where it escaped shell operators into literal characters since daytona exec has no intermediate shell layer - Pin aider to --python 3.12 instead of --with audioop-lts across all clouds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add --pty to fly ssh console for interactive sessions fly ssh console -C does not allocate a pseudo-terminal by default, causing interactive TUI agents (aider, claude) to fail with "Input is not a terminal (fd=0)" or completely unresponsive input. Adding --pty forces PTY allocation, matching how other clouds handle interactive sessions (SSH uses -t, Sprite uses -tty). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-18 04:23:15 -05:00
A	8a4f5873f9	feat: remove Oracle Cloud, add featured_cloud per agent (#1430 ) Oracle Cloud is removed as a supported provider. Each agent now has a `featured_cloud` field in manifest.json that controls cloud sort order in the CLI picker — featured clouds appear after credential-detected clouds but before CLI-installed ones, with a "recommended" hint. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-17 22:52:41 -08:00
Ahmed Abushagur	633ce8eaac	feat: upgrade default server sizes, fix Fly.io agent installs, improve E2E tests (#1428 ) - Upgrade default VM sizes across clouds for better agent performance: - Hetzner: cpx11 → cx23 (with cx22 fallback support for deprecated types) - DigitalOcean: s-2vcpu-2gb → s-2vcpu-4gb - Daytona: 2048MB → 4096MB memory - Oracle: VM.Standard.E2.1.Micro → VM.Standard.A1.Flex - OVH: d2-2 → d2-4 - Fix Fly.io agent failures: - Add Node.js + build-essential to wait_for_cloud_init (fixes npm-based agents) - Prepend PATH in interactive_session (fixes "source not found" errors) - Fix openclaw installs across clouds: use explicit PATH export instead of source - Fix DigitalOcean token validation (check "uuid" not "id") - Fix AWS cloud-init: chown .bashrc/.zshrc to ubuntu user - Improve Hetzner fallback: add "cheapest available" as last-resort fallback - Upgrade E2E tests: per-combo auto-fix, credential collection, robustness fixes Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 22:17:08 -08:00
Ahmed Abushagur	22b6a402f4	feat: E2E test harness, QA pipeline integration, macOS compat linter (#1425 ) * feat: add QA upgrade — macOS compat linter, per-agent mock assertions Layer 1: macOS compat linter (test/macos-compat.sh) - 12 rules (MC001–MC012) catching bash 3.2 incompatibilities - Detects: base64 -w0 file args, non-portable echo flags, source <(), ((var++)), read -d, nounset flag, sed -i, date %N, local -n, declare -A, ${var,,}, and \|& - Added to CI lint.yml in warn-only mode for burn-in - Integrated as Phase 0.5 in qa-dry-run.sh Layer 2: Per-agent mock assertions - test/fixtures/_shared_agent_assertions.sh with install checks for all 15 agents (claude, openclaw, aider, goose, etc.) - Integrated into test/mock.sh via _run_agent_assertions() Also includes branch fixes: - Fix base64 -w0 to use stdin redirect (aws, daytona, fly) - Fix fly/openclaw to use npm install instead of broken curl\|bash Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add E2E test harness and integrate into QA pipeline Add test/e2e.sh — a full E2E test harness that provisions real servers, installs agents, and verifies setup across all clouds. Features: - Smoke test (one canary agent per cloud) and full matrix modes - Credential auto-detection for 8 clouds - Per-cloud preflight validation (sequential) then parallel agent tests - Stale server cleanup, timing history, cross-cloud comparison - Auto-fix and optimization phases via Claude agents - macOS bash 3.2 compatible Integrate E2E as Phase 5 in both qa-cycle.sh and qa-dry-run.sh: - Runs after mock tests pass, gated on cloud credentials - Phase 5b auto-fixes failures using per-agent worktree branches - Parses results and includes in QA summary Also fixes: - shared/common.sh: honour SPAWN_NON_INTERACTIVE=1 in safe_read() - aws/lib/common.sh: fix SSH key import (use cat instead of base64, handle race condition on concurrent imports) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 20:41:07 -05:00
A	af1a2014fa	fix: resolve 32 test failures in run.sh and mock.sh (#1419 ) test/run.sh (3 failures fixed): - Export TEST_DIR so sprite mock tracks create→list state across processes - Add sleep mock to avoid 30s polling loops in ensure_sprite_exists - Add timeout/gtimeout, python3 pass-through mocks for host protection - Set HOME to fake home for isolation, create fake home directory structure - Clean up /tmp/spawn_* temp files in cleanup trap test/mock.sh (29 failures fixed): - Fix fly mock to detect "echo ok" in fly ssh console -C arguments (including printf %q escaped form) so _fly_wait_for_ssh() succeeds - Add timeout/gtimeout pass-through mocks to prevent system calls - Add python3 delegate mock for JSON parsing in shared/common.sh - Clean up /tmp/spawn_* temp files in cleanup trap Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-17 11:49:28 -08:00
A	e52e290b25	fix: enhance sandbox test to detect agent directory residue (#1417 ) Fixes #1409 The bash sandbox test now verifies that test runs don't create or modify agent-specific directories and configuration files: - Checks that ~/.openclaw, ~/.sprite, and ~/.claude directories are not created by test runs - Verifies ~/.claude.json and ~/.claude/settings.json are not modified during tests (using mtime comparison to handle pre-existing files) - Skips checks for directories/files that existed before tests ran to avoid false positives in development environments This ensures tests remain properly sandboxed and don't pollute the production environment with agent artifacts. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 12:52:24 -05:00
A	1a1c06e038	test: sandbox bash tests to prevent production env pollution (#1404 ) Fixes #1403 Changes: 1. test/run.sh - Isolated mock state files: - Changed /tmp/sprite_mock_created* to use TEST_DIR instead - Added cleanup of any leaked /tmp files in cleanup() trap - Prevents /tmp pollution from mock sprite state files 2. test/record.sh - Sandboxed config directory: - Added TEST_CONFIG_DIR environment variable support - When set, overrides HOME to prevent writing to ~/.config/spawn/ - Allows tests to run without polluting production config 3. test/qa-dry-run.sh - Safe git operations: - Changed git checkout to git restore for reverting README changes - Prevents potential checkout pollution of working tree - Falls back to git checkout -- for older git versions 4. test/test-sandbox.sh - New verification test: - Verifies no /tmp pollution after test/run.sh - Verifies production config not modified - Verifies mock.sh uses isolated temp directories Why: Prevents test suite from polluting production environment (file writes to /tmp, ~/.config/spawn/, git state mutations). Agent: test-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 11:26:17 -05:00
Ahmed Abushagur	a9d0ee9863	test: add mock test coverage for all 15 Fly.io agent scripts (#1390 ) Fly.io had zero test coverage — every bug fixed this session (stale tokens, FlyV1 auth, name-taken failures, SSH hangs, PATH issues) went undetected. This adds the full mock test infrastructure: - test/fixtures/fly/ — env vars, API assertions, fixture JSONs for app creation, machine creation, and token validation endpoints - test/mock-curl-script.sh — URL stripping for api.machines.dev, body validation for machine creation, synthetic status responses, app creation POST handler, state tracking - test/mock.sh — mock fly/flyctl CLI binary (ssh console, auth token), URL stripping, required field validation, base64 mock - test/record.sh — Fly.io REST endpoints now recordable, live create+delete cycle, error detection, auth var mapping All 15 agent scripts (aider, claude, openclaw, etc.) are automatically discovered and tested: 75 passed, 0 failed. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 05:52:26 -05:00
A	c4eccbd72f	feat: prioritize clouds with CLI installed + hcloud CLI integration (#1375 ) * fix: auto-run gcloud auth login on expired GCP tokens Instead of telling users to run `gcloud auth login` manually, just run it automatically when auth check fails or instance creation hits a reauthentication error, then retry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: prioritize clouds with CLI installed + hcloud CLI integration When selecting a cloud provider, clouds are now sorted in 3 tiers: 1. Credentials detected (env vars set) — top priority 2. CLI installed (e.g., gcloud, hcloud, aws) — middle priority 3. Neither — default order Also adds hcloud CLI-first support for Hetzner operations (server create/delete/list, SSH key management, auth) with automatic fallback to the existing REST API when hcloud is not available. Closes #1370 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: rename aws-lightsail to aws across the project Simplifies the cloud key from "aws-lightsail" to "aws" — AWS should have a single entry regardless of the underlying service used. Renames the directory, updates manifest.json matrix keys, CLI map, test fixtures, README, and all agent scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-16 20:12:35 -08:00
A	da30c7f5d3	security: replace eval with native indirect expansion in test/record.sh (#1351 ) Replaces fragile eval-based indirect variable expansion with bash's native ${!var} syntax. This eliminates potential command injection risks and improves code clarity. Changes: - Line 139: eval "local val=\${...}" → local val="${!env_var:-}" - Line 168: eval "local current_val=\${...}" → local current_val="${!env_var:-}" - Line 215: eval "[[ -n \${...} ]]" → [[ -n "${!env_var:-}" ]] - Line 223: eval "[[ -n \${...} ]]" → [[ -n "${!env_var:-}" ]] - Line 246: eval "local val=\${...}" → local val="${!env_var:-}" - Line 276: eval "local current=\${...}" → local current="${!var_name:-}" Security impact: Removes eval usage that could theoretically allow command injection if env var names were ever user-controlled (currently not the case, but pattern is fragile). Fixes part of issue #763 (MEDIUM: Indirect variable expansion via eval) Agent: security-auditor Co-authored-by: spawn-bot <bot@openrouter.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-16 20:25:48 -05:00
A	5f39b035c6	refactor: extract credential loading helpers to reduce complexity in test/record.sh (#1348 ) Split credential loading logic into focused helper functions: - _export_env_vars_from_fields: Extract array export logic (16 lines) - _load_single_token_config: Extract single-token loading (14 lines) Changes: - try_load_config reduced from 39 to 28 lines (28% reduction) - _load_multi_config_from_file reduced from 38 to 26 lines (32% reduction) - Eliminated duplicate env var validation logic - Improved readability with clear separation of concerns All 80 tests passing. No functional changes. Agent: complexity-hunter Co-authored-by: spawn-bot <bot@openrouter.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-16 20:23:49 -05:00
A	ec81c74594	refactor: introduce cloud adapter + spawn_agent runner system (#1340 ) Eliminate ~70% boilerplate across 149 agent scripts by introducing a standard cloud_* adapter interface and spawn_agent orchestration runner. Each cloud's lib/common.sh now exports 7 adapter functions (cloud_authenticate, cloud_provision, cloud_wait_ready, cloud_run, cloud_upload, cloud_interactive, cloud_label) that wrap cloud-specific operations behind a uniform interface. Agent scripts define hooks (agent_install, agent_env_vars, agent_launch_cmd, etc.) and call `spawn_agent "Agent Name"` — the runner handles the full deployment flow: auth → provision → wait → install → API key → env → config → launch. - shared/common.sh: add spawn_agent(), _fn_exists(), _spawn_inject_env_vars() - 10 cloud lib/common.sh files: add cloud_* adapter functions - 149 agent scripts: rewrite to hook pattern (~40-80 lines → ~20-35 lines) - test/run.sh: update 2 sprite test patterns for new adapter paths - Net reduction: ~4,300 lines (2,257 added, 6,563 removed) Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-16 16:25:44 -08:00
A	d0847986f8	fix: use shared install_claude_code across all clouds with fnm PATH fix (#1242 ) All cloud claude.sh scripts had inline curl-only installs with no fallback. When the curl installer failed (transient outage, rate limit), installation failed with no recovery. Additionally, fnm-installed Node.js was invisible to subsequent SSH sessions because each SSH command runs in a non-interactive shell that doesn't source .bashrc/.zshrc. Changes: - Migrate 8 cloud scripts to use shared install_claude_code (curl → npm → bun) - Move _ensure_node_runtime before npm/bun install attempts (not after) - Add fnm paths to claude_path so node is discoverable across SSH sessions - Prefix npm/bun install commands with claude_path for PATH visibility - Update test assertion to match new install_claude_code behavior Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-15 23:16:23 -08:00
A	3db288c3dd	feat: trim to 9 curated launch clouds, upvote-driven discovery (#1184 ) Reduce from 41 cloud providers to 10 (9 + local) curated for launch: - local (free), oracle (free tier), hetzner (~€3.29/mo), ovh (~€3.50/mo), fly (free tier), aws-lightsail ($3.50/mo), daytona (pay-per-second), digitalocean ($4/mo), gcp ($7.11/mo), sprite (Fly.io VMs) Changes: - Remove 30 cloud directories, test fixtures, and provider-specific tests - Slim manifest.json from 600 to 150 matrix entries, sorted by price - Update CLAUDE.md with higher bar for adding clouds (prestige + pricing) - Transform discovery service from code-implementing team to upvote-driven demand tracker that creates proposal issues and only implements when a proposal reaches 50+ upvotes - Create GitHub issue #1183 as cloud wishlist with all dropped clouds - Add discovery-team/cloud-proposal/agent-proposal labels - Protect discovery-team issues from refactor team (no comments/changes) - Fix all CLI tests (8034 pass, 0 fail) and shell tests (80 pass, 0 fail) Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-15 00:19:39 -08:00
A	1cb9f5a5cb	fix: correct scaleway SSH key assertion endpoint (/sshkeys → /ssh-keys) (#1140 ) The mock test assertion was checking for GET /sshkeys but the actual Scaleway API endpoint is /ssh-keys (with a hyphen), causing all 15 scaleway agent tests to fail the "fetches SSH keys" check. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-14 18:20:07 -05:00
A	11eff028a1	refactor: reduce complexity in shared/common.sh and test/mock.sh (#1128 ) Extract pattern-matching logic in _strip_api_base() into separate helper functions (_strip_gcore_endpoint, _strip_scaleway_endpoint) to reduce function complexity from 36 lines to organized cases with extracted handlers. Refactor ensure_api_token_with_provider() in shared/common.sh by extracting: - _prompt_for_api_token() handles user prompting - _validate_env_var_name() handles security validation Reduces main function complexity and improves testability. Agent: complexity-hunter Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 16:24:41 -05:00
A	c6d42e6f07	refactor: reduce complexity in discovery.sh, record.sh, and common.sh (#1123 ) Break down overly complex functions into smaller, single-purpose helpers: discovery.sh: - Extract _sync_and_setup() from run_team_cycle() for git sync + setup - Extract _launch_claude() to handle process startup - Extract _session_completed() to check session status - Extract _cleanup_cycle_files() for file cleanup - Reduces run_team_cycle() from 71 lines to 39 lines record.sh: - Extract _validate_response_not_empty() for empty check - Extract _validate_response_json() for JSON validation - Extract _validate_response_no_error() for API error checking - Extract _record_fixture_metadata() for metadata recording - Reduces _save_live_fixture() from 34 lines to 15 lines shared/common.sh: - Extract _check_agent_in_path() for PATH verification - Extract _check_agent_runs() for execution verification - Reduces verify_agent_installed() from 32 lines to 11 lines Each helper is focused on one concern, improving maintainability and testability. Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 15:44:05 -05:00
A	5e3060616c	refactor: reduce complexity in test/mock.sh and discovery.sh (#1119 ) Extract 60+ line nested case statement in _validate_body() into dedicated _get_required_fields() function using cloud:endpoint pattern matching. Reduces _validate_body() from 93 to 35 lines while improving readability and maintainability. Extract 162-line heredoc from build_team_prompt() into external discovery-team-prompt.txt template file. Reduces function to 6 lines, making discovery.sh more maintainable. All 80 bash tests pass. No functionality change. Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-14 14:11:36 -05:00
A	5b66b6e979	test: add _strip_api_base() and _validate_body() functions to test/mock.sh (#1118 ) Adds missing test infrastructure functions that were previously only in mock-curl-script.sh but required by test-infra-sync.test.ts: - _strip_api_base(): Strips cloud provider API base URLs to extract endpoint paths - _validate_body(): Validates POST request bodies contain required fields for major clouds Fixes test failures in test-infra-sync.test.ts where coverage validation checks rely on these functions being present in test/mock.sh. Agent: test-engineer Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 13:24:18 -05:00
A	7408c525c7	refactor: reduce complexity in test/mock.sh and test/record.sh (#1116 ) Extracted ssh-keygen mock creation into _create_ssh_keygen_mock() to simplify setup_mock_agents() from 38 to 13 lines. Extracted validation and response handling in test/record.sh: - _validate_endpoint_response(): handles empty/invalid/error responses - _save_endpoint_fixture(): saves fixture and updates metadata Reduces _record_endpoint() from 43 to 17 lines. Extracted ID extraction and delete response handling: - _extract_resource_id(): extracts ID from create response - _handle_delete_response(): handles fallback for empty delete responses Reduces _live_create_delete_cycle() from 44 to 28 lines. All 79 tests pass. Agent: complexity-hunter Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 13:11:42 -05:00
A	2f75c5b695	refactor: reduce complexity in test/mock.sh by extracting embedded script (#1112 ) Extracted the large 270-line embedded mock curl script from the setup_mock_curl() function into a separate file (mock-curl-script.sh). This reduces setup_mock_curl() from 270 lines to 6 lines, improving readability and maintainability. The refactoring: - Creates test/mock-curl-script.sh with all mock curl implementation - Simplifies setup_mock_curl() to copy the external script - Maintains identical functionality (all tests pass) - Makes the mock curl logic easier to understand and modify Agent: complexity-hunter Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 12:43:59 -05:00
A	0d494d044e	test: add missing API assertion fixtures and body validation for 8 cloud providers (#1107 ) Added _api_assertions.sh fixtures for binarylane, genesiscloud, hyperstack, kamatera, latitude, ovh, scaleway, and upcloud to enable comprehensive mock test coverage. Updated _validate_body() in test/mock.sh to validate POST request bodies for all cloud providers, ensuring payload correctness. Fixed syntax error in gcore validation (!! to ;;). Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-14 11:46:49 -05:00
A	0f3ca5e052	refactor: reduce complexity in test/mock.sh and test/record.sh (#1102 ) Extracted helper functions to reduce cyclomatic complexity: test/mock.sh: - Extract _wait_with_timeout() from run_script_with_timeout() (reduced from 32→17 lines) - Extract _setup_test_env() and _record_categorized_result() from run_test() (reduced from 50→26 lines) test/record.sh: - Refactor has_api_error() to use lambda dict for cloud-specific checks (improved readability, same logic) - Extract _format_env_var_display() from list_clouds() to eliminate nested loop (reduced from 48→32 lines) All functions maintain identical behavior and pass syntax validation. Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 10:43:19 -05:00
A	6647f7ca05	refactor: reduce complexity in test/mock.sh and test/record.sh (#1096 ) Extract assertion tracking and fixture detection logic in mock.sh: - New _run_assertions_and_track() helper consolidates 20 lines of repeated assertions - New _has_missing_fixture() helper checks mock log for fixture errors - run_test() now 30 lines shorter, focusing on orchestration rather than details Extract cloud endpoints data in record.sh: - Replace 132-line case statement with data-driven approach - Each cloud's endpoints now live in _ENDPOINTS_{cloud} variable - get_endpoints() function reduced to 3 lines, delegates to variable lookup Benefits: - Reduced cognitive load: test logic separated from data - Easier to add new clouds: just add _ENDPOINTS_* variable - Better maintainability: centralized endpoint definitions Tests: All 80 tests pass with fixtures enabled. Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 07:12:54 -05:00
Ahmed Abushagur	27825c6f3c	fix: replace `!!` with `;;` in gcore case branches in record.sh (#1089 ) The Gcore PR (#1079) introduced `!!` instead of `;;` as case statement terminators in 4 places, causing a syntax error on line 542 that breaks all fixture recording. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 04:15:09 -05:00
A	f3ee7e271a	security: Fix command injection vulnerability in env var exports (#1086 ) CRITICAL: Add validation to prevent command injection via malicious environment variable names in `export "${var_name}=..."` patterns. Vulnerability Details: - All instances of `export "${var_name}=${value}"` where var_name is derived from external sources (manifest.json auth fields, user input, API responses) were vulnerable to command injection - If var_name contained shell metacharacters like `;`, `$()`, or backticks, arbitrary code could be executed - Example exploit: var_name=`FOO; rm -rf /` would execute the rm command Affected Files: - shared/key-request.sh: _try_load_env_var() - var_name from manifest.json - shared/common.sh: _load_token_from_config(), ensure_api_token_with_provider(), _multi_creds_load_config(), _multi_creds_prompt(), _poll_instance_once() - var_name from function parameters - test/record.sh: _load_multi_config_from_file(), _try_load_cloud_config(), _prompt_cloud_creds_interactive() - var_name from test fixtures Fix Applied: - Added regex validation before all export statements: `^[A-Z_][A-Z0-9_]*$` - This allowlist enforces standard POSIX environment variable naming (uppercase letters, digits, underscores only, must start with letter or underscore) - Returns error if validation fails, preventing injection Impact: - While current usage passes hardcoded env var names (e.g., "HCLOUD_TOKEN"), the vulnerability existed in the implementation - manifest.json is currently trusted, but defense-in-depth prevents supply chain attacks or accidental malformed entries - Test infrastructure was also vulnerable to malicious fixture data Agent: security-auditor Co-authored-by: Spawn Refactor Service <refactor@spawn.service> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 04:01:25 -05:00
A	514bc7abc9	feat: add Gcore cloud provider with 3 agent scripts (#1079 ) Add Gcore (gcore.com) as a new cloud provider supporting global edge cloud instances via REST API with hourly billing. Implements full test infrastructure including mock fixtures, URL stripping, body validation, and live recording support. - gcore/lib/common.sh: Cloud library with apikey auth, project auto-detection - gcore/claude.sh, aider.sh, goose.sh: Agent deployment scripts - manifest.json: Cloud definition + 15 matrix entries (3 implemented, 12 missing) - test/mock.sh: URL stripping for Gcore path-parameter API, body validation, synthetic responses - test/record.sh: Endpoints, auth, API caller, error detection, live cycle - test/fixtures/gcore/: 8 fixture files for mock testing Co-authored-by: OpenRouter Bot <noreply@openrouter.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 00:19:25 -08:00
A	4cda0e35f2	feat: add ServerSpace cloud provider with 3 agent scripts (#1080 ) Add ServerSpace (serverspace.io) as a new cloud provider with global locations (EU, US, Asia). Uses REST API with X-API-KEY auth and async task-based server creation with polling. - serverspace/lib/common.sh: Full provider library with API wrapper, SSH key management, server provisioning with cloud-init, task polling - serverspace/claude.sh: Claude Code agent deployment - serverspace/aider.sh: Aider agent deployment - serverspace/goose.sh: Goose agent deployment - manifest.json: Cloud definition + 15 matrix entries (3 implemented) - test/mock.sh: URL stripping, body validation, synthetic responses - test/record.sh: Endpoints, auth, API calls, error detection - test/fixtures/serverspace/: Mock fixtures for all API endpoints Co-authored-by: OpenRouter Bot <noreply@openrouter.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 02:47:07 -05:00
A	5b0358bcd1	refactor: extract helpers to reduce complexity in run_test and ionos create_server (#1060 ) - test/mock.sh: Extract _tracked_assert and _categorize_failure from run_test (86->74 lines) - ionos/lib/common.sh: Extract _ionos_validate_create_params and _ionos_require_ubuntu_image from create_server (51->28 lines) Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-14 01:49:33 -05:00
A	cf16a8b55b	fix(test): add missing mock fixtures for Civo, Hetzner, and Scaleway (#1050 ) Civo tests failed because networks.json, disk_images.json, and correctly-named sshkeys.json fixtures were missing. Hetzner tests failed because datacenters.json was missing (needed for server type validation). Scaleway tests failed because SCW_DEFAULT_PROJECT_ID was missing from env, images.json had no Ubuntu images, and create_server.json fixture was absent. Also adds Civo and Scaleway to mock's _synthetic_active_response for instance polling, and fixes Scaleway account API URL stripping. Results: 435 passed, 0 failed, 1 skipped (previously 270/165/1). Agent: pr-maintainer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 23:37:20 -05:00
A	44b9a5bdff	fix(security): harden weak crypto fallbacks, key validation, and temp paths (#1039 ) * fix(security): harden weak crypto fallbacks, key validation, and temp paths - CSRF state generation: fail instead of using predictable date+$RANDOM fallback when openssl and /dev/urandom are unavailable (OAuth CSRF bypass) - Kamatera password: fail instead of using predictable date-based password when no secure random source available - key-server validKeyVal: enforce 8-512 char limits and ASCII-only check to block malformed/oversized values (Fixes #969) - upload_config_file: use mktemp-derived randomness for remote temp paths instead of predictable $RANDOM (symlink attack on remote server) Agent: security-auditor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): update assertions for upload_config_file mktemp-derived paths The upload_config_file function now uses mktemp-derived basenames (spawn_config_tmp.XXX) instead of the original filename for remote temp paths. Update test/run.sh assertions to: - Match "spawn_config" in the -file upload path - Verify mv commands move files to correct final destinations (settings.json, .claude.json) Addresses reviewer feedback on PR #1039. Agent: pr-maintainer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 21:43:37 -05:00
Ahmed Abushagur	c6d0cb218e	improve: make QA bot more effective with structured failures and verification (#1034 ) 5 improvements to the QA cycle: 1. Fix agents now get structured failure context — categorized failures (exit_code, missing_api_call, missing_env, no_fixture) instead of raw 500-line test output, plus a passing agent for comparison 2. Fix agent changes are verified before committing — re-runs mock tests after the agent finishes and only commits if results actually improved, discarding bad fixes that would create noise PRs 3. Test results now include failure categories — mock.sh records cloud/agent:fail:reason instead of just cloud/agent:fail, enabling smarter failure routing 4. Mock curl logs NO_FIXTURE warnings when no fixture matches a GET request, surfacing false-confidence gaps where tests pass with synthetic fallback data 5. Phase 3 (code fix) failures now escalate to GitHub issues after 3 consecutive cycles, matching the Phase 1 escalation pattern Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 20:07:54 -05:00
A	ea5d462f4f	refactor: decompose multi-credential config handling in test/record.sh (#1004 ) Extract _get_multi_cred_spec, _load_multi_config_from_file, and _save_multi_config_to_file helpers to eliminate duplicated per-cloud config blocks in try_load_config, save_config, has_credentials, prompt_credentials, and list_clouds. The cloud-to-credential mapping (OVH, UpCloud, Kamatera, AtlanticNet, CloudSigma) is now defined once in _get_multi_cred_spec and consumed by all five functions, making it trivial to add new multi-credential clouds. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 13:34:37 -08:00
A	2a66805b33	feat: Add Webdock provider support (#1001 ) Implements Webdock cloud provider with full API integration: - webdock/lib/common.sh with REST API primitives - claude.sh, cline.sh, aider.sh agent scripts - Test coverage in test/record.sh and test/mock.sh - manifest.json updated with cloud entry and matrix - README.md with usage documentation Webdock offers affordable European VPS (€2.15/month starting) with full REST API, SSH access, and developer-friendly features. Agent: cloud-scout-1 Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 13:24:06 -08:00
Ahmed Abushagur	1d9a2dbad1	perf: run cloud tests and recordings in parallel (#982 ) Both mock.sh and record.sh now run each cloud's tests/recordings concurrently as background jobs instead of sequentially. Results are aggregated after all clouds finish. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:44:57 -08:00
Ahmed Abushagur	d501b5eb1d	fix: CI test summary uses NO_COLOR instead of sed hack (#985 ) * fix: strip ANSI colors before grepping test summary The mock test output uses ANSI escape codes for colored ✓/✗/━━━ characters, so the grep in the Post summary step couldn't match them. Strip colors with sed first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use NO_COLOR standard instead of sed to strip ANSI codes mock.sh now respects the NO_COLOR env var (https://no-color.org/). CI sets NO_COLOR=1 so grep matches ✓/✗/━━━ cleanly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:26:41 -08:00
A	1124af265c	feat: add CloudSigma cloud provider (#860 ) * feat: add CloudSigma cloud provider Add CloudSigma as a new cloud provider with API-first architecture: - Create cloudsigma/lib/common.sh with HTTP Basic Auth support - Implement cloudsigma/claude.sh and cloudsigma/aider.sh agent scripts - Add CloudSigma to manifest.json (38th cloud provider) - Add matrix entries for all 15 agents (2 implemented, 13 missing) - Update test/record.sh with CloudSigma endpoints and auth handling - Update test/mock.sh with URL-stripping for CloudSigma API - Add cloudsigma/README.md with usage documentation CloudSigma features: - API v2.0 with HTTP Basic Auth (email:password) - Regions: ZRH (Zurich), WDC (Washington DC), LVS (Las Vegas) - Granular resource control (CPU/RAM/Disk independently configurable) - Ubuntu 24.04 cloned from public library drives - SSH access via cloudsigma user - Pay-as-you-go pricing starting at ~$14/month Agent: cloud-scout Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: address security review comments for CloudSigma provider - [CRITICAL] Fix command injection in credential saving: use sys.argv instead of raw shell interpolation in Python strings - [CRITICAL] Fix shell injection in create_cloudsigma_drive: pass name and size via sys.argv instead of inline interpolation - [CRITICAL] Fix shell injection in SSH key fingerprint lookups: pass fingerprint via sys.argv - [HIGH] Replace hardcoded VNC password with random generation via openssl rand -hex 8 - [MEDIUM] Fix config file path injection: pass via sys.argv Agent: pr-maintainer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 06:50:25 -08:00
A	fba986abea	feat: add HOSTKEY cloud provider (#909 ) Add HOSTKEY (https://hostkey.com/) as a new cloud provider to the spawn matrix. HOSTKEY offers affordable VPS hosting starting from €1/month with hourly billing, making it suitable for running AI agents that use remote API inference. Changes: - Created hostkey/lib/common.sh with HOSTKEY API wrappers - Implemented hostkey/claude.sh (Claude Code agent) - Implemented hostkey/openclaw.sh (OpenClaw agent) - Added HOSTKEY to manifest.json clouds section - Added matrix entries for all 15 agents (2 implemented, 13 missing) - Updated test/record.sh with HOSTKEY test infrastructure - Updated test/mock.sh with HOSTKEY URL handling - Created hostkey/README.md with usage instructions Data centers: Amsterdam, Frankfurt, Helsinki, Reykjavik, Istanbul, New York Agent: cloud-scout Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 05:08:26 -08:00
A	fc34a640bd	feat: add Atlantic.Net cloud provider (#883 ) Add Atlantic.Net Cloud as a new cloud provider with REST API support. Starting at $4-8/mo for budget VPS instances with SSH access. Implementation: - Created atlanticnet/lib/common.sh with HMAC-SHA256 API auth - Implemented 3 agent scripts: claude.sh, aider.sh, openclaw.sh - Updated manifest.json with cloud entry and 15 matrix entries - Added test coverage in test/record.sh and test/mock.sh - Created atlanticnet/README.md with usage docs API authentication uses timestamp + random GUID signed with private key. Defaults: G2.2GB plan, ubuntu-24.04_64bit image, USEAST2 location. Agent: cloud-scout-1 Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 03:07:22 -08:00
A	89ffe4802e	refactor: extract mock test env config and API assertions into per-cloud fixture files (#803 ) Reduces setup_env_for_cloud (84 lines -> 8 lines) and assert_cloud_api_calls (32 lines -> 9 lines) in test/mock.sh by moving cloud-specific data into per-cloud _env.sh and _api_assertions.sh files in test/fixtures/. Adding a new cloud's test config now only requires creating two small files in the fixtures directory instead of editing case branches in mock.sh. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 02:16:11 -08:00
A	be903f0089	feat: add CodeSandbox cloud provider (#857 ) Add CodeSandbox as a new sandbox cloud provider for running AI agents. CodeSandbox features: - Firecracker microVMs with ~2 second start times - SDK/CLI-based exec (no SSH) - Free tier: 40 hours/month on Build plan - Secure isolated environments Implementation: - Created codesandbox/lib/common.sh with SDK wrapper functions - Implemented 3 agent scripts: claude, aider, openclaw - Added CodeSandbox to manifest.json clouds - Created matrix entries (3 implemented, 12 missing) - Updated test/record.sh to list as non-recordable CLI cloud - Added codesandbox/README.md with usage instructions The implementation follows the existing pattern from e2b and modal, using Node.js SDK (@codesandbox/sdk) for sandbox lifecycle management. Agent: cloud-scout Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 02:09:31 -08:00
A	c7bbe8bc3b	refactor: extract generic _live_create_delete_cycle in test/record.sh (#818 ) The 5 per-cloud live recording functions (_live_hetzner, _live_digitalocean, _live_vultr, _live_linode, _live_civo) each duplicated 50-65 lines of identical create->save->extract-id->delete->save logic. Extract a generic _live_create_delete_cycle helper that handles the shared flow, with per-cloud body builder functions providing only the cloud-specific parts. Reduces test/record.sh by 112 lines (1016 -> 904) while preserving all behavior including cloud-specific delete delays and empty-response fallbacks. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:52:51 -08:00
A	cb1005ab31	refactor: extract helpers from run_script_test and run_shellcheck in test/run.sh (#776 ) Split run_script_test (61 lines -> 25 lines) into focused helpers: - _assert_sprite_common_commands: standard command lifecycle assertions - _assert_agent_specific: per-agent install assertions - _assert_no_temp_leaks: temp file cleanup check Split run_shellcheck (57 lines -> 12 lines) into: - _discover_shell_scripts: dynamic script discovery across cloud dirs - _run_shellcheck_on_scripts: per-script shellcheck execution and reporting Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-12 17:19:32 -08:00
A	4b0d25ca39	fix: prevent Python code injection via unescaped variables in inline Python (#771 ) Use sys.argv to pass shell values to inline Python instead of direct string interpolation, preventing single-quote injection attacks across cloud lib common.sh files and test/record.sh. Also fix eval injection in test/record.sh try_load_config() by replacing eval of Python-generated export statements with safe tab-separated parsing and direct variable assignment. Fixes #759 Fixes #760 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:47:13 -08:00
A	5a1037d92c	fix: replace ((var++)) with var=$((var + 1)) for macOS bash 3.x compat (#769 ) ((var++)) returns exit code 1 when the variable is 0 (falsy), which causes set -e to terminate the script. Replace all instances with the safe var=$((var + 1)) pattern in sprite/lib/common.sh and test/run.sh. Fixes #762 Agent: community-coordinator Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:45:51 -08:00
A	4e33cc39cd	fix: address medium security findings from #753 (#755 ) - Replace `echo -e` with `printf` in cli/install.sh for macOS bash 3.x compat - Remove `-u` (nounset) from test/run.sh — use `${VAR:-}` pattern instead - Replace `source <(curl ...)` with `eval "$(curl ...)"` in test/run.sh for curl\|bash compat - Add .gitignore patterns for sensitive files (.env, .pem, .key, credentials) Refs #753 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:48:52 -08:00
A	cec1806128	refactor: improve readability of config setup and shellcheck discovery (#744 ) - Replace hardcoded 4-cloud script list in run_shellcheck with dynamic discovery that covers all 21 clouds automatically - Convert 3 inline JSON templates (setup_claude_code_config, setup_openclaw_config, setup_continue_config) from single-line printf to readable heredocs while preserving json_escape security Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-12 15:19:11 -08:00
A	35997c8ae5	refactor: extract helpers from run_test() in test/mock.sh (#713 ) Break down the 150-line run_test() function into focused helpers: - run_script_with_timeout(): script execution with env vars and timeout - show_failure_output(): display last 20 lines on failure - assert_error_scenario(): handle error scenario assertions - assert_cloud_api_calls(): cloud-specific API call assertions - record_test_result(): write pass/fail to RESULTS_FILE run_test() is now 57 lines (62% reduction), each helper is under 35 lines. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:01:49 -08:00
A	ea943d1583	refactor: decompose 287-line setup_mock_curl into named helpers (#718 ) The mock curl heredoc script was a monolithic 287-line function with inline arg parsing, error injection, URL routing, body validation, fixture lookup, and state tracking all in one flow. Extract 10 focused helper functions within the heredoc: - _parse_args: curl argument parsing - _maybe_inject_error: MOCK_ERROR_SCENARIO handling - _handle_special_urls: install scripts, OpenRouter, spawn repo - _strip_api_base: URL-to-endpoint mapping for 14 cloud APIs - _check_fields / _validate_body: POST body validation - _try_fixture: fixture file lookup - _synthetic_active_response: cloud-specific GET-by-ID responses - _respond_get / _respond_post: METHOD-based response routing - _track_state: creation/deletion state tracking The main logic is now a 26-line sequence of named function calls, making the mock's control flow immediately readable. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:01:41 -08:00

1 2

70 commits