Commit graph

56 commits

Author SHA1 Message Date
A
3db288c3dd
feat: trim to 9 curated launch clouds, upvote-driven discovery (#1184)
Reduce from 41 cloud providers to 10 (9 + local) curated for launch:
- local (free), oracle (free tier), hetzner (~€3.29/mo), ovh (~€3.50/mo),
  fly (free tier), aws-lightsail ($3.50/mo), daytona (pay-per-second),
  digitalocean ($4/mo), gcp ($7.11/mo), sprite (Fly.io VMs)

Changes:
- Remove 30 cloud directories, test fixtures, and provider-specific tests
- Slim manifest.json from 600 to 150 matrix entries, sorted by price
- Update CLAUDE.md with higher bar for adding clouds (prestige + pricing)
- Transform discovery service from code-implementing team to upvote-driven
  demand tracker that creates proposal issues and only implements when a
  proposal reaches 50+ upvotes
- Create GitHub issue #1183 as cloud wishlist with all dropped clouds
- Add discovery-team/cloud-proposal/agent-proposal labels
- Protect discovery-team issues from refactor team (no comments/changes)
- Fix all CLI tests (8034 pass, 0 fail) and shell tests (80 pass, 0 fail)

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-15 00:19:39 -08:00
A
1cb9f5a5cb
fix: correct scaleway SSH key assertion endpoint (/sshkeys → /ssh-keys) (#1140)
The mock test assertion was checking for GET /sshkeys but the actual
Scaleway API endpoint is /ssh-keys (with a hyphen), causing all 15
scaleway agent tests to fail the "fetches SSH keys" check.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 18:20:07 -05:00
A
11eff028a1
refactor: reduce complexity in shared/common.sh and test/mock.sh (#1128)
Extract pattern-matching logic in _strip_api_base() into separate helper functions (_strip_gcore_endpoint, _strip_scaleway_endpoint) to reduce function complexity from 36 lines to organized cases with extracted handlers.

Refactor ensure_api_token_with_provider() in shared/common.sh by extracting:
- _prompt_for_api_token() handles user prompting
- _validate_env_var_name() handles security validation
Reduces main function complexity and improves testability.

Agent: complexity-hunter

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 16:24:41 -05:00
A
c6d42e6f07
refactor: reduce complexity in discovery.sh, record.sh, and common.sh (#1123)
Break down overly complex functions into smaller, single-purpose helpers:

discovery.sh:
  - Extract _sync_and_setup() from run_team_cycle() for git sync + setup
  - Extract _launch_claude() to handle process startup
  - Extract _session_completed() to check session status
  - Extract _cleanup_cycle_files() for file cleanup
  - Reduces run_team_cycle() from 71 lines to 39 lines

record.sh:
  - Extract _validate_response_not_empty() for empty check
  - Extract _validate_response_json() for JSON validation
  - Extract _validate_response_no_error() for API error checking
  - Extract _record_fixture_metadata() for metadata recording
  - Reduces _save_live_fixture() from 34 lines to 15 lines

shared/common.sh:
  - Extract _check_agent_in_path() for PATH verification
  - Extract _check_agent_runs() for execution verification
  - Reduces verify_agent_installed() from 32 lines to 11 lines

Each helper is focused on one concern, improving maintainability and testability.

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 15:44:05 -05:00
A
5e3060616c
refactor: reduce complexity in test/mock.sh and discovery.sh (#1119)
Extract 60+ line nested case statement in _validate_body() into
dedicated _get_required_fields() function using cloud:endpoint pattern
matching. Reduces _validate_body() from 93 to 35 lines while improving
readability and maintainability.

Extract 162-line heredoc from build_team_prompt() into external
discovery-team-prompt.txt template file. Reduces function to 6 lines,
making discovery.sh more maintainable.

All 80 bash tests pass. No functionality change.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 14:11:36 -05:00
A
5b66b6e979
test: add _strip_api_base() and _validate_body() functions to test/mock.sh (#1118)
Adds missing test infrastructure functions that were previously only in
mock-curl-script.sh but required by test-infra-sync.test.ts:
- _strip_api_base(): Strips cloud provider API base URLs to extract endpoint paths
- _validate_body(): Validates POST request bodies contain required fields for major clouds

Fixes test failures in test-infra-sync.test.ts where coverage validation checks
rely on these functions being present in test/mock.sh.

Agent: test-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 13:24:18 -05:00
A
7408c525c7
refactor: reduce complexity in test/mock.sh and test/record.sh (#1116)
Extracted ssh-keygen mock creation into _create_ssh_keygen_mock() to
simplify setup_mock_agents() from 38 to 13 lines.

Extracted validation and response handling in test/record.sh:
- _validate_endpoint_response(): handles empty/invalid/error responses
- _save_endpoint_fixture(): saves fixture and updates metadata
Reduces _record_endpoint() from 43 to 17 lines.

Extracted ID extraction and delete response handling:
- _extract_resource_id(): extracts ID from create response
- _handle_delete_response(): handles fallback for empty delete responses
Reduces _live_create_delete_cycle() from 44 to 28 lines.

All 79 tests pass.

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 13:11:42 -05:00
A
2f75c5b695
refactor: reduce complexity in test/mock.sh by extracting embedded script (#1112)
Extracted the large 270-line embedded mock curl script from the
setup_mock_curl() function into a separate file (mock-curl-script.sh).
This reduces setup_mock_curl() from 270 lines to 6 lines, improving
readability and maintainability.

The refactoring:
- Creates test/mock-curl-script.sh with all mock curl implementation
- Simplifies setup_mock_curl() to copy the external script
- Maintains identical functionality (all tests pass)
- Makes the mock curl logic easier to understand and modify

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 12:43:59 -05:00
A
0d494d044e
test: add missing API assertion fixtures and body validation for 8 cloud providers (#1107)
Added _api_assertions.sh fixtures for binarylane, genesiscloud, hyperstack, kamatera, latitude, ovh, scaleway, and upcloud to enable comprehensive mock test coverage. Updated _validate_body() in test/mock.sh to validate POST request bodies for all cloud providers, ensuring payload correctness. Fixed syntax error in gcore validation (!! to ;;).

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 11:46:49 -05:00
A
0f3ca5e052
refactor: reduce complexity in test/mock.sh and test/record.sh (#1102)
Extracted helper functions to reduce cyclomatic complexity:

test/mock.sh:
- Extract _wait_with_timeout() from run_script_with_timeout() (reduced from 32→17 lines)
- Extract _setup_test_env() and _record_categorized_result() from run_test() (reduced from 50→26 lines)

test/record.sh:
- Refactor has_api_error() to use lambda dict for cloud-specific checks (improved readability, same logic)
- Extract _format_env_var_display() from list_clouds() to eliminate nested loop (reduced from 48→32 lines)

All functions maintain identical behavior and pass syntax validation.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 10:43:19 -05:00
A
6647f7ca05
refactor: reduce complexity in test/mock.sh and test/record.sh (#1096)
Extract assertion tracking and fixture detection logic in mock.sh:
- New _run_assertions_and_track() helper consolidates 20 lines of repeated assertions
- New _has_missing_fixture() helper checks mock log for fixture errors
- run_test() now 30 lines shorter, focusing on orchestration rather than details

Extract cloud endpoints data in record.sh:
- Replace 132-line case statement with data-driven approach
- Each cloud's endpoints now live in _ENDPOINTS_{cloud} variable
- get_endpoints() function reduced to 3 lines, delegates to variable lookup

Benefits:
- Reduced cognitive load: test logic separated from data
- Easier to add new clouds: just add _ENDPOINTS_* variable
- Better maintainability: centralized endpoint definitions

Tests: All 80 tests pass with fixtures enabled.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 07:12:54 -05:00
Ahmed Abushagur
27825c6f3c
fix: replace !! with ;; in gcore case branches in record.sh (#1089)
The Gcore PR (#1079) introduced `!!` instead of `;;` as case statement
terminators in 4 places, causing a syntax error on line 542 that breaks
all fixture recording.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 04:15:09 -05:00
A
f3ee7e271a
security: Fix command injection vulnerability in env var exports (#1086)
CRITICAL: Add validation to prevent command injection via malicious environment variable names in `export "${var_name}=..."` patterns.

Vulnerability Details:
- All instances of `export "${var_name}=${value}"` where var_name is derived from external sources (manifest.json auth fields, user input, API responses) were vulnerable to command injection
- If var_name contained shell metacharacters like `;`, `$()`, or backticks, arbitrary code could be executed
- Example exploit: var_name=`FOO; rm -rf /` would execute the rm command

Affected Files:
- shared/key-request.sh: _try_load_env_var() - var_name from manifest.json
- shared/common.sh: _load_token_from_config(), ensure_api_token_with_provider(), _multi_creds_load_config(), _multi_creds_prompt(), _poll_instance_once() - var_name from function parameters
- test/record.sh: _load_multi_config_from_file(), _try_load_cloud_config(), _prompt_cloud_creds_interactive() - var_name from test fixtures

Fix Applied:
- Added regex validation before all export statements: `^[A-Z_][A-Z0-9_]*$`
- This allowlist enforces standard POSIX environment variable naming (uppercase letters, digits, underscores only, must start with letter or underscore)
- Returns error if validation fails, preventing injection

Impact:
- While current usage passes hardcoded env var names (e.g., "HCLOUD_TOKEN"), the vulnerability existed in the implementation
- manifest.json is currently trusted, but defense-in-depth prevents supply chain attacks or accidental malformed entries
- Test infrastructure was also vulnerable to malicious fixture data

Agent: security-auditor

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 04:01:25 -05:00
A
514bc7abc9
feat: add Gcore cloud provider with 3 agent scripts (#1079)
Add Gcore (gcore.com) as a new cloud provider supporting global edge
cloud instances via REST API with hourly billing. Implements full test
infrastructure including mock fixtures, URL stripping, body validation,
and live recording support.

- gcore/lib/common.sh: Cloud library with apikey auth, project auto-detection
- gcore/claude.sh, aider.sh, goose.sh: Agent deployment scripts
- manifest.json: Cloud definition + 15 matrix entries (3 implemented, 12 missing)
- test/mock.sh: URL stripping for Gcore path-parameter API, body validation, synthetic responses
- test/record.sh: Endpoints, auth, API caller, error detection, live cycle
- test/fixtures/gcore/: 8 fixture files for mock testing

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 00:19:25 -08:00
A
4cda0e35f2
feat: add ServerSpace cloud provider with 3 agent scripts (#1080)
Add ServerSpace (serverspace.io) as a new cloud provider with global
locations (EU, US, Asia). Uses REST API with X-API-KEY auth and async
task-based server creation with polling.

- serverspace/lib/common.sh: Full provider library with API wrapper,
  SSH key management, server provisioning with cloud-init, task polling
- serverspace/claude.sh: Claude Code agent deployment
- serverspace/aider.sh: Aider agent deployment
- serverspace/goose.sh: Goose agent deployment
- manifest.json: Cloud definition + 15 matrix entries (3 implemented)
- test/mock.sh: URL stripping, body validation, synthetic responses
- test/record.sh: Endpoints, auth, API calls, error detection
- test/fixtures/serverspace/: Mock fixtures for all API endpoints

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:47:07 -05:00
A
5b0358bcd1
refactor: extract helpers to reduce complexity in run_test and ionos create_server (#1060)
- test/mock.sh: Extract _tracked_assert and _categorize_failure from run_test (86->74 lines)
- ionos/lib/common.sh: Extract _ionos_validate_create_params and _ionos_require_ubuntu_image from create_server (51->28 lines)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 01:49:33 -05:00
A
cf16a8b55b
fix(test): add missing mock fixtures for Civo, Hetzner, and Scaleway (#1050)
Civo tests failed because networks.json, disk_images.json, and
correctly-named sshkeys.json fixtures were missing. Hetzner tests
failed because datacenters.json was missing (needed for server type
validation). Scaleway tests failed because SCW_DEFAULT_PROJECT_ID
was missing from env, images.json had no Ubuntu images, and
create_server.json fixture was absent.

Also adds Civo and Scaleway to mock's _synthetic_active_response
for instance polling, and fixes Scaleway account API URL stripping.

Results: 435 passed, 0 failed, 1 skipped (previously 270/165/1).

Agent: pr-maintainer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 23:37:20 -05:00
A
44b9a5bdff
fix(security): harden weak crypto fallbacks, key validation, and temp paths (#1039)
* fix(security): harden weak crypto fallbacks, key validation, and temp paths

- CSRF state generation: fail instead of using predictable date+$RANDOM
  fallback when openssl and /dev/urandom are unavailable (OAuth CSRF bypass)
- Kamatera password: fail instead of using predictable date-based password
  when no secure random source available
- key-server validKeyVal: enforce 8-512 char limits and ASCII-only check
  to block malformed/oversized values (Fixes #969)
- upload_config_file: use mktemp-derived randomness for remote temp paths
  instead of predictable $RANDOM (symlink attack on remote server)

Agent: security-auditor
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): update assertions for upload_config_file mktemp-derived paths

The upload_config_file function now uses mktemp-derived basenames
(spawn_config_tmp.XXX) instead of the original filename for remote temp
paths. Update test/run.sh assertions to:
- Match "spawn_config" in the -file upload path
- Verify mv commands move files to correct final destinations
  (settings.json, .claude.json)

Addresses reviewer feedback on PR #1039.

Agent: pr-maintainer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 21:43:37 -05:00
Ahmed Abushagur
c6d0cb218e
improve: make QA bot more effective with structured failures and verification (#1034)
5 improvements to the QA cycle:

1. Fix agents now get structured failure context — categorized failures
   (exit_code, missing_api_call, missing_env, no_fixture) instead of
   raw 500-line test output, plus a passing agent for comparison

2. Fix agent changes are verified before committing — re-runs mock tests
   after the agent finishes and only commits if results actually improved,
   discarding bad fixes that would create noise PRs

3. Test results now include failure categories — mock.sh records
   cloud/agent:fail:reason instead of just cloud/agent:fail, enabling
   smarter failure routing

4. Mock curl logs NO_FIXTURE warnings when no fixture matches a GET
   request, surfacing false-confidence gaps where tests pass with
   synthetic fallback data

5. Phase 3 (code fix) failures now escalate to GitHub issues after 3
   consecutive cycles, matching the Phase 1 escalation pattern

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 20:07:54 -05:00
A
ea5d462f4f
refactor: decompose multi-credential config handling in test/record.sh (#1004)
Extract _get_multi_cred_spec, _load_multi_config_from_file, and
_save_multi_config_to_file helpers to eliminate duplicated per-cloud
config blocks in try_load_config, save_config, has_credentials,
prompt_credentials, and list_clouds.

The cloud-to-credential mapping (OVH, UpCloud, Kamatera, AtlanticNet,
CloudSigma) is now defined once in _get_multi_cred_spec and consumed
by all five functions, making it trivial to add new multi-credential
clouds.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 13:34:37 -08:00
A
2a66805b33
feat: Add Webdock provider support (#1001)
Implements Webdock cloud provider with full API integration:
- webdock/lib/common.sh with REST API primitives
- claude.sh, cline.sh, aider.sh agent scripts
- Test coverage in test/record.sh and test/mock.sh
- manifest.json updated with cloud entry and matrix
- README.md with usage documentation

Webdock offers affordable European VPS (€2.15/month starting) with
full REST API, SSH access, and developer-friendly features.

Agent: cloud-scout-1

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 13:24:06 -08:00
Ahmed Abushagur
1d9a2dbad1 perf: run cloud tests and recordings in parallel (#982)
Both mock.sh and record.sh now run each cloud's tests/recordings
concurrently as background jobs instead of sequentially.
Results are aggregated after all clouds finish.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:44:57 -08:00
Ahmed Abushagur
d501b5eb1d
fix: CI test summary uses NO_COLOR instead of sed hack (#985)
* fix: strip ANSI colors before grepping test summary

The mock test output uses ANSI escape codes for colored ✓/✗/━━━
characters, so the grep in the Post summary step couldn't match
them. Strip colors with sed first.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use NO_COLOR standard instead of sed to strip ANSI codes

mock.sh now respects the NO_COLOR env var (https://no-color.org/).
CI sets NO_COLOR=1 so grep matches ✓/✗/━━━ cleanly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:26:41 -08:00
A
1124af265c
feat: add CloudSigma cloud provider (#860)
* feat: add CloudSigma cloud provider

Add CloudSigma as a new cloud provider with API-first architecture:

- Create cloudsigma/lib/common.sh with HTTP Basic Auth support
- Implement cloudsigma/claude.sh and cloudsigma/aider.sh agent scripts
- Add CloudSigma to manifest.json (38th cloud provider)
- Add matrix entries for all 15 agents (2 implemented, 13 missing)
- Update test/record.sh with CloudSigma endpoints and auth handling
- Update test/mock.sh with URL-stripping for CloudSigma API
- Add cloudsigma/README.md with usage documentation

CloudSigma features:
- API v2.0 with HTTP Basic Auth (email:password)
- Regions: ZRH (Zurich), WDC (Washington DC), LVS (Las Vegas)
- Granular resource control (CPU/RAM/Disk independently configurable)
- Ubuntu 24.04 cloned from public library drives
- SSH access via cloudsigma user
- Pay-as-you-go pricing starting at ~$14/month

Agent: cloud-scout

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: address security review comments for CloudSigma provider

- [CRITICAL] Fix command injection in credential saving: use sys.argv
  instead of raw shell interpolation in Python strings
- [CRITICAL] Fix shell injection in create_cloudsigma_drive: pass name
  and size via sys.argv instead of inline interpolation
- [CRITICAL] Fix shell injection in SSH key fingerprint lookups: pass
  fingerprint via sys.argv
- [HIGH] Replace hardcoded VNC password with random generation via
  openssl rand -hex 8
- [MEDIUM] Fix config file path injection: pass via sys.argv

Agent: pr-maintainer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 06:50:25 -08:00
A
fba986abea
feat: add HOSTKEY cloud provider (#909)
Add HOSTKEY (https://hostkey.com/) as a new cloud provider to the spawn
matrix. HOSTKEY offers affordable VPS hosting starting from €1/month with
hourly billing, making it suitable for running AI agents that use remote
API inference.

Changes:
- Created hostkey/lib/common.sh with HOSTKEY API wrappers
- Implemented hostkey/claude.sh (Claude Code agent)
- Implemented hostkey/openclaw.sh (OpenClaw agent)
- Added HOSTKEY to manifest.json clouds section
- Added matrix entries for all 15 agents (2 implemented, 13 missing)
- Updated test/record.sh with HOSTKEY test infrastructure
- Updated test/mock.sh with HOSTKEY URL handling
- Created hostkey/README.md with usage instructions

Data centers: Amsterdam, Frankfurt, Helsinki, Reykjavik, Istanbul, New York

Agent: cloud-scout

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 05:08:26 -08:00
A
fc34a640bd
feat: add Atlantic.Net cloud provider (#883)
Add Atlantic.Net Cloud as a new cloud provider with REST API support.
Starting at $4-8/mo for budget VPS instances with SSH access.

Implementation:
- Created atlanticnet/lib/common.sh with HMAC-SHA256 API auth
- Implemented 3 agent scripts: claude.sh, aider.sh, openclaw.sh
- Updated manifest.json with cloud entry and 15 matrix entries
- Added test coverage in test/record.sh and test/mock.sh
- Created atlanticnet/README.md with usage docs

API authentication uses timestamp + random GUID signed with private key.
Defaults: G2.2GB plan, ubuntu-24.04_64bit image, USEAST2 location.

Agent: cloud-scout-1

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 03:07:22 -08:00
A
89ffe4802e
refactor: extract mock test env config and API assertions into per-cloud fixture files (#803)
Reduces setup_env_for_cloud (84 lines -> 8 lines) and assert_cloud_api_calls
(32 lines -> 9 lines) in test/mock.sh by moving cloud-specific data into
per-cloud _env.sh and _api_assertions.sh files in test/fixtures/.

Adding a new cloud's test config now only requires creating two small files
in the fixtures directory instead of editing case branches in mock.sh.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:16:11 -08:00
A
be903f0089
feat: add CodeSandbox cloud provider (#857)
Add CodeSandbox as a new sandbox cloud provider for running AI agents.

CodeSandbox features:
- Firecracker microVMs with ~2 second start times
- SDK/CLI-based exec (no SSH)
- Free tier: 40 hours/month on Build plan
- Secure isolated environments

Implementation:
- Created codesandbox/lib/common.sh with SDK wrapper functions
- Implemented 3 agent scripts: claude, aider, openclaw
- Added CodeSandbox to manifest.json clouds
- Created matrix entries (3 implemented, 12 missing)
- Updated test/record.sh to list as non-recordable CLI cloud
- Added codesandbox/README.md with usage instructions

The implementation follows the existing pattern from e2b and modal,
using Node.js SDK (@codesandbox/sdk) for sandbox lifecycle management.

Agent: cloud-scout

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:09:31 -08:00
A
c7bbe8bc3b
refactor: extract generic _live_create_delete_cycle in test/record.sh (#818)
The 5 per-cloud live recording functions (_live_hetzner, _live_digitalocean,
_live_vultr, _live_linode, _live_civo) each duplicated 50-65 lines of
identical create->save->extract-id->delete->save logic. Extract a generic
_live_create_delete_cycle helper that handles the shared flow, with per-cloud
body builder functions providing only the cloud-specific parts.

Reduces test/record.sh by 112 lines (1016 -> 904) while preserving all
behavior including cloud-specific delete delays and empty-response fallbacks.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:51 -08:00
A
cb1005ab31
refactor: extract helpers from run_script_test and run_shellcheck in test/run.sh (#776)
Split run_script_test (61 lines -> 25 lines) into focused helpers:
- _assert_sprite_common_commands: standard command lifecycle assertions
- _assert_agent_specific: per-agent install assertions
- _assert_no_temp_leaks: temp file cleanup check

Split run_shellcheck (57 lines -> 12 lines) into:
- _discover_shell_scripts: dynamic script discovery across cloud dirs
- _run_shellcheck_on_scripts: per-script shellcheck execution and reporting

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-12 17:19:32 -08:00
A
4b0d25ca39
fix: prevent Python code injection via unescaped variables in inline Python (#771)
Use sys.argv to pass shell values to inline Python instead of direct
string interpolation, preventing single-quote injection attacks across
cloud lib common.sh files and test/record.sh.

Also fix eval injection in test/record.sh try_load_config() by replacing
eval of Python-generated export statements with safe tab-separated
parsing and direct variable assignment.

Fixes #759
Fixes #760

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 16:47:13 -08:00
A
5a1037d92c
fix: replace ((var++)) with var=$((var + 1)) for macOS bash 3.x compat (#769)
((var++)) returns exit code 1 when the variable is 0 (falsy), which
causes set -e to terminate the script. Replace all instances with
the safe var=$((var + 1)) pattern in sprite/lib/common.sh and
test/run.sh.

Fixes #762

Agent: community-coordinator

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 16:45:51 -08:00
A
4e33cc39cd
fix: address medium security findings from #753 (#755)
- Replace `echo -e` with `printf` in cli/install.sh for macOS bash 3.x compat
- Remove `-u` (nounset) from test/run.sh — use `${VAR:-}` pattern instead
- Replace `source <(curl ...)` with `eval "$(curl ...)"` in test/run.sh for curl|bash compat
- Add .gitignore patterns for sensitive files (.env, *.pem, *.key, credentials)

Refs #753

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:48:52 -08:00
A
cec1806128
refactor: improve readability of config setup and shellcheck discovery (#744)
- Replace hardcoded 4-cloud script list in run_shellcheck with dynamic
  discovery that covers all 21 clouds automatically
- Convert 3 inline JSON templates (setup_claude_code_config,
  setup_openclaw_config, setup_continue_config) from single-line printf
  to readable heredocs while preserving json_escape security

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-12 15:19:11 -08:00
A
35997c8ae5
refactor: extract helpers from run_test() in test/mock.sh (#713)
Break down the 150-line run_test() function into focused helpers:
- run_script_with_timeout(): script execution with env vars and timeout
- show_failure_output(): display last 20 lines on failure
- assert_error_scenario(): handle error scenario assertions
- assert_cloud_api_calls(): cloud-specific API call assertions
- record_test_result(): write pass/fail to RESULTS_FILE

run_test() is now 57 lines (62% reduction), each helper is under 35 lines.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:01:49 -08:00
A
ea943d1583
refactor: decompose 287-line setup_mock_curl into named helpers (#718)
The mock curl heredoc script was a monolithic 287-line function with
inline arg parsing, error injection, URL routing, body validation,
fixture lookup, and state tracking all in one flow.

Extract 10 focused helper functions within the heredoc:
- _parse_args: curl argument parsing
- _maybe_inject_error: MOCK_ERROR_SCENARIO handling
- _handle_special_urls: install scripts, OpenRouter, spawn repo
- _strip_api_base: URL-to-endpoint mapping for 14 cloud APIs
- _check_fields / _validate_body: POST body validation
- _try_fixture: fixture file lookup
- _synthetic_active_response: cloud-specific GET-by-ID responses
- _respond_get / _respond_post: METHOD-based response routing
- _track_state: creation/deletion state tracking

The main logic is now a 26-line sequence of named function calls,
making the mock's control flow immediately readable.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:01:41 -08:00
A
9a851b36b6
refactor: extract assert_equals/assert_match helpers in test/run.sh (#727)
Replace 36 inline if/else assertion blocks across 9 test functions with
calls to two new reusable helpers (assert_equals, assert_match). Reduces
test/run.sh by 126 lines (794 -> 668) while keeping all 79 tests passing.

Key functions reduced:
- _test_open_browser: 53 -> 36 lines (-32%)
- _test_ssh_key_utils: 48 -> 26 lines (-46%)
- _test_cloud_init: 41 -> 22 lines (-46%)
- _test_oauth_functions: 39 -> 23 lines (-41%)
- _test_ssh_wait: 33 -> 21 lines (-36%)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:00:59 -08:00
A
d5d7da0833
refactor: decompose setup_mock_agents and record_cloud into helpers (#722)
- Extract _create_logging_mock and _create_silent_mock from setup_mock_agents
  (test/mock.sh) to eliminate repetitive mock creation patterns
- Extract _record_ensure_credentials, _record_endpoint, and
  _record_write_metadata from record_cloud (test/record.sh) to separate
  credential checking, API recording, and metadata writing concerns

Pure refactoring — no behavior changes.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:00:56 -08:00
Ahmed Abushagur
1ad2371a25
feat: qa bot and emails (#565) 2026-02-11 20:19:45 -08:00
A
385a8a9b56
refactor: split 3 large test functions in test/run.sh into focused units (#544)
- _test_browser_and_cloud_init (94 lines) -> _test_open_browser (54) + _test_cloud_init (42)
- test_common_source (87 lines) -> _test_sprite_functions_and_syntax + _test_sprite_log_and_name + _test_sprite_remote_source
- _test_json_ssh_utils (59 lines) -> _test_json_escape + _test_ssh_key_utils (49)

All 75 tests pass. No behavioral changes.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 15:34:06 -08:00
A
7ba3559773
refactor: extract helpers from test_shared_common to reduce complexity (#511)
Break the 415-line test_shared_common() function in test/run.sh into
7 focused sub-functions grouped by feature:
- _test_model_validation (validate_model_id tests)
- _test_json_ssh_utils (json_escape, SSH key ops)
- _test_syntax_and_logging (syntax check, logging functions)
- _test_browser_and_cloud_init (open_browser, cloud-init, connectivity)
- _test_oauth_functions (wait_for_oauth_code, cleanup_oauth_session)
- _test_ssh_wait (generic_ssh_wait success/failure)
- _test_input_and_server_validation (safe_read, validate_server_name)

Also add assert_common_succeeds and assert_common_fails helpers to
eliminate repeated test boilerplate for simple pass/fail assertions.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-11 12:53:40 -08:00
A
88fa9e48e6
fix: prevent shell/Python injection in env var and credential handling (#443)
- binarylane/continue.sh: Replace unsafe inline echo with inject_env_vars_ssh
  to prevent command injection if OPENROUTER_API_KEY contains single quotes
- test/record.sh: Pass credential values via sys.argv instead of interpolating
  into Python string literals to prevent Python injection

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 04:50:34 -08:00
A
1576577ed8
feat: Add RamNode cloud provider with OpenStack API support (#408)
Add RamNode budget VPS cloud provider ($0.006/hr) with full OpenStack API integration.

Implementation:
- ramnode/lib/common.sh: OpenStack Keystone v3 auth + Compute API wrapper
- ramnode/claude.sh, ramnode/aider.sh, ramnode/goose.sh: 3 agent scripts
- manifest.json: Added ramnode cloud entry + 15 matrix entries (3 implemented)
- ramnode/README.md: Complete documentation
- test/record.sh: Live cycle testing for RamNode (_live_ramnode function)
- test/mock.sh: URL stripping for Identity/Compute/Network APIs

Technical details:
- Auth: RAMNODE_USERNAME + RAMNODE_PASSWORD + RAMNODE_PROJECT_ID
- APIs: Identity (5000/v3), Compute (8774/v2.1), Network (9696/v2.0)
- Token-based authentication (X-Auth-Token header)
- Server provisioning with cloud-init via base64-encoded userdata
- SSH key management via OpenStack keypairs API

Agent: cloud-scout-1

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-02-11 01:36:02 -08:00
A
5181f28704
feat: Add local cloud provider for running agents on local machine (#381)
Adds a "local" cloud provider that installs and runs agents directly on the
user's machine without any cloud provisioning. This is useful for local
development and testing.

- local/lib/common.sh: Cloud lib with local execution functions
- local/claude.sh: Claude Code agent script
- local/openclaw.sh: OpenClaw agent script
- local/nanoclaw.sh: NanoClaw agent script
- manifest.json: Added local cloud + matrix entries
- test/: Updated record.sh and mock.sh for local cloud support

Fixes #378

Agent: issue-fixer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 00:50:05 -08:00
Ahmed Abushagur
8b9f9a0e5a
QA-Bot setup (#335)
* feat: testing

* feat: auto-fix dead apis

* fix: mock works

* feat: new fixtures

* fix: more clouds tested

* fix: dry run fix

* fix: civo valid size

* fix: civo result wait

* feat: fixtures

* feat: per cloud agent
2026-02-10 19:51:07 -08:00
Sprite
b2e2462f0d fix: Poll for sprite provisioning instead of blind sleep
ensure_sprite_exists() now polls `sprite list` until the sprite
appears (up to 30s) instead of a fixed sleep. This eliminates the
spurious "sprite not found" errors that appeared while the sprite
was still provisioning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 00:52:35 +00:00
A
bbbe815035
refactor: Security fixes, complexity reduction, and UX improvements (#58)
Security:
- Fix command injection in modal/lib/common.sh (run_server, upload_file, interactive_session)
- Fix command injection in fly/lib/common.sh (run_server, upload_file, interactive_session)
- All container providers now use printf '%q' for proper shell escaping

Complexity:
- Extract _api_should_retry_on_error() helper in shared/common.sh (-19 lines)
- Refactor scaleway_api and upcloud_api to use shared retry helper (-24 lines)
- Extract _save_fly_token() helper in fly/lib/common.sh (-11 lines)
- Extract validateAndGetAgent() in commands.ts, reducing cmdRun/cmdAgentInfo duplication
- Refactor cmdList column width calculation to use calculateColumnWidth()

UX:
- Add actionable next steps to error messages in shared/common.sh
- Improve CLI bash fallback error messages with guidance (spawn.sh)
- Add OAuth progress indicator during browser authentication wait
- Show invalid model ID value and link to openrouter.ai/models
- Add troubleshooting steps for agent installation failures

Tests:
- Update test assertions in test/run.sh to match refactored patterns
- All tests passing: 74 TypeScript + 75 bash = 149 total, 0 failures

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 17:09:27 -08:00
Sprite
3ae83aa867 fix: Fix 4 failing claude.sh tests
Root causes:
- `clear` command fails with exit 1 when TERM is not set (test env has
  no terminal), crashing the script due to set -e. Guard with || true.
- Test patterns for Claude settings/state uploads used old temp file
  naming convention (/tmp/claude_settings, /tmp/claude_global) that no
  longer matches the paths generated by upload_config_file +
  upload_file_sprite (/tmp/*settings.json, /tmp/*.claude.json).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 18:26:53 +00:00
Sprite
8302cafbd7 Remove stale tests and fix echo -e in test harness
Remove tests for deleted nc_listen and create_oauth_response_html
functions. Replace echo -e with printf for macOS bash 3.x compat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 05:13:37 +00:00
Sprite
ce0f2ce7fb refactor: Add default case to script-specific assertions
Added default '*) ' case to handle agents without specific assertions,
resolving SC2249 info warning and improving code clarity.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 03:56:29 +00:00