spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-04-28 20:09:34 +00:00

Author	SHA1	Message	Date
A	b0f9f4e7af	refactor(e2e): normalize unused-arg comments in headless_env functions (#3113 ) GCP, Sprite, and DigitalOcean had commented-out code `# local agent="$2"` in their `_headless_env` functions. Hetzner already used the cleaner style `# $2 = agent (unused but part of the interface)`. Normalize to match. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-31 03:51:07 +07:00
A	499eb494c6	fix(security): use StrictHostKeyChecking=accept-new in all SSH connections (#3037 ) Replace StrictHostKeyChecking=no with accept-new across all E2E cloud drivers (aws, gcp, digitalocean, hetzner), the shared SSH_BASE_OPTS constant, and pull-history.ts. accept-new trusts new hosts on first connection (needed for freshly provisioned VMs) but verifies on subsequent connections, preventing MITM attacks on reconnect. Fixes #3031 Agent: style-reviewer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-26 18:04:40 -07:00
A	aafdb8655f	fix(security): pipe encoded commands via stdin in GCP/AWS exec functions (#3036 ) Replace shell interpolation of base64-encoded commands in SSH invocations with stdin piping. Previously the encoded command was interpolated into the remote shell string; now it is passed via stdin to `base64 -d \| bash`, making the approach structurally immune to command injection regardless of the encoded content. Fixes #3029 Fixes #3022 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-27 06:11:50 +07:00
A	defca448b0	fix(e2e): load GCP_ZONE from ~/.config/spawn/gcp.json in E2E driver (#3017 ) The GCP E2E cloud driver defaulted to us-central1-a when GCP_ZONE was not set in the environment. The QA VM stores zone config in ~/.config/spawn/gcp.json (alongside GCP_PROJECT) but _gcp_validate_env only read GCP_PROJECT from the environment — it never loaded GCP_ZONE. This caused E2E failures when us-central1-a had insufficient resources: 3 agents (openclaw, opencode, kilocode) failed with "SSH port never opened" because GCP couldn't provision instances in that zone. Fix: load both GCP_PROJECT and GCP_ZONE from the config file in _gcp_validate_env when they are not already set in the environment, matching how key-request.sh loads GCP_PROJECT for provisioning. Verified: all 3 previously failing agents now pass on europe-west1-b. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 01:27:46 +07:00
A	8fef58845c	fix(e2e): use aggressive cleanup threshold (5 min) for pre-run to prevent quota exhaustion (#2798 ) The pre-run stale cleanup (added in #2789) used the same 30-minute max_age as the post-run cleanup. Orphaned instances from recently-failed runs (< 30 min old) were not cleaned, causing quota exhaustion on DigitalOcean and other clouds. Pre-run cleanup now uses _CLEANUP_MAX_AGE=300 (5 min) to aggressively reclaim orphaned e2e instances before provisioning new ones. Post-run cleanup retains the 30-minute default. All 5 cloud drivers respect the override. Fixes #2793 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 11:23:55 -07:00
A	6fda75ccc8	security: validate base64 output in cloud_exec and soak.sh (defense-in-depth) (#2532 ) Add base64 character validation ([A-Za-z0-9+/=]) before use in SSH command strings for gcp.sh, aws.sh, and hetzner.sh cloud_exec functions -- matching the existing fix in digitalocean.sh (#2528). Also add a validated _encode_b64 helper to soak.sh and use it for all Telegram bot token encoding, preventing corrupted base64 from breaking out of single-quoted SSH command strings. Closes #2527 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 09:32:48 -04:00
A	1bddd713ea	fix: base64-encode commands in SSH exec to prevent injection (#2448 ) All four SSH-based cloud drivers (aws, digitalocean, gcp, hetzner) passed the command string directly as an SSH argument, which gets interpreted by the remote shell. While current callers pass trusted E2E test code, this creates a security footgun for future changes. Fix: base64-encode the command locally and decode it on the remote side before piping to bash. The encoded string contains only safe characters [A-Za-z0-9+/=], eliminating any injection vector. Stdin is preserved for callers that pipe data into cloud_exec. Closes #2432, closes #2433, closes #2434, closes #2435 Agent: complexity-hunter Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-10 13:22:33 -04:00
A	3724bb8ba4	fix: address SSH command injection risks in e2e cloud drivers (#2447 ) Add defense-in-depth validation across all e2e cloud driver scripts: - Validate IP addresses match IPv4 format before use in SSH commands (aws, digitalocean, gcp, hetzner) - Validate SSH username contains only safe characters (gcp) - Validate resource IDs are numeric before interpolating into API URLs (digitalocean droplet IDs, hetzner server IDs) - URL-encode app name in Hetzner API query parameter to prevent query parameter injection - Validate numeric env vars (INPUT_TEST_TIMEOUT, PROVISION_TIMEOUT, INSTALL_WAIT) that get interpolated into remote command strings Fixes #2432, #2433, #2434, #2435, #2442 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-10 12:27:47 -04:00
A	c4ae16849d	refactor: remove dead cloud_exec_long and _*_exec_long functions (#2407 ) The cloud_exec_long dispatcher in common.sh and all five cloud-specific _exec_long implementations (aws, digitalocean, gcp, hetzner, sprite) were defined but never called by any code in the e2e test suite. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-09 19:39:53 -07:00
A	1740274323	fix: replace base64 interpolation with stdin piping in all cloud exec_long functions (#2290 ) Replace unsafe pattern where base64-encoded commands were interpolated into remote command strings with secure stdin piping — command data now travels as stdin rather than as part of the command string, eliminating injection risk from shell metacharacter interpretation. Affected functions across all 5 cloud drivers: - _hetzner_exec_long - _aws_exec_long - _gcp_exec_long - _digitalocean_exec_long - _sprite_exec_long Fixes #2286 Fixes #2287 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-07 14:09:15 -05:00
A	548cfdf0b1	fix(security): apply base64 exec escaping to remaining 4 cloud drivers (#2067 ) PR #2064 fixed _exec_long shell injection for DigitalOcean and Sprite but missed the same bash -c '${cmd}' pattern in Hetzner, GCP, AWS, and Daytona. Apply the same base64-encoding fix to all four. Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-01 11:50:33 -08:00
Ahmed Abushagur	c1e605c884	fix(e2e): increase server sizes and install timeouts (#2014 ) E2E tests were failing because agent installs didn't complete within the default 120s timeout, and small VMs ran out of memory during builds. - INSTALL_WAIT: 120s → 300s (with per-cloud override via cloud_install_wait) - AWS: nano_3_0 → medium_3_0 (all agents need 4GB for reliable installs) - DigitalOcean: s-1vcpu-512mb-10gb → s-2vcpu-2gb, cap at 3 parallel - GCP: e2-medium → e2-standard-2 - Hetzner: cap at 5 parallel (primary IP limit) - Sprite: 300s install wait (slower exec than SSH) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-02-28 00:25:36 -08:00
Ahmed Abushagur	627026a26b	feat(e2e): multi-cloud test suite with cloud driver pattern (#2004 ) * feat(e2e): multi-cloud test suite with cloud driver pattern Scale the E2E test suite from AWS-only to all 6 infrastructure clouds (aws, hetzner, digitalocean, gcp, daytona, sprite) with parallel execution support. Architecture: - Cloud driver pattern: each cloud implements _cloudname_func() functions - load_cloud_driver() wires cloud-specific functions to generic names (cloud_exec, cloud_teardown, etc.) - Shared orchestration stays in one place, cloud details are isolated New files: - sh/e2e/e2e.sh — unified entry point with --cloud flag - sh/e2e/lib/clouds/{aws,hetzner,digitalocean,gcp,daytona,sprite}.sh Refactored: - common.sh — removed AWS constants, added load_cloud_driver() - provision.sh — cloud-agnostic via cloud_headless_env/cloud_provision_verify - verify.sh — replaced aws_ssh with cloud_exec/cloud_exec_long - teardown.sh/cleanup.sh — delegate to cloud driver functions - aws-e2e.sh — thin wrapper: exec e2e.sh --cloud aws Usage: e2e.sh --cloud aws # Single cloud e2e.sh --cloud aws --cloud hetzner # Multiple clouds in parallel e2e.sh --cloud all --parallel 3 # All clouds, 3 agents parallel Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(e2e): prevent subshell EXIT trap inheritance and single-cloud early exit - Reset EXIT trap in multi-cloud subshells to prevent LOG_DIR deletion before the main process reads log files - Use `\|\| true` for single-cloud run_agents_for_cloud to prevent set -e from skipping the summary on env validation failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: default to parallel agent provisioning in e2e tests All agents within a cloud now run in parallel by default instead of sequentially. Use --sequential to restore the old behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: cap sprite parallelism, 4GB for openclaw, remove stderr suppression - Sprite: add _sprite_max_parallel (cap 2 concurrent agents) to avoid CLI rate limiting that caused all 6 agents to fail - AWS: use medium_3_0 (4GB) bundle for openclaw which needs more RAM - Input tests: remove 2>/dev/null from agent commands so failures produce visible error output instead of empty responses - Add cloud_max_parallel to driver interface, respected by e2e.sh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use bash instead of sh for exec_long across all cloud drivers Ubuntu's /bin/sh is dash, which doesn't support bash-specific PATH sourcing from .spawnrc/.cargo/env. This caused codex and zeroclaw input tests to fail with "command not found" even though verify passed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: codex input test uses positional prompt, not -q flag codex CLI takes prompt as positional arg: `codex "PROMPT"`. The -q flag doesn't exist, causing "Usage:" error output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use codex exec -q for non-interactive input test codex requires `exec` subcommand for non-interactive mode. Plain `codex PROMPT` expects a TTY (stdin is not a terminal). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: codex exec takes no -q flag, just positional prompt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use cx23 instead of deprecated cx22 for Hetzner e2e tests Hetzner deprecated server type cx22 (ID 104). The default now uses cx23. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-02-27 19:28:08 -08:00

13 commits