spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-05-08 01:51:14 +00:00

Author	SHA1	Message	Date
A	122b59e4da	test: add 99 tests for post-session summary and SPAWN_DASHBOARD_URL convention (#1040 ) Cover the _show_post_session_summary function and updated ssh_interactive_session integration from PR #1037. Tests verify: - Summary warns user their server is still running with IP - Dashboard URL shown when SPAWN_DASHBOARD_URL is set - Generic message when no dashboard URL is available - Reconnect command uses correct SSH_USER and IP - SSH exit code preserved through the summary display - All 25 SSH-based cloud providers set SPAWN_DASHBOARD_URL - SPAWN_DASHBOARD_URL uses HTTPS and is defined before usage - Detects custom interactive_session implementations missing summary (alibabacloud flagged as known gap) Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 20:46:40 -05:00
A	01c91798d6	refactor(alibabacloud): simplify create_server by extracting helpers (#1035 ) - Extract `_aliyun_json_list_first` helper for flat JSON lists (unlike `_aliyun_json_field` which handles lists of dicts) - Extract `_aliyun_extract_instance_id` to replace inline Python parser - Extract `_ensure_network_infrastructure` to consolidate VPC/vSwitch/SG setup - Use `_log_diagnostic` for structured error reporting (consistent with patterns in shared/common.sh) Reduces create_server from 86 to 69 lines and eliminates inline Python. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 20:12:40 -05:00
A	beceb69962	test: add 151 tests for key-server security-critical logic (#1036 ) Add comprehensive test coverage for the key-server (.claude/skills/setup-agent-team/key-server.ts), which previously had zero tests despite containing security-critical logic: - validKeyVal: API key validation (control chars, shell metacharacters, length limits) - 37 tests - SAFE_PROVIDER_RE: path traversal prevention in provider names - 21 tests - UUID_RE: batch ID format validation - 12 tests - signHmac/verifyHmac: HMAC signing and verification for signed URLs - 17 tests - isAuthed: timing-safe Bearer token auth - 9 tests - rateCheck: rate limiting logic - 8 tests - esc: HTML escaping for XSS prevention - 13 tests - cleanup: data store batch expiry logic - 9 tests - Key submission validation flow - 6 tests - Route matching, security headers, backward compat - 19 tests Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 20:11:35 -05:00
Ahmed Abushagur	c6d0cb218e	improve: make QA bot more effective with structured failures and verification (#1034 ) 5 improvements to the QA cycle: 1. Fix agents now get structured failure context — categorized failures (exit_code, missing_api_call, missing_env, no_fixture) instead of raw 500-line test output, plus a passing agent for comparison 2. Fix agent changes are verified before committing — re-runs mock tests after the agent finishes and only commits if results actually improved, discarding bad fixes that would create noise PRs 3. Test results now include failure categories — mock.sh records cloud/agent:fail:reason instead of just cloud/agent:fail, enabling smarter failure routing 4. Mock curl logs NO_FIXTURE warnings when no fixture matches a GET request, surfacing false-confidence gaps where tests pass with synthetic fallback data 5. Phase 3 (code fix) failures now escalate to GitHub issues after 3 consecutive cycles, matching the Phase 1 escalation pattern Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 20:07:54 -05:00
A	f121b60d80	fix(ux): show post-session summary with server status and reconnect info (#1037 ) After an interactive SSH session ends, users are now shown: - A warning that their server is still running (and may incur charges) - A link to the cloud provider's dashboard to manage/delete it - The SSH command to reconnect This prevents users from unknowingly leaving servers running after exiting their agent session. Covers all 25 SSH-based cloud providers. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 20:06:40 -05:00
A	f586e19790	fix(security): replace unquoted heredocs with printf to prevent shell expansion in API keys (#1031 ) Unquoted `<< EOF` heredocs in nanoclaw .env file creation cause shell expansion of the API key value. If an API key contains `$`, backticks, or `\`, the value is silently corrupted or could trigger command execution. Replace with `printf '%s'` which safely writes the value without interpretation. Also fix unquoted variable expansion in upload_config_file's mv command and the github-codespaces/openclaw.sh config heredoc. Fixes 34 scripts across all cloud providers. Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 19:41:10 -05:00
A	e452ea8944	fix(security): validate branch names and cloud names in qa-cycle.sh (#1033 ) Add validate_branch_name() and validate_cloud_name() to qa-cycle.sh to prevent command injection via unvalidated strings passed to git/gh commands. Cloud names parsed from test/record.sh output via sed were used directly in branch names, git push, git worktree, and gh pr create commands without validation. Fixes #1028 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 19:40:14 -05:00
A	8c052aceb8	test: add 239 tests for CloudSigma provider patterns and conventions (#1030 ) Validates CloudSigma's unique architecture: region-based API URLs, HTTP Basic Auth (email + password), drive cloning workflow, python3 JSON construction, SSRF-preventing region validation, and SSH with 'cloudsigma' user. Covers lib/common.sh API surface, all 8 agent scripts, manifest consistency, and test infrastructure (mock.sh + record.sh). Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 19:39:58 -05:00
A	059690f8d7	fix(ux): include cloud provider dashboard URLs in script failure and interrupt messages (#1029 ) When spawn scripts fail or are interrupted, error messages now include the cloud provider's actual dashboard URL instead of generic "check your cloud provider dashboard" text. This helps users quickly navigate to their provider to check server status, clean up orphaned resources, or debug provisioning failures. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 16:01:57 -08:00
A	7d6bc0292b	fix(ux): add preflight credential check to interactive mode (#1027 ) The interactive flow (bare `spawn`) was missing the preflight credential warning that the direct `spawn <agent> <cloud>` path already had. Users who picked an agent and cloud interactively would not be warned about missing credentials, leading to confusing failures from the cloud provider script. Now both paths warn about missing credentials before launching. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:52:03 -05:00
A	334e10ead2	refactor: decompose ensure_aliyun_credentials and extract _aliyun_instance_public_ip (#1026 ) Extract _aliyun_load_or_prompt_credentials and _aliyun_configure_cli from the 68-line ensure_aliyun_credentials function, reducing it to 16 lines. Extract _aliyun_instance_public_ip to replace inline Python in _wait_for_aliyun_instance, making IP extraction reusable and consistent with the existing _aliyun_json_field helper pattern. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:31:39 -05:00
A	b6a07e3c60	fix: prevent sensitive file exfiltration via --prompt-file flag (#1024 ) Add path validation to --prompt-file to block reading sensitive files (SSH keys, cloud credentials, .env files, etc.) whose contents would be sent to remote agents. Also adds file size validation (1MB limit) and stat-based file type checking. Fixes #991 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:30:05 -05:00
A	b76f04cd78	fix(ux): show cloud count and credential readiness in interactive agent picker (#1025 ) When users run `spawn` interactively, the agent picker now shows how many clouds each agent supports and how many have credentials ready. This helps users quickly identify which agents they can deploy immediately. Before: "Claude Code AI coding assistant" After: "Claude Code 2 clouds, 1 ready" Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 18:29:25 -05:00
A	46b760cf2b	test: add 364 tests for Oracle Cloud Infrastructure provider patterns (#1023 ) Covers OCI CLI dependency management, VCN networking decomposition (VCN -> IGW -> route -> security rules -> subnet), instance creation with flex shape handling, cloud-init userdata, SSH delegation, server destruction, availability domain handling, and all 15 agent scripts following correct provisioning flow. Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:07:41 -05:00
A	aafe3d1ce4	fix: eliminate duplicate Loading manifest spinner in agent/cloud info (#1021 ) When running `spawn claude` or `spawn hetzner`, the "Loading manifest..." spinner appeared twice: once in showInfoOrError() and again in cmdAgentInfo/cmdCloudInfo via validateAndGetEntity(). Pass the pre-loaded manifest to avoid the redundant load and spinner flash. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:07:08 -05:00
A	415df93ea0	refactor: decompose latitude and contabo create_server into focused helpers (#1022 ) Extract validation, error handling, and response parsing from create_server into dedicated helpers following the pattern from PR #1016. Latitude helpers: _latitude_validate_inputs, _latitude_check_create_error, _latitude_extract_server_id Contabo helpers: _contabo_validate_inputs, _contabo_check_create_error, _contabo_extract_instance_id Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:05:18 -05:00
A	5412b9891f	fix: validate ALIYUN_IMAGE_ID and fix HOSTKEY input validation ordering (#1019 ) - Add validate_resource_name check for ALIYUN_IMAGE_ID env var in alibabacloud create_server, consistent with other providers (Contabo, Webdock) that validate user-controllable image identifiers - Move HOSTKEY location validation before _pick_instance_preset call, which uses the location in an API request — validates input before use rather than after Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:03:34 -05:00
A	41afc76537	test: add 157 tests for Alibaba Cloud provider patterns and conventions (#1020 ) Alibaba Cloud was added in commit `0d9307a` with zero test coverage. This adds comprehensive tests covering: - lib/common.sh API surface (required + provider-specific functions) - CLI installation and credential handling - SSH key management (DescribeKeyPairs, ImportKeyPair) - Server lifecycle (VPC, vSwitch, SecurityGroup, RunInstances) - Network infrastructure setup (CIDR ranges, availability zones) - Instance polling behavior - Security conventions (input validation, safe JSON parsing, macOS compat) - Agent script patterns (claude.sh, codex.sh, gemini.sh) - OpenRouter env var injection via SSH - Manifest consistency checks Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:00:55 -05:00
A	f1e7939188	refactor: decompose alibabacloud create_server into focused helpers (#1018 ) Extract _ensure_vpc, _ensure_vswitch, _aliyun_json_field, and _aliyun_json_top_field from the 182-line create_server function. This reduces create_server to 85 lines and eliminates repeated inline Python JSON parsing across multiple functions. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 17:54:47 -05:00
A	8a5d03995b	fix: validate provider name in invalidate_cloud_key and improve key validation (#1017 ) - Add regex validation (^[a-z0-9][a-z0-9._-]{0,63}$) to invalidate_cloud_key() in shared/key-request.sh to prevent path traversal attacks that could delete arbitrary files via crafted provider names (e.g., ../../etc/important) - Improve validKeyVal() in key-server.ts to block control characters (U+0000-U+001F, U+007F-U+009F) and enforce a 4096-byte max length on API key values, preventing injection of null bytes, newlines, and excessively long values Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 14:43:44 -08:00
A	388770126f	refactor: decompose webdock create_server and koyeb ensure_koyeb_cli into focused helpers (#1016 ) webdock/lib/common.sh: - Extract _webdock_get_public_key_ids() for SSH key ID fetching - Extract _webdock_validate_inputs() for input validation - Extract _webdock_handle_create_response() for response parsing and error reporting - create_server reduced from 53 to 24 lines koyeb/lib/common.sh: - Extract _koyeb_detect_os() for OS detection - Extract _koyeb_detect_arch() for architecture detection - Extract _koyeb_install_cli() for download and PATH setup - ensure_koyeb_cli reduced from 51 to 13 lines Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 14:22:24 -08:00
A	beec9ab8a3	fix: show signal names instead of 'code null' when scripts are killed (#1014 ) When a spawn script is killed by a signal (SIGKILL, SIGTERM, SIGHUP, etc.), Node.js returns exit code null. Previously this produced the confusing message "Script exited with code null". Now detects the actual signal and shows signal-specific guidance: OOM suggestions for SIGKILL, terminal reconnection tips for SIGHUP, spot instance warnings for SIGTERM. Fixes #1011 Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 14:12:43 -08:00
A	a260dce642	test: add 133 tests for Webdock provider patterns and conventions (#1015 ) Webdock was added in PR #1001 with zero dedicated test coverage. This adds comprehensive tests validating: - lib/common.sh API surface (required + provider-specific functions) - API base URL and constants - Credential handling (ensure_api_token_with_provider pattern) - SSH key management (json_escape for injection prevention) - Server lifecycle (generic_cloud_api, generic_wait_for_instance) - SSH delegation pattern (ssh_run_server, ssh_upload_file, etc.) - Security conventions (no echo -e, no set -u, validate_resource_name) - Agent script patterns (claude, aider, cline) - Manifest consistency (type, auth, exec_method, defaults) - Test infrastructure coverage (mock.sh and record.sh entries) Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 14:11:47 -08:00
A	0d9307a907	feat: Add Alibaba Cloud provider support (#1002 ) Adds Alibaba Cloud (Aliyun) ECS provider with 3 initial agent implementations. Provider details: - API: Alibaba Cloud CLI (aliyun ecs commands) - Pricing: Starting at ~$3.50/month for entry-level instances - Regions: Global coverage with strong Asia-Pacific presence - Instance types: Burstable T5 instances for cost-effective compute Implements: claude, codex, gemini Key features: - Automatic CLI installation - VPC and vSwitch auto-creation - Security group configuration with SSH access - Cloud-init support for automated agent setup - Credential persistence in ~/.config/spawn/alibabacloud.json Test coverage: Skipped (CLI-based provider, test infrastructure targets REST APIs) Agent: cloud-scout-2 Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 13:57:22 -08:00
A	69d08e6b1d	fix: improve CLI UX with clearer credential status and help docs (#1012 ) - Change 'auth: TOKEN' to 'needs TOKEN' with yellow highlight in spawn clouds - Always show legend footer explaining ready/needs indicators - Add --clear hint to spawn list footer - Show --version/-v and --help/-h aliases in help text - Document SPAWN_UNICODE=1 env var in help - Include HTTP status code in update fetch errors - Bump version to next patch Fixes #1010 Agent: issue-fixer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 13:53:53 -08:00
A	bd28508fd8	test: add 40 tests for decomposed shared/common.sh helpers (#1009 ) Tests cover the recently decomposed helper functions from PR #976 (cmdAgentInfo, generic_wait_for_instance) to ensure the refactored helpers maintain correct behavior. Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-13 13:42:53 -08:00
A	ea5d462f4f	refactor: decompose multi-credential config handling in test/record.sh (#1004 ) Extract _get_multi_cred_spec, _load_multi_config_from_file, and _save_multi_config_to_file helpers to eliminate duplicated per-cloud config blocks in try_load_config, save_config, has_credentials, prompt_credentials, and list_clouds. The cloud-to-credential mapping (OVH, UpCloud, Kamatera, AtlanticNet, CloudSigma) is now defined once in _get_multi_cred_spec and consumed by all five functions, making it trivial to add new multi-credential clouds. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 13:34:37 -08:00
A	b0ebaa94bb	refactor: decompose load_cloud_keys_from_config into focused helpers (#1007 ) Extract three helpers from the 82-line, 14-conditional function: - _parse_cloud_auths: extract cloud auth specs from manifest.json - _try_load_env_var: load a single env var from env or config file - _load_cloud_credentials: load all env vars for one cloud provider The main function is now a 36-line orchestrator with clear flow: validate prerequisites -> parse manifest -> iterate clouds -> summarize. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-13 13:29:53 -08:00
A	d3919cafda	test: add 55 tests for agent info quick-start display (#1005 ) Cover printAgentQuickStart (commands.ts) which has zero test coverage: - Single-auth and multi-auth cloud credential display - URL hint placement (only on first auth var) - All/partial/no credentials detection ("ready to go" vs export lines) - No-auth cloud (auth="none") handling - Agent info header, install line, available clouds listing - Credential prioritization in cloud ordering - Grouped cloud type display and credential indicators - Pure logic replica tests for quick-start computation Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 13:25:26 -08:00
A	2a66805b33	feat: Add Webdock provider support (#1001 ) Implements Webdock cloud provider with full API integration: - webdock/lib/common.sh with REST API primitives - claude.sh, cline.sh, aider.sh agent scripts - Test coverage in test/record.sh and test/mock.sh - manifest.json updated with cloud entry and matrix - README.md with usage documentation Webdock offers affordable European VPS (€2.15/month starting) with full REST API, SSH access, and developer-friendly features. Agent: cloud-scout-1 Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 13:24:06 -08:00
A	ac56e5454a	fix: improve CLI UX with better error messages and help text (#1003 ) - Show list-specific flags (-a, -c, --clear) in unknown flag error - Add specific error for empty prompt files instead of generic validation - Document SPAWN_UNICODE=1 env var in help text and troubleshooting - Show filter/clear hints in interactive list picker Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-13 13:04:48 -08:00
A	1a2cec6b81	feat: Add CloudSigma support for 6 agents (#998 ) Implements CloudSigma matrix entries for openclaw, nanoclaw, interpreter, continue, gemini, and codex. All scripts follow the standard CloudSigma pattern with OpenRouter API key injection. Agent: gap-filler Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 12:57:49 -08:00
A	d2fbd325b0	refactor: decompose fly get_server_name and oracle _setup_vcn_networking (#1000 ) - fly/lib/common.sh: Replace 23-line get_server_name() that duplicated env-var-check, prompt, and validation logic with a one-line call to the shared get_validated_server_name helper, matching all other cloud providers. - oracle/lib/common.sh: Break _setup_vcn_networking (48 lines, 3 distinct responsibilities) into focused helpers: - _create_internet_gateway: creates the IGW resource - _add_default_route: configures the route table - _add_ssh_security_rules: opens SSH port in the security list The orchestrator _setup_vcn_networking now delegates to these three helpers. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-13 12:57:11 -08:00
A	edc5993b61	feat: Add HOSTKEY support for nanoclaw, aider, goose, cline, continue (#999 ) Implements 5 missing HOSTKEY matrix entries: - hostkey/nanoclaw - hostkey/aider - hostkey/goose - hostkey/cline - hostkey/continue All scripts follow the standard pattern: 1. Authenticate with HOSTKEY 2. Create server instance 3. Install agent 4. Configure OpenRouter API key injection 5. Launch interactive session Agent: gap-filler Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 12:40:29 -08:00
A	892d53397f	fix: add --delete-branch to all gh pr close commands (#997 ) Ensures closing a PR also deletes its remote branch, consistent with how gh pr merge already uses --delete-branch. Removes redundant manual git push origin --delete calls that were previously needed. Fixes #942 Agent: pr-maintainer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 12:32:43 -08:00
A	b39a691b16	fix: validate SLACK_WEBHOOK format to prevent command injection (#996 ) SLACK_WEBHOOK was embedded directly in heredocs at three locations, allowing potential command injection if the env var contained shell metacharacters. Added early validation requiring the URL to match the expected Slack webhook format (https://hooks.slack.com/...). Also stopped leaking the full webhook URL into prompt text. Fixes #992 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-13 12:28:09 -08:00
A	583d2a63fc	test: add 38 tests validating test infrastructure stays in sync with manifest (#995 ) Validates that test/mock.sh and test/record.sh stay in sync with manifest.json. When a new cloud provider is added, CLAUDE.md mandates updating both files with endpoint mappings, auth env vars, and API dispatchers. These tests catch configuration drift automatically: - ALL_RECORDABLE_CLOUDS completeness and no duplicates - get_endpoints(), get_auth_env_var(), call_api() coverage parity - _strip_api_base() URL patterns match fixture directories - Fixture directories have required _env.sh and _metadata.json - Auth env vars in record.sh match manifest auth fields - Shell script conventions (shebang, set -eo pipefail, no echo -e) - Test infrastructure conventions (NO_COLOR, cleanup traps, counters) Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 12:21:53 -08:00
A	a0f6b335a4	fix: harden upload_file path validation with strict allowlist regex across 10 clouds (#993 ) Replace fragile blocklist validation and printf '%q' escaping in upload_file() with strict allowlist regex [a-zA-Z0-9/_.~-]+ across all non-SSH cloud providers. For codesandbox, additionally migrate from shell command interpolation to SDK filesystem API via environment variables, eliminating the injection surface entirely. Affected clouds: codesandbox, daytona, e2b, fly, koyeb, modal, northflank, railway, render, sprite Fixes #989 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 12:20:40 -08:00
A	67424c4bdc	refactor: decompose ensure_jq and ensure_gh_cli into focused helpers (#994 ) Extract platform-specific install logic from monolithic installer functions into small, focused helpers. Both functions had nested OS/package-manager cascades (depth 3-4) that made the control flow hard to follow. ensure_jq (shared/common.sh): - Extract _install_jq_brew, _install_jq_apt, _install_jq_dnf, _install_jq_apk - Extract _report_jq_not_found for the fallthrough error message - Main function becomes a clean dispatcher + verification ensure_gh_cli + _install_gh_binary (shared/github-auth.sh): - Extract _install_gh_brew, _install_gh_apt, _install_gh_dnf - Extract _detect_gh_platform, _fetch_gh_latest_version, _download_and_install_gh - _install_gh_binary drops from 71 to 12 lines as a clean orchestrator - ensure_gh_cli drops from 57 to 29 lines No behavior changes. All tests pass, bash -n passes. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 12:14:56 -08:00
Ahmed Abushagur	e720e15c9b	fix: give QA fix agents full mock test output instead of 10-line snippets (#988 ) Previously, Phase 3 fix agents only got the last 10 lines grepped from the log file per failing script. This was often insufficient to diagnose the root cause. Now runs `bash test/mock.sh {cloud}` per failing cloud and feeds the complete output to the fix agent. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:59:59 -08:00
A	fff84779dc	refactor: decompose cmdAgentInfo and generic_wait_for_instance into focused helpers (#976 ) Extract printAgentQuickStart from cmdAgentInfo (63 -> 43 lines), paralleling the existing printCloudQuickStart pattern. Extract _poll_instance_once and _report_instance_timeout from generic_wait_for_instance (52 -> 20 lines), eliminating duplicated elapsed/sleep/increment code in the polling loop. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-13 11:48:07 -08:00
Ahmed Abushagur	1d9a2dbad1	perf: run cloud tests and recordings in parallel (#982 ) Both mock.sh and record.sh now run each cloud's tests/recordings concurrently as background jobs instead of sequentially. Results are aggregated after all clouds finish. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:44:57 -08:00
Ahmed Abushagur	4e3f77f9bb	feat: track consecutive fixture recording failures and auto-escalate (#986 ) When a cloud's fixture recording fails 3+ consecutive QA cycles, the system now auto-creates a GitHub issue flagging the persistent failure. This catches stale API keys, changed endpoints, and other silent regressions that would otherwise go unnoticed. - Persistent tracker at .docs/qa-record-failures.json (git-ignored) - Counter increments on failure, resets on success - Deduplicates: skips issue creation if one already exists for that cloud Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:42:17 -08:00
Ahmed Abushagur	353f20d53a	docs: expand test infrastructure instructions for discovery bot (#987 ) The bot was under-updating test/mock.sh when adding new clouds because the prompt only mentioned URL stripping. Now lists all 4 required mock.sh functions and all 5 required record.sh functions explicitly. Also adds a "Mock Test Infrastructure" reference table to CLAUDE.md so both human contributors and bots know exactly what to update. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:41:25 -08:00
A	1d4e5b874c	fix: add defense-in-depth for SPAWN_HOME path validation and manifest JSON sanitization (#984 ) - Validate SPAWN_HOME is an absolute path, reject relative paths to prevent unintended file writes (addresses #980) - Resolve SPAWN_HOME to canonical form to collapse .. segments - Strip __proto__, constructor, and prototype keys from parsed manifest JSON to prevent prototype pollution (addresses #979) - Apply sanitization to all manifest ingestion paths (GitHub fetch, disk cache, local dev manifest) - Add 12 tests covering path validation and JSON sanitization Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 11:37:10 -08:00
A	16b9132c7c	fix: add confirmation to history clear and improve UX details (#983 ) - Add interactive confirmation prompt before clearing spawn history (spawn list --clear) to prevent accidental data loss - Show total prompt length in dry-run preview when prompt exceeds 100 characters, so users can verify the correct prompt was loaded - Add "Rerun previous" suggestion to non-interactive terminal fallback - Show "(shown first)" hint when clouds with credentials are detected in interactive picker, so users understand the sort order - Add repository URL to spawn version output for discoverability Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-13 11:31:01 -08:00
Ahmed Abushagur	d501b5eb1d	fix: CI test summary uses NO_COLOR instead of sed hack (#985 ) * fix: strip ANSI colors before grepping test summary The mock test output uses ANSI escape codes for colored ✓/✗/━━━ characters, so the grep in the Post summary step couldn't match them. Strip colors with sed first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use NO_COLOR standard instead of sed to strip ANSI codes mock.sh now respects the NO_COLOR env var (https://no-color.org/). CI sets NO_COLOR=1 so grep matches ✓/✗/━━━ cleanly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:26:41 -08:00
Ahmed Abushagur	0ed8a29004	fix: stop QA cycle from auto-merging PRs, only create them (#981 ) The QA cycle was auto-merging stale QA PRs that were mergeable. Now it only closes stale ones — merging is left for human review. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:19:24 -08:00
A	3b4444f292	fix: show all auth vars in agent info quick start for multi-credential clouds (#975 ) The `spawn <agent>` quick start section was only showing the first auth env var when the best available cloud requires multiple credentials (e.g., UpCloud with UPCLOUD_USERNAME + UPCLOUD_PASSWORD). This left users confused about what other credentials they needed. Now iterates over all auth vars, consistent with `spawn <cloud>` info. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-13 11:18:28 -08:00
A	c7dcbaa5af	fix: validate TARGET_SCRIPT against allowlist in trigger-server (#974 ) Add startup validation for the TARGET_SCRIPT env var to prevent arbitrary script execution. The validation: - Requires .sh extension - Checks the file exists - Resolves symlinks and relative paths via realpathSync - Verifies the real path is inside the allowed skill directory Fixes #970 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 11:14:40 -08:00

1 2 3 4 5 ...

1018 commits