Commit graph

1045 commits

Author SHA1 Message Date
A
8e4def50a7
feat: Add Open Interpreter on HOSTKEY (#1072)
Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
2026-02-14 02:20:50 -05:00
A
eb1c7d4fd7
security: fix unsafe variable expansion in discovery.sh prompt heredocs (#1070)
Replace split heredoc + echo pattern in build_team_prompt() with a
single quoted heredoc using MATRIX_SUMMARY_PLACEHOLDER, substituted
safely via python3 (consistent with WORKTREE_BASE_PLACEHOLDER pattern).

Also fixes:
- build_single_prompt(): unquoted <<EOF with ${cloud}/${agent} replaced
  with printf '%s' for safe string insertion
- get_matrix_summary(), count_gaps(), build_single_prompt(): ${MANIFEST}
  expanded inside python3 -c strings replaced with sys.argv parameter
  passing (consistent with PR #842 security pattern)

Fixes #1067

Agent: issue-responder

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:20:20 -05:00
A
2394f8af1b
feat: Add Amazon Q on CloudSigma (#1073)
Implements cloudsigma/amazonq.sh - deploys AWS's Amazon Q CLI coding
assistant on CloudSigma cloud infrastructure with OpenRouter integration.

Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:20:08 -05:00
A
ec1e99644a
feat: Add gptme on HOSTKEY (#1078)
Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:19:58 -05:00
A
bb9bd54c54
feat: add cloudsigma/goose.sh (#1068)
Implements Goose (Block's AI coding agent) on CloudSigma.
Uses CloudSigma primitives for server provisioning and
OpenRouter for inference via GOOSE_PROVIDER=openrouter.

Agent: gap-filler

Co-authored-by: OpenRouter Bot <noreply@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 02:19:22 -05:00
A
58823bcc4f
fix(ux): show credential readiness in agents list, matrix, and preflight check (#1061)
- `spawn agents` now shows "N ready" indicator when clouds have credentials
- `spawn matrix` compact view adds a "Ready" column showing credential count
- Preflight credential check gives context-specific guidance: mentions OAuth
  browser flow when only OPENROUTER_API_KEY is missing, improving clarity

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 01:55:52 -05:00
A
76778d5013
fix(security): use single-quoted heredocs with sed substitution in issue/team prompts (#1065)
Convert unquoted heredocs in refactor.sh (issue mode) and security.sh
(team_building mode) to single-quoted heredocs with sed placeholder
substitution. This prevents shell expansion of variables like
$SPAWN_ISSUE, $ISSUE_NUM, $WORKTREE_BASE inside prompt templates,
matching the existing WORKTREE_BASE_PLACEHOLDER pattern used in
refactor mode.

Fixes #1058
Fixes #1047
Fixes #1048

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 01:50:00 -05:00
A
04e92fe727
test: add 178 tests for credential display, auth parsing, and internal helpers (#1063)
Adds comprehensive test coverage for previously untested or weakly tested
areas in commands.ts and index.ts:

- formatCredStatusLine: newly exported function with zero prior tests
- formatAuthVarLine (replicated): private helper for quick-start display
- groupByType (replicated): private helper for clouds/agents grouping
- formatCacheAge (replicated): version display cache age formatting
- parseAuthEnvVars: 14 edge cases (CloudSigma format, short strings, hyphens)
- hasCloudCredentials: 8 edge cases (empty values, multiple vars, none auth)
- getImplementedClouds/getImplementedAgents: nonexistent/empty manifest cases
- getMissingClouds: all/none/empty scenarios
- calculateColumnWidth: boundary and empty array cases
- prioritizeCloudsByCredentials/buildAgentPickerHints: empty/orphan agents
- resolveDisplayName: null manifest, unknown keys
- buildRecordLabel/buildRecordHint: null manifest, long prompts
- formatRelativeTime/formatTimestamp: all time buckets + invalid input
- getErrorMessage: null, undefined, boolean, object-without-message
- levenshtein: empty strings, symmetry, insertions/deletions
- findClosestMatch/findClosestKeyByNameOrKey: empty candidates, case sensitivity
- resolveAgentKey/resolveCloudKey: display name resolution, empty string
- checkEntity: swapped kind detection (agent as cloud, cloud as agent)
- getStatusDescription: 404 vs other HTTP codes
- isRetryableExitCode: all exit codes including empty/signal messages
- buildRetryCommand: prompt length boundary (80 vs 81 chars), quote escaping
- credentialHints: all-set, missing, custom verb
- getSignalGuidance: all signals + dashboard URL
- getScriptFailureGuidance: all exit codes + null + dashboard

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 01:49:41 -05:00
A
5b0358bcd1
refactor: extract helpers to reduce complexity in run_test and ionos create_server (#1060)
- test/mock.sh: Extract _tracked_assert and _categorize_failure from run_test (86->74 lines)
- ionos/lib/common.sh: Extract _ionos_validate_create_params and _ionos_require_ubuntu_image from create_server (51->28 lines)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 01:49:33 -05:00
A
1af9878d0a
test: add 56 tests for shared/key-request.sh credential loading (#1057)
Add comprehensive test coverage for shared/key-request.sh, which had
zero existing test coverage. Tests cover:

- get_cloud_env_vars: env var extraction from manifest (7 tests)
- _parse_cloud_auths: manifest parsing for auth specs (6 tests)
- _try_load_env_var: single var loading from JSON config (10 tests)
- _load_cloud_credentials: multi-var credential loading (5 tests)
- load_cloud_keys_from_config: full credential loader (8 tests)
- invalidate_cloud_key: config deletion with path traversal guard (11 tests)
- request_missing_cloud_keys: key server request behavior (4 tests)
- Integration: end-to-end key loading scenarios (5 tests)

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 00:54:00 -05:00
A
44918b64c4
refactor: extract helpers to reduce function complexity (#1055)
- Extract `readPromptFile` from `resolvePrompt` in index.ts (60 -> 40 lines),
  isolating prompt-file validation and reading into a standalone helper
- Extract `formatCredStatusLine` from `buildCredentialStatusLines` in
  commands.ts, replacing repetitive set/not-set formatting with a reusable
  helper
- Extract `_aliyun_validate_create_params` and `_aliyun_run_instances` from
  `create_server` in alibabacloud/lib/common.sh (69 -> 34 lines), separating
  validation, API call, and orchestration concerns

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-14 00:41:39 -05:00
A
9336998168
fix(ux): add post-session summary to 10 exec-based cloud providers (#1056)
Users on exec-based clouds (Fly, Render, Koyeb, Northflank, Railway,
Modal, Daytona, E2B, CodeSandbox, GitHub Codespaces) got no warning
when their session ended that their service was still running and
incurring charges. This adds:

- _show_exec_post_session_summary() in shared/common.sh for non-SSH
  providers that use CLI exec commands instead of direct SSH
- SPAWN_DASHBOARD_URL for all 10 exec-based clouds so users get
  actionable dashboard links
- Post-session summary calls in each cloud's interactive_session()
- 33 new tests covering the exec post-session summary feature

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 00:38:10 -05:00
A
90c610a21d
refactor: simplify signal/failure guidance and credential formatting in CLI (#1053)
Convert getSignalGuidance from switch statement to data-driven lookup
table (SIGNAL_GUIDANCE), separating signal metadata from rendering logic.
Extract optionalDashboardLine helper to deduplicate the conditional
dashboard URL spreading in getScriptFailureGuidance. Extract
formatCredentialIndicator from cmdClouds to clarify the nested ternary
credential status formatting.

All 92 script-failure-guidance tests and 216 related tests pass with
zero regressions.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 00:33:50 -05:00
A
577206bc1b
fix(ux): add post-session summary to Alibaba Cloud and GCP (#1052)
Both clouds had custom `interactive_session` functions that called
`ssh` directly, bypassing the shared `ssh_interactive_session` which
shows the post-session server-still-running warning. Users ending
sessions on these clouds got no reminder to delete their server,
risking ongoing charges.

Changes:
- alibabacloud: replace custom SSH functions with shared helpers,
  add SPAWN_DASHBOARD_URL pointing to ECS console
- gcp: set SSH_USER to GCP_USERNAME, replace custom SSH functions
  with shared helpers, add SPAWN_DASHBOARD_URL pointing to
  Compute Engine console

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-14 00:29:43 -05:00
A
aca3ca0316
test: add 237 edge-case tests for core CLI modules (#1054)
Covers edge cases in manifest.ts, security.ts, commands.ts, index.ts,
and history.ts not addressed by existing tests or open PRs #1050/#1051.

Key areas tested:
- stripDangerousKeys: deep nesting, arrays, all-dangerous-keys objects
- validatePromptFilePath: path traversal combos, all sensitive path patterns
- validatePromptFileStats: boundary at exactly 1MB, empty files, non-files
- validateIdentifier: boundary at 64/65 chars, various invalid characters
- validateScriptContent: fork bomb, destructive ops, HTML error pages
- validatePrompt: backtick detection, boundary at 10KB, injection patterns
- checkEntity: swapped agent/cloud detection, typo correction
- cmdHelp: content verification (all subcommands, flags, env vars, sections)
- expandEqualsFlags: equals-in-value, empty value, mixed arg types
- history: combined filters, case-insensitive filters, corrupted files
- parseAuthEnvVars: multi-auth, short vars, non-env-var auth strings
- buildRetryCommand: quote escaping, long prompt threshold
- isRetryableExitCode: SSH exit 255 vs other codes
- getSignalGuidance/getScriptFailureGuidance: all signal/exit code branches
- hasCloudCredentials: multi-auth partial setup
- formatRelativeTime: all time ranges including future timestamps

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-14 00:26:38 -05:00
A
a79dd226ac
fix(security): prevent API key injection in koyeb/nanoclaw.sh and harden remote temp file permissions (#1049)
koyeb/nanoclaw.sh embedded the API key directly in a run_server command
string using single quotes. If the key contained a single quote, it could
break out and enable command injection. Replaced with the safe mktemp +
upload_file pattern used by all other nanoclaw scripts.

Also added chmod 600 before mv on remote /tmp/nanoclaw_env in 8 nanoclaw
scripts to restrict permissions on the credential file during transfer.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 23:37:25 -05:00
A
b1c2ad0a0d
test: add 159 edge-case tests for commands.ts exported helpers (#1051)
Comprehensive tests covering boundary conditions and interactions
between exported helpers in commands.ts:

- parseAuthEnvVars: unusual auth formats, multi-var, edge patterns
- hasCloudCredentials: env var presence/absence, multi-var auth
- resolveAgentKey/resolveCloudKey: display name resolution, case
  sensitivity, hyphenated keys, cross-kind rejection
- getImplementedClouds/getImplementedAgents: large manifest fixtures
- getMissingClouds: partial/full/zero implementation coverage
- prioritizeCloudsByCredentials: mixed credential states, multi-var
- buildAgentPickerHints: cloud counts, singular/plural, ready status
- formatRelativeTime: future dates, extreme dates, invalid input
- formatTimestamp: epoch, invalid, empty string
- buildRetryCommand: quote escaping, long prompt threshold, newlines
- isRetryableExitCode: all code paths including edge codes
- getErrorMessage: Error, plain object, primitives, null, undefined
- getSignalGuidance: all signal branches with/without dashboard URL
- getScriptFailureGuidance: all exit codes with auth/dashboard hints
- credentialHints: multi-var auth, partial/full/no credentials
- resolveDisplayName: null manifest, unknown keys
- buildRecordLabel/buildRecordHint: null manifest, prompt truncation
- levenshtein: symmetry, empty strings, transposition
- findClosestMatch: large candidate lists, empty list, case handling
- calculateColumnWidth: empty array, single char, varying lengths

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 23:37:22 -05:00
A
cf16a8b55b
fix(test): add missing mock fixtures for Civo, Hetzner, and Scaleway (#1050)
Civo tests failed because networks.json, disk_images.json, and
correctly-named sshkeys.json fixtures were missing. Hetzner tests
failed because datacenters.json was missing (needed for server type
validation). Scaleway tests failed because SCW_DEFAULT_PROJECT_ID
was missing from env, images.json had no Ubuntu images, and
create_server.json fixture was absent.

Also adds Civo and Scaleway to mock's _synthetic_active_response
for instance polling, and fixes Scaleway account API URL stripping.

Results: 435 passed, 0 failed, 1 skipped (previously 270/165/1).

Agent: pr-maintainer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 23:37:20 -05:00
Ahmed Abushagur
56d8e50acf
fix: standardize CloudSigma auth field format in manifest (#1045)
The auth field used "and" separator instead of "+" which caused
key-request.sh to crash during QA cycle Phase 0.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:52:27 -05:00
L
0a0512652a
chore: reduce workflow cron frequencies (#1046)
- discovery: every 30 min → every 3 days
- refactor: every 5 min → hourly
- security: every 5 min → every 30 min

Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:55:40 -08:00
A
44b9a5bdff
fix(security): harden weak crypto fallbacks, key validation, and temp paths (#1039)
* fix(security): harden weak crypto fallbacks, key validation, and temp paths

- CSRF state generation: fail instead of using predictable date+$RANDOM
  fallback when openssl and /dev/urandom are unavailable (OAuth CSRF bypass)
- Kamatera password: fail instead of using predictable date-based password
  when no secure random source available
- key-server validKeyVal: enforce 8-512 char limits and ASCII-only check
  to block malformed/oversized values (Fixes #969)
- upload_config_file: use mktemp-derived randomness for remote temp paths
  instead of predictable $RANDOM (symlink attack on remote server)

Agent: security-auditor
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): update assertions for upload_config_file mktemp-derived paths

The upload_config_file function now uses mktemp-derived basenames
(spawn_config_tmp.XXX) instead of the original filename for remote temp
paths. Update test/run.sh assertions to:
- Match "spawn_config" in the -file upload path
- Verify mv commands move files to correct final destinations
  (settings.json, .claude.json)

Addresses reviewer feedback on PR #1039.

Agent: pr-maintainer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 21:43:37 -05:00
A
881067bf8b
test: add 50 tests for unified printQuickStart and buildDashboardHint helpers (#1042) (#1044)
Cover the unified printQuickStart function and extracted buildDashboardHint
helper from PR #1042. Tests verify:

- buildDashboardHint edge cases: empty string URL fallback, very long URLs,
  consistency across signal types, per-exit-code dashboard URL inclusion
- printQuickStart unified behavior via cmdCloudInfo: single-auth, no-agent,
  ready-to-go shortcut, non-parseable auth, none-auth
- printQuickStart unified behavior via cmdAgentInfo: credential-prioritized
  cloud ordering, no-implementations, single-cloud ready-to-go
- cmdCloudInfo agent list count display and metadata
- cmdAgentInfo cloud list count display and metadata

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 21:43:23 -05:00
A
2d2e318a7c
fix(ux): update README matrix to include Webdock and Alibaba Cloud (#1043)
The README hero line and matrix table were stale -- showing 36 clouds
and 514 combinations when the actual manifest has 38 clouds and 531
combinations. Adds missing Webdock and Alibaba Cloud columns and
updates all agent rows to reflect current implementation status.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 21:43:20 -05:00
A
dde41b1357
refactor(cli): unify quick-start printing and extract dashboard hint helper (#1042)
Merge printAgentQuickStart and printCloudQuickStart into a single
printQuickStart function, eliminating duplicated credential-checking and
auth-var-line printing logic. Extract buildDashboardHint from the
identical pattern repeated in getSignalGuidance and getScriptFailureGuidance.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 20:54:13 -05:00
L
22acb2299f
feat: Add full issue/PR thread context + compress agent prompts (#1041)
- Add mandatory "Context Gathering" step to all issue/PR-related prompts
  requiring agents to fetch complete threads (--comments + linked PRs)
  before starting work
- Compress all prompts across refactor.sh, security.sh, discovery.sh
  (~950 lines removed) while preserving all critical behavioral rules
- Eliminate duplicated boilerplate (team coordination, worktree patterns,
  monitoring loops, shutdown sequences) across all 4 scripts

Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 17:47:05 -08:00
A
676a3af917
improve(ux): show server name and billing reminder in post-session summary (#1038)
The post-session summary (shown after every SSH session ends) now:
- Displays the server name when available, so users can find it in their
  cloud dashboard (e.g., "Your server 'spawn-claude-abc' is still running")
- Adds explicit billing reminder ("Remember to delete it to avoid charges")
- Uses green (log_info) for reconnect instructions instead of yellow
  (log_warn), since reconnect info is helpful guidance, not a warning

No changes to individual cloud scripts needed -- all scripts already set
SERVER_NAME before calling interactive_session.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 20:46:46 -05:00
Ahmed Abushagur
b4abe8012f
fix(ci): propagate mock test exit code and fix broken pipe in summary (#1032)
* fix(ci): propagate mock test exit code and fix broken pipe in summary

The test workflow had three issues:
- mock.sh exit code was swallowed by tee (no pipefail), so the check
  always passed even with 165 failures
- grep|head pipe caused "write error: Broken pipe" in post summary
- Summary was noisy with 100+ individual result lines

Now uses PIPESTATUS[0] to capture the real exit code, shows a clean
results line plus collapsible failures list, and fails the check when
tests fail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): report test results without blocking PRs

Pre-existing failures (165) shouldn't block unrelated PRs. The summary
still shows pass/fail counts and a collapsible failures list so the bot
can see the results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* perf(ci): increase QA cycle frequency from daily to every 4 hours

Daily runs meant breakage could go undetected for up to 24 hours.
Every 4 hours gives 6 runs/day (00:00, 04:00, 08:00, 12:00, 16:00,
20:00 UTC) with a max 4-hour feedback loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): add missing Check results step to fail on test errors

Addresses review feedback:
- The exit code was captured via PIPESTATUS[0] into GITHUB_OUTPUT but
  no subsequent step consumed it, so the workflow always passed even
  when tests failed. Added a "Check results" step that reads the
  captured exit code and fails the job accordingly.
- Reverted QA cron schedule change (every 4 hours back to daily at
  06:00 UTC) as it was unrelated to the test exit code fix and should
  be proposed separately if desired.

Agent: pr-maintainer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
2026-02-13 20:46:45 -05:00
A
122b59e4da
test: add 99 tests for post-session summary and SPAWN_DASHBOARD_URL convention (#1040)
Cover the _show_post_session_summary function and updated
ssh_interactive_session integration from PR #1037. Tests verify:

- Summary warns user their server is still running with IP
- Dashboard URL shown when SPAWN_DASHBOARD_URL is set
- Generic message when no dashboard URL is available
- Reconnect command uses correct SSH_USER and IP
- SSH exit code preserved through the summary display
- All 25 SSH-based cloud providers set SPAWN_DASHBOARD_URL
- SPAWN_DASHBOARD_URL uses HTTPS and is defined before usage
- Detects custom interactive_session implementations missing summary
  (alibabacloud flagged as known gap)

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 20:46:40 -05:00
A
01c91798d6
refactor(alibabacloud): simplify create_server by extracting helpers (#1035)
- Extract `_aliyun_json_list_first` helper for flat JSON lists (unlike
  `_aliyun_json_field` which handles lists of dicts)
- Extract `_aliyun_extract_instance_id` to replace inline Python parser
- Extract `_ensure_network_infrastructure` to consolidate VPC/vSwitch/SG setup
- Use `_log_diagnostic` for structured error reporting (consistent with
  patterns in shared/common.sh)

Reduces create_server from 86 to 69 lines and eliminates inline Python.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 20:12:40 -05:00
A
beceb69962
test: add 151 tests for key-server security-critical logic (#1036)
Add comprehensive test coverage for the key-server
(.claude/skills/setup-agent-team/key-server.ts), which previously had
zero tests despite containing security-critical logic:

- validKeyVal: API key validation (control chars, shell metacharacters,
  length limits) - 37 tests
- SAFE_PROVIDER_RE: path traversal prevention in provider names - 21 tests
- UUID_RE: batch ID format validation - 12 tests
- signHmac/verifyHmac: HMAC signing and verification for signed URLs - 17 tests
- isAuthed: timing-safe Bearer token auth - 9 tests
- rateCheck: rate limiting logic - 8 tests
- esc: HTML escaping for XSS prevention - 13 tests
- cleanup: data store batch expiry logic - 9 tests
- Key submission validation flow - 6 tests
- Route matching, security headers, backward compat - 19 tests

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 20:11:35 -05:00
Ahmed Abushagur
c6d0cb218e
improve: make QA bot more effective with structured failures and verification (#1034)
5 improvements to the QA cycle:

1. Fix agents now get structured failure context — categorized failures
   (exit_code, missing_api_call, missing_env, no_fixture) instead of
   raw 500-line test output, plus a passing agent for comparison

2. Fix agent changes are verified before committing — re-runs mock tests
   after the agent finishes and only commits if results actually improved,
   discarding bad fixes that would create noise PRs

3. Test results now include failure categories — mock.sh records
   cloud/agent:fail:reason instead of just cloud/agent:fail, enabling
   smarter failure routing

4. Mock curl logs NO_FIXTURE warnings when no fixture matches a GET
   request, surfacing false-confidence gaps where tests pass with
   synthetic fallback data

5. Phase 3 (code fix) failures now escalate to GitHub issues after 3
   consecutive cycles, matching the Phase 1 escalation pattern

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 20:07:54 -05:00
A
f121b60d80
fix(ux): show post-session summary with server status and reconnect info (#1037)
After an interactive SSH session ends, users are now shown:
- A warning that their server is still running (and may incur charges)
- A link to the cloud provider's dashboard to manage/delete it
- The SSH command to reconnect

This prevents users from unknowingly leaving servers running after
exiting their agent session. Covers all 25 SSH-based cloud providers.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 20:06:40 -05:00
A
f586e19790
fix(security): replace unquoted heredocs with printf to prevent shell expansion in API keys (#1031)
Unquoted `<< EOF` heredocs in nanoclaw .env file creation cause shell
expansion of the API key value. If an API key contains `$`, backticks,
or `\`, the value is silently corrupted or could trigger command
execution. Replace with `printf '%s'` which safely writes the value
without interpretation.

Also fix unquoted variable expansion in upload_config_file's mv command
and the github-codespaces/openclaw.sh config heredoc.

Fixes 34 scripts across all cloud providers.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 19:41:10 -05:00
A
e452ea8944
fix(security): validate branch names and cloud names in qa-cycle.sh (#1033)
Add validate_branch_name() and validate_cloud_name() to qa-cycle.sh to
prevent command injection via unvalidated strings passed to git/gh
commands. Cloud names parsed from test/record.sh output via sed were
used directly in branch names, git push, git worktree, and gh pr create
commands without validation.

Fixes #1028

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 19:40:14 -05:00
A
8c052aceb8
test: add 239 tests for CloudSigma provider patterns and conventions (#1030)
Validates CloudSigma's unique architecture: region-based API URLs,
HTTP Basic Auth (email + password), drive cloning workflow, python3
JSON construction, SSRF-preventing region validation, and SSH with
'cloudsigma' user. Covers lib/common.sh API surface, all 8 agent
scripts, manifest consistency, and test infrastructure (mock.sh +
record.sh).

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 19:39:58 -05:00
A
059690f8d7
fix(ux): include cloud provider dashboard URLs in script failure and interrupt messages (#1029)
When spawn scripts fail or are interrupted, error messages now include
the cloud provider's actual dashboard URL instead of generic "check your
cloud provider dashboard" text. This helps users quickly navigate to
their provider to check server status, clean up orphaned resources, or
debug provisioning failures.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 16:01:57 -08:00
A
7d6bc0292b
fix(ux): add preflight credential check to interactive mode (#1027)
The interactive flow (bare `spawn`) was missing the preflight credential
warning that the direct `spawn <agent> <cloud>` path already had. Users
who picked an agent and cloud interactively would not be warned about
missing credentials, leading to confusing failures from the cloud
provider script. Now both paths warn about missing credentials before
launching.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:52:03 -05:00
A
334e10ead2
refactor: decompose ensure_aliyun_credentials and extract _aliyun_instance_public_ip (#1026)
Extract _aliyun_load_or_prompt_credentials and _aliyun_configure_cli from
the 68-line ensure_aliyun_credentials function, reducing it to 16 lines.
Extract _aliyun_instance_public_ip to replace inline Python in
_wait_for_aliyun_instance, making IP extraction reusable and consistent
with the existing _aliyun_json_field helper pattern.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:31:39 -05:00
A
b6a07e3c60
fix: prevent sensitive file exfiltration via --prompt-file flag (#1024)
Add path validation to --prompt-file to block reading sensitive files
(SSH keys, cloud credentials, .env files, etc.) whose contents would be
sent to remote agents. Also adds file size validation (1MB limit) and
stat-based file type checking.

Fixes #991

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:30:05 -05:00
A
b76f04cd78
fix(ux): show cloud count and credential readiness in interactive agent picker (#1025)
When users run `spawn` interactively, the agent picker now shows how many
clouds each agent supports and how many have credentials ready. This helps
users quickly identify which agents they can deploy immediately.

Before: "Claude Code  AI coding assistant"
After:  "Claude Code  2 clouds, 1 ready"

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 18:29:25 -05:00
A
46b760cf2b
test: add 364 tests for Oracle Cloud Infrastructure provider patterns (#1023)
Covers OCI CLI dependency management, VCN networking decomposition
(VCN -> IGW -> route -> security rules -> subnet), instance creation
with flex shape handling, cloud-init userdata, SSH delegation,
server destruction, availability domain handling, and all 15 agent
scripts following correct provisioning flow.

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:07:41 -05:00
A
aafe3d1ce4
fix: eliminate duplicate Loading manifest spinner in agent/cloud info (#1021)
When running `spawn claude` or `spawn hetzner`, the "Loading manifest..."
spinner appeared twice: once in showInfoOrError() and again in
cmdAgentInfo/cmdCloudInfo via validateAndGetEntity(). Pass the
pre-loaded manifest to avoid the redundant load and spinner flash.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:07:08 -05:00
A
415df93ea0
refactor: decompose latitude and contabo create_server into focused helpers (#1022)
Extract validation, error handling, and response parsing from
create_server into dedicated helpers following the pattern from PR #1016.

Latitude helpers: _latitude_validate_inputs, _latitude_check_create_error,
_latitude_extract_server_id

Contabo helpers: _contabo_validate_inputs, _contabo_check_create_error,
_contabo_extract_instance_id

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:05:18 -05:00
A
5412b9891f
fix: validate ALIYUN_IMAGE_ID and fix HOSTKEY input validation ordering (#1019)
- Add validate_resource_name check for ALIYUN_IMAGE_ID env var in
  alibabacloud create_server, consistent with other providers (Contabo,
  Webdock) that validate user-controllable image identifiers
- Move HOSTKEY location validation before _pick_instance_preset call,
  which uses the location in an API request — validates input before
  use rather than after

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:03:34 -05:00
A
41afc76537
test: add 157 tests for Alibaba Cloud provider patterns and conventions (#1020)
Alibaba Cloud was added in commit 0d9307a with zero test coverage.
This adds comprehensive tests covering:
- lib/common.sh API surface (required + provider-specific functions)
- CLI installation and credential handling
- SSH key management (DescribeKeyPairs, ImportKeyPair)
- Server lifecycle (VPC, vSwitch, SecurityGroup, RunInstances)
- Network infrastructure setup (CIDR ranges, availability zones)
- Instance polling behavior
- Security conventions (input validation, safe JSON parsing, macOS compat)
- Agent script patterns (claude.sh, codex.sh, gemini.sh)
- OpenRouter env var injection via SSH
- Manifest consistency checks

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 18:00:55 -05:00
A
f1e7939188
refactor: decompose alibabacloud create_server into focused helpers (#1018)
Extract _ensure_vpc, _ensure_vswitch, _aliyun_json_field, and
_aliyun_json_top_field from the 182-line create_server function.
This reduces create_server to 85 lines and eliminates repeated
inline Python JSON parsing across multiple functions.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 17:54:47 -05:00
A
8a5d03995b
fix: validate provider name in invalidate_cloud_key and improve key validation (#1017)
- Add regex validation (^[a-z0-9][a-z0-9._-]{0,63}$) to invalidate_cloud_key()
  in shared/key-request.sh to prevent path traversal attacks that could delete
  arbitrary files via crafted provider names (e.g., ../../etc/important)

- Improve validKeyVal() in key-server.ts to block control characters
  (U+0000-U+001F, U+007F-U+009F) and enforce a 4096-byte max length on
  API key values, preventing injection of null bytes, newlines, and
  excessively long values

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 14:43:44 -08:00
A
388770126f
refactor: decompose webdock create_server and koyeb ensure_koyeb_cli into focused helpers (#1016)
webdock/lib/common.sh:
- Extract _webdock_get_public_key_ids() for SSH key ID fetching
- Extract _webdock_validate_inputs() for input validation
- Extract _webdock_handle_create_response() for response parsing and error reporting
- create_server reduced from 53 to 24 lines

koyeb/lib/common.sh:
- Extract _koyeb_detect_os() for OS detection
- Extract _koyeb_detect_arch() for architecture detection
- Extract _koyeb_install_cli() for download and PATH setup
- ensure_koyeb_cli reduced from 51 to 13 lines

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 14:22:24 -08:00
A
beec9ab8a3
fix: show signal names instead of 'code null' when scripts are killed (#1014)
When a spawn script is killed by a signal (SIGKILL, SIGTERM, SIGHUP, etc.),
Node.js returns exit code null. Previously this produced the confusing message
"Script exited with code null". Now detects the actual signal and shows
signal-specific guidance: OOM suggestions for SIGKILL, terminal reconnection
tips for SIGHUP, spot instance warnings for SIGTERM.

Fixes #1011

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 14:12:43 -08:00
A
a260dce642
test: add 133 tests for Webdock provider patterns and conventions (#1015)
Webdock was added in PR #1001 with zero dedicated test coverage.
This adds comprehensive tests validating:
- lib/common.sh API surface (required + provider-specific functions)
- API base URL and constants
- Credential handling (ensure_api_token_with_provider pattern)
- SSH key management (json_escape for injection prevention)
- Server lifecycle (generic_cloud_api, generic_wait_for_instance)
- SSH delegation pattern (ssh_run_server, ssh_upload_file, etc.)
- Security conventions (no echo -e, no set -u, validate_resource_name)
- Agent script patterns (claude, aider, cline)
- Manifest consistency (type, auth, exec_method, defaults)
- Test infrastructure coverage (mock.sh and record.sh entries)

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 14:11:47 -08:00