Commit graph

917 commits

Author SHA1 Message Date
A
89ffe4802e
refactor: extract mock test env config and API assertions into per-cloud fixture files (#803)
Reduces setup_env_for_cloud (84 lines -> 8 lines) and assert_cloud_api_calls
(32 lines -> 9 lines) in test/mock.sh by moving cloud-specific data into
per-cloud _env.sh and _api_assertions.sh files in test/fixtures/.

Adding a new cloud's test config now only requires creating two small files
in the fixtures directory instead of editing case branches in mock.sh.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:16:11 -08:00
A
7a441813fd
fix: detect slash notation and suggest correct syntax (#859)
When users type `spawn claude/hetzner` or `spawn hetzner/claude`,
the CLI now splits on the slash and forwards to the correct handler
with a helpful tip, instead of showing a confusing "invalid characters"
error from identifier validation.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:15:59 -08:00
A
de5a0c16de
refactor: extract helpers from Hetzner server type validation (#845)
Break down _validate_server_type_for_location (74 lines -> 29 lines) and
create_server (64 lines -> 43 lines) by extracting focused helpers:

- _hetzner_get_available_ids: fetch datacenter availability data
- _hetzner_find_fallback_type: search for compatible alternative types
- _hetzner_resolve_server_type: handle validation errors and fallback logging

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 02:14:30 -08:00
L
019e7b3ce5
fix: require full review protocol in every pr-reviewer prompt (#856)
The lead agent was abbreviating subsequent reviewer prompts to
"follow the same protocol as pr-reviewer-851" — but sub-agents
can't see each other's prompts. Result: only the first reviewer
got the --approve/--merge instructions, the rest defaulted to
--comment reviews that don't satisfy branch protection.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:13:01 -08:00
A
fa5b4979e8
fix: upgrade SSH to StrictHostKeyChecking=accept-new (TOFU) and randomize temp paths (#849)
- Change SSH default from StrictHostKeyChecking=no to accept-new, which
  accepts host keys on first connection but rejects if they change later
  (Trust On First Use). This protects against MITM attacks on subsequent
  connections. Requires OpenSSH 7.6+ (released Oct 2017).
- Replace predictable $$-based temp file path in upload_config_file with
  $RANDOM to prevent symlink attacks on the remote server.

Addresses findings from issue #763.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:11:47 -08:00
A
0f69e2abe9
feat: implement local/plandex.sh (#854)
Add Plandex agent support for local machine execution:
- Install Plandex via official installer
- Inject OPENROUTER_API_KEY into shell config
- Support both interactive and prompt modes
- Follow local cloud provider pattern

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:10:08 -08:00
A
bfb125c028
test: add cloud lib API surface tests (#852)
Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 02:09:56 -08:00
A
05eb3e957d
feat: implement ramnode gaps for gemini, amazonq, plandex, kilocode (#855)
Implements 4 missing ramnode cloud provider integrations:
- ramnode/gemini.sh - Gemini CLI with OpenRouter support
- ramnode/amazonq.sh - Amazon Q CLI via OpenRouter
- ramnode/plandex.sh - Plandex agent with OpenRouter native support
- ramnode/kilocode.sh - Kilo Code CLI with OpenRouter provider

All scripts follow the ramnode pattern:
1. Source ramnode/lib/common.sh (OpenStack API primitives)
2. Authenticate and provision Ubuntu 24.04 server
3. Install the agent via npm/curl
4. Inject OPENROUTER_API_KEY and agent-specific env vars
5. Launch interactive session

Note: ramnode/codex.sh already existed but was marked as missing in manifest.json.
Updated manifest to mark all 5 agents as "implemented".

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 02:09:35 -08:00
A
be903f0089
feat: add CodeSandbox cloud provider (#857)
Add CodeSandbox as a new sandbox cloud provider for running AI agents.

CodeSandbox features:
- Firecracker microVMs with ~2 second start times
- SDK/CLI-based exec (no SSH)
- Free tier: 40 hours/month on Build plan
- Secure isolated environments

Implementation:
- Created codesandbox/lib/common.sh with SDK wrapper functions
- Implemented 3 agent scripts: claude, aider, openclaw
- Added CodeSandbox to manifest.json clouds
- Created matrix entries (3 implemented, 12 missing)
- Updated test/record.sh to list as non-recordable CLI cloud
- Added codesandbox/README.md with usage instructions

The implementation follows the existing pattern from e2b and modal,
using Node.js SDK (@codesandbox/sdk) for sandbox lifecycle management.

Agent: cloud-scout

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:09:31 -08:00
A
de265c543b
feat: implement local/cline.sh (#853)
Add Cline agent support on local machine. Combines local provider primitives
with Cline installation and OpenRouter credential injection.

Features:
- npm-based installation of Cline globally
- OpenRouter API key injection via OAuth or env var
- Persistent env vars in shell config (.zshrc or .bashrc)
- Sets OPENAI_BASE_URL to route through OpenRouter

Agent: gap-filler-2

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:09:26 -08:00
L
2e32f2c9fe
fix: rewrite monitoring loops to require TaskList on every iteration (#858)
* fix: require full review protocol in every pr-reviewer prompt

The lead agent was abbreviating subsequent reviewer prompts to
"follow the same protocol as pr-reviewer-851" — but sub-agents
can't see each other's prompts. Result: only the first reviewer
got the --approve/--merge instructions, the rest defaulted to
--comment reviews that don't satisfy branch protection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rewrite monitoring loops to require TaskList on every iteration

Root cause: leads were looping on sleep 5 without ever calling
TaskList — 90 consecutive sleeps, 0 TaskList calls, 0 messages
processed. Teammate messages arrive as user turns but the lead
never checked for them.

Changes:
- All monitoring loops now require TaskList on every iteration
- Added agent teams reference docs (code.claude.com/docs/en/agent-teams)
- SKILL.md: added Agent Teams section with coordination pattern,
  spawn requirements, and prompt completeness rule
- Explicit "DO NOT just loop on sleep 5" warnings with examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:59:06 -08:00
A
ebdab346df
fix: warn about missing credentials before running spawn scripts (#851)
Previously, users would run `spawn claude hetzner` without HCLOUD_TOKEN
set, the CLI would download and start executing the script, and it would
fail mid-execution after potentially provisioning resources. Now the CLI
checks for missing credentials before running and warns the user upfront.

In interactive mode, shows a confirmation prompt so the user can abort
or continue. In non-interactive mode, shows a warning without blocking.

- Add preflightCredentialCheck() that inspects cloud auth env vars
- Call it in cmdRun before script execution
- 9 tests covering all credential states (all set, partial, missing,
  multi-var, CLI-based auth, none auth)
- Version bump to 0.2.69

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:52:41 -08:00
A
b9e21bded6
refactor: use sys.argv instead of bash interpolation in Python body builders (#842)
Replace unsafe '$var' bash string interpolation inside Python code with
sys.argv parameter passing across 9 cloud provider libs. This eliminates
a class of potential injection bugs where values containing single quotes
could break the Python string context.

Affected functions:
- binarylane: _binarylane_build_server_body
- contabo: _contabo_build_instance_body
- digitalocean: _build_droplet_request_body
- hostinger: _hostinger_build_create_body
- ionos: ionos_register_ssh_key, _ionos_create_datacenter,
         _ionos_build_volume_body, _ionos_build_server_body
- linode: _linode_build_create_payload
- ovh: ovh_register_ssh_key, _ovh_find_flavor_id,
       _ovh_get_ssh_key_id, _ovh_build_instance_body
- upcloud: _build_upcloud_server_body
- vultr: _vultr_build_instance_body

This aligns with the pattern already used by cherry, scaleway, netcup,
and ramnode providers.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:45:11 -08:00
A
7b5f84141f
fix: show specific missing credentials in script failure messages (#813)
When a spawn script fails, the error message now checks which
required environment variables are actually set vs missing, instead
of generically saying "Missing or invalid credentials". This helps
users immediately see which credential they need to add.

- All set: "Credentials appear to be set (invalid or expired?)"
- Some missing: lists only the specific vars that are not set
- None set: lists all required vars

Version bump to 0.2.67.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:45:01 -08:00
L
49bb39c8ec
fix: prevent duplicate review_all runs via reason-based dedup (#848)
Two problems:
1. Schedule was every 20 min but review_all cycles take 35 min,
   causing overlapping triggers that fill both slots
2. Trigger server only deduped by issue number, not by reason,
   so two review_all runs could stack up

Fixes:
- Change schedule from */20 to 0,45 (every 45 min)
- Add reason-based dedup in trigger-server.ts: reject 409 if a
  non-issue run with the same reason is already in progress

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:41:11 -08:00
A
d9a18b49d3
fix: show credential-aware quick start in spawn <agent> and spawn <cloud> info (#817)
Prioritize clouds with detected credentials in spawn <agent> info pages.
Skip showing export instructions for env vars already set. Show credential
status in spawn <cloud> info header and available clouds list.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:33:19 -08:00
L
5763f4dd14
fix: move branch/PR cleanup responsibility to security team (#847)
Discovery and refactor teams should not prune branches or
merge/close PRs — that's the security team's job (via
branch-cleaner agent in review_all mode).

- discovery.sh: remove Branch Cleaner agent, remove branch
  pruning and PR merge/close from cleanup_between_cycles()
  and run_team_cycle() pre-cycle cleanup
- refactor.sh: remove merged branch deletion and stale PR
  checks from pre-cycle cleanup, remove orphan branch cleanup
  from pr-maintainer role

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:27:35 -08:00
A
70060f29f5
refactor: reduce complexity in ionos/lib/common.sh (#815)
Replace inline python3 JSON extraction with shared _extract_json_field
helper and use sys.argv in body builders instead of string interpolation:

- _ionos_find_existing_datacenter: consolidate 3 python3 calls into 1
- _ionos_build_server_body: use sys.argv for name, cores, ram
- _ionos_build_volume_body: use sys.argv; remove intermediate encoding
- _ionos_create_datacenter: use sys.argv for location; use _extract_json_field
- ionos_register_ssh_key: use sys.argv for key_name and pub_key
- _ionos_wait_for_volume: use _extract_json_field for state extraction
- _ionos_wait_for_server_ip: use _extract_json_field for IP extraction
- _ionos_launch_and_attach: use _extract_json_field for server ID
- _ionos_create_boot_volume: use _extract_json_field for volume ID

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:24:44 -08:00
A
2f671d8edf
test: add 66 tests for OAuth security functions in shared/common.sh (#814)
Cover previously untested security-critical OAuth functions:
- _generate_oauth_html: HTML generation for success/error pages
- _validate_oauth_server_args: port validation + CSRF state file
- _generate_oauth_server_script: Node.js server script generation
- cleanup_oauth_session: temp resource cleanup
- exchange_oauth_code: JSON injection prevention via json_escape
- execute_agent_non_interactive: prompt escaping with printf %q
- wait_for_oauth_code: timeout behavior
- _check_oauth_prerequisites: connectivity + runtime detection
- find_node_runtime: bun/node discovery

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:24:33 -08:00
A
1725fa79d4
test: add cloud lib security convention regression tests (69 tests) (#816)
Validates that all cloud provider lib/common.sh files follow security
conventions from the security audit. Tests cover SSH key encoding
(json_escape or python json.dumps), config file permissions, Python
code injection prevention, API body JSON safety, heredoc injection
prevention, shared/common.sh sourcing, and credential handling patterns.

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:23:20 -08:00
A
8446e785cf
test: add 88 tests for OAuth flow functions in shared/common.sh (#843)
The OAuth flow is the primary authentication mechanism for spawn users,
yet its component functions had zero test coverage. This adds tests for:

- validate_oauth_port: port range validation (boundary values, injection)
- _generate_csrf_state: CSRF token generation (entropy, uniqueness)
- _generate_oauth_html: success/error HTML page generation
- _generate_oauth_server_script: Node.js callback server (CSRF, ports)
- _validate_oauth_server_args: prerequisite validation (port, state, runtime)
- _init_oauth_session: temp directory and CSRF state file creation
- cleanup_oauth_session: PID and directory cleanup
- exchange_oauth_code: OAuth code-to-key exchange with json_escape security
- check_openrouter_connectivity: network reachability fallback chain
- Integration: session lifecycle and CSRF security properties

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:22:11 -08:00
A
5fd9d19775
feat: implement ramnode/opencode.sh (#812)
Add OpenCode agent script for RamNode Cloud using OpenStack API.

- Use ramnode cloud primitives for server provisioning
- Install OpenCode via opencode_install_cmd helper
- Inject OPENROUTER_API_KEY environment variable
- Launch interactive OpenCode session via SSH

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 01:21:57 -08:00
A
6182348641
fix: show credential status in dry-run and specify missing env vars on failure (#841)
Two UX improvements:

1. `spawn <agent> <cloud> --dry-run` now shows a Credentials section that
   checks which env vars (OPENROUTER_API_KEY, cloud auth vars) are set vs
   missing, so users can verify readiness before a real run.

2. Script failure guidance (exit code 1 and default) now checks which
   specific env vars are unset instead of showing a generic "need X + Y"
   message, making it immediately clear what's missing.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:20:21 -08:00
A
b1a576a52a
test: add 51 tests for _classify_api_result and _report_api_failure (#834)
These helpers were extracted from _cloud_api_retry_loop in PR #821 to
reduce cyclomatic complexity but had zero test coverage. They are
invoked on every cloud API call across all providers:

- _classify_api_result: Classifies curl/HTTP results into retry reasons
  (network error, rate limit 429, service unavailable 503) or empty
  (success/non-retryable error). Tests cover all branches including
  curl exit codes 1/6/7/28, HTTP 429/503, success codes 200/201/204,
  non-retryable errors 400-502, and edge cases.

- _report_api_failure: Generates user-facing error messages after
  retries are exhausted. Differentiates network vs HTTP errors,
  outputs API response body only for HTTP errors. Tests cover
  retry count display, response body handling, and special chars.

Also includes integration tests verifying the classify-then-report
pipeline and realistic cloud provider scenarios (Hetzner, DigitalOcean,
DNS failures, auth errors, validation errors).

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 01:19:31 -08:00
A
087a14c276
test: add agent env injection contract tests (128 tests) (#838)
Validates the critical contract that every implemented agent script
correctly injects the environment variables from manifest.json.
Catches silent breakage where an agent starts but cannot reach the
LLM API due to missing OPENROUTER_API_KEY or provider-specific vars.

Tests cover:
- OPENROUTER_API_KEY presence in all scripts
- Provider-specific env vars (ANTHROPIC_BASE_URL, OPENAI_BASE_URL, etc.)
- OpenRouter API key acquisition patterns (env check, OAuth, manual)
- Agent install and launch command references
- Cloud lib env injection infrastructure
- Base URL values pointing to openrouter.ai
- No hardcoded API keys (security)
- Full coverage statistics across all agents and clouds

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:19:14 -08:00
L
2d4827f9ed
fix: replace sleep 15/30 polling with sleep 5 yield pattern (#844)
Long sleeps block message delivery — teammate messages arrive as
user turns between the lead's responses. sleeping 30s means 30s
of queued messages. Use sleep 5 to yield quickly, process messages
immediately, and do useful work between polls.

Updated all 3 service scripts: security.sh, refactor.sh, discovery.sh.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:18:07 -08:00
A
813089def7
test: add 67 tests for shared/github-auth.sh (zero prior coverage) (#832)
Add comprehensive test coverage for the standalone GitHub auth helper
(shared/github-auth.sh) merged in PR #824 with no tests.

Coverage includes:
- Source pattern and function availability (9 tests)
- Fallback log functions when common.sh unavailable (3 tests)
- ensure_gh_cli: detection, installation paths, error handling (7 tests)
- _install_gh_binary: OS/arch detection, error paths, cleanup (11 tests)
- ensure_gh_auth: token auth, interactive login, post-login checks (8 tests)
- ensure_github_auth: combined wrapper success/failure (4 tests)
- Direct execution mode and set -eo pipefail (2 tests)
- Script conventions: bash 3.x compat, no echo -e, safe var access (10 tests)
- Installation path coverage: macOS/Linux/APT/DNF/Homebrew (4 tests)
- Error handling edge cases: curl failure, tar failure, auth failures (6 tests)
- GITHUB_TOKEN security: piped via printf, not CLI arg (2 tests)
- Shebang check (1 test)

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:17:57 -08:00
A
cb2a8614e9
refactor: reduce complexity in latitude and ovh cloud libs (#835)
- latitude/lib/common.sh: Replace custom 38-line wait_for_server_ready
  polling loop with generic_wait_for_instance from shared/common.sh.
  Consolidate extract_latitude_server_ip (36 lines of inline Python) into
  a single readonly expression constant. Net -59 lines.

- ovh/lib/common.sh: Replace shell variable interpolation in Python
  strings ('${var}') with sys.argv[] in _ovh_find_flavor_id,
  _ovh_get_ssh_key_id, _ovh_build_instance_body, and ovh_register_ssh_key.
  This eliminates injection surface and follows the established pattern
  used by other cloud providers.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 01:17:20 -08:00
L
ac3a8c58a5
docs: add systemd as service option alongside sprite-env in SKILL.md (#840)
For non-Sprite environments (standard Linux VMs, sandboxes), systemd
is the recommended way to run the trigger server. Added full unit
file template, management commands, and troubleshooting entries.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:09:01 -08:00
L
ac5ef953f9
fix: enforce worktrees, clarify team separation of concerns (#839)
* fix: add worktree requirement to security team prompts

PR reviewers must check out PRs in sub-worktrees before running
bash -n or bun test. Scan mode agents must also work inside a
worktree. This prevents concurrent agents from conflicting in
the main repo checkout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: enforce worktrees everywhere, refactor pr-maintainer role

- SKILL.md: expand worktree convention to cover all agent work
  (PR review, testing, audits) not just branch creation
- refactor.sh pr-maintainer: strip review/approve/merge
  responsibilities — that's the security team's job. pr-maintainer
  now focuses on rebasing conflicting PRs, addressing review
  comments, and fixing failing checks
- Remove stale PR auto-merge from pre-cycle cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove self-review from refactor team, clarify separation of concerns

Refactor team focuses on research, deep-dives, and solving problems.
Security team owns the entire PR review/approve/merge lifecycle.

- Replace "No Self-Merge Rule" with "Separation of Concerns" section
- Remove all self-review steps from issue and refactor mode workflows
- Remove needs-team-review labeling from agent instructions
- Simplify monitoring loop (no more review verification)
- Simplify lifecycle checks (verify PRs exist, not reviewed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove stale needs-team-review label from security triage reference

The refactor team no longer applies this label, so remove it
from the available labels documentation in triage mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 00:52:13 -08:00
L
c602443711
fix: remove hardcoded /home/sprite paths from service scripts (#836)
The LOG_FILE paths in security.sh and refactor.sh were hardcoded to
/home/sprite/spawn, causing permission errors on non-Sprite environments.
Use $REPO_ROOT (already computed dynamically) instead. Also update
SKILL.md examples to use dynamic path resolution.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 00:34:34 -08:00
A
9f76af00d2
fix: show credential status in quick-start sections (#823)
The quick-start sections in `spawn <cloud>` and `spawn <agent>` now show
whether required env vars are already set (green with "set" indicator)
or still need to be configured (cyan "export" instruction). This helps
users immediately see what credentials are missing before launching.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:59:57 -08:00
A
4d3c54a11e
refactor: extract helpers from execScript and _cloud_api_retry_loop (#821)
Reduce cyclomatic complexity in the two highest-scoring functions:

- cli/src/commands.ts: Extract `handleUserInterrupt` and `runWithRetries`
  from `execScript` (complexity score 6 -> 2 for execScript, retry logic
  now independently testable)

- shared/common.sh: Extract `_classify_api_result` and `_report_api_failure`
  from `_cloud_api_retry_loop` (complexity score 9 -> 4, removes duplicated
  error-classification logic from loop body)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:57:20 -08:00
A
e73d6b9793
fix: support --flag=value syntax in CLI argument parsing (#826)
Previously, `spawn --prompt="Fix bugs" claude sprite` or
`spawn list --agent=claude` would fail with "Unknown flag" because
the CLI only recognized `--flag value` (space-separated) syntax.
Now `--flag=value` is expanded to `--flag value` early in the
arg parsing pipeline, supporting the common GNU-style convention.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
2026-02-12 23:55:46 -08:00
A
716da5d43b
fix: auto re-exec command after CLI auto-update (fixes #780) (#830)
When a CLI auto-update triggers mid-command (e.g. `spawn claude sprite`),
the updated binary now automatically re-runs with the original arguments
instead of asking the user to manually re-run. Sets SPAWN_NO_UPDATE_CHECK=1
on re-exec to prevent infinite update loops. Falls back to the old "run
again" message when no arguments were provided (bare `spawn`).

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:54:49 -08:00
A
c7bbe8bc3b
refactor: extract generic _live_create_delete_cycle in test/record.sh (#818)
The 5 per-cloud live recording functions (_live_hetzner, _live_digitalocean,
_live_vultr, _live_linode, _live_civo) each duplicated 50-65 lines of
identical create->save->extract-id->delete->save logic. Extract a generic
_live_create_delete_cycle helper that handles the shared flow, with per-cloud
body builder functions providing only the cloud-specific parts.

Reduces test/record.sh by 112 lines (1016 -> 904) while preserving all
behavior including cloud-specific delete delays and empty-response fallbacks.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:51 -08:00
A
317d931e87
test: add 32 tests for extract_api_error_message in shared/common.sh (#820)
This function parses JSON error responses from cloud provider APIs (used
by Hetzner, DigitalOcean, Vultr, and Contabo) and had zero test coverage.
Tests cover: field priority order, fallback behavior, realistic cloud
provider responses, and edge cases (non-object JSON, null/empty fields).

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:27 -08:00
A
fbea9303f0
test: add 48 tests for SSH key lifecycle functions (#828)
Cover ensure_ssh_key_with_provider (zero prior coverage), plus edge cases
for generate_ssh_key_if_missing, get_ssh_fingerprint, extract_ssh_key_ids,
and check_ssh_key_by_fingerprint. Tests validate the callback-based SSH
key registration flow used by all cloud providers.

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:22 -08:00
A
a96b310861
refactor: reduce complexity in Hetzner _validate_server_type_for_location (#831)
- Extract _hetzner_find_candidates helper to eliminate duplicated jq
  candidate-search logic (same-family and any-family searches were
  nearly identical 15-line blocks)
- Consolidate 3 separate jq calls for wanted_cpu/cores/memory into
  a single jq invocation
- Replace duplicated replacement-picking code with a loop over
  family strategies

Function reduced from 106 to ~72 lines (plus 17-line reusable helper).

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:17 -08:00
A
5169350feb
fix: use buildRetryCommand in spawn list footer to avoid truncated prompts (#819)
The "Rerun last" hint in `spawn list` was truncating prompts at 30
characters and appending "...", producing broken copy-paste commands.
Now delegates to the existing buildRetryCommand helper which properly
handles long prompts by suggesting --prompt-file instead of truncating.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:52:08 -08:00
A
3f28d5f29f
test: add 52 tests for SSH helpers and instance polling in shared/common.sh (#822)
Cover critical infrastructure functions that had zero dedicated test coverage:
- ssh_run_server, ssh_upload_file, ssh_interactive_session (SSH command construction)
- ssh_verify_connectivity (ConnectTimeout, max_attempts, test command)
- generic_ssh_wait (exponential backoff, success/failure, elapsed time logging)
- wait_for_cloud_init (argument delegation, cloud-init file check)
- generic_wait_for_instance (API polling, status matching, IP export, timeout)
- extract_api_error_message (all 5 error field patterns + fallbacks)
- SSH_USER default behavior (root fallback across all helpers)

Uses mock SSH/SCP/sleep commands via PATH override to test argument
construction and behavior without requiring network connectivity.

Agent: test-engineer

-- refactor/test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:51:46 -08:00
L
88954f0e12
feat: add standalone GitHub auth helper (shared/github-auth.sh) (#824)
Standalone, sourceable script that installs the gh CLI and runs
interactive gh auth login. Any agent script on any cloud can source
it and call ensure_github_auth to get authenticated with GitHub.

- ensure_gh_cli: installs via brew/apt/dnf/binary fallback
- ensure_gh_auth: uses GITHUB_TOKEN or interactive OAuth flow
- ensure_github_auth: combined convenience wrapper
- Idempotent, macOS bash 3.x compatible, curl|bash compatible

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:37:02 -08:00
L
608104a76d
fix: set IS_SANDBOX=1 in all spawn environments (#829)
All spawn environments are disposable cloud VMs. Setting IS_SANDBOX=1
helps agents like Claude Code recognize the environment as a sandbox,
avoiding unnecessary safety prompts for root-level operations.

Added in two places for full coverage:
- generate_env_config(): included automatically in every env injection
- get_cloud_init_userdata(): set in .bashrc/.zshrc during cloud-init

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:36:36 -08:00
L
6633873ccc
refactor: replace Python with jq in Hetzner lib, fix /lab → /labs URLs (#827)
Hetzner lib: replace all Python JSON parsing with jq. Uses the
/datacenters API as the authoritative source for server type
availability (server_types.available), cross-referenced with
/server_types for specs and pricing. jq is auto-installed if missing.

URLs: update openrouter.ai/lab/spawn → openrouter.ai/labs/spawn
across all READMEs and CLI source.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:14:11 -08:00
L
720753c644
fix: validate Hetzner server type availability at selected location (#825)
When HETZNER_SERVER_TYPE and HETZNER_LOCATION are both pre-set (e.g. by
automated scripts), there was no validation that the type is actually
available at that location. ARM types like cax11 are only available in
EU datacenters, causing "unsupported location for server type" errors
when paired with US locations like ash.

Now create_server validates the type against the Hetzner /datacenters
API (server_types.available) before attempting creation. If incompatible,
it auto-selects the cheapest compatible alternative (same CPU family,
>= specs) and warns the user.

Uses jq for JSON parsing (auto-installed if missing) and the Hetzner
/datacenters + /server_types APIs for authoritative availability data.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:01:37 -08:00
L
56c4c020d5
feat: consolidate security review_all and scan into single 20-min cycle (#802)
The two scheduled modes (review_all every 15 min, scan every 30 min)
competed for MAX_CONCURRENT=1 on the trigger server, causing 429 drops
and 30-55+ min gaps. Merge both into a single cycle that runs every
20 min, prioritizing PR review but also performing lightweight repo
scanning when capacity allows (≤5 open PRs).

Also prevents refactor agents from closing issues manually — issues
now auto-close via `Fixes #N` in the PR body when merged.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 20:29:56 -08:00
L
8bcdb59c09
docs: add Contributing section to README (#788)
* docs: add Contributing section to README

Adds guidance for testing cloud providers, reporting issues,
requesting new clouds/agents, and reporting auth problems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use correct issue template URLs for cloud, agent, and CLI requests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 18:05:47 -08:00
L
ba6b0bd98f
fix: add missing sign-off footers to all agent gh comments and PR bodies (#785)
Many gh commands in agent team prompts were missing the mandatory
`-- team/agent-name` sign-off footer, causing dedup checks to fail
and making it impossible to identify which agent posted a comment.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 17:59:02 -08:00
A
0fe83fe311
fix: improve CLI error messages for retry commands and unknown names (#777)
- buildRetryCommand: suggest --prompt-file for long prompts instead of
  truncating into a non-functional command (threshold raised to 80 chars)
- showUnknownCommandError: change "Unknown command" to "Unknown agent or cloud"
  since users are passing agent/cloud names, not commands
- Bump CLI version to 0.2.66

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-12 17:19:46 -08:00
A
cb1005ab31
refactor: extract helpers from run_script_test and run_shellcheck in test/run.sh (#776)
Split run_script_test (61 lines -> 25 lines) into focused helpers:
- _assert_sprite_common_commands: standard command lifecycle assertions
- _assert_agent_specific: per-agent install assertions
- _assert_no_temp_leaks: temp file cleanup check

Split run_shellcheck (57 lines -> 12 lines) into:
- _discover_shell_scripts: dynamic script discovery across cloud dirs
- _run_shellcheck_on_scripts: per-script shellcheck execution and reporting

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-12 17:19:32 -08:00