Commit graph

882 commits

Author SHA1 Message Date
A
fba986abea
feat: add HOSTKEY cloud provider (#909)
Add HOSTKEY (https://hostkey.com/) as a new cloud provider to the spawn
matrix. HOSTKEY offers affordable VPS hosting starting from €1/month with
hourly billing, making it suitable for running AI agents that use remote
API inference.

Changes:
- Created hostkey/lib/common.sh with HOSTKEY API wrappers
- Implemented hostkey/claude.sh (Claude Code agent)
- Implemented hostkey/openclaw.sh (OpenClaw agent)
- Added HOSTKEY to manifest.json clouds section
- Added matrix entries for all 15 agents (2 implemented, 13 missing)
- Updated test/record.sh with HOSTKEY test infrastructure
- Updated test/mock.sh with HOSTKEY URL handling
- Created hostkey/README.md with usage instructions

Data centers: Amsterdam, Frankfurt, Helsinki, Reykjavik, Istanbul, New York

Agent: cloud-scout

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 05:08:26 -08:00
A
841618f1b8
feat: implement codesandbox/nanoclaw (#915)
Combines CodeSandbox SDK primitives with NanoClaw agent setup:
- Creates sandbox using CodeSandbox API
- Installs Node.js dependencies (tsx)
- Clones and builds nanoclaw from GitHub
- Injects OpenRouter API key as ANTHROPIC_API_KEY
- Configures .env file with API credentials
- Launches interactive WhatsApp QR code authentication flow

Updates manifest.json matrix status to "implemented"

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 05:08:04 -08:00
A
b2dd67a0af
refactor: extract helpers to reduce complexity in fly and netcup providers (#912)
fly/lib/common.sh:
- Extract _get_fly_cmd() to eliminate duplicated fly/flyctl CLI resolution
  across run_server, interactive_session, _try_flyctl_auth, ensure_fly_cli
- Extract _fly_parse_error() to deduplicate JSON error parsing (was inline
  in _validate_fly_token, _fly_create_app, _fly_create_machine)
- Extract _fly_build_machine_body() from _fly_create_machine (50→32 lines)
- Use shared _extract_json_field in _fly_create_machine and
  _fly_wait_for_machine_start instead of inline python3 calls

netcup/lib/common.sh:
- Extract _netcup_is_success() for repeated status=='success' checks
  (was inline python3 in create_server, destroy_server, _netcup_wait_for_ip)
- Extract _netcup_build_login_body() from netcup_get_session (51→30 lines)
- Use _extract_json_field throughout instead of inline python3 one-liners
- Net reduction: 351→335 lines (-16)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 05:07:53 -08:00
A
3babfa08ca
feat: Implement atlanticnet/nanoclaw (#919)
Agent: gap-filler-atlanticnet

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 05:07:42 -08:00
A
81bb668ee0
refactor: replace hand-rolled loops/helpers with shared utilities in cherry and ionos (#916)
Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 05:07:17 -08:00
A
6be6537f1b
feat: Add goose on CodeSandbox (#898)
Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 05:07:00 -08:00
A
3a86cecccf
feat: Add gemini on Atlantic.Net (#900)
Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 05:06:53 -08:00
A
0d4bfdeb98
refactor: replace hand-rolled loops and inline Python with shared helpers in Scaleway provider (#923)
- Replace 38-line _scaleway_power_on_and_wait polling loop with generic_wait_for_instance
- Remove _scaleway_extract_ip (IP extraction now handled by generic_wait_for_instance)
- Replace inline Python JSON building in create_server and scaleway_register_ssh_key with json_escape
- Replace inline Python error parsing with extract_api_error_message shared helper
- Replace inline Python field extraction with _extract_json_field shared helper

Net reduction: 58 lines (372 -> 315)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 05:06:44 -08:00
A
e242f1d85c
fix: use safe single-quoted env injection in cline.sh and plandex.sh (#914)
local/cline.sh and local/plandex.sh were writing API keys to shell
config using double-quoted printf format strings. If an API key
contained shell metacharacters (", $, backtick), sourcing the shell
config could execute arbitrary code.

Replace manual printf with inject_env_vars_local which uses the safe
generate_env_config helper (single-quoted values with proper escaping).

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 05:04:25 -08:00
A
e760de064a
fix: use shared setup_continue_config to prevent JSON injection in local/continue.sh (#921)
local/continue.sh used a double-quoted heredoc to write the API key
directly into ~/.continue/config.json without escaping. If the key
contained double quotes, it could produce invalid JSON or inject
additional config fields. Replace inline heredoc with the shared
setup_continue_config helper which uses json_escape.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 05:03:38 -08:00
A
150f500085
docs: Sync README matrix with manifest.json (#917)
Agent: team-lead

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 04:25:42 -08:00
A
7731306f37
test: add local cloud provider pattern tests (239 tests) (#911)
Adds comprehensive test coverage for the local cloud provider, which
runs agents directly on the user's machine without cloud provisioning.
Previously had zero dedicated tests despite 14 implemented agent scripts.

Tests cover:
- local/lib/common.sh API surface (no-op destroy, bash -c exec, cp uploads)
- All 14 local agent scripts follow local-specific patterns
- No SSH/SCP patterns leak into local scripts
- OpenRouter API key handling with OAuth fallback
- SPAWN_PROMPT handling for interactive/non-interactive modes
- Installation verification (command -v checks)
- Safety checks (no sudo, no rm -rf system dirs)
- Manifest consistency for local cloud entries

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 04:02:45 -08:00
A
fc34a640bd
feat: add Atlantic.Net cloud provider (#883)
Add Atlantic.Net Cloud as a new cloud provider with REST API support.
Starting at $4-8/mo for budget VPS instances with SSH access.

Implementation:
- Created atlanticnet/lib/common.sh with HMAC-SHA256 API auth
- Implemented 3 agent scripts: claude.sh, aider.sh, openclaw.sh
- Updated manifest.json with cloud entry and 15 matrix entries
- Added test coverage in test/record.sh and test/mock.sh
- Created atlanticnet/README.md with usage docs

API authentication uses timestamp + random GUID signed with private key.
Defaults: G2.2GB plan, ubuntu-24.04_64bit image, USEAST2 location.

Agent: cloud-scout-1

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 03:07:22 -08:00
A
f30c255d2a
feat: implement local/kilocode.sh (#807)
Add Kilo Code agent support for local machine cloud provider.

- Install @kilocode/cli via npm if not already installed
- Inject OpenRouter credentials via env vars
- Set KILO_PROVIDER_TYPE=openrouter and KILO_OPEN_ROUTER_API_KEY
- Support SPAWN_PROMPT for non-interactive execution
- Update manifest.json matrix entry to "implemented"

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 02:30:59 -08:00
L
5a8cd13237
fix: let pr-maintainer merge already-approved PRs (#861)
The refactor team's pr-maintainer can now rebase and merge PRs
that the security team has already approved. This closes the gap
where approved PRs sat unmerged because neither team was merging
them.

- pr-maintainer: merge APPROVED+MERGEABLE PRs (rebase first)
- Still NEVER review or approve PRs (security team only)
- Updated separation of concerns section

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:19:56 -08:00
A
89ffe4802e
refactor: extract mock test env config and API assertions into per-cloud fixture files (#803)
Reduces setup_env_for_cloud (84 lines -> 8 lines) and assert_cloud_api_calls
(32 lines -> 9 lines) in test/mock.sh by moving cloud-specific data into
per-cloud _env.sh and _api_assertions.sh files in test/fixtures/.

Adding a new cloud's test config now only requires creating two small files
in the fixtures directory instead of editing case branches in mock.sh.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:16:11 -08:00
A
7a441813fd
fix: detect slash notation and suggest correct syntax (#859)
When users type `spawn claude/hetzner` or `spawn hetzner/claude`,
the CLI now splits on the slash and forwards to the correct handler
with a helpful tip, instead of showing a confusing "invalid characters"
error from identifier validation.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:15:59 -08:00
A
de5a0c16de
refactor: extract helpers from Hetzner server type validation (#845)
Break down _validate_server_type_for_location (74 lines -> 29 lines) and
create_server (64 lines -> 43 lines) by extracting focused helpers:

- _hetzner_get_available_ids: fetch datacenter availability data
- _hetzner_find_fallback_type: search for compatible alternative types
- _hetzner_resolve_server_type: handle validation errors and fallback logging

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 02:14:30 -08:00
L
019e7b3ce5
fix: require full review protocol in every pr-reviewer prompt (#856)
The lead agent was abbreviating subsequent reviewer prompts to
"follow the same protocol as pr-reviewer-851" — but sub-agents
can't see each other's prompts. Result: only the first reviewer
got the --approve/--merge instructions, the rest defaulted to
--comment reviews that don't satisfy branch protection.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:13:01 -08:00
A
fa5b4979e8
fix: upgrade SSH to StrictHostKeyChecking=accept-new (TOFU) and randomize temp paths (#849)
- Change SSH default from StrictHostKeyChecking=no to accept-new, which
  accepts host keys on first connection but rejects if they change later
  (Trust On First Use). This protects against MITM attacks on subsequent
  connections. Requires OpenSSH 7.6+ (released Oct 2017).
- Replace predictable $$-based temp file path in upload_config_file with
  $RANDOM to prevent symlink attacks on the remote server.

Addresses findings from issue #763.

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 02:11:47 -08:00
A
0f69e2abe9
feat: implement local/plandex.sh (#854)
Add Plandex agent support for local machine execution:
- Install Plandex via official installer
- Inject OPENROUTER_API_KEY into shell config
- Support both interactive and prompt modes
- Follow local cloud provider pattern

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:10:08 -08:00
A
bfb125c028
test: add cloud lib API surface tests (#852)
Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 02:09:56 -08:00
A
05eb3e957d
feat: implement ramnode gaps for gemini, amazonq, plandex, kilocode (#855)
Implements 4 missing ramnode cloud provider integrations:
- ramnode/gemini.sh - Gemini CLI with OpenRouter support
- ramnode/amazonq.sh - Amazon Q CLI via OpenRouter
- ramnode/plandex.sh - Plandex agent with OpenRouter native support
- ramnode/kilocode.sh - Kilo Code CLI with OpenRouter provider

All scripts follow the ramnode pattern:
1. Source ramnode/lib/common.sh (OpenStack API primitives)
2. Authenticate and provision Ubuntu 24.04 server
3. Install the agent via npm/curl
4. Inject OPENROUTER_API_KEY and agent-specific env vars
5. Launch interactive session

Note: ramnode/codex.sh already existed but was marked as missing in manifest.json.
Updated manifest to mark all 5 agents as "implemented".

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 02:09:35 -08:00
A
be903f0089
feat: add CodeSandbox cloud provider (#857)
Add CodeSandbox as a new sandbox cloud provider for running AI agents.

CodeSandbox features:
- Firecracker microVMs with ~2 second start times
- SDK/CLI-based exec (no SSH)
- Free tier: 40 hours/month on Build plan
- Secure isolated environments

Implementation:
- Created codesandbox/lib/common.sh with SDK wrapper functions
- Implemented 3 agent scripts: claude, aider, openclaw
- Added CodeSandbox to manifest.json clouds
- Created matrix entries (3 implemented, 12 missing)
- Updated test/record.sh to list as non-recordable CLI cloud
- Added codesandbox/README.md with usage instructions

The implementation follows the existing pattern from e2b and modal,
using Node.js SDK (@codesandbox/sdk) for sandbox lifecycle management.

Agent: cloud-scout

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:09:31 -08:00
A
de265c543b
feat: implement local/cline.sh (#853)
Add Cline agent support on local machine. Combines local provider primitives
with Cline installation and OpenRouter credential injection.

Features:
- npm-based installation of Cline globally
- OpenRouter API key injection via OAuth or env var
- Persistent env vars in shell config (.zshrc or .bashrc)
- Sets OPENAI_BASE_URL to route through OpenRouter

Agent: gap-filler-2

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-13 02:09:26 -08:00
L
2e32f2c9fe
fix: rewrite monitoring loops to require TaskList on every iteration (#858)
* fix: require full review protocol in every pr-reviewer prompt

The lead agent was abbreviating subsequent reviewer prompts to
"follow the same protocol as pr-reviewer-851" — but sub-agents
can't see each other's prompts. Result: only the first reviewer
got the --approve/--merge instructions, the rest defaulted to
--comment reviews that don't satisfy branch protection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rewrite monitoring loops to require TaskList on every iteration

Root cause: leads were looping on sleep 5 without ever calling
TaskList — 90 consecutive sleeps, 0 TaskList calls, 0 messages
processed. Teammate messages arrive as user turns but the lead
never checked for them.

Changes:
- All monitoring loops now require TaskList on every iteration
- Added agent teams reference docs (code.claude.com/docs/en/agent-teams)
- SKILL.md: added Agent Teams section with coordination pattern,
  spawn requirements, and prompt completeness rule
- Explicit "DO NOT just loop on sleep 5" warnings with examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:59:06 -08:00
A
ebdab346df
fix: warn about missing credentials before running spawn scripts (#851)
Previously, users would run `spawn claude hetzner` without HCLOUD_TOKEN
set, the CLI would download and start executing the script, and it would
fail mid-execution after potentially provisioning resources. Now the CLI
checks for missing credentials before running and warns the user upfront.

In interactive mode, shows a confirmation prompt so the user can abort
or continue. In non-interactive mode, shows a warning without blocking.

- Add preflightCredentialCheck() that inspects cloud auth env vars
- Call it in cmdRun before script execution
- 9 tests covering all credential states (all set, partial, missing,
  multi-var, CLI-based auth, none auth)
- Version bump to 0.2.69

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:52:41 -08:00
A
b9e21bded6
refactor: use sys.argv instead of bash interpolation in Python body builders (#842)
Replace unsafe '$var' bash string interpolation inside Python code with
sys.argv parameter passing across 9 cloud provider libs. This eliminates
a class of potential injection bugs where values containing single quotes
could break the Python string context.

Affected functions:
- binarylane: _binarylane_build_server_body
- contabo: _contabo_build_instance_body
- digitalocean: _build_droplet_request_body
- hostinger: _hostinger_build_create_body
- ionos: ionos_register_ssh_key, _ionos_create_datacenter,
         _ionos_build_volume_body, _ionos_build_server_body
- linode: _linode_build_create_payload
- ovh: ovh_register_ssh_key, _ovh_find_flavor_id,
       _ovh_get_ssh_key_id, _ovh_build_instance_body
- upcloud: _build_upcloud_server_body
- vultr: _vultr_build_instance_body

This aligns with the pattern already used by cherry, scaleway, netcup,
and ramnode providers.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:45:11 -08:00
A
7b5f84141f
fix: show specific missing credentials in script failure messages (#813)
When a spawn script fails, the error message now checks which
required environment variables are actually set vs missing, instead
of generically saying "Missing or invalid credentials". This helps
users immediately see which credential they need to add.

- All set: "Credentials appear to be set (invalid or expired?)"
- Some missing: lists only the specific vars that are not set
- None set: lists all required vars

Version bump to 0.2.67.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:45:01 -08:00
L
49bb39c8ec
fix: prevent duplicate review_all runs via reason-based dedup (#848)
Two problems:
1. Schedule was every 20 min but review_all cycles take 35 min,
   causing overlapping triggers that fill both slots
2. Trigger server only deduped by issue number, not by reason,
   so two review_all runs could stack up

Fixes:
- Change schedule from */20 to 0,45 (every 45 min)
- Add reason-based dedup in trigger-server.ts: reject 409 if a
  non-issue run with the same reason is already in progress

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:41:11 -08:00
A
d9a18b49d3
fix: show credential-aware quick start in spawn <agent> and spawn <cloud> info (#817)
Prioritize clouds with detected credentials in spawn <agent> info pages.
Skip showing export instructions for env vars already set. Show credential
status in spawn <cloud> info header and available clouds list.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:33:19 -08:00
L
5763f4dd14
fix: move branch/PR cleanup responsibility to security team (#847)
Discovery and refactor teams should not prune branches or
merge/close PRs — that's the security team's job (via
branch-cleaner agent in review_all mode).

- discovery.sh: remove Branch Cleaner agent, remove branch
  pruning and PR merge/close from cleanup_between_cycles()
  and run_team_cycle() pre-cycle cleanup
- refactor.sh: remove merged branch deletion and stale PR
  checks from pre-cycle cleanup, remove orphan branch cleanup
  from pr-maintainer role

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:27:35 -08:00
A
70060f29f5
refactor: reduce complexity in ionos/lib/common.sh (#815)
Replace inline python3 JSON extraction with shared _extract_json_field
helper and use sys.argv in body builders instead of string interpolation:

- _ionos_find_existing_datacenter: consolidate 3 python3 calls into 1
- _ionos_build_server_body: use sys.argv for name, cores, ram
- _ionos_build_volume_body: use sys.argv; remove intermediate encoding
- _ionos_create_datacenter: use sys.argv for location; use _extract_json_field
- ionos_register_ssh_key: use sys.argv for key_name and pub_key
- _ionos_wait_for_volume: use _extract_json_field for state extraction
- _ionos_wait_for_server_ip: use _extract_json_field for IP extraction
- _ionos_launch_and_attach: use _extract_json_field for server ID
- _ionos_create_boot_volume: use _extract_json_field for volume ID

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:24:44 -08:00
A
2f671d8edf
test: add 66 tests for OAuth security functions in shared/common.sh (#814)
Cover previously untested security-critical OAuth functions:
- _generate_oauth_html: HTML generation for success/error pages
- _validate_oauth_server_args: port validation + CSRF state file
- _generate_oauth_server_script: Node.js server script generation
- cleanup_oauth_session: temp resource cleanup
- exchange_oauth_code: JSON injection prevention via json_escape
- execute_agent_non_interactive: prompt escaping with printf %q
- wait_for_oauth_code: timeout behavior
- _check_oauth_prerequisites: connectivity + runtime detection
- find_node_runtime: bun/node discovery

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:24:33 -08:00
A
1725fa79d4
test: add cloud lib security convention regression tests (69 tests) (#816)
Validates that all cloud provider lib/common.sh files follow security
conventions from the security audit. Tests cover SSH key encoding
(json_escape or python json.dumps), config file permissions, Python
code injection prevention, API body JSON safety, heredoc injection
prevention, shared/common.sh sourcing, and credential handling patterns.

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:23:20 -08:00
A
8446e785cf
test: add 88 tests for OAuth flow functions in shared/common.sh (#843)
The OAuth flow is the primary authentication mechanism for spawn users,
yet its component functions had zero test coverage. This adds tests for:

- validate_oauth_port: port range validation (boundary values, injection)
- _generate_csrf_state: CSRF token generation (entropy, uniqueness)
- _generate_oauth_html: success/error HTML page generation
- _generate_oauth_server_script: Node.js callback server (CSRF, ports)
- _validate_oauth_server_args: prerequisite validation (port, state, runtime)
- _init_oauth_session: temp directory and CSRF state file creation
- cleanup_oauth_session: PID and directory cleanup
- exchange_oauth_code: OAuth code-to-key exchange with json_escape security
- check_openrouter_connectivity: network reachability fallback chain
- Integration: session lifecycle and CSRF security properties

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:22:11 -08:00
A
5fd9d19775
feat: implement ramnode/opencode.sh (#812)
Add OpenCode agent script for RamNode Cloud using OpenStack API.

- Use ramnode cloud primitives for server provisioning
- Install OpenCode via opencode_install_cmd helper
- Inject OPENROUTER_API_KEY environment variable
- Launch interactive OpenCode session via SSH

Agent: gap-filler

Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
2026-02-13 01:21:57 -08:00
A
6182348641
fix: show credential status in dry-run and specify missing env vars on failure (#841)
Two UX improvements:

1. `spawn <agent> <cloud> --dry-run` now shows a Credentials section that
   checks which env vars (OPENROUTER_API_KEY, cloud auth vars) are set vs
   missing, so users can verify readiness before a real run.

2. Script failure guidance (exit code 1 and default) now checks which
   specific env vars are unset instead of showing a generic "need X + Y"
   message, making it immediately clear what's missing.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:20:21 -08:00
A
b1a576a52a
test: add 51 tests for _classify_api_result and _report_api_failure (#834)
These helpers were extracted from _cloud_api_retry_loop in PR #821 to
reduce cyclomatic complexity but had zero test coverage. They are
invoked on every cloud API call across all providers:

- _classify_api_result: Classifies curl/HTTP results into retry reasons
  (network error, rate limit 429, service unavailable 503) or empty
  (success/non-retryable error). Tests cover all branches including
  curl exit codes 1/6/7/28, HTTP 429/503, success codes 200/201/204,
  non-retryable errors 400-502, and edge cases.

- _report_api_failure: Generates user-facing error messages after
  retries are exhausted. Differentiates network vs HTTP errors,
  outputs API response body only for HTTP errors. Tests cover
  retry count display, response body handling, and special chars.

Also includes integration tests verifying the classify-then-report
pipeline and realistic cloud provider scenarios (Hetzner, DigitalOcean,
DNS failures, auth errors, validation errors).

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 01:19:31 -08:00
A
087a14c276
test: add agent env injection contract tests (128 tests) (#838)
Validates the critical contract that every implemented agent script
correctly injects the environment variables from manifest.json.
Catches silent breakage where an agent starts but cannot reach the
LLM API due to missing OPENROUTER_API_KEY or provider-specific vars.

Tests cover:
- OPENROUTER_API_KEY presence in all scripts
- Provider-specific env vars (ANTHROPIC_BASE_URL, OPENAI_BASE_URL, etc.)
- OpenRouter API key acquisition patterns (env check, OAuth, manual)
- Agent install and launch command references
- Cloud lib env injection infrastructure
- Base URL values pointing to openrouter.ai
- No hardcoded API keys (security)
- Full coverage statistics across all agents and clouds

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:19:14 -08:00
L
2d4827f9ed
fix: replace sleep 15/30 polling with sleep 5 yield pattern (#844)
Long sleeps block message delivery — teammate messages arrive as
user turns between the lead's responses. sleeping 30s means 30s
of queued messages. Use sleep 5 to yield quickly, process messages
immediately, and do useful work between polls.

Updated all 3 service scripts: security.sh, refactor.sh, discovery.sh.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:18:07 -08:00
A
813089def7
test: add 67 tests for shared/github-auth.sh (zero prior coverage) (#832)
Add comprehensive test coverage for the standalone GitHub auth helper
(shared/github-auth.sh) merged in PR #824 with no tests.

Coverage includes:
- Source pattern and function availability (9 tests)
- Fallback log functions when common.sh unavailable (3 tests)
- ensure_gh_cli: detection, installation paths, error handling (7 tests)
- _install_gh_binary: OS/arch detection, error paths, cleanup (11 tests)
- ensure_gh_auth: token auth, interactive login, post-login checks (8 tests)
- ensure_github_auth: combined wrapper success/failure (4 tests)
- Direct execution mode and set -eo pipefail (2 tests)
- Script conventions: bash 3.x compat, no echo -e, safe var access (10 tests)
- Installation path coverage: macOS/Linux/APT/DNF/Homebrew (4 tests)
- Error handling edge cases: curl failure, tar failure, auth failures (6 tests)
- GITHUB_TOKEN security: piped via printf, not CLI arg (2 tests)
- Shebang check (1 test)

Agent: test-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:17:57 -08:00
A
cb2a8614e9
refactor: reduce complexity in latitude and ovh cloud libs (#835)
- latitude/lib/common.sh: Replace custom 38-line wait_for_server_ready
  polling loop with generic_wait_for_instance from shared/common.sh.
  Consolidate extract_latitude_server_ip (36 lines of inline Python) into
  a single readonly expression constant. Net -59 lines.

- ovh/lib/common.sh: Replace shell variable interpolation in Python
  strings ('${var}') with sys.argv[] in _ovh_find_flavor_id,
  _ovh_get_ssh_key_id, _ovh_build_instance_body, and ovh_register_ssh_key.
  This eliminates injection surface and follows the established pattern
  used by other cloud providers.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-13 01:17:20 -08:00
L
ac3a8c58a5
docs: add systemd as service option alongside sprite-env in SKILL.md (#840)
For non-Sprite environments (standard Linux VMs, sandboxes), systemd
is the recommended way to run the trigger server. Added full unit
file template, management commands, and troubleshooting entries.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 01:09:01 -08:00
L
ac5ef953f9
fix: enforce worktrees, clarify team separation of concerns (#839)
* fix: add worktree requirement to security team prompts

PR reviewers must check out PRs in sub-worktrees before running
bash -n or bun test. Scan mode agents must also work inside a
worktree. This prevents concurrent agents from conflicting in
the main repo checkout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: enforce worktrees everywhere, refactor pr-maintainer role

- SKILL.md: expand worktree convention to cover all agent work
  (PR review, testing, audits) not just branch creation
- refactor.sh pr-maintainer: strip review/approve/merge
  responsibilities — that's the security team's job. pr-maintainer
  now focuses on rebasing conflicting PRs, addressing review
  comments, and fixing failing checks
- Remove stale PR auto-merge from pre-cycle cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove self-review from refactor team, clarify separation of concerns

Refactor team focuses on research, deep-dives, and solving problems.
Security team owns the entire PR review/approve/merge lifecycle.

- Replace "No Self-Merge Rule" with "Separation of Concerns" section
- Remove all self-review steps from issue and refactor mode workflows
- Remove needs-team-review labeling from agent instructions
- Simplify monitoring loop (no more review verification)
- Simplify lifecycle checks (verify PRs exist, not reviewed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove stale needs-team-review label from security triage reference

The refactor team no longer applies this label, so remove it
from the available labels documentation in triage mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 00:52:13 -08:00
L
c602443711
fix: remove hardcoded /home/sprite paths from service scripts (#836)
The LOG_FILE paths in security.sh and refactor.sh were hardcoded to
/home/sprite/spawn, causing permission errors on non-Sprite environments.
Use $REPO_ROOT (already computed dynamically) instead. Also update
SKILL.md examples to use dynamic path resolution.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-13 00:34:34 -08:00
A
9f76af00d2
fix: show credential status in quick-start sections (#823)
The quick-start sections in `spawn <cloud>` and `spawn <agent>` now show
whether required env vars are already set (green with "set" indicator)
or still need to be configured (cyan "export" instruction). This helps
users immediately see what credentials are missing before launching.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:59:57 -08:00
A
4d3c54a11e
refactor: extract helpers from execScript and _cloud_api_retry_loop (#821)
Reduce cyclomatic complexity in the two highest-scoring functions:

- cli/src/commands.ts: Extract `handleUserInterrupt` and `runWithRetries`
  from `execScript` (complexity score 6 -> 2 for execScript, retry logic
  now independently testable)

- shared/common.sh: Extract `_classify_api_result` and `_report_api_failure`
  from `_cloud_api_retry_loop` (complexity score 9 -> 4, removes duplicated
  error-classification logic from loop body)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:57:20 -08:00
A
e73d6b9793
fix: support --flag=value syntax in CLI argument parsing (#826)
Previously, `spawn --prompt="Fix bugs" claude sprite` or
`spawn list --agent=claude` would fail with "Unknown flag" because
the CLI only recognized `--flag value` (space-separated) syntax.
Now `--flag=value` is expanded to `--flag value` early in the
arg parsing pipeline, supporting the common GNU-style convention.

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
2026-02-12 23:55:46 -08:00
A
716da5d43b
fix: auto re-exec command after CLI auto-update (fixes #780) (#830)
When a CLI auto-update triggers mid-command (e.g. `spawn claude sprite`),
the updated binary now automatically re-runs with the original arguments
instead of asking the user to manually re-run. Sets SPAWN_NO_UPDATE_CHECK=1
on re-exec to prevent infinite update loops. Falls back to the old "run
again" message when no arguments were provided (bare `spawn`).

Agent: ux-engineer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 23:54:49 -08:00