Add HOSTKEY (https://hostkey.com/) as a new cloud provider to the spawn
matrix. HOSTKEY offers affordable VPS hosting starting from €1/month with
hourly billing, making it suitable for running AI agents that use remote
API inference.
Changes:
- Created hostkey/lib/common.sh with HOSTKEY API wrappers
- Implemented hostkey/claude.sh (Claude Code agent)
- Implemented hostkey/openclaw.sh (OpenClaw agent)
- Added HOSTKEY to manifest.json clouds section
- Added matrix entries for all 15 agents (2 implemented, 13 missing)
- Updated test/record.sh with HOSTKEY test infrastructure
- Updated test/mock.sh with HOSTKEY URL handling
- Created hostkey/README.md with usage instructions
Data centers: Amsterdam, Frankfurt, Helsinki, Reykjavik, Istanbul, New York
Agent: cloud-scout
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Combines CodeSandbox SDK primitives with NanoClaw agent setup:
- Creates sandbox using CodeSandbox API
- Installs Node.js dependencies (tsx)
- Clones and builds nanoclaw from GitHub
- Injects OpenRouter API key as ANTHROPIC_API_KEY
- Configures .env file with API credentials
- Launches interactive WhatsApp QR code authentication flow
Updates manifest.json matrix status to "implemented"
Agent: gap-filler
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
- Replace 38-line _scaleway_power_on_and_wait polling loop with generic_wait_for_instance
- Remove _scaleway_extract_ip (IP extraction now handled by generic_wait_for_instance)
- Replace inline Python JSON building in create_server and scaleway_register_ssh_key with json_escape
- Replace inline Python error parsing with extract_api_error_message shared helper
- Replace inline Python field extraction with _extract_json_field shared helper
Net reduction: 58 lines (372 -> 315)
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
local/cline.sh and local/plandex.sh were writing API keys to shell
config using double-quoted printf format strings. If an API key
contained shell metacharacters (", $, backtick), sourcing the shell
config could execute arbitrary code.
Replace manual printf with inject_env_vars_local which uses the safe
generate_env_config helper (single-quoted values with proper escaping).
Agent: security-auditor
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
local/continue.sh used a double-quoted heredoc to write the API key
directly into ~/.continue/config.json without escaping. If the key
contained double quotes, it could produce invalid JSON or inject
additional config fields. Replace inline heredoc with the shared
setup_continue_config helper which uses json_escape.
Agent: security-auditor
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds comprehensive test coverage for the local cloud provider, which
runs agents directly on the user's machine without cloud provisioning.
Previously had zero dedicated tests despite 14 implemented agent scripts.
Tests cover:
- local/lib/common.sh API surface (no-op destroy, bash -c exec, cp uploads)
- All 14 local agent scripts follow local-specific patterns
- No SSH/SCP patterns leak into local scripts
- OpenRouter API key handling with OAuth fallback
- SPAWN_PROMPT handling for interactive/non-interactive modes
- Installation verification (command -v checks)
- Safety checks (no sudo, no rm -rf system dirs)
- Manifest consistency for local cloud entries
Agent: test-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Atlantic.Net Cloud as a new cloud provider with REST API support.
Starting at $4-8/mo for budget VPS instances with SSH access.
Implementation:
- Created atlanticnet/lib/common.sh with HMAC-SHA256 API auth
- Implemented 3 agent scripts: claude.sh, aider.sh, openclaw.sh
- Updated manifest.json with cloud entry and 15 matrix entries
- Added test coverage in test/record.sh and test/mock.sh
- Created atlanticnet/README.md with usage docs
API authentication uses timestamp + random GUID signed with private key.
Defaults: G2.2GB plan, ubuntu-24.04_64bit image, USEAST2 location.
Agent: cloud-scout-1
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Add Kilo Code agent support for local machine cloud provider.
- Install @kilocode/cli via npm if not already installed
- Inject OpenRouter credentials via env vars
- Set KILO_PROVIDER_TYPE=openrouter and KILO_OPEN_ROUTER_API_KEY
- Support SPAWN_PROMPT for non-interactive execution
- Update manifest.json matrix entry to "implemented"
Agent: gap-filler
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
The refactor team's pr-maintainer can now rebase and merge PRs
that the security team has already approved. This closes the gap
where approved PRs sat unmerged because neither team was merging
them.
- pr-maintainer: merge APPROVED+MERGEABLE PRs (rebase first)
- Still NEVER review or approve PRs (security team only)
- Updated separation of concerns section
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reduces setup_env_for_cloud (84 lines -> 8 lines) and assert_cloud_api_calls
(32 lines -> 9 lines) in test/mock.sh by moving cloud-specific data into
per-cloud _env.sh and _api_assertions.sh files in test/fixtures/.
Adding a new cloud's test config now only requires creating two small files
in the fixtures directory instead of editing case branches in mock.sh.
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When users type `spawn claude/hetzner` or `spawn hetzner/claude`,
the CLI now splits on the slash and forwards to the correct handler
with a helpful tip, instead of showing a confusing "invalid characters"
error from identifier validation.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The lead agent was abbreviating subsequent reviewer prompts to
"follow the same protocol as pr-reviewer-851" — but sub-agents
can't see each other's prompts. Result: only the first reviewer
got the --approve/--merge instructions, the rest defaulted to
--comment reviews that don't satisfy branch protection.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Change SSH default from StrictHostKeyChecking=no to accept-new, which
accepts host keys on first connection but rejects if they change later
(Trust On First Use). This protects against MITM attacks on subsequent
connections. Requires OpenSSH 7.6+ (released Oct 2017).
- Replace predictable $$-based temp file path in upload_config_file with
$RANDOM to prevent symlink attacks on the remote server.
Addresses findings from issue #763.
Agent: security-auditor
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Plandex agent support for local machine execution:
- Install Plandex via official installer
- Inject OPENROUTER_API_KEY into shell config
- Support both interactive and prompt modes
- Follow local cloud provider pattern
Agent: gap-filler
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements 4 missing ramnode cloud provider integrations:
- ramnode/gemini.sh - Gemini CLI with OpenRouter support
- ramnode/amazonq.sh - Amazon Q CLI via OpenRouter
- ramnode/plandex.sh - Plandex agent with OpenRouter native support
- ramnode/kilocode.sh - Kilo Code CLI with OpenRouter provider
All scripts follow the ramnode pattern:
1. Source ramnode/lib/common.sh (OpenStack API primitives)
2. Authenticate and provision Ubuntu 24.04 server
3. Install the agent via npm/curl
4. Inject OPENROUTER_API_KEY and agent-specific env vars
5. Launch interactive session
Note: ramnode/codex.sh already existed but was marked as missing in manifest.json.
Updated manifest to mark all 5 agents as "implemented".
Agent: gap-filler
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Add CodeSandbox as a new sandbox cloud provider for running AI agents.
CodeSandbox features:
- Firecracker microVMs with ~2 second start times
- SDK/CLI-based exec (no SSH)
- Free tier: 40 hours/month on Build plan
- Secure isolated environments
Implementation:
- Created codesandbox/lib/common.sh with SDK wrapper functions
- Implemented 3 agent scripts: claude, aider, openclaw
- Added CodeSandbox to manifest.json clouds
- Created matrix entries (3 implemented, 12 missing)
- Updated test/record.sh to list as non-recordable CLI cloud
- Added codesandbox/README.md with usage instructions
The implementation follows the existing pattern from e2b and modal,
using Node.js SDK (@codesandbox/sdk) for sandbox lifecycle management.
Agent: cloud-scout
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Add Cline agent support on local machine. Combines local provider primitives
with Cline installation and OpenRouter credential injection.
Features:
- npm-based installation of Cline globally
- OpenRouter API key injection via OAuth or env var
- Persistent env vars in shell config (.zshrc or .bashrc)
- Sets OPENAI_BASE_URL to route through OpenRouter
Agent: gap-filler-2
Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: require full review protocol in every pr-reviewer prompt
The lead agent was abbreviating subsequent reviewer prompts to
"follow the same protocol as pr-reviewer-851" — but sub-agents
can't see each other's prompts. Result: only the first reviewer
got the --approve/--merge instructions, the rest defaulted to
--comment reviews that don't satisfy branch protection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: rewrite monitoring loops to require TaskList on every iteration
Root cause: leads were looping on sleep 5 without ever calling
TaskList — 90 consecutive sleeps, 0 TaskList calls, 0 messages
processed. Teammate messages arrive as user turns but the lead
never checked for them.
Changes:
- All monitoring loops now require TaskList on every iteration
- Added agent teams reference docs (code.claude.com/docs/en/agent-teams)
- SKILL.md: added Agent Teams section with coordination pattern,
spawn requirements, and prompt completeness rule
- Explicit "DO NOT just loop on sleep 5" warnings with examples
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, users would run `spawn claude hetzner` without HCLOUD_TOKEN
set, the CLI would download and start executing the script, and it would
fail mid-execution after potentially provisioning resources. Now the CLI
checks for missing credentials before running and warns the user upfront.
In interactive mode, shows a confirmation prompt so the user can abort
or continue. In non-interactive mode, shows a warning without blocking.
- Add preflightCredentialCheck() that inspects cloud auth env vars
- Call it in cmdRun before script execution
- 9 tests covering all credential states (all set, partial, missing,
multi-var, CLI-based auth, none auth)
- Version bump to 0.2.69
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a spawn script fails, the error message now checks which
required environment variables are actually set vs missing, instead
of generically saying "Missing or invalid credentials". This helps
users immediately see which credential they need to add.
- All set: "Credentials appear to be set (invalid or expired?)"
- Some missing: lists only the specific vars that are not set
- None set: lists all required vars
Version bump to 0.2.67.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two problems:
1. Schedule was every 20 min but review_all cycles take 35 min,
causing overlapping triggers that fill both slots
2. Trigger server only deduped by issue number, not by reason,
so two review_all runs could stack up
Fixes:
- Change schedule from */20 to 0,45 (every 45 min)
- Add reason-based dedup in trigger-server.ts: reject 409 if a
non-issue run with the same reason is already in progress
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prioritize clouds with detected credentials in spawn <agent> info pages.
Skip showing export instructions for env vars already set. Show credential
status in spawn <cloud> info header and available clouds list.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Discovery and refactor teams should not prune branches or
merge/close PRs — that's the security team's job (via
branch-cleaner agent in review_all mode).
- discovery.sh: remove Branch Cleaner agent, remove branch
pruning and PR merge/close from cleanup_between_cycles()
and run_team_cycle() pre-cycle cleanup
- refactor.sh: remove merged branch deletion and stale PR
checks from pre-cycle cleanup, remove orphan branch cleanup
from pr-maintainer role
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace inline python3 JSON extraction with shared _extract_json_field
helper and use sys.argv in body builders instead of string interpolation:
- _ionos_find_existing_datacenter: consolidate 3 python3 calls into 1
- _ionos_build_server_body: use sys.argv for name, cores, ram
- _ionos_build_volume_body: use sys.argv; remove intermediate encoding
- _ionos_create_datacenter: use sys.argv for location; use _extract_json_field
- ionos_register_ssh_key: use sys.argv for key_name and pub_key
- _ionos_wait_for_volume: use _extract_json_field for state extraction
- _ionos_wait_for_server_ip: use _extract_json_field for IP extraction
- _ionos_launch_and_attach: use _extract_json_field for server ID
- _ionos_create_boot_volume: use _extract_json_field for volume ID
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validates that all cloud provider lib/common.sh files follow security
conventions from the security audit. Tests cover SSH key encoding
(json_escape or python json.dumps), config file permissions, Python
code injection prevention, API body JSON safety, heredoc injection
prevention, shared/common.sh sourcing, and credential handling patterns.
Agent: test-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two UX improvements:
1. `spawn <agent> <cloud> --dry-run` now shows a Credentials section that
checks which env vars (OPENROUTER_API_KEY, cloud auth vars) are set vs
missing, so users can verify readiness before a real run.
2. Script failure guidance (exit code 1 and default) now checks which
specific env vars are unset instead of showing a generic "need X + Y"
message, making it immediately clear what's missing.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These helpers were extracted from _cloud_api_retry_loop in PR #821 to
reduce cyclomatic complexity but had zero test coverage. They are
invoked on every cloud API call across all providers:
- _classify_api_result: Classifies curl/HTTP results into retry reasons
(network error, rate limit 429, service unavailable 503) or empty
(success/non-retryable error). Tests cover all branches including
curl exit codes 1/6/7/28, HTTP 429/503, success codes 200/201/204,
non-retryable errors 400-502, and edge cases.
- _report_api_failure: Generates user-facing error messages after
retries are exhausted. Differentiates network vs HTTP errors,
outputs API response body only for HTTP errors. Tests cover
retry count display, response body handling, and special chars.
Also includes integration tests verifying the classify-then-report
pipeline and realistic cloud provider scenarios (Hetzner, DigitalOcean,
DNS failures, auth errors, validation errors).
Agent: test-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Validates the critical contract that every implemented agent script
correctly injects the environment variables from manifest.json.
Catches silent breakage where an agent starts but cannot reach the
LLM API due to missing OPENROUTER_API_KEY or provider-specific vars.
Tests cover:
- OPENROUTER_API_KEY presence in all scripts
- Provider-specific env vars (ANTHROPIC_BASE_URL, OPENAI_BASE_URL, etc.)
- OpenRouter API key acquisition patterns (env check, OAuth, manual)
- Agent install and launch command references
- Cloud lib env injection infrastructure
- Base URL values pointing to openrouter.ai
- No hardcoded API keys (security)
- Full coverage statistics across all agents and clouds
Agent: test-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Long sleeps block message delivery — teammate messages arrive as
user turns between the lead's responses. sleeping 30s means 30s
of queued messages. Use sleep 5 to yield quickly, process messages
immediately, and do useful work between polls.
Updated all 3 service scripts: security.sh, refactor.sh, discovery.sh.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- latitude/lib/common.sh: Replace custom 38-line wait_for_server_ready
polling loop with generic_wait_for_instance from shared/common.sh.
Consolidate extract_latitude_server_ip (36 lines of inline Python) into
a single readonly expression constant. Net -59 lines.
- ovh/lib/common.sh: Replace shell variable interpolation in Python
strings ('${var}') with sys.argv[] in _ovh_find_flavor_id,
_ovh_get_ssh_key_id, _ovh_build_instance_body, and ovh_register_ssh_key.
This eliminates injection surface and follows the established pattern
used by other cloud providers.
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
For non-Sprite environments (standard Linux VMs, sandboxes), systemd
is the recommended way to run the trigger server. Added full unit
file template, management commands, and troubleshooting entries.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add worktree requirement to security team prompts
PR reviewers must check out PRs in sub-worktrees before running
bash -n or bun test. Scan mode agents must also work inside a
worktree. This prevents concurrent agents from conflicting in
the main repo checkout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: enforce worktrees everywhere, refactor pr-maintainer role
- SKILL.md: expand worktree convention to cover all agent work
(PR review, testing, audits) not just branch creation
- refactor.sh pr-maintainer: strip review/approve/merge
responsibilities — that's the security team's job. pr-maintainer
now focuses on rebasing conflicting PRs, addressing review
comments, and fixing failing checks
- Remove stale PR auto-merge from pre-cycle cleanup
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove self-review from refactor team, clarify separation of concerns
Refactor team focuses on research, deep-dives, and solving problems.
Security team owns the entire PR review/approve/merge lifecycle.
- Replace "No Self-Merge Rule" with "Separation of Concerns" section
- Remove all self-review steps from issue and refactor mode workflows
- Remove needs-team-review labeling from agent instructions
- Simplify monitoring loop (no more review verification)
- Simplify lifecycle checks (verify PRs exist, not reviewed)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: remove stale needs-team-review label from security triage reference
The refactor team no longer applies this label, so remove it
from the available labels documentation in triage mode.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LOG_FILE paths in security.sh and refactor.sh were hardcoded to
/home/sprite/spawn, causing permission errors on non-Sprite environments.
Use $REPO_ROOT (already computed dynamically) instead. Also update
SKILL.md examples to use dynamic path resolution.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The quick-start sections in `spawn <cloud>` and `spawn <agent>` now show
whether required env vars are already set (green with "set" indicator)
or still need to be configured (cyan "export" instruction). This helps
users immediately see what credentials are missing before launching.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reduce cyclomatic complexity in the two highest-scoring functions:
- cli/src/commands.ts: Extract `handleUserInterrupt` and `runWithRetries`
from `execScript` (complexity score 6 -> 2 for execScript, retry logic
now independently testable)
- shared/common.sh: Extract `_classify_api_result` and `_report_api_failure`
from `_cloud_api_retry_loop` (complexity score 9 -> 4, removes duplicated
error-classification logic from loop body)
Agent: complexity-hunter
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, `spawn --prompt="Fix bugs" claude sprite` or
`spawn list --agent=claude` would fail with "Unknown flag" because
the CLI only recognized `--flag value` (space-separated) syntax.
Now `--flag=value` is expanded to `--flag value` early in the
arg parsing pipeline, supporting the common GNU-style convention.
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
When a CLI auto-update triggers mid-command (e.g. `spawn claude sprite`),
the updated binary now automatically re-runs with the original arguments
instead of asking the user to manually re-run. Sets SPAWN_NO_UPDATE_CHECK=1
on re-exec to prevent infinite update loops. Falls back to the old "run
again" message when no arguments were provided (bare `spawn`).
Agent: ux-engineer
Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>