Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
17 KiB
Spawn
Spawn is a matrix of agents x clouds. Every script provisions a cloud server, installs an agent, injects OpenRouter credentials, and drops the user into an interactive session.
The Matrix
manifest.json is the source of truth. It tracks:
- agents — AI agents and self-hosted AI tools (Claude Code, OpenClaw, NanoClaw, ...)
- clouds — cloud providers to run them on (Sprite, Hetzner, ...)
- matrix — which
cloud/agentcombinations are"implemented"vs"missing"
How to Improve Spawn
When run via ./discovery.sh, your job is to pick ONE of these tasks and execute it:
1. Fill a missing matrix entry
Look at manifest.json → matrix for any "missing" entry. To implement it:
- Find the cloud's
lib/common.sh— it has all the provider-specific primitives (create server, run command, upload file, interactive session) - Find the agent's existing script on another cloud — it shows the install steps, config files, env vars, and launch command
- Combine them: use the cloud's primitives to execute the agent's setup steps
- The script goes at
{cloud}/{agent}.sh
Pattern for every script:
1. Source {cloud}/lib/common.sh (local or remote fallback)
2. Authenticate with cloud provider
3. Provision server/VM
4. Wait for readiness
5. Install the agent
6. Get OpenRouter API key (env var or OAuth)
7. Inject env vars into shell config
8. Write agent-specific config files
9. Launch interactive session
OpenRouter injection is mandatory. Every agent script MUST:
- Set
OPENROUTER_API_KEYin the shell environment - Set provider-specific env vars (e.g.,
ANTHROPIC_BASE_URL=https://openrouter.ai/api) - These come from the agent's
envfield inmanifest.json
2. Add a new cloud provider (HIGH BAR)
We are currently shipping with 9 curated clouds (sorted by price):
- local — free (no provisioning)
- hetzner — ~€3.29/mo (CX22)
- ovh — ~€3.50/mo (d2-2)
- fly — free tier (3 shared-cpu VMs)
- aws — $3.50/mo (nano)
- daytona — pay-per-second sandboxes
- digitalocean — $4/mo (Basic droplet)
- gcp — $7.11/mo (e2-micro)
- sprite — Fly.io managed VMs
Do NOT add clouds speculatively. Every cloud must be manually tested and verified end-to-end before shipping. Adding a cloud that can't be tested is worse than not having it.
Requirements to add a new cloud:
- Prestige or unbeatable pricing — must be a well-known brand OR beat our cheapest options
- Must be manually testable — you need an account and can verify scripts work
- REST API or CLI with SSH/exec — no proprietary-only access methods
- Test coverage is mandatory (see "Mock Test Infrastructure" section below)
Steps to add one:
- Create
{cloud}/lib/common.shwith the provider's primitives - Add an entry to
manifest.json→clouds - Add
"missing"entries to the matrix for every existing agent - Implement at least 2-3 agent scripts to prove the lib works
- Update the cloud's
README.md - Add test coverage (mandatory):
test/record.sh— add toALL_RECORDABLE_CLOUDS, add cases inget_endpoints(),get_auth_env_var(),call_api(),has_api_error(), and add a_live_{cloud}()functiontest/mock.sh— add cases in_strip_api_base(),_validate_body(),assert_cloud_api_calls(), andsetup_env_for_cloud()
DO NOT add GPU clouds (CoreWeave, RunPod, etc.). Spawn agents call remote LLM APIs for inference — they need cheap CPU instances with SSH, not expensive GPU VMs.
3. Add a new agent (only with community demand)
Do NOT add agents speculatively. Only add one if there's real community buzz:
Required evidence (at least 2 of these):
- 1000+ GitHub stars on the agent's repo
- Hacker News post with 50+ points (search:
https://hn.algolia.com/api/v1/search?query=AGENT_NAME) - Reddit post with 100+ upvotes in r/LocalLLaMA, r/MachineLearning, or r/ChatGPT
- Explicit user request in this repo's GitHub issues
Technical requirements:
- Installable via a single command (npm, pip, curl)
- Accepts API keys via env vars (
OPENAI_API_KEY,ANTHROPIC_API_KEY, orOPENROUTER_API_KEY) - Works with OpenRouter (natively or via
OPENAI_BASE_URLoverride)
To add: same steps as before (manifest.json entry, matrix entries, implement on 1+ cloud, README).
4. Respond to GitHub issues
Check gh issue list --repo OpenRouterTeam/spawn --state open for user requests:
- If someone requests an agent or cloud, implement it and comment with the PR link
- If something is already implemented, close the issue with a note
- If a bug is reported, fix it
4. Extend tests
test/run.sh contains the test harness. When adding a new cloud or agent:
- Add mock functions for the cloud's CLI/API calls
- Add per-script assertions matching the agent's setup steps
- Run
bash test/run.shto verify
File Structure Convention
spawn/
cli/
src/index.ts # CLI entry point (bun/TypeScript)
src/manifest.ts # Manifest fetch + cache logic
src/commands.ts # All subcommands (interactive, list, run, etc.)
src/version.ts # Version constant
package.json # npm package (@openrouter/spawn)
install.sh # One-liner installer (bun → npm → auto-install bun)
shared/
common.sh # Provider-agnostic shared utilities
{cloud}/
lib/common.sh # Cloud-specific functions (sources shared/common.sh)
{agent}.sh # Agent deployment scripts
.claude/skills/setup-agent-team/
trigger-server.ts # HTTP trigger server (concurrent runs, dedup)
discovery.sh # Discovery cycle script (fill gaps, scout new clouds/agents)
refactor.sh # Dual-mode cycle script (issue fix or full refactor)
start-discovery.sh # Launcher with secrets (gitignored)
start-refactor.sh # Launcher with secrets (gitignored)
.github/workflows/
discovery.yml # Scheduled + issue-triggered discovery workflow
refactor.yml # Scheduled + issue-triggered refactor workflow
manifest.json # The matrix (source of truth)
discovery.sh # Run this to trigger one discovery cycle
test/run.sh # Test harness
README.md # User-facing docs
CLAUDE.md # This file - contributor guide
Documentation Policy
NEVER commit documentation files to the repository. All documentation, testing guides, implementation notes, security audits, and similar files MUST be stored in .docs/ directory (git-ignored).
Examples of files that should NOT be committed:
TESTING_*.mdSECURITY_AUDIT.mdIMPLEMENTATION_NOTES.mdTODO.md- Any other internal documentation files
The only documentation files allowed in the repository are:
README.md(user-facing)CLAUDE.md(contributor guide)- Cloud-specific
README.mdfiles in{cloud}/README.md
If you need to create documentation during development, write it to .docs/ and add .docs/ to .gitignore.
Architecture: Shared Library Pattern
shared/common.sh - Core utilities used by all clouds:
- Logging:
log_info,log_warn,log_error(colored output) - Input handling:
safe_read(works in interactive and piped contexts) - OAuth flow:
try_oauth_flow,get_openrouter_api_key_oauth(browser-based auth) - Network utilities:
nc_listen(cross-platform netcat wrapper),open_browser - SSH helpers:
generate_ssh_key_if_missing,get_ssh_fingerprint,generic_ssh_wait - Security:
validate_model_id,json_escape
{cloud}/lib/common.sh - Cloud-specific extensions:
- Sources
shared/common.shat the top - Adds provider-specific functions:
- Sprite:
ensure_sprite_installed,get_sprite_name,run_sprite, etc. - Hetzner: API wrappers for server creation, SSH key management, etc.
- DigitalOcean: Droplet provisioning, API calls, etc.
- Vultr: Instance management via REST API
- Linode: Linode-specific provisioning functions
- Sprite:
Agent scripts ({cloud}/{agent}.sh):
- Source their cloud's
lib/common.sh(which auto-sourcesshared/common.sh) - Use shared functions for logging, OAuth, SSH setup
- Use cloud functions for provisioning and connecting to servers
- Deploy the specific agent with its configuration
Why This Structure?
- DRY principle: OAuth, logging, SSH logic written once in
shared/common.sh - Consistency: All scripts use same authentication and error handling patterns
- Maintainability: Bug fixes in shared code benefit all providers automatically
- Extensibility: New clouds only need to implement provider-specific logic
- Testability: Shared functions can be tested independently
Source Pattern
Every cloud's lib/common.sh starts with:
#!/bin/bash
# Cloud-specific functions for {provider}
# Source shared provider-agnostic functions
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/../../shared/common.sh" || {
echo "ERROR: Failed to load shared/common.sh" >&2
exit 1
}
# ... cloud-specific functions below ...
This pattern ensures:
- Shared utilities are always available
- Path resolution works when sourced from any location
- Script fails fast if shared library is missing
Shell Script Rules
These rules are non-negotiable — violating them breaks remote execution for all users.
curl|bash Compatibility
Every script MUST work when executed via bash <(curl -fsSL URL):
- NEVER use relative paths for sourcing (
source ./lib/...,source ../shared/...) - NEVER rely on
$0,dirname $0, orBASH_SOURCEresolving to a real filesystem path - ALWAYS use the local-or-remote fallback pattern:
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" 2>/dev/null && pwd)" if [[ -f "$SCRIPT_DIR/lib/common.sh" ]]; then source "$SCRIPT_DIR/lib/common.sh" else eval "$(curl -fsSL https://raw.githubusercontent.com/OpenRouterTeam/spawn/main/{cloud}/lib/common.sh)" fi - Similarly,
{cloud}/lib/common.shMUST use the same fallback forshared/common.sh
macOS bash 3.x Compatibility
macOS ships bash 3.2. All scripts MUST work on it:
- NO
echo -e— useprintffor escape sequences - NO
source <(cmd)insidebash <(curl ...)— useeval "$(cmd)"instead - NO
((var++))withset -e— usevar=$((var + 1))(avoids falsy-zero exit) - NO
localkeyword inside( ... ) &subshells — not function scope - NO
set -u(nounset) — use${VAR:-}for optional env var checks instead
Conventions
#!/bin/bash+set -eo pipefail(nouflag)- Use
${VAR:-}for all optional env var checks (OPENROUTER_API_KEY, cloud tokens, etc.) - Remote fallback URL:
https://raw.githubusercontent.com/OpenRouterTeam/spawn/main/{path} - All env vars documented in the cloud's README.md
Use Bun + TypeScript for Inline Scripting — NEVER python/python3
When shell scripts need JSON processing, HTTP calls, crypto, or any non-trivial logic:
- ALWAYS use
bun eval '...'or write a temp.tsfile andbun runit - NEVER use
python3 -corpython -cfor inline scripting — python is not a project dependency - Prefer
jqfor simple JSON extraction; fall back tobun evalwhen jq is unavailable - Pass data to bun via environment variables (e.g.,
_DATA="${var}" bun eval "...") or temp files — never interpolate untrusted values into JS strings - For complex operations (SigV4 signing, API calls with retries), write a heredoc
.tsfile andbun runit
Testing
- NEVER use vitest — use Bun's built-in test runner (
bun:test) exclusively - Test files go in
cli/src/__tests__/ - Run tests with
bun test - Use
import { describe, it, expect, beforeEach, afterEach, mock, spyOn } from "bun:test"
Mock Test Infrastructure (MANDATORY for new clouds)
When adding a new cloud provider, you MUST update all of these in test/mock.sh:
| Function | What to add |
|---|---|
_strip_api_base() |
URL pattern to strip the cloud's API base (e.g., https://api.newcloud.com/v1*) |
_validate_body() |
Server creation endpoint + required POST fields |
assert_cloud_api_calls() |
Expected API calls (SSH key fetch + server create endpoints) |
setup_env_for_cloud() |
Test env vars (API token, server name, plan, region) |
And in test/record.sh:
| Function | What to add |
|---|---|
ALL_RECORDABLE_CLOUDS |
Cloud name to the list |
get_endpoints() |
GET-only API endpoints for fixture recording |
get_auth_env_var() |
Auth env var name mapping |
call_api() |
API function dispatcher |
has_api_error() |
Error response detection for the cloud's API |
Without these, the cloud has no test coverage, body validation is missing, mock tests log UNHANDLED_URL warnings, and the QA cycle skips the cloud entirely.
CLI Version Management
CRITICAL: Bump the version on every CLI change!
- ANY change to
cli/requires a version bump incli/package.json - Use semantic versioning:
- Patch (0.2.X → 0.2.X+1): Bug fixes, minor improvements, documentation
- Minor (0.X.0 → 0.X+1.0): New features, significant improvements
- Major (X.0.0 → X+1.0.0): Breaking changes
- The CLI has auto-update enabled — users get new versions immediately on next run
- Version bumps ensure users always have the latest fixes and features
- NEVER commit
cli/cli.js— it is a build artifact (already in.gitignore). It is produced during releases, not checked into the repo. Do NOT usegit add -f cli/cli.js.
Autonomous Loops
When running autonomous discovery/refactoring loops (./discovery.sh --loop):
- Run
bash -non every changed .sh file before committing — syntax errors break everything - NEVER revert a prior fix — if
shared/common.shwas changed to fix macOS compat, don't undo it - NEVER re-introduce deleted functions — if
write_oauth_response_filewas removed, don't call it - NEVER change the source/eval fallback pattern in lib/common.sh files — it's load-bearing for curl|bash
- Test after EACH iteration — don't batch multiple changes without verification
- If a change breaks tests, STOP — revert and ask for guidance rather than compounding the regression
Refactoring Service
The automated refactoring service runs via .claude/skills/setup-agent-team/. It is triggered by GitHub Actions (on schedule, on issue open, or manual dispatch).
Architecture
trigger-server.ts — HTTP server (port 8080), spawns refactor.sh per trigger
start-refactor.sh — Sets env vars (secrets, MAX_CONCURRENT), execs trigger-server
refactor.sh — Dual-mode: issue fix or full refactor cycle
refactor.yml — GitHub Actions workflow that POSTs to the trigger server
Dual-Mode Cycles
refactor.sh detects its mode from the SPAWN_ISSUE env var (set by trigger-server.ts):
| Issue Mode | Refactor Mode | |
|---|---|---|
| Trigger | ?reason=issues&issue=N |
?reason=schedule |
| Teammates | 2 (issue-fixer, issue-tester) | 6 (security, ux, complexity, test, branch, community) |
| Prompt timeout | 15 min | 30 min |
| Hard timeout | 20 min | 40 min |
| Worktree | /tmp/spawn-worktrees/issue-N/ |
/tmp/spawn-worktrees/refactor/ |
| Team name | spawn-issue-N |
spawn-refactor |
| Pre-cycle cleanup | Skip | Branch/PR/worktree cleanup |
| Post-cycle commit | Skip (uses PR workflow) | Direct commit to main |
Concurrency
MAX_CONCURRENT=3allows 1 refactor + 2 issue runs simultaneously- Each run gets an isolated worktree — no cross-contamination
- Cleanup only touches its own worktree, never
rm -rf /tmp/spawn-worktrees - Duplicate issue triggers (same issue number already running) return 409 Conflict
- Capacity full returns 429 Too Many Requests
Modifying the Service
start-refactor.shis gitignored (containsTRIGGER_SECRET) — edit locally onlytrigger-server.tsandrefactor.share committed — changes require a PR- After merging changes, restart the service for them to take effect
- The refactor prompt uses
WORKTREE_BASE_PLACEHOLDERwhich getssed-substituted at runtime - Issue prompt uses heredoc variable expansion directly (not single-quoted)
Git Workflow
- Always work on a feature branch — never commit directly to main (except urgent one-line fixes)
- Before creating a PR, check
git statusandgit logto verify branch state - Use
gh pr createfrom the feature branch, thengh pr merge --squash - Every PR must be MERGED or CLOSED with a comment — never close silently
- If a PR can't be merged (conflicts, superseded, wrong approach), close it with
gh pr close {number} --comment "Reason" - Never rebase main or use
--forceunless explicitly asked
After Each Change
bash -n {file}syntax check on all modified scripts- Update
manifest.jsonmatrix status to"implemented" - Update the cloud's
README.mdwith usage instructions - Commit with a descriptive message