vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-04-28 03:49:31 +00:00

Sprite 288d191320 refactor: Migrate tests from vitest to bun:test and add testing rules

- Convert all test files to use bun:test instead of vitest
- Update CLAUDE.md to prohibit vitest, mandate bun:test
- Replace vi.fn() with mock() from bun:test
- Replace vi.spyOn with spyOn from bun:test
- Note: commands.test.ts needs module mocking refactor (TODO)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-08 04:29:37 +00:00

10 KiB

Raw Blame History

Spawn

Spawn is a matrix of agents x clouds. Every script provisions a cloud server, installs an agent, injects OpenRouter credentials, and drops the user into an interactive session.

The Matrix

manifest.json is the source of truth. It tracks:

agents — coding agents / AI tools (Claude Code, OpenClaw, NanoClaw, ...)
clouds — cloud providers to run them on (Sprite, Hetzner, ...)
matrix — which cloud/agent combinations are "implemented" vs "missing"

How to Improve Spawn

When run via ./improve.sh, your job is to pick ONE of these tasks and execute it:

1. Fill a missing matrix entry

Look at manifest.json → matrix for any "missing" entry. To implement it:

Find the cloud's lib/common.sh — it has all the provider-specific primitives (create server, run command, upload file, interactive session)
Find the agent's existing script on another cloud — it shows the install steps, config files, env vars, and launch command
Combine them: use the cloud's primitives to execute the agent's setup steps
The script goes at {cloud}/{agent}.sh

Pattern for every script:

1. Source {cloud}/lib/common.sh (local or remote fallback)
2. Authenticate with cloud provider
3. Provision server/VM
4. Wait for readiness
5. Install the agent
6. Get OpenRouter API key (env var or OAuth)
7. Inject env vars into shell config
8. Write agent-specific config files
9. Launch interactive session

OpenRouter injection is mandatory. Every agent script MUST:

Set OPENROUTER_API_KEY in the shell environment
Set provider-specific env vars (e.g., ANTHROPIC_BASE_URL=https://openrouter.ai/api)
These come from the agent's env field in manifest.json

2. Add a new cloud provider (PRIORITY)

We bias heavily toward adding more clouds/sandboxes over more agents. To add one:

Create {cloud}/lib/common.sh with the provider's primitives:
- Auth/token management (env var → config file → prompt)
- Server/container creation (API call or CLI)
- SSH/exec connectivity
- File upload
- Interactive session
- Server destruction
Add an entry to manifest.json → clouds
Add "missing" entries to the matrix for every existing agent
Implement at least 2-3 agent scripts to prove the lib works
Update the cloud's README.md

Good candidate clouds:

Container/sandbox platforms (fast spin-up, developer-friendly)
GPU clouds (CoreWeave, RunPod, Vast.ai, Together AI)
Regional providers with simple APIs (OVH, Scaleway, UpCloud)
Any provider with REST API or CLI + SSH/exec + pay-per-hour pricing

3. Add a new agent (only with community demand)

Do NOT add agents speculatively. Only add one if there's real community buzz:

Required evidence (at least 2 of these):

1000+ GitHub stars on the agent's repo
Hacker News post with 50+ points (search: https://hn.algolia.com/api/v1/search?query=AGENT_NAME)
Reddit post with 100+ upvotes in r/LocalLLaMA, r/MachineLearning, or r/ChatGPT
Explicit user request in this repo's GitHub issues

Technical requirements:

Installable via a single command (npm, pip, curl)
Accepts API keys via env vars (OPENAI_API_KEY, ANTHROPIC_API_KEY, or OPENROUTER_API_KEY)
Works with OpenRouter (natively or via OPENAI_BASE_URL override)

To add: same steps as before (manifest.json entry, matrix entries, implement on 1+ cloud, README).

4. Respond to GitHub issues

Check gh issue list --repo OpenRouterTeam/spawn --state open for user requests:

If someone requests an agent or cloud, implement it and comment with the PR link
If something is already implemented, close the issue with a note
If a bug is reported, fix it

4. Extend tests

test/run.sh contains the test harness. When adding a new cloud or agent:

Add mock functions for the cloud's CLI/API calls
Add per-script assertions matching the agent's setup steps
Run bash test/run.sh to verify

File Structure Convention

spawn/
  cli/
    src/index.ts                 # CLI entry point (bun/TypeScript)
    src/manifest.ts              # Manifest fetch + cache logic
    src/commands.ts              # All subcommands (interactive, list, run, etc.)
    src/version.ts               # Version constant
    package.json                 # npm package (@openrouter/spawn)
    install.sh                   # One-liner installer (bun → npm → bash fallback)
    spawn.sh                     # Bash fallback CLI (no bun/node required)
  shared/
    common.sh                    # Provider-agnostic shared utilities
  {cloud}/
    lib/common.sh                # Cloud-specific functions (sources shared/common.sh)
    {agent}.sh                   # Agent deployment scripts
  manifest.json                  # The matrix (source of truth)
  improve.sh                     # Run this to trigger one improvement cycle
  test/run.sh                    # Test harness
  README.md                      # User-facing docs
  CLAUDE.md                      # This file - contributor guide

Architecture: Shared Library Pattern

shared/common.sh - Core utilities used by all clouds:

Logging: log_info, log_warn, log_error (colored output)
Input handling: safe_read (works in interactive and piped contexts)
OAuth flow: try_oauth_flow, get_openrouter_api_key_oauth (browser-based auth)
Network utilities: nc_listen (cross-platform netcat wrapper), open_browser
SSH helpers: generate_ssh_key_if_missing, get_ssh_fingerprint, generic_ssh_wait
Security: validate_model_id, json_escape

{cloud}/lib/common.sh - Cloud-specific extensions:

Sources shared/common.sh at the top
Adds provider-specific functions:
- Sprite: ensure_sprite_installed, get_sprite_name, run_sprite, etc.
- Hetzner: API wrappers for server creation, SSH key management, etc.
- DigitalOcean: Droplet provisioning, API calls, etc.
- Vultr: Instance management via REST API
- Linode: Linode-specific provisioning functions

Agent scripts ({cloud}/{agent}.sh):

Source their cloud's lib/common.sh (which auto-sources shared/common.sh)
Use shared functions for logging, OAuth, SSH setup
Use cloud functions for provisioning and connecting to servers
Deploy the specific agent with its configuration

Why This Structure?

DRY principle: OAuth, logging, SSH logic written once in shared/common.sh
Consistency: All scripts use same authentication and error handling patterns
Maintainability: Bug fixes in shared code benefit all providers automatically
Extensibility: New clouds only need to implement provider-specific logic
Testability: Shared functions can be tested independently

Source Pattern

Every cloud's lib/common.sh starts with:

#!/bin/bash
# Cloud-specific functions for {provider}

# Source shared provider-agnostic functions
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/../../shared/common.sh" || {
    echo "ERROR: Failed to load shared/common.sh" >&2
    exit 1
}

# ... cloud-specific functions below ...

This pattern ensures:

Shared utilities are always available
Path resolution works when sourced from any location
Script fails fast if shared library is missing

Shell Script Rules

These rules are non-negotiable — violating them breaks remote execution for all users.

curl|bash Compatibility

Every script MUST work when executed via bash <(curl -fsSL URL):

NEVER use relative paths for sourcing (source ./lib/..., source ../shared/...)
NEVER rely on $0, dirname $0, or BASH_SOURCE resolving to a real filesystem path

ALWAYS use the local-or-remote fallback pattern:

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" 2>/dev/null && pwd)"
if [[ -f "$SCRIPT_DIR/lib/common.sh" ]]; then
    source "$SCRIPT_DIR/lib/common.sh"
else
    eval "$(curl -fsSL https://raw.githubusercontent.com/OpenRouterTeam/spawn/main/{cloud}/lib/common.sh)"
fi

Similarly, {cloud}/lib/common.sh MUST use the same fallback for shared/common.sh

macOS bash 3.x Compatibility

macOS ships bash 3.2. All scripts MUST work on it:

NO echo -e — use printf for escape sequences
NO source <(cmd) inside bash <(curl ...) — use eval "$(cmd)" instead
NO ((var++)) with set -e — use var=$((var + 1)) (avoids falsy-zero exit)
NO local keyword inside ( ... ) & subshells — not function scope
NO set -u (nounset) — use ${VAR:-} for optional env var checks instead

Conventions

#!/bin/bash + set -eo pipefail (no u flag)
Use ${VAR:-} for all optional env var checks (OPENROUTER_API_KEY, cloud tokens, etc.)
Remote fallback URL: https://raw.githubusercontent.com/OpenRouterTeam/spawn/main/{path}
All env vars documented in the cloud's README.md

Testing

NEVER use vitest — use Bun's built-in test runner (bun:test) exclusively
Test files go in cli/src/__tests__/
Run tests with bun test
Use import { describe, it, expect, beforeEach, afterEach, mock, spyOn } from "bun:test"

Autonomous Loops

When running autonomous improvement/refactoring loops (./improve.sh --loop):

Run bash -n on every changed .sh file before committing — syntax errors break everything
NEVER revert a prior fix — if shared/common.sh was changed to fix macOS compat, don't undo it
NEVER re-introduce deleted functions — if write_oauth_response_file was removed, don't call it
NEVER change the source/eval fallback pattern in lib/common.sh files — it's load-bearing for curl|bash
Test after EACH iteration — don't batch multiple changes without verification
If a change breaks tests, STOP — revert and ask for guidance rather than compounding the regression

Git Workflow

Always work on a feature branch — never commit directly to main (except urgent one-line fixes)
Before creating a PR, check git status and git log to verify branch state
Use gh pr create from the feature branch, then gh pr merge --squash
Never rebase main or use --force unless explicitly asked

After Each Change

bash -n {file} syntax check on all modified scripts
Update manifest.json matrix status to "implemented"
Update the cloud's README.md with usage instructions
Commit with a descriptive message

10 KiB Raw Blame History