All cloud claude.sh scripts had inline curl-only installs with no fallback.
When the curl installer failed (transient outage, rate limit), installation
failed with no recovery. Additionally, fnm-installed Node.js was invisible
to subsequent SSH sessions because each SSH command runs in a non-interactive
shell that doesn't source .bashrc/.zshrc.
Changes:
- Migrate 8 cloud scripts to use shared install_claude_code (curl → npm → bun)
- Move _ensure_node_runtime before npm/bun install attempts (not after)
- Add fnm paths to claude_path so node is discoverable across SSH sessions
- Prefix npm/bun install commands with claude_path for PATH visibility
- Update test assertion to match new install_claude_code behavior
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 5 composable helper functions to shared/common.sh (install_agent,
verify_agent, get_or_prompt_api_key, inject_env_vars_cb, launch_session)
using the same callback pattern as offer_github_auth and
setup_claude_code_config. Refactor all 15 hetzner scripts to use them,
reducing total line count from 868 to 579 (-33%).
Add install_claude_code helper with 3-method fallback (curl → npm → bun)
and per-step error logging. When npm/bun fallback needs node, installs it
via fnm (platform-agnostic) with nodesource as Debian/Ubuntu fallback.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: wire shared/github-auth.sh into all agent flows
Add offer_github_auth() to shared/common.sh and call it from the
inject_env_vars_* functions so all agent flows automatically offer
GitHub CLI setup after env var injection — no per-script changes needed.
Changes:
- shared/common.sh: add offer_github_auth() function, call it from
inject_env_vars_ssh() and inject_env_vars_local()
- sprite/lib/common.sh: call offer_github_auth() from
inject_env_vars_sprite()
- OVH is covered automatically (inject_env_vars_ovh delegates to
inject_env_vars_ssh)
Behavior:
- Prompts "Set up GitHub CLI (gh) on this machine? (y/N):"
- Defaults to No (non-blocking for users who don't need it)
- Skippable via SPAWN_SKIP_GITHUB_AUTH=1 env var for CI/automation
- Uses safe_read for curl|bash compatibility
- Downloads and runs shared/github-auth.sh on the remote VM
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: add shared agent setup helpers, deduplicate hetzner scripts (#1236)
Add 5 composable helper functions to shared/common.sh (install_agent,
verify_agent, get_or_prompt_api_key, inject_env_vars_cb, launch_session)
that use the same callback pattern as offer_github_auth and
setup_claude_code_config. Refactor all 15 hetzner agent scripts to use
them, reducing total line count from 868 to 579 (-33%).
Phase 1 of multi-phase rollout — remaining clouds to follow.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests calling loadManifest(true) with mocked fetch were writing test
manifests (only 2 agents) to the real ~/.cache/spawn/manifest.json.
This caused `spawn` to show only "Claude Code" and "Aider" instead
of all 15 agents.
Root cause: CACHE_DIR/CACHE_FILE were computed once at import time,
so tests setting XDG_CACHE_HOME in beforeEach() had no effect.
Fix:
- Make CACHE_DIR/CACHE_FILE dynamic via getter functions so test
isolation via XDG_CACHE_HOME actually works
- Skip disk writes in test environments unless XDG_CACHE_HOME is
explicitly set (tests that need disk cache use setupTestEnvironment
which sets XDG_CACHE_HOME to a temp dir)
- Bump CLI version to 0.2.88
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes#1180
When running `spawn <agent>` (e.g., `spawn claude`), now shows an interactive
cloud picker instead of requiring the full command or showing agent info.
- Add cmdAgentInteractive() function for agent-first cloud selection
- Route `spawn <agent>` to interactive picker when in TTY mode
- Fall back to agent info display in non-interactive contexts
- Update help text to reflect new interactive behavior
- Version bump 0.2.83 → 0.2.84
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
API keys and env vars were only written to .zshrc, so SSH sessions using
bash couldn't find credentials. Also fixes incorrect ~/.claude/local/bin
PATH (claude installs to ~/.local/bin) and syncs interactive_session PATH
with cloud-init PATH across all 9 clouds.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixed Hetzner installation issue where curl to claude.ai/install.sh
was returning 403 errors. Added fallback to use bun (already installed
by cloud-init) to install Claude Code.
Also added --debug flag to enable verbose bash output (set -x) for
easier troubleshooting.
Changes:
- hetzner/claude.sh: Use bun fallback installation method
- CLI: Added --debug flag support (v0.2.86)
- shared/common.sh: Enable set -x when SPAWN_DEBUG=1
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* test: fix failing agent-config-setup tests by handling $HOME path substitution
Fixed 15 failing tests in agent-config-setup.test.ts by adding proper $HOME
path substitution in mock_run callbacks. The config setup functions use
$HOME instead of ~ in the mv commands, but the test mocks only were
replacing ~/ paths. Now all mock_run callbacks properly replace both:
- ~/ paths (for mkdir commands)
- $HOME paths (for mv commands in upload_config_file)
All 8061 CLI tests now pass. All 80 shell tests remain passing.
Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: clarify monitoring loop requirements to prevent early session termination (#1194)
All four run modes (team_building, triage, review_all, scan) now have
explicit "Monitor Loop (CRITICAL)" sections with step-by-step instructions:
1. Call TaskList to check task status
2. Process completed tasks/messages
3. Call Bash("sleep 15") to wait
4. REPEAT until done or timeout
This fixes the issue where team leads would spawn teammates, then fail
to enter the monitoring loop, causing the session to end prematurely
(since "session ENDS when you produce a response with NO tool calls").
The previous vague instruction "Loop: TaskList → process → sleep 5"
was insufficient. The new format makes it crystal clear that:
- The loop must be INFINITE (keep repeating)
- EVERY iteration must include BOTH TaskList AND Bash sleep calls
- The session will end if you stop calling tools
This addresses the bug where review_all sessions ended after ~115s
instead of running the full 30min cycle.
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: document Sprite vs normal VM paths in SKILL.md, never invent directories (#1196)
- Add environment table: Sprite VMs use /home/sprite/, normal VMs use /root/
- Replace all hardcoded /root/spawn paths with <REPO_ROOT> placeholders
- Instruct agents to ask the user for the repo path, never guess
- Explicitly ban inventing directories like /home/claude-runner/
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: fix agent-config-setup.test.ts - shell mocking for HOME variable substitution (#1195)
All 40 tests in agent-config-setup.test.ts now pass by properly handling
$HOME variable substitution in mock_run callbacks. Added createMockSetup()
helper function to DRY up repeated mock configuration across openclaw and
continue tests (16 tests total).
Changes:
- Fix mock_run() to replace $HOME before evaluating commands
- Add createMockSetup(tempDir, configDir) helper to reduce code duplication
- Update all setup_openclaw_config and setup_continue_config tests to use helper
- Ensures /tmp/spawn_config_* temp files are redirected to temp test directory
Agent: test-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* refactor: reduce complexity in cmdConnect and setup_claude_code_config (#1191)
Extract helper functions to reduce nesting and duplication:
1. cmdConnect (54 → 28 lines): Extract runInteractiveCommand() helper to
eliminate duplicate spawn/Promise handling for Sprite and SSH connections
2. interactiveListPicker (48 → 21 lines): Extract handleRecordAction() helper
to reduce nesting in reconnect/rerun logic
3. setup_claude_code_config (46 → 40 lines): Extract _generate_claude_code_settings()
and _generate_claude_code_state() helpers to clarify JSON generation and
make the main function focus on orchestration
All changes preserve existing behavior and pass existing tests.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: pr-maintainer <pr-maintainer@spawn>
Extract helper functions to reduce nesting and duplication:
1. cmdConnect (54 → 28 lines): Extract runInteractiveCommand() helper to
eliminate duplicate spawn/Promise handling for Sprite and SSH connections
2. interactiveListPicker (48 → 21 lines): Extract handleRecordAction() helper
to reduce nesting in reconnect/rerun logic
3. setup_claude_code_config (46 → 40 lines): Extract _generate_claude_code_settings()
and _generate_claude_code_state() helpers to clarify JSON generation and
make the main function focus on orchestration
All changes preserve existing behavior and pass existing tests.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
All 40 tests in agent-config-setup.test.ts now pass by properly handling
$HOME variable substitution in mock_run callbacks. Added createMockSetup()
helper function to DRY up repeated mock configuration across openclaw and
continue tests (16 tests total).
Changes:
- Fix mock_run() to replace $HOME before evaluating commands
- Add createMockSetup(tempDir, configDir) helper to reduce code duplication
- Update all setup_openclaw_config and setup_continue_config tests to use helper
- Ensures /tmp/spawn_config_* temp files are redirected to temp test directory
Agent: test-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add environment table: Sprite VMs use /home/sprite/, normal VMs use /root/
- Replace all hardcoded /root/spawn paths with <REPO_ROOT> placeholders
- Instruct agents to ask the user for the repo path, never guess
- Explicitly ban inventing directories like /home/claude-runner/
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All four run modes (team_building, triage, review_all, scan) now have
explicit "Monitor Loop (CRITICAL)" sections with step-by-step instructions:
1. Call TaskList to check task status
2. Process completed tasks/messages
3. Call Bash("sleep 15") to wait
4. REPEAT until done or timeout
This fixes the issue where team leads would spawn teammates, then fail
to enter the monitoring loop, causing the session to end prematurely
(since "session ENDS when you produce a response with NO tool calls").
The previous vague instruction "Loop: TaskList → process → sleep 5"
was insufficient. The new format makes it crystal clear that:
- The loop must be INFINITE (keep repeating)
- EVERY iteration must include BOTH TaskList AND Bash sleep calls
- The session will end if you stop calling tools
This addresses the bug where review_all sessions ended after ~115s
instead of running the full 30min cycle.
Co-authored-by: Security Reviewer <security-reviewer@spawn.dev>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Reduce from 41 cloud providers to 10 (9 + local) curated for launch:
- local (free), oracle (free tier), hetzner (~€3.29/mo), ovh (~€3.50/mo),
fly (free tier), aws-lightsail ($3.50/mo), daytona (pay-per-second),
digitalocean ($4/mo), gcp ($7.11/mo), sprite (Fly.io VMs)
Changes:
- Remove 30 cloud directories, test fixtures, and provider-specific tests
- Slim manifest.json from 600 to 150 matrix entries, sorted by price
- Update CLAUDE.md with higher bar for adding clouds (prestige + pricing)
- Transform discovery service from code-implementing team to upvote-driven
demand tracker that creates proposal issues and only implements when a
proposal reaches 50+ upvotes
- Create GitHub issue #1183 as cloud wishlist with all dropped clouds
- Add discovery-team/cloud-proposal/agent-proposal labels
- Protect discovery-team issues from refactor team (no comments/changes)
- Fix all CLI tests (8034 pass, 0 fail) and shell tests (80 pass, 0 fail)
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add VM reconnect functionality to spawn list (#1144)
Implements ability to reconnect to previously spawned VMs instead of
always creating new instances. Changes include:
- Add VMConnection interface to track IP, user, and server metadata
- Add save_vm_connection() bash function for scripts to persist connection info
- Modify spawn list to show connection status and offer reconnect option
- Support both SSH (cloud providers) and sprite console reconnection
- Update digitalocean/claude.sh and sprite/claude.sh as reference implementations
Fixes#1144
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* improve: add helpful error message when VM reconnect fails
Show user-friendly message suggesting to spawn a new VM if
reconnection fails, rather than just showing raw SSH error.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Added comprehensive test suite for cmdLast function (PR #1171 feature).
Covers:
- Empty history (no records)
- History with records (rerunning latest)
- Record hints and prompt display
- Helper functions (buildRecordLabel, buildRecordHint)
- Edge cases (old timestamps, metadata fields, selection logic)
Tests increased from 13,685 to 13,712 (+27 tests).
Agent: test-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit fixes 3 critical reliability bugs in shared/common.sh:
1. Float arithmetic in OAuth polling loop (line 702)
- Bug: elapsed=$((elapsed + POLL_INTERVAL)) fails when POLL_INTERVAL is decimal
- Impact: OAuth timeout detection breaks when users set SPAWN_POLL_INTERVAL=0.5
- Fix: Use python3 for float addition with integer fallback
2. Missing error handling in extract_ssh_key_ids (line 1249)
- Bug: No error handling when python3 fails or API returns malformed JSON
- Impact: Silent failures in SSH key provisioning across 7+ cloud providers
- Fix: Add error handling with clear diagnostic messages
3. Unsafe fallback in calculate_retry_backoff (line 1312)
- Bug: Empty interval returned if python3 unavailable and echo fails
- Impact: sleep "" errors break retry loops in all cloud API wrappers
- Fix: Add input validation and use printf instead of echo
All tests pass (13685 pass, 0 fail).
Agent: code-health
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* security: fix command injection in upload_config_file via unquoted path
VULNERABILITY: The upload_config_file() function passes remote_path
to mv without proper quoting, enabling command injection if the path
contains spaces or shell metacharacters.
IMPACT: HIGH — While current callers use hardcoded paths (~/.claude/...),
the function signature accepts arbitrary paths, making this a latent
vulnerability. A malicious or crafted path could execute arbitrary
commands on the remote server.
FIX: Double-quote remote_path in all command contexts (dirname, mv).
Tilde expansion still works correctly in double quotes when the tilde
is at the start of the path.
BEFORE:
mv '${temp_remote}' ${remote_path}
# If remote_path = "~/.config; rm -rf /" → command injection
AFTER:
mv '${temp_remote}' "${remote_path}"
# Path is properly quoted, no injection possible
Tracked in: #763
Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: replace ~ with $HOME in upload_config_file callers
- Replace ~ with $HOME in all upload_config_file calls (lines 2432, 2443, 2522, 2575)
- Update comment to clarify tilde does not expand inside double quotes
- Update documentation example to use $HOME instead of ~
This addresses the review feedback that tilde expansion does not work
inside double quotes in bash. Using $HOME allows proper path expansion
on the remote shell while maintaining secure double-quoting.
Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
echo "$response" | grep -q can cause "write error: Broken pipe" when
grep -q exits early and echo gets SIGPIPE. This is non-deterministic
and depends on response size and timing, which is why it only fails
intermittently in CI. Using [[ == *pattern* ]] avoids pipes entirely.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates test assertion strings in 10 test files to match current
implementation error messages. Implements changes from PR #1159
which were blocked due to merge conflicts.
Fixes#1161
Agent: test-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds a new `spawn last` command (with `rerun` alias) that instantly
reruns the most recent spawn from history without requiring the
interactive picker. This improves the workflow for users who frequently
want to restart their last session.
Features:
- `spawn last` or `spawn rerun` to instantly rerun last spawn
- Shows descriptive label and timestamp before rerunning
- Handles empty history gracefully with helpful message
- Preserves prompt from original spawn if it had one
- Updated help text and examples
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit fixes 3 high-impact reliability issues that could cause runtime failures:
1. **OAuth server PID race condition** (shared/common.sh)
- BEFORE: Used pgrep to find server PID, which could match wrong processes
- AFTER: Store PID in a file and read it reliably
- IMPACT: Prevents OAuth cleanup failures and orphaned server processes
2. **Unhandled curl failures in OAuth code exchange** (shared/common.sh)
- BEFORE: curl failures returned empty response without error detection
- AFTER: Check curl exit code and report network/API errors clearly
- IMPACT: Users get actionable feedback instead of cryptic "empty key" errors
3. **Missing error handling in script download** (cli/src/commands.ts)
- BEFORE: Caught download error but continued execution with undefined scriptContent
- AFTER: Exit early when download fails to prevent crash
- IMPACT: Prevents "Cannot read property of undefined" runtime errors
All changes preserve existing behavior while adding defensive error handling.
Agent: code-health
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Diversify help command examples to showcase more agents and clouds
(openclaw, goose, interpreter, vultr, digitalocean, linode)
- Remove duplicate "Auth: token" text in cloud info display
- Update test to match new help examples
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Bots no longer run on Sprite VMs. Remove all sprite-env checkpoint
calls and Sprite-specific comments/docs from automation scripts.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Previously only org members were allowed. Now checks both org membership
and repo collaborator status, so invited collaborators can open issues
and PRs without being blocked.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New teammate that proactively scans for reliability, maintainability,
readability, testability, scalability, and best practice issues. Picks
top 3 highest-impact findings per cycle and fixes them in one PR.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split checkEntity into three focused helpers that each handle a specific
correction strategy (wrong kind, same-kind typo, opposite-kind typo).
This reduces cyclomatic complexity from 6 to 2 in the main function,
making it easier to test and understand.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Extract cmdHelp's 76-line help message into 6 modular helper functions
(getHelpUsageSection, getHelpExamplesSection, getHelpAuthSection,
getHelpInstallSection, getHelpTroubleshootingSection, getHelpEnvVarsSection,
getHelpFooterSection) to improve maintainability and allow reuse.
Extract cmdAgentInfo's cloud listing logic into printAgentCloudsList helper
to reduce the function's cognitive load and separate display concerns.
Both refactorings maintain identical user-facing behavior while reducing
code duplication and improving testability.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fix install-script-validation tests that checked for non-existent
source-mode fallback features (PRs #707, #710 were not implemented)
- Rename test suite to "build fallback and binary download" to match
actual behavior (pre-built binary download, not source mode)
- Remove assertions for non-existent features (${HOME}/.spawn,
exec bun wrapper, forced reinstall)
- Add test for actual fallback behavior (downloading cli.js from releases)
- Fix download-and-failure test to match actual error message casing
("Firewall or proxy" not "firewall or proxy")
These tests were blocking CI and preventing clarity on actual vs desired
implementation. Now tests accurately reflect the current install.sh behavior.
Agent: test-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Fixes#1145
Replaces numeric input with interactive fuzzy picker for server/location selection.
- Uses fzf when available for interactive filtering
- Falls back to numbered list when fzf is not installed
- Applies to all interactive_pick flows (Hetzner locations, server types, etc.)
- Improves UX with type-to-filter capability
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Enhanced error messages in gcp/lib/common.sh to provide comprehensive,
platform-specific guidance for users getting started with GCP:
Changes:
- Added detailed install instructions for macOS (Homebrew), Ubuntu/Debian,
and Fedora/RHEL platforms
- Included post-installation steps (auth and project configuration)
- Added links to create GCP project and enable Compute Engine API
- Improved error message structure with clear "How to fix" sections
- Maintained color-coded output for better readability
This addresses the issue where users felt the GCP flow "doesn't guide
me toward finding the right resource". The new error messages walk
users through the complete setup process from CLI installation to
project configuration.
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
When there are ≤10 open issues, the team lead must assign them to
teammates and deliver PRs immediately instead of acknowledging and
deferring. Community-coordinator now required to assign every issue
to a teammate for fixing, never just comment and move on.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use sonnet instead of opus for security triage and routine tasks
- Triage mode: run claude with --model sonnet (single-agent safety check
doesn't need opus)
- PR reviewers: opus → sonnet (routine review work)
- Issue-checker: haiku → sonnet (needs better judgment for dedup/labels)
- Kept opus for: team-building implementer/reviewer, full scan auditors
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use moonshotai/kimi-k2.5 for issue triage and community coordination
- Security triage mode: sonnet → moonshotai/kimi-k2.5
- Security issue-checker: sonnet → moonshotai/kimi-k2.5
- Refactor community-coordinator: sonnet → moonshotai/kimi-k2.5
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two problems fixed:
1. issue-checker was posting duplicate triage comments ("Re-triage",
"Status Check") on already-triaged issues. Strengthened dedup to
check for both security/triage AND security/issue-checker sign-offs,
and explicitly banned re-triage comment patterns.
2. Issues stuck in "under-review" forever because no agent transitioned
them to "safe-to-work". Added silent label progression: if a triage
comment exists, move under-review → safe-to-work without commenting.
Also fixed triage mode to recognize issue-checker sign-offs as prior
triage to prevent cross-agent duplicate work.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update test expectations to match current UX error messages:
- "Cannot run interactive picker" instead of "No interactive terminal"
- "Next steps" instead of "What to do"
- "experiencing issues" instead of "recovering"
- "Firewall or proxy" (capitalized) instead of "firewall or proxy"
All affected tests now pass with the current CLI error messages.
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The default GITHUB_TOKEN lacks issues and pull-requests write access,
causing 403 when trying to close issues/PRs from non-org members.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Automatically closes issues and PRs opened by non-members of the
OpenRouterTeam org with an explanatory comment.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updated failing test cases to match the actual error messages generated by the CLI:
- "Cannot run interactive picker: not a terminal" (not "No interactive terminal")
- "Try manual installation:" (not "Try the installation manually")
- "Retry with a fresh server" (not "Re-run spawn to try")
- "installation failed" (not "installation failed to complete successfully")
- "Next steps" (not "What to do")
- "temporarily unavailable" (not "recovering")
Shell tests (80/80) pass. CLI tests improved from 128 failures to 47 failures.
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The mock test assertion was checking for GET /sshkeys but the actual
Scaleway API endpoint is /ssh-keys (with a hyphen), causing all 15
scaleway agent tests to fail the "fetches SSH keys" check.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Improved the error message when spawn is run without arguments in a
non-interactive environment (piped/redirected stdin/stdout).
Before: 'No interactive terminal detected.'
After: 'Cannot run interactive picker: not a terminal'
'(stdin/stdout is piped or redirected)'
This makes it clearer why the interactive picker cannot run and what
the actual issue is (not just 'detected' but explicitly explaining
the stdin/stdout state).
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Extracted helper functions to improve code maintainability:
1. shared/common.sh:
- Extracted _prompt_and_validate_api_key() from get_openrouter_api_key_manual()
- Simplified API key validation loop and confirmation logic
2. cli/commands.ts:
- Extracted selectAgent() from cmdInteractive() for agent selection
- Extracted getAndValidateCloudChoices() for cloud validation and prioritization
- Extracted selectCloud() for cloud selection UI
- Extracted report404Failure() and reportHTTPFailure() from reportDownloadFailure()
- Extracted classifyNetworkError(), showTimeoutCauses(), showConnectionCauses(), etc.
- Simplified error handling with switch statement in reportDownloadError()
These changes reduce cyclomatic complexity and improve testability while preserving
all existing functionality.
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds a /cleanup-branches skill that deletes local and remote branches
with no open PR, and prunes stale worktrees. Supports --dry-run to
preview what would be deleted. Protects branches with open PRs.
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: fix codesandbox provider pattern tests for helper function indirection
Update tests to account for functions that delegate to SDK helpers
(_csb_sdk_eval and _csb_run_cmd) rather than directly inlining SDK code.
Also add aliyun CLI auth pattern to credential handling test.
- Fix codesandbox tests to check for helper calls when patterns aren't direct
- Update test_codesandbox_token test to accept "How to fix" variant
- Allow interactive_session validation to check via run_server delegation
- Fixed: 42 codesandbox failures reduced to 0, 1 alibabacloud failure fixed
Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* test: fix alibabacloud provider pattern tests for delegation
Update tests to account for alibabacloud delegating to shared SSH functions
instead of implementing SSH/SCP directly. Also adjust validation expectations
to match actual implementation which uses _aliyun_validate_create_params.
- Accept _aliyun_validate_create_params as validation pattern
- Update SSH test expectations for ssh_run_server and ssh_interactive_session
- Fixed: 6 alibabacloud failures reduced to 0
Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* test: fix new-cloud-provider-patterns codesandbox validation tests
Update tests to account for codesandbox delegating to _csb_run_cmd helper
and interactive_session delegating to run_server.
- Accept _csb_run_cmd as SDK execution pattern
- Allow interactive_session validation via run_server delegation
- Fixed: 2 codesandbox validation failures reduced to 0
Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Enhance user-facing error messages with better structure and visual hierarchy:
**CLI Error Messages:**
- Add bold headers for "Next steps:" and "Possible causes:" sections
- Make action items more scannable and directive
- Simplify language (e.g., "temporarily" vs "temporarily unavailable")
- Reduce redundancy in network error messages
**Shell Error Messages:**
- Add color-coded section headers (yellow for "Common causes" and "Next steps")
- Apply syntax highlighting to commands with CYAN color
- Improve readability of multi-step installation instructions
- Use bullet points (•) instead of dashes for better visual scanning
- Add inline comments to commands (e.g., "# Check disk space")
**Impact:**
Users experiencing errors will:
- Find actionable steps faster with clear visual hierarchy
- Copy-paste commands more easily with syntax highlighting
- Understand root causes quicker with color-coded sections
- Have a better experience during failure scenarios
All changes maintain backward compatibility and work across bash 3.x (macOS) and modern bash.
Agent: ux-engineer
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Agents were posting redundant comments on issues because dedup checks
were soft prompt instructions that agents didn't reliably follow.
Strengthens all three team prompts with explicit STRICT DEDUP rules:
- security/issue-checker: skip issues entirely if already commented,
do label fixes silently without commenting
- refactor/community-coordinator: only re-comment to link a new PR or
report a concrete resolution, remove interim update instructions
- refactor/issue-fixer: check for ANY team's sign-off before posting
acknowledgment, not just own team
- discovery/issue-responder: skip if already commented unless linking
a concrete PR or fix
Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract helper functions to simplify complex control flow:
- try_oauth_flow: Extract _start_oauth_session_with_server helper to handle server startup phase, improving readability and testability
- _hetzner_resolve_server_type: Extract _hetzner_log_validation_error and _hetzner_log_type_change helpers to separate error handling logic from main flow
These changes reduce nesting levels and improve function cohesion while maintaining identical behavior.
Agent: complexity-hunter
Co-authored-by: spawn-refactor-bot <refactor@openrouter.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>