spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-05-15 09:59:46 +00:00

Author	SHA1	Message	Date
A	1725fa79d4	test: add cloud lib security convention regression tests (69 tests) (#816 ) Validates that all cloud provider lib/common.sh files follow security conventions from the security audit. Tests cover SSH key encoding (json_escape or python json.dumps), config file permissions, Python code injection prevention, API body JSON safety, heredoc injection prevention, shared/common.sh sourcing, and credential handling patterns. Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:23:20 -08:00
A	8446e785cf	test: add 88 tests for OAuth flow functions in shared/common.sh (#843 ) The OAuth flow is the primary authentication mechanism for spawn users, yet its component functions had zero test coverage. This adds tests for: - validate_oauth_port: port range validation (boundary values, injection) - _generate_csrf_state: CSRF token generation (entropy, uniqueness) - _generate_oauth_html: success/error HTML page generation - _generate_oauth_server_script: Node.js callback server (CSRF, ports) - _validate_oauth_server_args: prerequisite validation (port, state, runtime) - _init_oauth_session: temp directory and CSRF state file creation - cleanup_oauth_session: PID and directory cleanup - exchange_oauth_code: OAuth code-to-key exchange with json_escape security - check_openrouter_connectivity: network reachability fallback chain - Integration: session lifecycle and CSRF security properties Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:22:11 -08:00
A	5fd9d19775	feat: implement ramnode/opencode.sh (#812 ) Add OpenCode agent script for RamNode Cloud using OpenStack API. - Use ramnode cloud primitives for server provisioning - Install OpenCode via opencode_install_cmd helper - Inject OPENROUTER_API_KEY environment variable - Launch interactive OpenCode session via SSH Agent: gap-filler Co-authored-by: B (Discovery Team) <6723574+louisgv@users.noreply.github.com>	2026-02-13 01:21:57 -08:00
A	6182348641	fix: show credential status in dry-run and specify missing env vars on failure (#841 ) Two UX improvements: 1. `spawn <agent> <cloud> --dry-run` now shows a Credentials section that checks which env vars (OPENROUTER_API_KEY, cloud auth vars) are set vs missing, so users can verify readiness before a real run. 2. Script failure guidance (exit code 1 and default) now checks which specific env vars are unset instead of showing a generic "need X + Y" message, making it immediately clear what's missing. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:20:21 -08:00
A	b1a576a52a	test: add 51 tests for _classify_api_result and _report_api_failure (#834 ) These helpers were extracted from _cloud_api_retry_loop in PR #821 to reduce cyclomatic complexity but had zero test coverage. They are invoked on every cloud API call across all providers: - _classify_api_result: Classifies curl/HTTP results into retry reasons (network error, rate limit 429, service unavailable 503) or empty (success/non-retryable error). Tests cover all branches including curl exit codes 1/6/7/28, HTTP 429/503, success codes 200/201/204, non-retryable errors 400-502, and edge cases. - _report_api_failure: Generates user-facing error messages after retries are exhausted. Differentiates network vs HTTP errors, outputs API response body only for HTTP errors. Tests cover retry count display, response body handling, and special chars. Also includes integration tests verifying the classify-then-report pipeline and realistic cloud provider scenarios (Hetzner, DigitalOcean, DNS failures, auth errors, validation errors). Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-13 01:19:31 -08:00
A	087a14c276	test: add agent env injection contract tests (128 tests) (#838 ) Validates the critical contract that every implemented agent script correctly injects the environment variables from manifest.json. Catches silent breakage where an agent starts but cannot reach the LLM API due to missing OPENROUTER_API_KEY or provider-specific vars. Tests cover: - OPENROUTER_API_KEY presence in all scripts - Provider-specific env vars (ANTHROPIC_BASE_URL, OPENAI_BASE_URL, etc.) - OpenRouter API key acquisition patterns (env check, OAuth, manual) - Agent install and launch command references - Cloud lib env injection infrastructure - Base URL values pointing to openrouter.ai - No hardcoded API keys (security) - Full coverage statistics across all agents and clouds Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:19:14 -08:00
L	2d4827f9ed	fix: replace sleep 15/30 polling with sleep 5 yield pattern (#844 ) Long sleeps block message delivery — teammate messages arrive as user turns between the lead's responses. sleeping 30s means 30s of queued messages. Use sleep 5 to yield quickly, process messages immediately, and do useful work between polls. Updated all 3 service scripts: security.sh, refactor.sh, discovery.sh. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:18:07 -08:00
A	813089def7	test: add 67 tests for shared/github-auth.sh (zero prior coverage) (#832 ) Add comprehensive test coverage for the standalone GitHub auth helper (shared/github-auth.sh) merged in PR #824 with no tests. Coverage includes: - Source pattern and function availability (9 tests) - Fallback log functions when common.sh unavailable (3 tests) - ensure_gh_cli: detection, installation paths, error handling (7 tests) - _install_gh_binary: OS/arch detection, error paths, cleanup (11 tests) - ensure_gh_auth: token auth, interactive login, post-login checks (8 tests) - ensure_github_auth: combined wrapper success/failure (4 tests) - Direct execution mode and set -eo pipefail (2 tests) - Script conventions: bash 3.x compat, no echo -e, safe var access (10 tests) - Installation path coverage: macOS/Linux/APT/DNF/Homebrew (4 tests) - Error handling edge cases: curl failure, tar failure, auth failures (6 tests) - GITHUB_TOKEN security: piped via printf, not CLI arg (2 tests) - Shebang check (1 test) Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:17:57 -08:00
A	cb2a8614e9	refactor: reduce complexity in latitude and ovh cloud libs (#835 ) - latitude/lib/common.sh: Replace custom 38-line wait_for_server_ready polling loop with generic_wait_for_instance from shared/common.sh. Consolidate extract_latitude_server_ip (36 lines of inline Python) into a single readonly expression constant. Net -59 lines. - ovh/lib/common.sh: Replace shell variable interpolation in Python strings ('${var}') with sys.argv[] in _ovh_find_flavor_id, _ovh_get_ssh_key_id, _ovh_build_instance_body, and ovh_register_ssh_key. This eliminates injection surface and follows the established pattern used by other cloud providers. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-13 01:17:20 -08:00
L	ac3a8c58a5	docs: add systemd as service option alongside sprite-env in SKILL.md (#840 ) For non-Sprite environments (standard Linux VMs, sandboxes), systemd is the recommended way to run the trigger server. Added full unit file template, management commands, and troubleshooting entries. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:09:01 -08:00
L	ac5ef953f9	fix: enforce worktrees, clarify team separation of concerns (#839 ) * fix: add worktree requirement to security team prompts PR reviewers must check out PRs in sub-worktrees before running bash -n or bun test. Scan mode agents must also work inside a worktree. This prevents concurrent agents from conflicting in the main repo checkout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: enforce worktrees everywhere, refactor pr-maintainer role - SKILL.md: expand worktree convention to cover all agent work (PR review, testing, audits) not just branch creation - refactor.sh pr-maintainer: strip review/approve/merge responsibilities — that's the security team's job. pr-maintainer now focuses on rebasing conflicting PRs, addressing review comments, and fixing failing checks - Remove stale PR auto-merge from pre-cycle cleanup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove self-review from refactor team, clarify separation of concerns Refactor team focuses on research, deep-dives, and solving problems. Security team owns the entire PR review/approve/merge lifecycle. - Replace "No Self-Merge Rule" with "Separation of Concerns" section - Remove all self-review steps from issue and refactor mode workflows - Remove needs-team-review labeling from agent instructions - Simplify monitoring loop (no more review verification) - Simplify lifecycle checks (verify PRs exist, not reviewed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove stale needs-team-review label from security triage reference The refactor team no longer applies this label, so remove it from the available labels documentation in triage mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 00:52:13 -08:00
L	c602443711	fix: remove hardcoded /home/sprite paths from service scripts (#836 ) The LOG_FILE paths in security.sh and refactor.sh were hardcoded to /home/sprite/spawn, causing permission errors on non-Sprite environments. Use $REPO_ROOT (already computed dynamically) instead. Also update SKILL.md examples to use dynamic path resolution. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 00:34:34 -08:00
A	9f76af00d2	fix: show credential status in quick-start sections (#823 ) The quick-start sections in `spawn <cloud>` and `spawn <agent>` now show whether required env vars are already set (green with "set" indicator) or still need to be configured (cyan "export" instruction). This helps users immediately see what credentials are missing before launching. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:59:57 -08:00
A	4d3c54a11e	refactor: extract helpers from execScript and _cloud_api_retry_loop (#821 ) Reduce cyclomatic complexity in the two highest-scoring functions: - cli/src/commands.ts: Extract `handleUserInterrupt` and `runWithRetries` from `execScript` (complexity score 6 -> 2 for execScript, retry logic now independently testable) - shared/common.sh: Extract `_classify_api_result` and `_report_api_failure` from `_cloud_api_retry_loop` (complexity score 9 -> 4, removes duplicated error-classification logic from loop body) Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:57:20 -08:00
A	e73d6b9793	fix: support --flag=value syntax in CLI argument parsing (#826 ) Previously, `spawn --prompt="Fix bugs" claude sprite` or `spawn list --agent=claude` would fail with "Unknown flag" because the CLI only recognized `--flag value` (space-separated) syntax. Now `--flag=value` is expanded to `--flag value` early in the arg parsing pipeline, supporting the common GNU-style convention. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-12 23:55:46 -08:00
A	716da5d43b	fix: auto re-exec command after CLI auto-update (fixes #780 ) (#830 ) When a CLI auto-update triggers mid-command (e.g. `spawn claude sprite`), the updated binary now automatically re-runs with the original arguments instead of asking the user to manually re-run. Sets SPAWN_NO_UPDATE_CHECK=1 on re-exec to prevent infinite update loops. Falls back to the old "run again" message when no arguments were provided (bare `spawn`). Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:54:49 -08:00
A	c7bbe8bc3b	refactor: extract generic _live_create_delete_cycle in test/record.sh (#818 ) The 5 per-cloud live recording functions (_live_hetzner, _live_digitalocean, _live_vultr, _live_linode, _live_civo) each duplicated 50-65 lines of identical create->save->extract-id->delete->save logic. Extract a generic _live_create_delete_cycle helper that handles the shared flow, with per-cloud body builder functions providing only the cloud-specific parts. Reduces test/record.sh by 112 lines (1016 -> 904) while preserving all behavior including cloud-specific delete delays and empty-response fallbacks. Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:52:51 -08:00
A	317d931e87	test: add 32 tests for extract_api_error_message in shared/common.sh (#820 ) This function parses JSON error responses from cloud provider APIs (used by Hetzner, DigitalOcean, Vultr, and Contabo) and had zero test coverage. Tests cover: field priority order, fallback behavior, realistic cloud provider responses, and edge cases (non-object JSON, null/empty fields). Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:52:27 -08:00
A	fbea9303f0	test: add 48 tests for SSH key lifecycle functions (#828 ) Cover ensure_ssh_key_with_provider (zero prior coverage), plus edge cases for generate_ssh_key_if_missing, get_ssh_fingerprint, extract_ssh_key_ids, and check_ssh_key_by_fingerprint. Tests validate the callback-based SSH key registration flow used by all cloud providers. Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:52:22 -08:00
A	a96b310861	refactor: reduce complexity in Hetzner _validate_server_type_for_location (#831 ) - Extract _hetzner_find_candidates helper to eliminate duplicated jq candidate-search logic (same-family and any-family searches were nearly identical 15-line blocks) - Consolidate 3 separate jq calls for wanted_cpu/cores/memory into a single jq invocation - Replace duplicated replacement-picking code with a loop over family strategies Function reduced from 106 to ~72 lines (plus 17-line reusable helper). Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:52:17 -08:00
A	5169350feb	fix: use buildRetryCommand in spawn list footer to avoid truncated prompts (#819 ) The "Rerun last" hint in `spawn list` was truncating prompts at 30 characters and appending "...", producing broken copy-paste commands. Now delegates to the existing buildRetryCommand helper which properly handles long prompts by suggesting --prompt-file instead of truncating. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:52:08 -08:00
A	3f28d5f29f	test: add 52 tests for SSH helpers and instance polling in shared/common.sh (#822 ) Cover critical infrastructure functions that had zero dedicated test coverage: - ssh_run_server, ssh_upload_file, ssh_interactive_session (SSH command construction) - ssh_verify_connectivity (ConnectTimeout, max_attempts, test command) - generic_ssh_wait (exponential backoff, success/failure, elapsed time logging) - wait_for_cloud_init (argument delegation, cloud-init file check) - generic_wait_for_instance (API polling, status matching, IP export, timeout) - extract_api_error_message (all 5 error field patterns + fallbacks) - SSH_USER default behavior (root fallback across all helpers) Uses mock SSH/SCP/sleep commands via PATH override to test argument construction and behavior without requiring network connectivity. Agent: test-engineer -- refactor/test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:51:46 -08:00
L	88954f0e12	feat: add standalone GitHub auth helper (shared/github-auth.sh) (#824 ) Standalone, sourceable script that installs the gh CLI and runs interactive gh auth login. Any agent script on any cloud can source it and call ensure_github_auth to get authenticated with GitHub. - ensure_gh_cli: installs via brew/apt/dnf/binary fallback - ensure_gh_auth: uses GITHUB_TOKEN or interactive OAuth flow - ensure_github_auth: combined convenience wrapper - Idempotent, macOS bash 3.x compatible, curl\|bash compatible Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:37:02 -08:00
L	608104a76d	fix: set IS_SANDBOX=1 in all spawn environments (#829 ) All spawn environments are disposable cloud VMs. Setting IS_SANDBOX=1 helps agents like Claude Code recognize the environment as a sandbox, avoiding unnecessary safety prompts for root-level operations. Added in two places for full coverage: - generate_env_config(): included automatically in every env injection - get_cloud_init_userdata(): set in .bashrc/.zshrc during cloud-init Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:36:36 -08:00
L	6633873ccc	refactor: replace Python with jq in Hetzner lib, fix /lab → /labs URLs (#827 ) Hetzner lib: replace all Python JSON parsing with jq. Uses the /datacenters API as the authoritative source for server type availability (server_types.available), cross-referenced with /server_types for specs and pricing. jq is auto-installed if missing. URLs: update openrouter.ai/lab/spawn → openrouter.ai/labs/spawn across all READMEs and CLI source. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:14:11 -08:00
L	720753c644	fix: validate Hetzner server type availability at selected location (#825 ) When HETZNER_SERVER_TYPE and HETZNER_LOCATION are both pre-set (e.g. by automated scripts), there was no validation that the type is actually available at that location. ARM types like cax11 are only available in EU datacenters, causing "unsupported location for server type" errors when paired with US locations like ash. Now create_server validates the type against the Hetzner /datacenters API (server_types.available) before attempting creation. If incompatible, it auto-selects the cheapest compatible alternative (same CPU family, >= specs) and warns the user. Uses jq for JSON parsing (auto-installed if missing) and the Hetzner /datacenters + /server_types APIs for authoritative availability data. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 23:01:37 -08:00
L	56c4c020d5	feat: consolidate security review_all and scan into single 20-min cycle (#802 ) The two scheduled modes (review_all every 15 min, scan every 30 min) competed for MAX_CONCURRENT=1 on the trigger server, causing 429 drops and 30-55+ min gaps. Merge both into a single cycle that runs every 20 min, prioritizing PR review but also performing lightweight repo scanning when capacity allows (≤5 open PRs). Also prevents refactor agents from closing issues manually — issues now auto-close via `Fixes #N` in the PR body when merged. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 20:29:56 -08:00
L	8bcdb59c09	docs: add Contributing section to README (#788 ) * docs: add Contributing section to README Adds guidance for testing cloud providers, reporting issues, requesting new clouds/agents, and reporting auth problems. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use correct issue template URLs for cloud, agent, and CLI requests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 18:05:47 -08:00
L	ba6b0bd98f	fix: add missing sign-off footers to all agent gh comments and PR bodies (#785 ) Many gh commands in agent team prompts were missing the mandatory `-- team/agent-name` sign-off footer, causing dedup checks to fail and making it impossible to identify which agent posted a comment. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 17:59:02 -08:00
A	0fe83fe311	fix: improve CLI error messages for retry commands and unknown names (#777 ) - buildRetryCommand: suggest --prompt-file for long prompts instead of truncating into a non-functional command (threshold raised to 80 chars) - showUnknownCommandError: change "Unknown command" to "Unknown agent or cloud" since users are passing agent/cloud names, not commands - Bump CLI version to 0.2.66 Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-12 17:19:46 -08:00
A	cb1005ab31	refactor: extract helpers from run_script_test and run_shellcheck in test/run.sh (#776 ) Split run_script_test (61 lines -> 25 lines) into focused helpers: - _assert_sprite_common_commands: standard command lifecycle assertions - _assert_agent_specific: per-agent install assertions - _assert_no_temp_leaks: temp file cleanup check Split run_shellcheck (57 lines -> 12 lines) into: - _discover_shell_scripts: dynamic script discovery across cloud dirs - _run_shellcheck_on_scripts: per-script shellcheck execution and reporting Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-12 17:19:32 -08:00
L	43a1b1f815	fix: add comment dedup checks to all community-facing agent prompts (#778 ) * fix: add comment dedup checks to all community-facing agent prompts - security.sh triage: check for existing triage labels/comments before re-triaging (prevents duplicate "Security triage: SAFE" comments) - security.sh issue-checker: check for prior nudge comments before re-posting "Re-flagging for attention" on stale issues - discovery.sh Issue Responder: added DEDUP CHECK section (was missing entirely — could post duplicate acknowledgments/PR links) - Fixed stale Pending Review/Under Review/In Progress references to kebab-case in issue-checker prompt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: standardize team/agent sign-off on all comments for dedup Every agent comment now ends with `— team/agent-name` sign-off: - security/triage, security/pr-reviewer, security/issue-checker, security/scan, security/shell-auditor, etc. - refactor/community-coordinator, refactor/pr-maintainer, etc. - discovery/issue-responder, discovery/cloud-scout, etc. - qa/cycle Dedup checks now look for the sign-off string instead of author login, making it reliable across bot account changes. Each agent can identify its own prior comments by grepping for its sign-off. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use double-hyphen sign-off and document convention in SKILL.md Replace emdash (—) with double-hyphen (--) in all agent sign-offs to avoid encoding issues in shell strings. Format: `-- team/agent`. Added section 5 "Comment sign-off for dedup" to SKILL.md documenting the convention, format rules, and how agents use it for dedup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 17:18:15 -08:00
A	ff0ccfdbd0	refactor: reduce complexity in ramnode picker and cmdInteractive (#756 ) - Replace RamNode's custom _pick_flavor (37 lines) with shared interactive_pick helper (1 line), eliminating duplicated picker logic - Extract credential sorting from cmdInteractive into reusable prioritizeCloudsByCredentials helper for testability and clarity Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:47:43 -08:00
A	fde0ed16b6	refactor: extract shared extract_api_error_message helper to reduce inline Python duplication (#767 ) Replace 10 inline `python3 -c "import json,sys; d=json.loads(...)..."` one-liners across vultr, hetzner, digitalocean, and contabo with calls to a new shared `extract_api_error_message` helper in shared/common.sh. The helper tries common JSON error field patterns (message, error, error.message, error.error_message, reason) and falls back to a caller-specified default. This pattern appears 35+ times across cloud libs; this PR converts the first 4 clouds as a proof of concept. Remaining clouds can adopt incrementally. Net reduction: 10 lines per converted cloud (~3 lines saved per call site). Agent: complexity-hunter Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:47:20 -08:00
A	4b0d25ca39	fix: prevent Python code injection via unescaped variables in inline Python (#771 ) Use sys.argv to pass shell values to inline Python instead of direct string interpolation, preventing single-quote injection attacks across cloud lib common.sh files and test/record.sh. Also fix eval injection in test/record.sh try_load_config() by replacing eval of Python-generated export statements with safe tab-separated parsing and direct variable assignment. Fixes #759 Fixes #760 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:47:13 -08:00
A	ee71d1980a	fix: use log_step (cyan) for in-progress messages in shared libs and installer (#772 ) Replace log_info (green, implies success/completion) with log_step (cyan, implies in-progress) for messages that describe actions currently happening: - shared/common.sh: OAuth flow steps, agent execution, server setup - cli/install.sh: add log_step function, use for download/install steps - Cloud libs: interactive session starts, image finding, auth initiation - Agent scripts: gateway startup, session opening This makes the color semantics consistent: green (log_info) = success, completion, informational facts cyan (log_step) = in-progress actions, status updates yellow (log_warn) = warnings red (log_error) = errors Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:46:21 -08:00
A	f4b3d99cff	test: add 73 tests for logging, temp-file, cloud-init, and SSH key helpers (#765 ) Add comprehensive test coverage for previously untested utility functions in shared/common.sh that are used pervasively across all cloud providers: - log_step: cyan progress messages (added PR #757) - _log_diagnostic: structured error output (header + causes + numbered fixes) - check_python_available: Python 3 dependency detection with install hints - find_node_runtime: bun/node runtime discovery - track_temp_file + cleanup_temp_files: secure credential temp file cleanup - register_cleanup_trap: EXIT/INT/TERM signal handlers - get_cloud_init_userdata: cloud-init YAML generation for provisioning - calculate_retry_backoff: jittered exponential backoff - generate_ssh_key_if_missing: ed25519 key generation with directory creation - get_ssh_fingerprint: MD5 fingerprint extraction - opencode_install_cmd: opencode install script content - POLL_INTERVAL / SSH_OPTS: configurable constants and defaults - All 4 log functions: stderr-only output verification Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:46:13 -08:00
A	a290815108	test: add 111 tests for trigger-server security and validation logic (#774 ) Add comprehensive test coverage for the trigger-server HTTP service (.claude/skills/setup-agent-team/trigger-server.ts), which had zero test coverage despite recent security-critical changes (PRs #745, #747). Tests cover: - Timing-safe Bearer token auth (17 tests including injection attempts) - VALID_REASONS allowlist enforcement (13 tests including injection) - Issue parameter validation regex (17 tests including shell injection) - Issue dedup logic (8 tests) - Capacity checking (6 tests) - reapAndEnforce process cleanup (9 tests including boundary cases) - Health response structure (4 tests) - Streaming response metadata (4 tests) - Environment variable parsing (5 tests) - Route matching logic (10 tests) - Full validation flow with priority ordering (8 tests) Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-12 16:46:10 -08:00
A	cdf6f1dba5	fix: use log_step (cyan) for in-progress messages instead of log_info (green) (#768 ) In-progress actions (installing, starting, connecting...) should use log_step (cyan) to visually distinguish them from completion messages which use log_info (green). This makes it easier for users to see at a glance what is happening vs what has finished. Changes: - cli/install.sh: add log_step function, use it for install progress - shared/common.sh: OAuth flow and non-interactive exec messages - Cloud libs: interactive_session, auth, and cleanup messages - Agent scripts: gateway startup and session opening messages Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:45:58 -08:00
A	5a1037d92c	fix: replace ((var++)) with var=$((var + 1)) for macOS bash 3.x compat (#769 ) ((var++)) returns exit code 1 when the variable is 0 (falsy), which causes set -e to terminate the script. Replace all instances with the safe var=$((var + 1)) pattern in sprite/lib/common.sh and test/run.sh. Fixes #762 Agent: community-coordinator Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:45:51 -08:00
A	13a3a187d7	chore: remove junk file and GPU cloud stubs (#770 ) - Remove `1n` (accidental zsh error output from automated process) - Remove `runpod/` and `vastai/` directories — GPU clouds are explicitly prohibited per CLAUDE.md and these were orphaned stubs with no lib/common.sh, no README, and no manifest entries Fixes #764 Agent: community-coordinator Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:45:46 -08:00
L	f7c6e07867	feat: security triage applies full label taxonomy (#766 ) * feat: security triage now applies full label taxonomy Triage mode now applies: - Safety label (safe-to-work / malicious / needs-human-review) - Content-type label (bug, enhancement, security, question, etc.) - Lifecycle label (Pending Review) so downstream teams can pick up Team-building mode now transitions lifecycle labels: - Adds "In Progress" at start, removes it on close Added a "Available Labels Reference" section to the triage prompt documenting all label categories for the agent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: all security-filed issues get safe-to-work + Pending Review Issues filed by the security team (scan findings, drift/anomaly reports, follow-up issues from closed PRs) now automatically get `safe-to-work` and `Pending Review` labels so downstream teams can immediately pick them up without waiting for another triage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove Pending Review from safe-to-work issues safe-to-work already means triage is complete — adding Pending Review is redundant and confusing. Now only UNCLEAR issues get Pending Review (they still need human attention). SAFE issues and security-filed issues skip straight to actionable with just safe-to-work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: normalize all labels to kebab-case Renamed on GitHub: - "In Progress" → "in-progress" - "Pending Review" → "pending-review" - "Under Review" → "under-review" - "good first issue" → "good-first-issue" - "help wanted" → "help-wanted" Updated all references in security.sh and refactor.sh to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: align issue templates and workflows with actual repo labels Created missing labels: cloud-request, agent-request, cli. Replaced nonexistent needs-triage with pending-review in all templates. Templates updated: - bug_report: bug + pending-review - cli_feature_request: cli + enhancement + pending-review - cloud_request: cloud-request + enhancement + pending-review - agent_request: agent-request + enhancement + pending-review Workflows updated: - refactor.yml: trigger on safe-to-work AND (bug\|cli\|enhancement\|maintenance) - discovery.yml: already correct (safe-to-work AND cloud-request\|agent-request) - security.yml: already correct (team-building label check) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 16:20:07 -08:00
A	fa5fe26d31	test: add 57 tests for credential-based cloud prioritization (PR #752 ) (#758 ) Tests cover parseAuthEnvVars, hasCloudCredentials, cloud sorting by detected credentials, mapToSelectOptions with hintOverrides, getAuthHint, getImplementedClouds, and the full interactive picker prioritization flow. Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-12 15:49:29 -08:00
A	68c53c4d3f	feat: implement netcup/kilocode integration (#751 ) * feat: implement netcup/kilocode integration Add Kilo Code support on Netcup VPS provider. This fills a missing matrix entry by combining Netcup cloud primitives with Kilo Code agent setup. Changes: - netcup/kilocode.sh: New script using Netcup API provisioning + Kilo Code CLI - netcup/README.md: Added Kilo Code to available agents list - manifest.json: Updated netcup/kilocode from "missing" to "implemented" Agent: gap-filler-1 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: address review findings in netcup/kilocode.sh - Add missing upload_file/run_server args to inject_env_vars_ssh (HIGH - prevented credential leak where API key was passed as upload_func) - Add wait_for_cloud_init call after verify_server_connectivity (MEDIUM) - Add shellcheck source directive and SC2154 disable - Add server info to completion message Agent: pr-maintainer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-12 15:49:19 -08:00
A	4e33cc39cd	fix: address medium security findings from #753 (#755 ) - Replace `echo -e` with `printf` in cli/install.sh for macOS bash 3.x compat - Remove `-u` (nounset) from test/run.sh — use `${VAR:-}` pattern instead - Replace `source <(curl ...)` with `eval "$(curl ...)"` in test/run.sh for curl\|bash compat - Add .gitignore patterns for sensitive files (.env, .pem, .key, credentials) Refs #753 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:48:52 -08:00
A	4bd5f2205f	test: add 71 tests for cloud API helper functions in shared/common.sh (#754 ) Cover _parse_api_response, _update_retry_interval, _api_should_retry_on_error, calculate_retry_backoff, _cloud_api_retry_loop, generic_cloud_api, generic_cloud_api_custom_auth, _make_api_request, _make_api_request_custom_auth, and _curl_api -- all recently refactored with zero prior test coverage. Agent: test-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:48:46 -08:00
A	cf53ea1fb2	fix: use log_step (cyan) for in-progress messages instead of log_info (green) (#757 ) Consistently use log_step for progress/status messages ("Waiting for...", "Fetching...", "Creating...") and reserve log_info for success/completion messages. This gives users a clear visual distinction between operations that are still running (cyan) vs operations that have completed (green). Also adds periodic progress updates to silent polling loops in ramnode, cherry, and netcup IP wait functions so users see activity during long waits. Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:48:38 -08:00
L	32a3e6e276	feat: add PR labeling + stale issue re-triage to security review_all (#748 ) - Created missing repo labels: malicious, needs-human-review, maintenance, team-building - PR reviewers now label PRs after review: - security-review-required for CRITICAL/HIGH findings - security-approved for clean/MEDIUM/LOW PRs - Added issue-checker agent to review_all mode that monitors stale issues (no activity >1 hour) and re-triggers review: - Re-flags safe-to-work issues with no activity - Re-notifies needs-human-review issues via Slack - Ensures all open issues have a status label - Adds Pending Review to un-triaged issues Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:39:44 -08:00
A	e36e087029	feat: prioritize clouds with detected credentials in interactive picker (#752 ) When running `spawn` interactively, clouds where the user already has auth env vars set (e.g. HCLOUD_TOKEN, DO_API_TOKEN) now appear first in the cloud selection list with a "credentials detected" hint. This reduces friction by surfacing the most likely-to-succeed options. Fixes #685 Agent: ux-engineer Co-authored-by: A <6723574+louisgv@users.noreply.github.com>	2026-02-12 15:33:14 -08:00
A	eae6219f27	fix: use timing-safe comparison and validate reason param in trigger-server (#747 ) Replace direct string comparison (!==) with crypto.timingSafeEqual() for Bearer token authentication, preventing timing side-channel attacks on TRIGGER_SECRET. Pattern matches existing key-server.ts implementation. Also validate the `reason` query parameter against an allowlist of known values to prevent injection of arbitrary strings into the SPAWN_REASON env var passed to spawned scripts. Fixes #745 Agent: security-auditor Co-authored-by: A <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:32:44 -08:00

... 14 15 16 17 18 ...

1598 commits