spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-05-20 01:11:18 +00:00

Author	SHA1	Message	Date
A	ecc876f3bc	fix: remove dead shellQuote re-export from gcp/gcp.ts (#2551 ) Dead backwards-compat re-export left over from the shellQuote consolidation (PRs #2533, #2535, #2546). Zero consumers import shellQuote from gcp/gcp.ts — all correctly import from shared/ui.ts. Per CLAUDE.md: avoid backwards-compatibility hacks; delete unused code. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 21:42:09 -04:00
A	9bb39a213a	test: remove theatrical tests from manifest-integrity (#2552 ) Remove 2 tests from the manifest-integrity.test.ts "structure" describe block that can never fail: - "should parse as valid JSON": manifest.json is already parsed via JSON.parse() at module scope (line 23). If parsing fails, the module throws and ALL tests fail — this individual test can never provide an independent failure signal. - "should have agents, clouds, and matrix top-level keys": after parsing, Object.keys(manifest.agents/clouds) and Object.entries(manifest.matrix) are called at module scope (lines 25-27). If those properties were missing, the module load itself would throw. This test is also guaranteed to pass whenever any test in the file runs. Removing these 2 theatrical tests leaves 1403 tests (down from 1405). All remaining tests provide real signal. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 21:41:03 -04:00
Ahmed Abushagur	f683dd857b	feat: add --config and --steps CLI flags for programmatic setup (#2545 ) * feat: add Telegram and WhatsApp options to OpenClaw setup picker Adds separate "Telegram" and "WhatsApp" checkboxes to the OpenClaw setup screen: - Telegram: prompts for bot token from @BotFather, injects into OpenClaw config via `openclaw config set` - WhatsApp: reminds user to scan QR code via the web dashboard after launch (no CLI setup possible) Updates USER.md with channel-specific guidance when either is selected. Bump CLI version to 0.16.16. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: run WhatsApp QR scan interactively before TUI launch Instead of punting WhatsApp setup to "after launch", runs `openclaw channels login --channel whatsapp` as an interactive SSH session between gateway start and TUI launch. The user scans the QR code with their phone during provisioning setup. Flow: gateway starts → tunnel set up → WhatsApp QR scan → TUI launch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update WhatsApp hint to reflect pre-TUI QR scanning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add --config and --steps CLI flags for programmatic setup Add --config <path> flag to load spawn options from a JSON config file (model, steps, name, setup data like telegram_bot_token). Add --steps <list> flag for comma-separated setup step control. Both enable the web UI and headless automation to control which setup steps run. Priority order: CLI flags > --config file > env vars > defaults. - New spawn-config.ts module with valibot validation - OptionalStep extended with dataEnvVar and interactive metadata - validateStepNames() for step name validation with warnings - Telegram setup reads TELEGRAM_BOT_TOKEN env var before prompting - WhatsApp auto-skipped in headless mode with warning - promptSetupOptions() skipped when SPAWN_ENABLED_STEPS already set - E2E verify helpers for github, browser, telegram setup artifacts - QA reference file documenting all agent setup options - Version bump to 0.17.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add --model flag and priority order tests - Add --model <id> CLI flag that sets MODEL_ID env var - --model is extracted before --config so it takes priority - Add config-priority.test.ts with 8 tests verifying: - --model overrides config model - --steps overrides config steps - --steps "" disables all steps - --name overrides config name - Config tokens apply as defaults - Explicit env vars override config tokens - Remove preferences.json from priority order docs (not needed) - Add --model to help text and unknown-flag guidance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add --model, --config, --steps to README Document config file format, setup steps table, and new CLI flags in the commands table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review feedback - Move null byte check before path resolution (defense-in-depth) - Move agent-setup-options.md from .claude/rules/ to .docs/ (git-ignored) per documentation policy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve rebase conflicts and deduplicate --model flag extraction Rebase on main introduced a duplicate --model flag extraction block (one from the PR at line 804, one from main at line 941). Consolidated into the single early extraction point with -m shorthand support. Also removed duplicate --model entry from KNOWN_FLAGS set. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-13 00:32:58 +00:00
A	ff8bff4c02	chore: standardize featured_cloud to digitalocean + sprite for all agents (#2548 ) Set every agent's featured_cloud to ["digitalocean", "sprite"] — one primary recommendation (DigitalOcean) and one fallback (Sprite). Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-12 19:47:08 -04:00
A	6081c0a17f	feat(qa): telegram soak test on digitalocean + fix bun -e (#2547 ) - soak.sh: SOAK_CLOUD env var makes cloud configurable (default: sprite) - qa.sh: load TELEGRAM_BOT_TOKEN, TELEGRAM_TEST_CHAT_ID, SOAK_CLOUD from /etc/spawn-qa-auth.env in soak mode - qa.yml: add weekly Monday 3am UTC scheduled soak trigger - fix: bun eval → bun -e across soak.sh, key-request.sh, github-auth.sh (bun eval is not a valid subcommand in bun 1.3.9) - fix: export _TOKEN via env prefix so process.env._TOKEN works in bun -e - docs: update shell-scripts.md rule to say bun -e (not bun eval) Verified: 3/4 Telegram tests pass in smoke test on DigitalOcean (120s wait) getMe ✓ sendMessage ✓ getWebhookInfo ✓; cron test needs full 55-min window. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 19:45:18 -04:00
A	2b83a8106d	security: use shellQuote() in agent-setup.ts for consistent null-byte defense (#2546 ) Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 19:44:50 -04:00
Ahmed Abushagur	e640d1bfe5	fix: update Codex default model to gpt-5.3-codex and add agent model reference (#2540 ) The previous PR (#2536) set the Codex default to gpt-5.1-codex, but the latest available on OpenRouter is gpt-5.3-codex. Also adds a rules file documenting each agent's default model to prevent future regressions. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-12 15:49:19 -07:00
Ahmed Abushagur	d2d71b17ef	feat: add --model flag and preferences file for LLM model override (#2543 ) Adds --model / -m CLI flag to override the agent's default LLM model: spawn codex gcp --model openai/gpt-5.3-codex Also supports persistent per-agent model preferences via config file at ~/.config/spawn/preferences.json: { "models": { "codex": "openai/gpt-5.3-codex" } } Priority: --model flag > preferences file > agent default. This enables a future web UI to pass model selection via CLI args when invoking spawn programmatically to provision machines. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-12 18:47:09 -04:00
A	0d66125fd6	fix: add junie to tarball build pipeline (#2541 ) Junie was added as a fully implemented agent (manifest, agent scripts, agent-setup.ts) but the packer/tarball pipeline was never updated. This meant the nightly agent-tarballs workflow could not build a pre-built tarball for Junie, forcing all deployments to do a live npm install. - Add junie entry to packer/agents.json (tier: node, @jetbrains/junie-cli) - Add junie to capture-agent.sh allowlist and path-capture case (npm-based, same as codex/kilocode — captures /root/.npm-global/) Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 18:45:03 -04:00
A	0963f708b4	test: remove duplicate and theatrical tests (#2539 ) Remove redundant existsSync check inside icon-integrity "is actual PNG data" tests — the file existence is already verified in the preceding test, and isPng() will throw if the file is missing. Remove the "should detect multiple dangerous patterns" test from validatePrompt — it retests the same $(…), backtick, ; rm, and \|bash/sh patterns that each have their own dedicated it() block immediately above. Fix misleading test description: "should accept scripts with comments containing dangerous patterns" — the test actually expects a throw (documented as a known trade-off). Rename to "should reject…". Removes 1 test (1381 → 1380) and 18 expect() calls. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 16:48:33 -04:00
A	f6f36cc452	security: add DO_CLIENT_SECRET env var override (#2538 ) * security: add DO_CLIENT_SECRET env var override Allows users/organizations to supply their own DigitalOcean OAuth client secret via DO_CLIENT_SECRET env var rather than relying on the bundled default. The bundled secret remains as fallback. Fixes #2537 Agent: security-auditor Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * chore: bump CLI version to 0.16.19 Agent: security-auditor Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 15:48:36 -04:00
A	91b66f4b40	fix(e2e): fix input test prompt delivery and agent flags (#2536 ) Three root-cause bugs in input test functions: 1. Stdin pass-through broken: cloud_exec uses "printf '...' \| base64 -d \| bash" on the remote, meaning bash reads the script from its own stdin — not the outer process's stdin. "PROMPT=$(base64 -d)" inside the script was reading from the already-consumed pipe, always producing an empty prompt. Fix: embed the base64-encoded prompt directly in the remote command string. Base64 output is [A-Za-z0-9+/=] only — safe to embed in single-quoted strings. 2. Zeroclaw flag wrong: "zeroclaw agent -p" was passing the prompt as --provider (not --prompt). The correct flag for non-interactive single-message mode is "-m"/"--message". 3. Codex model stale: "openai/gpt-5-codex" does not exist on OpenRouter. Updated to "openai/gpt-5.1-codex" which is available. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 13:50:06 -04:00
A	dfd08ad48c	security: consolidate shellQuote across all clouds (defense-in-depth) (#2535 ) PR #2533 hardened GCP with shellQuote() and null-byte rejection, but left Hetzner, DigitalOcean, AWS, and connect.ts using inline .replace(/'/g, "'\\''") without null-byte validation. - Move shellQuote to shared/ui.ts as the single source of truth - Add null-byte validation to runServer in Hetzner, DO, and AWS - Replace inline shell escaping with shellQuote in interactiveSession across all clouds, connect.ts, and agents.ts buildEnvBlock - Re-export shellQuote from gcp.ts for backwards compatibility Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 12:54:31 -04:00
A	58a2d3bf18	test: Remove duplicate and theatrical tests (#2534 ) Consolidate 9 per-credential-type it() blocks in prompt-file-security.test.ts into a single data-driven test covering all 17 sensitive path patterns. Merge 2 validatePromptFileStats "accept" tests into one. Consolidate 4 unicode/encoding-attack it() blocks in security.test.ts into a single data-driven test. Merge 3 "accept identifier" it() blocks into one. Removes 19 redundant tests (1400 → 1381) with no loss of coverage. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 12:52:45 -04:00
A	868ebbe4fe	security: harden shellQuote and consolidate shell escaping in gcp.ts (#2533 ) - Add null-byte rejection to shellQuote (defense-in-depth) - Export shellQuote for testability - Refactor interactiveSession to use shellQuote instead of inline escaping - Add comprehensive test suite for shellQuote security properties Fixes #2529 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 10:27:48 -04:00
A	595e36ffb6	test: Remove duplicate and theatrical tests (#2531 ) Consolidate 8 fragmented pipe-to-bash/sh tests in validatePrompt into 2 data-driven tests covering all inputs (with/without whitespace, complex pipelines, and standalone word acceptance). Merge 3 backtick tests into 1. Merge 2 whitespace tests into 1. Removes 19 lines of duplicate test setup. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-12 09:35:21 -04:00
A	6bdef06351	refactor: deduplicate generateCsrfState into shared/oauth.ts (#2530 ) The identical generateCsrfState() helper existed in both digitalocean/digitalocean.ts and shared/oauth.ts. Export it from oauth.ts (which digitalocean.ts already imports) and remove the duplicate copy. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-12 09:33:53 -04:00
A	6fda75ccc8	security: validate base64 output in cloud_exec and soak.sh (defense-in-depth) (#2532 ) Add base64 character validation ([A-Za-z0-9+/=]) before use in SSH command strings for gcp.sh, aws.sh, and hetzner.sh cloud_exec functions -- matching the existing fix in digitalocean.sh (#2528). Also add a validated _encode_b64 helper to soak.sh and use it for all Telegram bot token encoding, preventing corrupted base64 from breaking out of single-quoted SSH command strings. Closes #2527 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 09:32:48 -04:00
A	76399eafd9	security: validate base64 in digitalocean.sh SSH exec (defense-in-depth) (#2528 ) Add explicit base64 character validation in _digitalocean_exec after encoding the command, matching the existing pattern in provision.sh. This ensures the encoded value contains only [A-Za-z0-9+/=] before embedding it in the SSH command string. Note: #2527 (provision.sh base64 validation) was already fixed in a prior commit — the validation at lines 284-289 already rejects non-base64 characters and empty output. Fixes #2526 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 08:16:48 -04:00
A	afff57db5b	test: remove conditional-expect anti-patterns from 3 test files (#2525 ) Replace `if (!r.ok) { expect(...) }` and `if (result.ok) { return }` guards with unconditional assertions using toThrow() or toMatchObject(). These conditional blocks silently skipped assertions when the condition evaluated the wrong way, providing false confidence. Also remove now-unused tryCatch imports from prompt-file-security.test.ts and security.test.ts. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 02:21:20 -07:00
A	7278638a31	security: validate localPath in uploadFile() and harden runServer() in gcp.ts (#2524 ) Fixes #2521 - Add path traversal and argument injection protection for localPath Fixes #2522 - Add validation for cmd parameter before SSH execution Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 04:50:56 -04:00
Ahmed Abushagur	5b5e7d4706	test: add cron-triggered Telegram reminder to soak test (#2519 ) * test: add cron-triggered Telegram reminder to soak test Tests OpenClaw's ability to stay alive and execute scheduled tasks. Installs a one-shot cron on the VM before the 1h soak wait that sends a Telegram message at ~55 min, then verifies the message was sent after the wait completes. Also moves Telegram config injection before the soak wait so the cron can use the bot token immediately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: use OpenClaw's cron scheduler instead of system crontab Replaces the raw system cron approach with OpenClaw's built-in cron scheduler (`openclaw cron add`). This properly tests that OpenClaw's gateway stays alive after 1 hour and can execute scheduled tasks. The test now: 1. Injects Telegram config + schedules an OpenClaw cron job (--at +55min) 2. Waits 1 hour (soak) 3. Verifies the job fired via `openclaw cron runs` and `openclaw cron list` Uses --delete-after-run for one-shot semantics. Verification checks both the run history and the auto-deletion as proof of execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: verify cron message on Telegram side via forwardMessage Instead of trusting OpenClaw's self-reported cron status, we now verify the message actually exists in the Telegram chat: 1. Extract message_id from OpenClaw's cron execution logs (tries `openclaw cron runs`, then ~/.openclaw/cron/ directory) 2. Call Telegram's forwardMessage API with that message_id 3. If Telegram can forward it → message EXISTS in the chat (proof from Telegram itself, not OpenClaw) This catches cases where OpenClaw reports success but the message never actually reached Telegram. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review findings in soak test - Add validate_positive_int() and validate SOAK_WAIT_SECONDS + SOAK_CRON_DELAY_SECONDS at startup (prevents command injection via crafted env vars) - Validate TELEGRAM_TEST_CHAT_ID is numeric in soak_validate_telegram_env - Use per-app marker file /tmp/.spawn-cron-scheduled-${app} to avoid race conditions when multiple soak tests run on the same VM Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 04:49:42 -04:00
A	c5a40b04a6	fix(e2e): add retry-with-backoff for DigitalOcean 422 droplet limit errors (#2520 ) When provisioning hits a 422 "droplet limit exceeded" response, wait 30s and retry up to 3 times. Makes E2E suite resilient to transient limit hits during parallel batch provisioning. Fixes #2516 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-12 07:49:47 +00:00
A	85a2289bb0	fix(e2e): dynamically calculate DigitalOcean parallel capacity from account limit (#2518 ) Previously, _digitalocean_max_parallel() always returned 3, assuming all quota slots were available. When pre-existing droplets occupy slots, the batch-3 parallel runs fail with "droplet limit exceeded" API errors. Now queries /v2/account for the actual droplet_limit and subtracts the current droplet count to compute available capacity. Falls back to 3 if the API is unreachable. -- qa/e2e-tester Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-12 02:50:48 -04:00
Ahmed Abushagur	553cbad7bf	fix: revert OpenClaw default model to openrouter/auto (#2509 ) OpenClaw requires the openrouter/ provider prefix for model IDs. The previous default (moonshotai/kimi-k2.5) was missing the prefix, causing "Unknown model" warnings. Reverted to openrouter/openrouter/auto which uses OpenRouter's auto-router to pick the best model per prompt. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-12 01:06:50 -04:00
A	129f72d8e1	test: remove conditional-expect anti-pattern in result-helpers.test.ts (#2514 ) Replace `if (result.ok) { expect(result.data)... }` guards with `expect(result).toMatchObject({ ok: true, data: ... })`. The old pattern silently skips inner expects when the condition is false — `toMatchObject` asserts both discriminant and value in a single unconditional call. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 01:04:57 -04:00
A	d4328f38c2	fix: correct DigitalOcean default droplet size in README and stale getUserHome path (#2513 ) DO_DROPLET_SIZE default documented as s-2vcpu-4gb ($24/mo) but code and manifest both use s-2vcpu-2gb ($18/mo). Also fixes stale getUserHome() source reference in testing rules (shared/paths.ts, not shared/ui.ts). Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-11 21:25:30 -07:00
Ahmed Abushagur	b548c5b75a	fix: only pre-select Chrome browser in setup picker (#2512 ) #2507 pre-selected all setup options. Only browser should default to enabled — GitHub CLI and reuse-saved-key are opt-in. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 23:05:31 -04:00
A	a6e3d3304d	test: remove theatrical getTerminalWidth tests that can never fail (#2510 ) The two getTerminalWidth tests only checked that the function returns a number >= 80. Since the implementation is `process.stdout.columns \|\| 80`, both assertions are trivially satisfied in any environment and provide zero regression signal. Removed them along with the unused import. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-11 21:42:23 -04:00
A	6ef7dfc99d	fix(e2e): add claude and codex to .spawnrc fallback in provision.sh (#2511 ) When Sprite (or another cloud) times out during provisioning, provision.sh falls back to constructing .spawnrc manually over SSH. The claude and codex agents were missing from the agent-specific case block, so: - claude: ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN were never written, causing verify_claude's openrouter.ai check to fail - codex: OPENAI_API_KEY and OPENAI_BASE_URL were never written Discovered during E2E run: sprite/claude failed with .spawnrc timeout + missing openrouter.ai in fallback .spawnrc. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 21:40:03 -04:00
A	6c535ac1e8	fix: replace stale bun -e with bun eval in key-request.sh (#2506 ) PR #2505 migrated all bun -e → bun eval across shell scripts but missed 2 instances in sh/shared/key-request.sh (lines 32 and 61). This completes the migration for consistency. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-11 17:22:56 -07:00
Ahmed Abushagur	aa6e7dd1fc	fix: default all setup options to enabled in picker (#2507 ) The multiselect picker for setup options (Chrome browser, GitHub CLI, etc.) started with nothing selected. Now all available options are pre-selected so users get the full setup by default. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 19:43:03 -04:00
A	529a1b3b02	fix: bump quality cycle timeout to 90 min and recognize gcp cli auth (#2501 ) * fix: bump quality cycle timeout to 90 min and recognize gcp cli auth - Quality cycle was hitting the 45 min hard limit mid-run; bumped CYCLE_TIMEOUT from 2400s (40 min) to 5400s (90 min) so E2E tests (provision + install + verify across multiple clouds) have room to complete without getting killed - Updated qa-quality-prompt time budget from 35 min to 85 min to match - Added _check_cli_auth_clouds() to key-request.sh: for clouds that use CLI auth (gcp via gcloud), check if the CLI has an active account instead of reporting them as missing and sending key-request emails - GCP_PROJECT is loaded from ~/.config/spawn/gcp.json when gcloud is authenticated; other CLI-auth clouds (sprite) are excluded from the count since they are not auto-checkable * fix: replace local -n namerefs with eval for bash 3.2 compatibility local -n (namerefs) requires bash 4.3+ and breaks on macOS which ships bash 3.2. Replace with eval-based variable indirection that works on all supported bash versions. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: validate GCP_PROJECT format before export to prevent shell injection Security: project ID from config now validated against ^[a-z][a-z0-9-]*$ pattern before export. Invalid IDs are rejected with a log message. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-11 17:48:21 -04:00
A	1d2bf324c4	refactor: replace bun -e with bun eval and require() with ESM imports in shell scripts (#2505 ) Per shell-scripts.md rules: always use `bun eval` (not `bun -e`) and ESM-only (never `require()`). Fixed in: - sh/shared/key-request.sh: 3 instances of `bun -e` → `bun eval` - sh/e2e/lib/soak.sh: `bun -e` → `bun eval`; `require("fs")`/`require("path")` → named ESM imports from node:fs and node:path Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-11 14:03:46 -07:00
A	65a2efd5ba	fix: gcp use root SSH user instead of whoami (#2503 ) The `resolveUsername()` function called `whoami` and validated against a regex that rejected dots in usernames (e.g. `adrian.hale`), causing "Invalid username" errors. All other clouds use a static SSH user (root for Hetzner/DO, ubuntu for AWS). Switch GCP to use `root` consistently: - Replace dynamic `whoami` lookup with static `GCP_SSH_USER = "root"` - Simplify cloud-init startup script (already runs as root) - Fix bun symlink path to use /root instead of /home/${username} - Remove unused `username` field from GcpState Closes #2502 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-11 13:48:49 -07:00
A	9859cc6a31	test: remove theatrical always-pass test from fs-sandbox (#2504 ) The "real home ~/.spawn/history.json should not be modified" test was a false signal: if the file doesn't exist it does `expect(true).toBe(true)`, and if it does exist it only checks `stat.isFile()` while admitting in comments that it "can't detect retroactively" whether the file was modified. This test could never catch the regression it claimed to guard against. Remove it and drop the unused `statSync` import. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 16:47:52 -04:00
A	150d094ef2	fix: fallback to manual project entry when gcloud projects list fails (#2500 ) * fix: fallback to manual project entry when gcloud projects list fails When the user declines the suggested default GCP project and `gcloud projects list` fails (e.g. lacking resourcemanager.projects.list permission), prompt for a manual project ID instead of hard-failing. Also fix selectFromList() to return "" on cancel (Ctrl+C/Escape) rather than defaultValue, so canceling a project picker is treated as "no selection" rather than silently re-using the first project. Fixes #2499 Agent: issue-fixer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: add GCP project ID format validation for manual entry Validates user-entered GCP project IDs against the required format (^[a-z][a-z0-9-]{4,28}[a-z0-9]$) before accepting them. Invalid entries are rejected with a helpful message and the user is re-prompted. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-11 15:47:53 -04:00
A	25f46d4742	test: remove duplicate per-entity micro-tests in manifest-type-contracts (#2498 ) Replace nested describe-per-agent/cloud loops with data-driven it() blocks that loop over all entities internally. Reduces test count by 192 (235→43) while preserving all 659 expect() calls and identical coverage. Failures now include the entity key in the assertion message for debuggability. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 10:34:32 -07:00
A	794fd1f950	Update Junie icon to official JetBrains logo (#2497 ) Replace the GitHub avatar with the official Junie icon SVG (converted to 200x200 PNG to match existing format). Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-11 09:35:16 -07:00
A	479cbbc009	fix: pass --skip-setup to hermes installer for headless installs (#2496 ) The Hermes Agent installer's setup wizard tries to read from /dev/tty, which fails in headless/non-interactive cloud VM environments. The installer supports --skip-setup to bypass the wizard; pass it via bash -s -- --skip-setup. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 09:33:27 -04:00
A	1a9cd59ae8	fix: correct stale GritQL plugin path in type-safety rules (#2495 ) The `.claude/rules/type-safety.md` referenced the GritQL no-type-assertion plugin at `packages/cli/no-type-assertion.grit`, but the actual location is `lint/no-type-assertion.grit` (root-level lint/ directory, not packages/cli/). Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 08:50:42 -04:00
Ahmed Abushagur	330c10fcd2	feat: add Telegram soak test for OpenClaw (--soak mode) (#2492 ) Add a soak test that provisions OpenClaw on Sprite, waits 1 hour for stabilization, injects a Telegram bot token, and runs integration tests against the Telegram Bot API (getMe, sendMessage, getWebhookInfo). - New: sh/e2e/lib/soak.sh — soak test library with all Telegram-specific logic - Modified: sh/e2e/e2e.sh — add --soak flag to arg parser - Modified: qa.sh — add soak run mode (bypasses Claude, runs e2e.sh directly) - Modified: trigger-server.ts — add "soak" to VALID_REASONS - Modified: qa.yml — add soak to workflow_dispatch options Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: A <258483684+la14-1@users.noreply.github.com>	2026-03-11 05:51:53 -04:00
A	c0cedc3887	docs: add missing agent entries to all cloud READMEs (#2494 ) Junie was added to all 6 clouds (scripts + matrix) but none of the READMEs documented it. Sprite README was also missing Hermes, and local README was missing OpenCode and Junie. All 6 cloud READMEs now list all 8 agents consistently. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-11 05:49:50 -04:00
A	d6c1140612	test: remove duplicate validatePrompt test cases (#2493 ) The "should accept all example prompts from issue #2249" test block contained 3 assertions already covered by surrounding tests: - "Fix the merge conflict >> registration flow" (duplicated) - "Run tests && deploy if they pass" (duplicated) - "The output where X > Y is slow" (duplicated) The one unique assertion ("Add a heredoc to the Dockerfile") has been folded into the existing "developer phrases" test, which covers the same false-positive category (prose containing shell-like syntax). Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-11 04:49:13 -04:00
Ahmed Abushagur	a209daf492	fix: upgrade code-health teammate to do post-merge sweeps and gap detection (#2489 ) Replaces the generic "scan for code smells" prompt with a structured 3-step process: (1) post-merge consistency sweep — fix lint violations and straggler patterns left behind by recent PRs, (2) implementation gap detection — manifest.json vs actual scripts, missing READMEs, orphaned entries, (3) general health scan as fallback. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 04:15:59 -04:00
A	68abbee4df	fix(e2e): fix OPENROUTER_API_KEY fallback and sprite env whitelist (#2491 ) On QA VMs running Claude Code via OpenRouter, the API key is stored as ANTHROPIC_AUTH_TOKEN. Add a fallback in common.sh so e2e.sh picks up the key from ANTHROPIC_AUTH_TOKEN when ANTHROPIC_BASE_URL points to openrouter.ai and OPENROUTER_API_KEY is unset. Also add SPRITE_NAME and SPRITE_ORG to the headless env var whitelist in provision.sh — these are emitted by _sprite_headless_env() but were missing from the positive whitelist, causing every Sprite provisioning attempt to log errors and silently skip the env vars. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 03:23:46 -04:00
A	37fa334d78	fix: navigate back to list after delete/remove errors (#2488 ) * fix: navigate back to list after delete/remove errors instead of exiting Previously, choosing "Delete this server" or "Remove from history" from the action menu would always exit the picker — even if the operation failed. Now handleRecordAction returns "back" for delete/remove actions, and activeServerPicker refreshes the remaining list and loops back to the picker. Cancel on the action menu also returns to the list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ValueOf<T> type helper and GritQL enum ban rule - Add shared ValueOf<T> type that extracts value unions from const objects and readonly tuples - Update RecordActionOutcome to use ValueOf<typeof RecordActionOutcome> - Add lint/no-ts-enum.grit GritQL rule that bans TypeScript enum keyword - Register new rule in biome.json plugins Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sort type export before value exports in shared index Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add biome config for shared package, fix export sort order Add biome.json to packages/shared so lint + format + import organization is enforced on the shared library. Fix ValueOf export position to match biome's organizeImports sort order (type specifiers after value exports). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: hoist type re-exports to top of shared index Split inline `type Result` and `type ValueOf` out of mixed export statements into separate `export type { ... }` re-exports, hoisted to the top per biome's organizeImports group config. biome's useExportType rule doesn't flag re-exports (only locally defined types), so these must be manually separated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: consolidate biome config to single root biome.json Remove per-package biome.json files (packages/cli, packages/shared, .claude/scripts, .claude/skills/setup-spa) and consolidate into a single root config with includes glob covering packages/*/.ts. Update GritQL rule exclusions to also match shared/src/ paths now that the shared package is covered by the root config. Fix build-clouds.ts lint issues (node: protocol, block statements, import sort) that were newly caught. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: replace grit filename exclusions with biome-ignore comments Remove all $filename exclusion logic from GritQL rules and instead add biome-ignore-all comments at the top of files that legitimately need the banned patterns (result.ts, parse.ts, type-guards.ts). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-11 00:04:51 -07:00
A	5031d84e6c	refactor: eliminate process-global mock.module() pollution in tests (#2490 ) Replace mock.module() calls with dependency injection to prevent cross-file test pollution in Bun's shared worker process. Changes: - orchestrate.ts: add getApiKey to OrchestrationOptions - billing-guidance.ts: add injectable BillingGuidanceDeps parameter - delete.ts: add optional deleteHandler parameter to confirmAndDelete - update.ts: add UpdateOptions with injectable runUpdate function - sprite.ts: add optional spawnFn parameter to interactiveSession - Remove unnecessary oauth mocks from junie-agent and do-snapshot tests Only @clack/prompts mock (shared via test-helpers.ts) and do-payment-warning.test.ts (safe spread pattern) remain. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-10 23:57:57 -07:00
A	6439cba58c	fix: remove spinner from delete to prevent output overlap (#2487 ) * fix: remove spinner from delete command to prevent output overlap The delete spinner in confirmAndDelete collided with cloud-specific destroy functions that print their own progress (logStep/logInfo). This caused the "Instance destroyed" message to overwrite the spinner line without a newline, producing garbled output. Remove the spinner and let the cloud destroy functions handle progress output directly, then show a clean success/failure message after. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: redirect cloud destroy output into delete spinner Cloud destroy functions (logStep/logInfo) write progress to stderr, which collided with the @clack spinner on the terminal. Now stderr writes during the delete are intercepted and fed into s.message() so the spinner text updates in place instead of garbling the output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add delete spinner behavior tests Verify that confirmAndDelete: - Feeds stderr output from cloud destroy functions into spinner.message() - Calls spinner.clear() (not stop) so no spinner chrome remains - Shows p.log.success with the last stderr message as detail - Shows p.log.error on failure - Always restores process.stderr.write, even on error - Works when destroy produces no stderr output Also adds spinnerClear to the shared test-helpers mock. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove global cloud module mocks that polluted other tests Only mock hetzner (the cloud used by test records). Other cloud modules are left un-mocked since they're never called for hetzner records. This fixes the DO payment warning test failures caused by mock.module being process-global in Bun. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-10 23:35:12 -07:00
Ahmed Abushagur	4318acad19	fix: prompt to enable Compute Engine API for new GCP users (#2484 ) * fix: prompt to enable Compute Engine API on GCP SERVICE_DISABLED error New GCP users hit SERVICE_DISABLED because the Compute Engine API isn't enabled by default. Detects this error, opens the activation URL in the browser, and prompts the user to retry after enabling it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add beta flags section to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 23:07:09 -07:00

1 2 3 4 5 ...

2018 commits