spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-05-22 03:14:57 +00:00

Author	SHA1	Message	Date
Ahmed Abushagur	8c7a381375	fix: auto-reconnect on Sprite connection drops (#2855 ) Sprite CLI exits with code 1 on "connection closed" (not 255 like SSH). The reconnect loop now treats exit code 1 on Sprite as a connection drop, retrying up to 5 times with a 3s delay between attempts. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 15:13:14 +07:00
A	a3e0dbd4dd	test: remove duplicate and theatrical tests (#2853 ) - Remove `digitalocean/findSpawnSnapshot` describe from do-cov.test.ts (3 basic tests) — fully superseded by do-snapshot.test.ts (7 thorough tests covering name filtering, invalid IDs, network failure, etc.) - Remove `setupAutoUpdate` describe from agent-setup-cov.test.ts (2 shallow tests checking only "systemd" string presence) — fully superseded by auto-update.test.ts which verifies exact systemd unit content, base64-encoded scripts, timer schedules, and error handling Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 12:24:00 +07:00
Ahmed Abushagur	26332afa56	fix: prevent silent exit in --fast mode on Sprite (#2852 ) In fast mode, Promise.allSettled runs server boot, OAuth, and tarball download concurrently. When all operations complete — especially after Bun.serve.stop(true) in the OAuth flow removes its event loop handle — the event loop can appear empty before the await continuation starts new I/O operations. This causes Bun to exit silently with code 0, dropping the user back to their shell after "Successfully obtained OpenRouter API key via OAuth!" with no error. Fix: keep a dummy setInterval handle alive during the fast-mode concurrent section so the event loop never drains prematurely. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 20:51:02 -07:00
A	9a98589cef	fix(security): prevent command injection via INPUT_TEST_TIMEOUT in verify.sh (#2851 ) Add defense-in-depth validation of INPUT_TEST_TIMEOUT directly in verify.sh (not just relying on common.sh). Each input test function now calls _validate_timeout() to ensure the value contains only digits before use. Additionally, instead of interpolating INPUT_TEST_TIMEOUT directly into remote command strings passed to cloud_exec, the timeout value is now assigned to a single-quoted remote variable (_TIMEOUT) and referenced via "$_TIMEOUT" on the remote side. This eliminates the injection surface even if validation were somehow bypassed. Affected functions: input_test_claude(), input_test_codex(), input_test_openclaw(), input_test_zeroclaw(). Fixes #2849 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-20 19:58:52 -07:00
A	acfc31027b	test: delete theatrical unicode-cov.test.ts (#2848 ) Fixes #2847 Removes 273 lines of false-confidence tests that copy-paste shouldForceAscii() logic inline 9x with zero imports from unicode-detect.ts. Every test passed even if the real source was deleted — a theatrical test is worse than no test. Agent: test-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 19:29:14 -07:00
A	84e78a0274	fix(test): prevent flaky timeout in checkBillingEnabled test (#2845 ) The test assumed _state.project would be empty, but module-level state persists across tests due to import caching. Prior resolveProject tests set _state.project, so checkBillingEnabled would attempt a real gcloudSync call and time out at 5s. Mock spawnSync to handle both cases. Agent: pr-maintainer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-20 18:38:48 -07:00
A	a7690f8400	test: remove duplicate and theatrical tests (#2846 ) Some checks are pending CLI Release / Build and release CLI (push) Waiting to run Details Lint / ShellCheck (push) Waiting to run Details Lint / Biome Lint (push) Waiting to run Details Lint / macOS Compatibility (push) Waiting to run Details - history-cov.test.ts: remove duplicate filterHistory ordering test and no-cap saveSpawnRecord test — both are already covered more thoroughly in history-trimming.test.ts - unicode-cov.test.ts: remove theatrical pattern where each test re-implemented shouldForceAscii as an inline lambda (testing an inline copy instead of the real function). consolidate into a single shared helper that mirrors the actual module logic, tested once per scenario. -- qa/dedup-scanner Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-20 17:57:40 -07:00
A	858e348a24	fix(security): add HOME validation before rm -rf in cleanup (#2842 ) Add safe_cleanup_test_dirs() helper to qa.sh and security.sh that validates HOME is set, exists, and is not "/" before running find + rm -rf for test directory cleanup. Prevents unintended deletions if HOME is unset or maliciously set. Fixes #2838 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-20 17:36:53 -07:00
A	62e5918078	fix(security): wrap runServer SSH commands with shellQuote in DO and Hetzner (#2843 ) DigitalOcean and Hetzner runServer() passed the command string directly to SSH without shell-quoting, allowing metacharacters (;, \|, $(), etc.) to be interpreted by the remote shell. AWS and GCP already used `bash -c ${shellQuote(fullCmd)}` — this applies the same pattern to the two affected modules. Fixes #2836 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-20 17:34:43 -07:00
A	ffb4cbeb11	fix(security): prevent path traversal in uploadFile/downloadFile across all cloud providers (#2844 ) Check for ".." path traversal in the raw input BEFORE normalize() strips it, fixing CWE-22 where crafted paths like "/tmp/../../etc/passwd" normalized to "/etc/passwd" and bypassed the post-normalize ".." check. Extracts a shared validateRemotePath() into shared/ssh.ts and replaces the duplicated inline validation in all 5 providers (DigitalOcean, Hetzner, GCP, AWS, Sprite) plus agent-setup.ts. Fixes #2835 Agent: complexity-hunter Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-20 16:48:58 -07:00
A	b9e326d649	fix: use base64 encoding for GITHUB_TOKEN to prevent injection (#2840 ) * fix: use base64 encoding for GITHUB_TOKEN to prevent injection Aligns GITHUB_TOKEN handling with the existing base64 pattern used for OPENROUTER_API_KEY in orchestrate.ts, eliminating the single-quote escaping vulnerability. Fixes #2834 Agent: security-auditor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: apply shellQuote to base64-encoded GITHUB_TOKEN Address security review feedback: wrap the base64-encoded token in shellQuote() for defense-in-depth, preventing any theoretical shell metacharacter escape from the interpolated value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 16:46:49 -07:00
A	0c13679dde	fix(security): quote branch name variables to prevent word-splitting (#2841 ) Replace `for branch in $VAR` with `while IFS= read -r branch` loops in qa.sh and security.sh to prevent word-splitting on branch names containing spaces or special characters. This closes a MEDIUM severity vulnerability where a malicious branch name like `qa/test main` could cause the loop to iterate over split tokens separately. Fixes #2837 Agent: style-reviewer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-20 23:44:22 +00:00
A	5acf598615	fix: use stdin piping in _stage_prompt_remotely to prevent injection (#2839 ) Replaces command string interpolation with stdin piping for the base64 prompt in verify.sh. Also anchors the _validate_base64 regex. Fixes #2833 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 15:46:00 -07:00
A	3551995aa1	refactor: remove dead code and stale references (#2832 ) Deduplicate identical mockBunSpawn helper that was copy-pasted across five test files (aws-cov, gcp-cov, do-cov, hetzner-cov, sprite-cov). Centralise it in test-helpers.ts and import from there instead. -- qa/code-quality Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 14:12:20 -07:00
A	5e263dd12f	test: remove duplicate and theatrical tests (#2831 ) Remove 10 duplicate test cases from cmd-list-cov.test.ts and cmd-run-cov.test.ts that were already covered by dedicated test files: - buildRecordLabel (3 tests) — duplicated from cmdlast.test.ts - buildRecordSubtitle (3 tests) — duplicated from cmdlast.test.ts - cmdListClear (2 tests) — weaker duplicates of clear-history.test.ts - cmdLast (1 test) — duplicated from cmdlast.test.ts - cmdRun detectAndFixSwappedArgs (1 test) — duplicated from commands-swap-resolve.test.ts which has 10 thorough swap tests -- qa/dedup-scanner Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-20 13:47:17 -07:00
A	32525f5dd7	test: remove duplicate and theatrical tests (#2830 ) Some checks are pending CLI Release / Build and release CLI (push) Waiting to run Details Lint / ShellCheck (push) Waiting to run Details Lint / Biome Lint (push) Waiting to run Details Lint / macOS Compatibility (push) Waiting to run Details - delete manifest-cov.test.ts: it duplicated stripDangerousKeys, agentKeys/cloudKeys/matrixStatus/countImplemented from manifest.test.ts; unique tests (isStaleCache, getCacheAge, richer loadManifest edge cases) consolidated into manifest.test.ts - remove sprite/interactiveSession from sprite-cov.test.ts: superseded by sprite-keep-alive.test.ts which tests actual script content - remove sprite/installSpriteKeepAlive from sprite-cov.test.ts: superseded by sprite-keep-alive.test.ts - remove startGateway from agent-setup-cov.test.ts: superseded by gateway-resilience.test.ts which checks systemd config, cron, and port-wait all 2050 tests pass Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-20 09:51:57 -07:00
A	8be8b650b0	docs: sync README commands table with help.ts (add spawn link) (#2829 ) Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-20 09:49:38 -07:00
A	1dc9c04eeb	fix: standardize ESM import extensions across 35 production files (#2827 ) Add .js extensions to 124 relative imports that were missing them. The codebase is "type": "module" (ESM) and the dominant pattern already used .js extensions, but 35 files had a mix of extensionless and .js imports — sometimes within the same file. Standardize to .js everywhere. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 08:51:40 -07:00
A	f4e2cd80a4	fix(ux): add spawn link to help output and --fast to KNOWN_FLAGS (#2828 ) spawn link is a fully implemented command (440 lines) that was completely missing from `spawn help`. Users had no way to discover it through the CLI's self-documentation. Also adds --fast to the KNOWN_FLAGS set for consistency — it was accepted by the CLI but not registered in the flag validation set. Agent: ux-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 08:49:26 -07:00
Ahmed Abushagur	21c0e1511c	fix: remove 100-entry history cap — keep all records (#2819 ) The MAX_HISTORY_ENTRIES=100 cap silently archived records when you spawned more than 100 times, making older active servers vanish from `spawn list`. The cap was solving a non-problem — 1000 records is ~500KB. Removed: - MAX_HISTORY_ENTRIES constant and trimming logic - archiveRecords() and readExistingArchive() (no longer needed) - Smart trim tests (history-trimming.test.ts rewritten to test ordering only) Existing archive files (~/.spawn/history-YYYY-MM-DD.json) are still readable by recoverFromArchives() for corruption recovery. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 06:32:08 -07:00
A	a7cebd4054	test: remove duplicate and theatrical tests (#2826 ) - delete commands-update-download.test.ts (7 tests): superseded by cmd-update-cov.test.ts which has 13 tests with better fallback URL coverage and uses clack mocks properly - remove saveSpawnRecord id generation describe from history-cov.test.ts (1 test): superseded by history-spawn-id.test.ts which has 3 more thorough tests covering the same scenario - remove 4 describe blocks from cmd-run-cov.test.ts (18 tests): getSignalGuidance, getScriptFailureGuidance, getScriptFailureGuidance additional, and getSignalGuidance additional are all covered more thoroughly by the dedicated script-failure-guidance.test.ts; the "additional" blocks were theatrical (only checked joined.length > 0) - delete picker.test.ts and merge its 8 parsePickerInput tests into picker-cov.test.ts to eliminate duplicate describe name collision 2063 -> 2036 tests (-27), 0 failures Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-20 06:11:57 -07:00
A	c323f10ae9	fix(gcp): add /usr/local/bin to PATH for kilocode binary detection (#2825 ) Fixes #2823: npm installs kilocode to /usr/local/bin when running as root on GCP, but the E2E binary verify step didn't include /usr/local/bin in PATH, causing false "binary not found" failures. The .spawnrc PATH (generated by generateEnvConfig) already includes /usr/local/bin, but verify_kilocode used a hardcoded PATH that omitted it. This aligns kilocode and codex verify checks with openclaw and junie which already include /usr/local/bin. Also fixes the same latent issue in verify_codex. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 05:25:15 -07:00
A	8f24067336	test: remove duplicate and theatrical tests (#2820 ) Remove thin duplicate test blocks that were redundant with more comprehensive coverage elsewhere: - ui-cov.test.ts: drop shellQuote (4 tests → gcp-shellquote.test.ts has 11), jsonEscape (1 test → ui-utils.test.ts has 4), toKebabCase (2 tests → ui-utils.test.ts has 5), sanitizeTermValue (2 tests → ui-utils.test.ts has 6), withRetry (3 tests → with-retry-result.test.ts has 8) - agent-setup-cov.test.ts: drop wrapSshCall (5 tests → with-retry-result.test.ts has 7 plus integration tests) - run-path-credential-display.test.ts: drop isRetryableExitCode (2 tests → cmd-run-cov.test.ts has 5) - history-cov.test.ts: drop generateSpawnId (2 tests → history-spawn-id.test.ts has 2 with UUID format check) and clearHistory (2 tests → clear-history.test.ts has extensive coverage) - cmd-list-cov.test.ts: drop formatRelativeTime (9 tests → commands-exported-utils.test.ts has 10 with an extra boundary case) All 2063 tests pass, biome lint clean. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-20 05:00:22 -07:00
A	9ddf8b67b0	fix(ux): remove -n short flag from spawn link --name to prevent silent conflict (#2822 ) The top-level arg parser in index.ts:820 claims -n for --dry-run before any subcommand sees it. Running `spawn link 1.2.3.4 -n my-server` silently drops the intended name value — the user gets no error, the spawn is registered without the name they specified. Removing -n from link's --name extractFlag call eliminates the conflict. The --name long form is unaffected and documented in the usage string. Also updates cmd-link-cov.test.ts to use --name in the short-flags test. Agent: ux-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 04:01:00 -07:00
A	24bdf664ab	fix(types): resolve TypeScript strict mode errors in production code (#2824 ) Fix 24 TypeScript strict mode errors across 7 production files: - interactive.ts: guard against undefined `val` in validate callback - list.ts: use already-narrowed `conn` variable instead of `selected.connection` - run.ts: widen `buildCloudLines` defaults param to `Record<string, unknown>` - digitalocean.ts: use `toRecord()` to safely drill into nested API responses; capture narrowed `oauthCode` in const for async closure - history.ts: backfill missing record IDs via `backfillRecordIds()` helper; use `v.safeParse` output directly to get properly typed records - index.ts: use `Manifest` type for `showUnknownCommandError` parameter - orchestrate.ts: capture narrowed `tunnel` and `getConnectionInfo` in const variables before async closures Fixes #2821 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-20 03:17:04 -07:00
A	c82865707a	feat: fix coverage threshold enforcement with correct bunfig syntax (#2818 ) The original bunfig.toml used `line` and `function` (singular) which Bun silently ignores. The correct field names are `lines` and `functions` (plural). Changes: - Fix field names: line→lines, function→functions - Set thresholds: lines=0.35 (floor: digitalocean.ts 38.5%), functions=0.5 (floor: preload.ts 50%) - Add coverageSkipTestFiles=true - Keep --coverage in CI (bunfig thresholds enforce exit code on failure) Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-20 02:21:40 -07:00
Ahmed Abushagur	54820f0bea	fix: add file locking to history writes + backfill missing record IDs (#2817 ) Some checks failed CLI Release / Build and release CLI (push) Failing after 5s Details Lint / Biome Lint (push) Failing after 5s Details Lint / macOS Compatibility (push) Successful in 15s Details Lint / ShellCheck (push) Successful in 1m13s Details History records were being silently lost when concurrent spawn processes did load→modify→save simultaneously (last writer wins, first record vanishes). This explains records disappearing from `spawn list`. Changes: - Add mkdir-based advisory file locking (withHistoryLock) around all write operations: saveSpawnRecord, saveLaunchCmd, saveMetadata, markRecordDeleted, removeRecord, updateRecordIp, updateRecordConnection - Stale lock detection (>30s) prevents deadlocks from crashed processes - Backfill IDs on legacy records without them during loadHistory() - Validate archive records during merge (readExistingArchive) - Limit archive recovery scan to 30 most recent files Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 01:48:58 -07:00
A	0ea0d5bb61	test: add coverage for retryOrQuit and skipCloudInit auto-detection (#2810 ) Both functions were added in recent commits but had zero test coverage: - retryOrQuit (`ed127cf`): non-interactive mode now verified to throw - skipCloudInit (`2280550`): 4 cases verify correct tier/cloud/mode conditions 1468 tests pass, 0 failures. Agent: test-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 23:45:04 -07:00
A	69b6f8aa66	fix(test): fix 7 failing tests — GCP mock gaps and sandbox pollution (#2816 ) - GCP coverage tests (6 failures): getServerIp, listServers, and authenticate tests did not mock the `which gcloud` spawnSync call inside requireGcloudCmd(), causing "gcloud CLI not found" errors. Add mockSpawnSyncWithGcloud/mockWhichGcloud helpers that satisfy the gcloud discovery call before the test-specific mock. - Sandbox guardrail test (1 failure): cmd-uninstall-cov deletes ~/.spawn and other sandbox directories but never re-creates them. Since Bun runs test files in the same process, the fs-sandbox test then fails. Add afterEach restoration of sandbox dirs. - Add coverageThreshold to bunfig.toml with correct syntax (coverageThreshold under [test], not [test.coverage]) Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-19 23:43:13 -07:00
A	407a4ee901	fix: rename duplicate `providers` variable in key-server.ts (#2815 ) The second `const providers` declaration shadowed the first in the same scope, causing a parse error that crashed the key server on startup. Renamed to `providerRequests` to fix the conflict. Closes #2808 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 23:09:59 -07:00
A	221b25507f	fix(security): use consistent SPAWN_ISSUE validation pattern (#2814 ) Update security.sh to use `^[1-9][0-9]*$` instead of `^[0-9]+$`, matching refactor.sh and rejecting leading zeros. Closes #2761 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-19 23:04:04 -07:00
A	18c7834d24	fix: restore packages/cli/bunfig.toml for preload when running from subdir (#2813 ) The pre-merge hook and `cd packages/cli && bun test` need a local bunfig.toml so the preload path resolves correctly for the sandbox. Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 22:57:03 -07:00
A	9ae3525030	feat: enforce CI coverage thresholds + colocate billing guidance (#2811 ) - Move bunfig.toml to repo root with valid coverageThreshold syntax (line=80%, function=0 to avoid per-file false positives) - Add --coverage flag to CI test step - Delete packages/cli/bunfig.toml (superseded by root config) - Add tests for packages/shared (type-guards, parse, result) - Colocate billing config into each cloud directory (aws/billing.ts, gcp/billing.ts, hetzner/billing.ts, digitalocean/billing.ts) - Refactor billing-guidance.ts: BillingConfig interface replaces cloud-string-keyed Record maps - Bump CLI version to 0.25.1 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-19 22:52:45 -07:00
Ahmed Abushagur	aa4b2a23d6	feat: auto-reconnect on SSH drops during interactive session (#2806 ) When SSH exits with code 255 (connection dropped/timed out), retry up to 5 times with 3s delay between attempts. Clean exits (0), Ctrl+C (130), and agent crashes exit immediately without retrying. Only applies to remote clouds — local sessions skip reconnect logic. Signed-off-by: L <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-19 22:28:10 -07:00
A	72c3f23364	test: add comprehensive code coverage tests (#2802 ) * test: add comprehensive coverage tests (67% → 85% lines) Add 27 new test files with ~565 tests covering all major modules: Shared modules: - ui-cov: logging, prompts, validation, shellQuote, withRetry, loadApiToken - ssh-cov: spawnInteractive, killWithTimeout, startSshTunnel, waitForSsh - ssh-keys-cov: generateSshKey edge cases, key sorting, fingerprint - oauth-cov: PKCE flow, code verifier/challenge, key management - orchestrate-cov: provisioning flow, enabled steps, model preferences - agent-setup-cov: wrapSshCall, createCloudAgents, GitHub auth Commands: - connect, status, uninstall, pick, delete, update, fix, interactive - link, run, list (with formatRelativeTime, filters, actions) Cloud providers: - aws, gcp, digitalocean, hetzner, sprite (auth, CRUD, SSH ops) Remaining: - picker, unicode-detect, history, manifest, update-check Also fixes: - do-payment-warning.test.ts: use spyOn instead of mock.module for shared/ui to prevent cross-test contamination - preflight-credentials.test.ts: resilient to @clack/prompts mock replacement by other test files Coverage: 74% → 90% functions, 67% → 85% lines Tests: 1467 → 2032, 0 failures Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: expand coverage tests for commands, oauth, orchestrate, and link Add 65+ new tests across 7 test files: - cmd-list-cov: handleRecordAction branches (rerun, fix, no-connection), resolveListFilters with cloud filter, footer and empty message paths - cmd-run-cov: showDryRunPreview edge cases, getScriptFailureGuidance for all exit codes, getSignalGuidance, cmdRun validation - cmd-pick-cov: flag edge cases (missing values, multiple flags) - cmd-link-cov: IP generation, detection spinner, invalid IP - cmd-fix-cov: additional fix paths - oauth-cov: non-standard key confirmation, null config handling - orchestrate-cov: tunnel support, checkAccountReady, tarball, SPAWN_NAME, preLaunch, restart loop, step validation Coverage: 90.50% functions, 85.13% lines (2097 tests, 0 failures) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add coverage thresholds (80% lines, 90% functions) Configure bun test coverage thresholds in bunfig.toml to enforce minimum coverage levels and prevent regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 22:24:54 -07:00
A	646faf66e2	test: remove duplicate config_files test in manifest-type-contracts (#2809 ) Consolidated two overlapping describe blocks that both iterated over the same config_files data: - 'Agent optional field types' had a test checking config_files keys were strings with length > 0 - 'Config files structure' had a separate describe checking the same keys match a path regex and values are non-null objects Merged into a single test within 'Agent optional field types' that checks all constraints: key is string, key is non-empty, key matches path regex (/[/~./]), and value is a non-null object. Removed the now-redundant 'Config files structure' describe block. -- qa/dedup-scanner Co-authored-by: spawn-qa-bot <qa@openrouter.ai>	2026-03-19 22:05:41 -07:00
Ahmed Abushagur	ed127cf592	feat: never-give-up resilience layer (#2807 ) Some checks failed CLI Release / Build and release CLI (push) Failing after 5s Details Lint / Biome Lint (push) Failing after 4s Details Lint / macOS Compatibility (push) Successful in 15s Details Lint / ShellCheck (push) Successful in 59s Details * feat: never-give-up resilience layer — retry every failure instead of exiting Add retryOrQuit() helper to shared/ui.ts that prompts "Try again? (Y/n)" after any recoverable failure. Wrap all fatal exit points with retry loops: - Cloud auth (Hetzner, DigitalOcean, AWS, GCP): retry after 3 failed tokens - API key acquisition: retry after 3 failed OAuth+manual attempts - Server creation: retry on any createServer failure (both fast & sequential) - SSH readiness: retry on waitForReady timeout - Agent install: retry on install failure - Pre-launch hooks: retry on preLaunch failure Non-interactive mode (SPAWN_NON_INTERACTIVE=1) still throws immediately. Ctrl+C at any retry prompt exits cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(e2e): add AI-driven interactive test harness Add --interactive mode to the E2E test framework. Instead of running spawn in headless mode (SPAWN_NON_INTERACTIVE=1), this spawns the CLI in a real PTY and uses Claude Haiku to respond to prompts like a human user would. New files: - sh/e2e/interactive-harness.ts — Bun script that drives the PTY + AI loop - sh/e2e/lib/interactive.sh — Bash integration with the E2E framework Usage: e2e.sh --cloud hetzner claude --interactive Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(qa): wire interactive E2E into scheduled QA pipeline - Add `e2e-interactive` option to workflow_dispatch in qa.yml - Add `e2e-interactive` run mode to qa.sh (loads cloud creds + ANTHROPIC_API_KEY) - Runs `e2e.sh --cloud hetzner claude --interactive` directly (no Claude Code needed) - Defaults to hetzner (cheapest), overridable via E2E_INTERACTIVE_CLOUD/AGENT env vars Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(qa): schedule interactive E2E daily at 6am UTC Runs one agent (claude) on one cloud (hetzner) with AI-driven prompts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(qa): offset soak cron to avoid GitHub Actions schedule dedup GitHub Actions deduplicates overlapping cron schedules into one run, making `github.event.schedule` unpredictable. The soak test at `0 3 * * 1` was getting absorbed by the `0 /4 * ` quality sweep and never firing as reason=soak. Move soak to `30 1 * 1` (Monday 1:30am UTC) — safely between the 0am and 4am quality sweep slots. Interactive E2E at `0 6 * * ` is already safe (between the 4am and 8am slots). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(qa): add e2e-interactive to trigger server valid reasons The trigger server validates reason query params against an allowlist. Without this, the `e2e-interactive` dispatch returns 400. Also note: `soak` is already in VALID_REASONS in the repo but the running service on the QA VM is stale — needs a restart to pick up both soak and e2e-interactive reasons. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 17:33:22 -07:00
Ahmed Abushagur	2280550c18	perf: skip cloud-init for minimal-tier agents with tarballs/snapshots (#2804 ) * perf: skip cloud-init for minimal-tier agents with tarballs/snapshots Ubuntu 24.04 base images already have curl + git, so minimal-tier agents (claude, opencode, zeroclaw, hermes) don't need the cloud-init package install step when using tarballs or snapshots. Adds skipCloudInit flag to CloudOrchestrator — set automatically when (tarball \|\| snapshot) && tier === "minimal". Each cloud's waitForReady checks this flag and calls waitForSshOnly instead of waitForCloudInit. Saves ~30-60s on minimal-tier agent deploys with --fast or --beta tarball. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add --fast mode and updated beta features to README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove timing table from README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-19 16:14:49 -07:00
A	66036bfac9	fix(do): skip _run_with_restart in headless mode to prevent duplicate droplets (#2805 ) The _run_with_restart wrapper in all 8 DigitalOcean agent scripts catches SIGTERM/SIGKILL exit codes (143/137) and retries the orchestration process. In headless mode (E2E tests), when the provision timeout kills the process, this restart loop would re-run main.ts, creating duplicate droplets and exhausting the account's droplet quota — causing ALL subsequent DO agents to fail provisioning. Skip the restart loop entirely when SPAWN_HEADLESS=1 (set by runScriptHeadless in the CLI). The restart behavior is only useful for interactive sessions where the user's SSH connection drops. Fixes #2794 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 16:12:25 -07:00
A	8d76ad90d3	security: base64-encode cmd in _sprite_exec to prevent injection (#2803 ) Apply the same base64 encoding mitigation used by all other cloud drivers (aws, hetzner, digitalocean, gcp). The command is encoded locally, validated for safe characters, then decoded and executed on the remote side via `base64 -d \| bash`. Fixes #2800 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-19 13:19:07 -07:00
A	1d0349cc23	test: add SPAWN_FAST fast-mode coverage to orchestrate (#2801 ) Add 6 test cases verifying the Promise.allSettled parallel orchestration path introduced in #2796. Tests cover: happy path, server boot failure propagation, API key failure propagation, tarball fallback to agent.install, local cloud exclusion from fast mode, and non-fatal preProvision/checkAccountReady failures. Agent: test-engineer Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 13:16:02 -07:00
A	8fef58845c	fix(e2e): use aggressive cleanup threshold (5 min) for pre-run to prevent quota exhaustion (#2798 ) The pre-run stale cleanup (added in #2789) used the same 30-minute max_age as the post-run cleanup. Orphaned instances from recently-failed runs (< 30 min old) were not cleaned, causing quota exhaustion on DigitalOcean and other clouds. Pre-run cleanup now uses _CLEANUP_MAX_AGE=300 (5 min) to aggressively reclaim orphaned e2e instances before provisioning new ones. Post-run cleanup retains the 30-minute default. All 5 cloud drivers respect the override. Fixes #2793 Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 11:23:55 -07:00
A	e4bfd38443	security: pass encoded prompt via env var, not string interpolation (#2799 ) Fixes #2797. The _stage_prompt_remotely() function was interpolating ${encoded_prompt} directly into the remote command string passed to cloud_exec. While _validate_base64() ensures only [A-Za-z0-9+/=] characters are present, defense-in-depth requires eliminating the interpolation entirely. The fix uses printf %s format substitution to build the remote command, placing the encoded prompt into a single-quoted shell variable assignment (_EP='...') on the remote side. Single quotes prevent all shell expansion, and base64 charset cannot contain single quotes, making injection structurally impossible. Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 11:23:08 -07:00
Ahmed Abushagur	5efbcf9ee7	feat: add --fast flag for parallel server boot + setup (#2796 ) * feat: add --fast flag for parallel server boot + setup Adds `--fast` flag that runs server creation concurrently with API key prompt, account check, pre-provision hooks, tarball download, and env config generation. Once SSH is up, uploads tarball and applies config. --fast implies --beta tarball and --beta images, enabling snapshots and pre-built tarballs automatically. Flow without --fast (sequential): auth → API key → preProvision → size → create → boot → install → configure Flow with --fast (parallel): auth → size → [create+boot \| API key \| preProvision \| tarball download \| accountCheck] → upload tarball → inject env → configure Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add --beta parallel as standalone opt-in for parallel setup --beta parallel enables the parallel orchestration without implying tarball/images. --fast still implies all three (tarball + images + parallel). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 10:26:54 -07:00
A	6772ed1cd7	fix(cli): validate agentKey in buildFixScript and fixSpawn before manifest lookup (#2792 ) Some checks failed Lint / ShellCheck (push) Successful in 1m5s Details CLI Release / Build and release CLI (push) Failing after 18s Details Lint / Biome Lint (push) Failing after 4s Details Lint / macOS Compatibility (push) Successful in 14s Details Add validateIdentifier() calls to buildFixScript() and fixSpawn() to ensure agent keys from spawn history match [a-z0-9_-]+ before using them to index manifest.agents. This prevents potential prototype pollution or unexpected behavior from tampered history files. Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-19 06:36:06 -07:00
A	5f8b7f1145	fix(e2e): run stale cleanup before agents, not just after (#2789 ) Orphaned e2e instances from previously interrupted test runs (e.g. killed by timeout) remain under the 30-minute max_age threshold and continue to consume account capacity. This caused DigitalOcean "droplet limit exceeded" 422 errors when re-running the suite within 30 minutes of a failed run. Add a pre-run stale cleanup call at the start of run_agents_for_cloud (after credentials are validated, before agents start). This clears leftover e2e-* instances immediately so they don't block provisioning in the new run. -- qa/e2e-tester Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 03:49:51 -07:00
A	9ab3993b39	fix(e2e): eliminate prompt interpolation in input_test commands (#2790 ) Replaces the pattern of embedding base64-encoded prompts directly into remote command strings via shell variable interpolation with a two-step approach: stage the encoded prompt to a remote temp file first, then read from that file in the agent command. This eliminates RCE risk if the prompt source ever becomes user-controlled. Changes: - Add _stage_prompt_remotely() helper that writes encoded prompt to /tmp/.e2e-prompt on the remote host via an isolated cloud_exec call - input_test_claude(): read prompt from temp file instead of _ENCODED_PROMPT var - input_test_codex(): same - input_test_openclaw(): same - input_test_zeroclaw(): same - Update _validate_base64() comment to reflect defense-in-depth role Closes #2788 Agent: security-auditor Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 03:48:53 -07:00
A	787087144c	fix(cli): bump version to 0.23.2 for missed patch releases (#2787 ) Some checks failed CLI Release / Build and release CLI (push) Failing after 5s Details Lint / Biome Lint (push) Failing after 4s Details Lint / macOS Compatibility (push) Successful in 17s Details Lint / ShellCheck (push) Successful in 57s Details Two CLI changes landed after the last version bump (0.23.1) without incrementing the version: - `d9575acd`: fix(cli): exit with code 1 on spawn fix error paths - `148cc9e7`: refactor: extract duplicate waitForSshSnapshotBoot to shared/ssh.ts The CLI has auto-update enabled — without a version bump, users won't pick up these fixes on next run. Agent: code-health Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 01:00:10 -07:00
Ahmed Abushagur	5a23982513	fix: prevent grep pipefail from killing tarball release uploads (#2786 ) The old-asset cleanup pipeline `gh release view \| grep \| while` fails when grep finds no matches (exit 1) and pipefail is set. This kills the entire step before gh release upload runs. Fix: wrap grep in `{ grep ... \|\| true; }` so no-match is not fatal. This caused all arm64 builds and some x86_64 builds to fail nightly. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:51:09 -07:00
Ahmed Abushagur	2825884fee	fix(packer): use cpx22 in nbg1 for Hetzner builds (#2785 ) cx23 is only available in Helsinki — poor availability. Switch to cpx22 (AMD, 2 vCPU, 4GB) which is available in nbg1/hel1/sin. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 22:52:01 -07:00

1 2 3 4 5 ...

2225 commits