spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-05-03 06:10:21 +00:00

Author	SHA1	Message	Date
A	f5f0b9ec64	fix(lint): fix biome violations in packages/shared and add to CI (#2923 ) The CI biome check only covered packages/cli/src/, .claude/scripts/, and .claude/skills/setup-spa/ — packages/shared/src/ was unchecked, allowing 7 lint/format violations to accumulate in its test files. - Auto-fix import ordering, formatting, and useNumberNamespace lint across 3 test files in packages/shared/src/__tests__/ - Add packages/shared/src/ to the biome check in lint.yml so future violations are caught in CI Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-23 17:49:55 -07:00
Ahmed Abushagur	66a1749b4b	fix: add sprite-keep-running.sh, remove Hetzner from Packer, cleanup on cancel (#2869 ) Some checks are pending CLI Release / Build and release CLI (push) Waiting to run Details Lint / ShellCheck (push) Waiting to run Details Lint / Biome Lint (push) Waiting to run Details Lint / macOS Compatibility (push) Waiting to run Details * fix: destroy orphaned Packer builder instances on workflow cancel When a Packer Snapshots workflow is cancelled mid-build, Packer's process is killed before it can clean up its temporary builder droplet/server. This leaves orphaned packer-* instances running and costing money. Add `if: cancelled()` cleanup steps for both DigitalOcean and Hetzner that destroy any packer-* prefixed instances after cancellation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove Hetzner cleanup step — only DO needed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove Hetzner from Packer snapshots, add cancel cleanup Remove Hetzner from the Packer workflow entirely — only DigitalOcean snapshots are built. Deletes packer/hetzner.pkr.hcl and simplifies the workflow by removing all Hetzner-specific steps and cloud conditionals. Also adds a cancelled() cleanup step that destroys orphaned packer-* builder droplets when a workflow run is cancelled mid-build. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add missing sprite-keep-running.sh script The keep-alive install was 404ing because sh/shared/sprite-keep-running.sh never existed in the repo. The TypeScript code downloaded it from the CDN (which maps to sh/shared/) but the file was never created. The script wraps a command and pings the sprite's own public URL every 30s to prevent inactivity shutdown. It resolves the URL via sprite-env info (available on all sprites) and falls back to exec without keep-alive if the URL can't be determined. Also removes Hetzner from the Packer snapshots workflow entirely — only DigitalOcean snapshots are built. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address security review — scope cleanup filter, fix JSON injection 1. Add `spawn-packer` tag to DO builder droplets in Packer template and filter cleanup by tag instead of broad `packer-` name prefix. Prevents accidentally destroying builder instances from other concurrent builds. 2. Use `jq --arg` for SINGLE_AGENT_INPUT instead of string interpolation to prevent JSON injection via crafted agent names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 18:13:38 +00:00
A	c82865707a	feat: fix coverage threshold enforcement with correct bunfig syntax (#2818 ) The original bunfig.toml used `line` and `function` (singular) which Bun silently ignores. The correct field names are `lines` and `functions` (plural). Changes: - Fix field names: line→lines, function→functions - Set thresholds: lines=0.35 (floor: digitalocean.ts 38.5%), functions=0.5 (floor: preload.ts 50%) - Add coverageSkipTestFiles=true - Keep --coverage in CI (bunfig thresholds enforce exit code on failure) Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-20 02:21:40 -07:00
A	9ae3525030	feat: enforce CI coverage thresholds + colocate billing guidance (#2811 ) - Move bunfig.toml to repo root with valid coverageThreshold syntax (line=80%, function=0 to avoid per-file false positives) - Add --coverage flag to CI test step - Delete packages/cli/bunfig.toml (superseded by root config) - Add tests for packages/shared (type-guards, parse, result) - Colocate billing config into each cloud directory (aws/billing.ts, gcp/billing.ts, hetzner/billing.ts, digitalocean/billing.ts) - Refactor billing-guidance.ts: BillingConfig interface replaces cloud-string-keyed Record maps - Bump CLI version to 0.25.1 Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-03-19 22:52:45 -07:00
Ahmed Abushagur	ed127cf592	feat: never-give-up resilience layer (#2807 ) Some checks failed CLI Release / Build and release CLI (push) Failing after 5s Details Lint / Biome Lint (push) Failing after 4s Details Lint / macOS Compatibility (push) Successful in 15s Details Lint / ShellCheck (push) Successful in 59s Details * feat: never-give-up resilience layer — retry every failure instead of exiting Add retryOrQuit() helper to shared/ui.ts that prompts "Try again? (Y/n)" after any recoverable failure. Wrap all fatal exit points with retry loops: - Cloud auth (Hetzner, DigitalOcean, AWS, GCP): retry after 3 failed tokens - API key acquisition: retry after 3 failed OAuth+manual attempts - Server creation: retry on any createServer failure (both fast & sequential) - SSH readiness: retry on waitForReady timeout - Agent install: retry on install failure - Pre-launch hooks: retry on preLaunch failure Non-interactive mode (SPAWN_NON_INTERACTIVE=1) still throws immediately. Ctrl+C at any retry prompt exits cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(e2e): add AI-driven interactive test harness Add --interactive mode to the E2E test framework. Instead of running spawn in headless mode (SPAWN_NON_INTERACTIVE=1), this spawns the CLI in a real PTY and uses Claude Haiku to respond to prompts like a human user would. New files: - sh/e2e/interactive-harness.ts — Bun script that drives the PTY + AI loop - sh/e2e/lib/interactive.sh — Bash integration with the E2E framework Usage: e2e.sh --cloud hetzner claude --interactive Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(qa): wire interactive E2E into scheduled QA pipeline - Add `e2e-interactive` option to workflow_dispatch in qa.yml - Add `e2e-interactive` run mode to qa.sh (loads cloud creds + ANTHROPIC_API_KEY) - Runs `e2e.sh --cloud hetzner claude --interactive` directly (no Claude Code needed) - Defaults to hetzner (cheapest), overridable via E2E_INTERACTIVE_CLOUD/AGENT env vars Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(qa): schedule interactive E2E daily at 6am UTC Runs one agent (claude) on one cloud (hetzner) with AI-driven prompts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(qa): offset soak cron to avoid GitHub Actions schedule dedup GitHub Actions deduplicates overlapping cron schedules into one run, making `github.event.schedule` unpredictable. The soak test at `0 3 * * 1` was getting absorbed by the `0 /4 * ` quality sweep and never firing as reason=soak. Move soak to `30 1 * 1` (Monday 1:30am UTC) — safely between the 0am and 4am quality sweep slots. Interactive E2E at `0 6 * * ` is already safe (between the 4am and 8am slots). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(qa): add e2e-interactive to trigger server valid reasons The trigger server validates reason query params against an allowlist. Without this, the `e2e-interactive` dispatch returns 400. Also note: `soak` is already in VALID_REASONS in the repo but the running service on the QA VM is stale — needs a restart to pick up both soak and e2e-interactive reasons. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 17:33:22 -07:00
Ahmed Abushagur	5a23982513	fix: prevent grep pipefail from killing tarball release uploads (#2786 ) The old-asset cleanup pipeline `gh release view \| grep \| while` fails when grep finds no matches (exit 1) and pipefail is set. This kills the entire step before gh release upload runs. Fix: wrap grep in `{ grep ... \|\| true; }` so no-match is not fatal. This caused all arm64 builds and some x86_64 builds to fail nightly. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:51:09 -07:00
Ahmed Abushagur	a023223a58	fix: correct jq cross-product syntax in packer workflow (#2784 ) The nested comprehension `[($agents[] \| . as $a) \| ...]` is invalid jq. Use `[$agents[] as $a \| $clouds[] as $c \| ...]` instead. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 22:08:26 -07:00
Ahmed Abushagur	7289f3ef36	feat(hetzner): add snapshot support + Packer image builds (#2774 ) Some checks failed CLI Release / Build and release CLI (push) Failing after 31s Details Lint / ShellCheck (push) Successful in 40s Details Lint / Biome Lint (push) Failing after 14s Details Lint / macOS Compatibility (push) Successful in 18s Details CLI changes: - Add findSpawnSnapshot() to query Hetzner /images?type=snapshot API for pre-built spawn-{agent}-* images (matches by description prefix) - Add waitForSshOnly() for snapshot boots (skips cloud-init polling) - Update createServer() to accept optional snapshotId — boots from snapshot instead of ubuntu-24.04, skips cloud-init userdata - Wire up orchestrator with skipAgentInstall flag Packer changes: - Add packer/hetzner.pkr.hcl using hcloud plugin, mirroring the DO template (tier scripts, agent install, cleanup, manifest) - Unify packer-snapshots.yml to build both DO and Hetzner in a single workflow with cloud×agent matrix and per-cloud cleanup steps Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 16:46:48 -07:00
Ahmed Abushagur	34fc9b6d4d	fix: increase packer snapshot transfer timeout to 60m (#2648 ) * fix: increase packer snapshot transfer timeout to 60m The default 30m timeout is too short for transferring snapshots to distant DO regions (blr1, sgp1, syd1). This caused zeroclaw and kilocode builds to fail despite successful provisioning. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * revert: remove batch splitting from packer workflow DO droplet cap is no longer an issue — revert to single parallel build job for all agents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 04:48:11 -04:00
Ahmed Abushagur	1d61c77d95	fix: batch packer snapshot builds to avoid DO droplet cap (#2642 ) Splits the 8 agents into 2 sequential batches of 4 so we stay under DigitalOcean's concurrent droplet creation limit. Batch 2 waits for batch 1 to finish before starting. Single-agent builds are unaffected. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: A <258483684+la14-1@users.noreply.github.com>	2026-03-14 18:58:07 -07:00
A	d8ab5c4724	fix: add junie to Docker build matrix in docker.yml (#2644 ) The junie.Dockerfile was added in PR #2601 but the docker.yml workflow matrix was not updated, so no Docker image for junie was ever being built. Add junie to the agent list so ghcr.io/openrouterteam/spawn-junie gets built alongside all other agents. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-14 21:38:57 -04:00
A	6081c0a17f	feat(qa): telegram soak test on digitalocean + fix bun -e (#2547 ) - soak.sh: SOAK_CLOUD env var makes cloud configurable (default: sprite) - qa.sh: load TELEGRAM_BOT_TOKEN, TELEGRAM_TEST_CHAT_ID, SOAK_CLOUD from /etc/spawn-qa-auth.env in soak mode - qa.yml: add weekly Monday 3am UTC scheduled soak trigger - fix: bun eval → bun -e across soak.sh, key-request.sh, github-auth.sh (bun eval is not a valid subcommand in bun 1.3.9) - fix: export _TOKEN via env prefix so process.env._TOKEN works in bun -e - docs: update shell-scripts.md rule to say bun -e (not bun eval) Verified: 3/4 Telegram tests pass in smoke test on DigitalOcean (120s wait) getMe ✓ sendMessage ✓ getWebhookInfo ✓; cron test needs full 55-min window. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 19:45:18 -04:00
Ahmed Abushagur	330c10fcd2	feat: add Telegram soak test for OpenClaw (--soak mode) (#2492 ) Add a soak test that provisions OpenClaw on Sprite, waits 1 hour for stabilization, injects a Telegram bot token, and runs integration tests against the Telegram Bot API (getMe, sendMessage, getWebhookInfo). - New: sh/e2e/lib/soak.sh — soak test library with all Telegram-specific logic - Modified: sh/e2e/e2e.sh — add --soak flag to arg parser - Modified: qa.sh — add soak run mode (bypasses Claude, runs e2e.sh directly) - Modified: trigger-server.ts — add "soak" to VALID_REASONS - Modified: qa.yml — add soak to workflow_dispatch options Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: A <258483684+la14-1@users.noreply.github.com>	2026-03-11 05:51:53 -04:00
A	7af389387d	fix: eliminate release race condition causing 404 on cloud bundle downloads (#2475 ) The cli-release workflow was deleting releases before recreating them, leaving a window where users downloading cloud bundles (gcp.js, aws.js, etc.) would get a 404. This affected all clouds on every push to main. Switch to gh release upload --clobber which atomically replaces assets without removing the release, and only create releases if they don't already exist. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-10 18:13:13 -07:00
Ahmed Abushagur	7e2f9f45fc	fix: use Google Chrome .deb for OpenClaw browser tool (#2368 ) * fix: use Google Chrome .deb instead of Playwright for OpenClaw browser Snap Chromium on Ubuntu 24.04 fails because AppArmor confinement blocks CDP control. OpenClaw's own docs recommend installing Google Chrome via .deb package which bypasses snap entirely. Also adds browser.noSandbox and browser.executablePath to the OpenClaw config so the browser tool works out of the box on Linux VMs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unnecessary confirmation prompt when OAuth fails If OAuth didn't complete, the user obviously wants to paste a key. The "Paste your API key manually? (Y/n)" prompt was pointless friction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unnecessary "Continue anyway?" credential confirmation If the user selected a cloud, they obviously want to continue. The warning + setup guidance is sufficient — no need to block on a confirm. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: move Chrome install to configure step so it runs after tarball The tarball path skips agent.install() entirely, so Chrome never got installed. Moving it to configure() (setupOpenclawConfig) ensures it always runs regardless of install method. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: bundle Google Chrome in openclaw tarball Add Chrome .deb install to openclaw's tarball build so it ships pre-installed. Capture /usr/bin/google-chrome and /opt/google/chrome/ in the tarball. Add dl.google.com to the workflow domain allowlist. The configure() step still has a fallback install with idempotency check (command -v google-chrome) for non-tarball installs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use openclaw config set for browser setup + correct binary name - Use `google-chrome-stable` (actual .deb binary name) not `google-chrome` - Set browser config via `openclaw config set` CLI (the supported way) instead of writing JSON directly which wasn't being picked up - Remove browser section from JSON config to avoid conflicts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 01:52:08 -04:00
Ahmed Abushagur	7bebc6558f	feat: full marketplace compliance + automated Vendor API submission (#2295 ) Packer template: - Match official 90-cleanup.sh: remove SSH host keys, create revoked_keys, remove cloud-init instances, zero-fill free space, use --force-confold for upgrades, autoremove/autoclean - Add Packer manifest post-processor for snapshot ID extraction - Remove PACKER_LOG=1 (debug logging not needed in production) Workflow: - Add "Submit to DO Marketplace" step after successful build - Reads agent→app_id mapping from MARKETPLACE_APP_IDS secret (JSON) - Extracts snapshot ID from Packer manifest, PATCHes Vendor API - Gracefully handles 400 (app already pending review) - Skips silently if no MARKETPLACE_APP_IDS secret is configured Setup: add MARKETPLACE_APP_IDS secret as JSON, e.g.: {"claude":"60089fc6...", "codex":"60089fc7..."} App IDs come from the DO Vendor Portal after initial approval. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 16:40:04 -05:00
Ahmed Abushagur	d77a067aa4	fix: snapshot cleanup + claude install (name-prefix filter) (#2273 ) * fix: claude snapshot build — remove npm fallback from install command The native install (curl \| bash) succeeds but exits non-zero due to a PATH warning. The \|\| fallback then tries `npm install` which doesn't exist on the "minimal" tier → exit 127. Fix: replace npm fallback with binary existence check (same pattern as hermes agent). If install exits non-zero but ~/.local/bin/claude exists, the build succeeds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: snapshot cleanup and lookup — use name prefix instead of tags DO Packer builder `tags` only apply to the temporary build droplet, not the resulting snapshot image. Both the workflow cleanup step and the CLI's findSpawnSnapshot() were querying by `tag_name` which returned nothing — old snapshots piled up and the CLI couldn't find existing snapshots. Fix: filter by snapshot name prefix (`spawn-{agent}-`) instead of tags, in both the workflow and the CLI. Remove misleading `tags` from the Packer template. Add test cases for name-prefix filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 21:32:58 -08:00
A	3a1de9d4cf	refactor: remove packages/shared, deduplicate with CLI shared (#2257 ) * refactor: remove packages/shared, deduplicate with packages/cli/src/shared packages/shared duplicated packages/cli/src/shared (parse.ts, result.ts, type-guards.ts) with the CLI never importing from the shared package. The only consumer was .claude/skills/setup-spa, which now imports directly from packages/cli/src/shared via relative paths. - Delete packages/shared entirely - Update setup-spa imports to use relative paths to CLI shared - Remove @openrouter/spawn-shared workspace dependency from setup-spa - Update CLAUDE.md and type-safety.md references Agent: complexity-hunter Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: remove packages/shared from lint workflow, fix import sorting The Biome Lint CI step referenced packages/shared/src/ which no longer exists after this PR removes the package. Also fix import ordering in setup-spa files to satisfy Biome's organizeImports rule. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: address Devin review — update stale packages/shared references - Update type-safety.md line 67: packages/shared/src/parse.ts → packages/cli/src/shared/parse.ts - Update install.ps1 sparse-checkout: remove packages/shared reference Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-06 21:58:42 -05:00
Ahmed Abushagur	cefcd56327	feat: restore Packer DO snapshot pipeline for fast agent boot (#2262 ) Restores the nightly Packer snapshot build pipeline (reverted in #2205) that pre-bakes agent images as DigitalOcean snapshots. When a snapshot exists on the user's account, droplet boot skips cloud-init and tarball install entirely — cutting provisioning from ~10min to ~2min. - Add `packer/digitalocean.pkr.hcl` HCL2 template with multi-region distribution, apt-lock wait, and snapshot marker - Add `.github/workflows/packer-snapshots.yml` nightly build with matrix strategy, auto-cleanup of old snapshots, and injection-safe env var handling - Add `findSpawnSnapshot()` to query DO API for pre-built snapshots - Add `waitForSshOnly()` for snapshot boots (skip cloud-init wait) - Modify `createServer()` to accept optional `snapshotId` param - Wire snapshot detection in DO `main.ts` orchestrator - Add `skipAgentInstall` to `CloudOrchestrator` interface to skip tarball + install steps when booting from snapshot - Add 5 unit tests for snapshot lookup (happy path, empty, error, invalid ID, network failure) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 16:32:05 -08:00
Ahmed Abushagur	141254c4e1	feat: ARM tarball builds + arch-aware download (#2248 ) * feat: ARM tarball builds + arch-aware download - Add ARM64 matrix entries for native binary agents (zeroclaw, opencode, hermes, claude) in agent-tarballs.yml workflow - Update agent-tarball.ts to detect remote VM arch via uname -m and download the correct tarball (x86_64 or arm64) - Change release strategy to support multiple arch assets per tag - Document ARM build requirements in discovery.md for future agents - Bump CLI version to 0.15.2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use sudo for tarball extraction on non-root SSH clouds On AWS Lightsail, SSH connects as 'ubuntu' (not root), but tarballs extract to /root/. Without sudo, tar fails with "Permission denied". Conditionally use sudo when not running as root (id -u != 0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 17:10:33 -05:00
Ahmed Abushagur	ba9690ea23	fix: tarball workflow failures (root ownership, swapfile, hermes TTY) (#2240 ) - Use sudo mv + chown for tarball in release step (root-owned from capture) - Skip swapfile creation if /swapfile already exists (GitHub Actions runners) - Tolerate hermes setup wizard failure when /dev/tty unavailable in CI Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 05:48:46 -05:00
Ahmed Abushagur	8072c084c2	feat: pre-built agent tarballs for fast install (#2232 ) * feat: pre-built agent tarballs on GitHub Releases for fast install Adds a nightly GitHub Actions workflow that builds and uploads agent tarballs to rolling GitHub Releases. During provisioning, the CLI now attempts to download and extract a tarball before falling back to live install. Priority chain: snapshot > tarball > live install. - New workflow: .github/workflows/agent-tarballs.yml - New capture script: packer/scripts/capture-agent.sh - New module: packages/cli/src/shared/agent-tarball.ts - Orchestrate tries tarball first on non-local clouds - Skip tarball when using DO snapshot (skipTarball flag) - Tests for tarball install + orchestration integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use global.fetch mock pattern and address security review - Use `global.fetch = mock(...)` instead of `spyOn(globalThis, "fetch")` to match codebase convention and fix CI mock interception - Add URL validation regex to reject shell metacharacters (CRITICAL) - Add agent name validation in workflow input (MEDIUM) - Add `jq has()` check before executing install commands (CRITICAL) - Use `tar -T` instead of unquoted word-splitting in capture-agent.sh (MEDIUM) - Resolve merge conflicts with upstream/main (keep Docker fields, adapt to simplified DO flow, bump version to 0.15.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use globalThis.fetch for testability in CI Bun's native fetch binding doesn't go through global.fetch property lookup, so global.fetch = mock(...) doesn't intercept it. Using globalThis.fetch explicitly ensures the mock interception works. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add missing packer dependencies and harden install command safety - Add packer/agents.json (agent tier + install command definitions) - Add packer/scripts/tier-{minimal,node,bun,full}.sh (dependency scripts) - Add basic command safety check rejecting suspicious patterns - Document packer/agents.json as a trust boundary requiring PR review Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): fix npm prefix mismatch, add apt-get update, cleanup - Add apt-get update -y before apt-get install in all tier scripts - Add --prefix ~/.npm-global to npm install commands in agents.json so installed packages land where capture-agent.sh expects them - Rename misleading MARKER_DIR → MARKER_FILE in capture-agent.sh - Remove stale comment referencing packer snapshots in workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): detect empty agent installs in capture script The "no files found" check was dead code — the marker file is always created before filtering, so FILTERED_FILE always had at least one entry. Now we count non-marker entries to catch cases where the agent install silently fails and no actual files are on disk. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): use bare fetch() for Bun mock compatibility in CI In Bun, global.fetch = mock(...) overrides bare fetch() calls but NOT globalThis.fetch() calls. Every other source file in the codebase uses bare fetch() and their mocks work fine in CI. Switch to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): use dependency injection for fetch in tests Bun's global.fetch mock doesn't reliably intercept bare fetch() calls across all Bun versions in CI. Instead of fighting the runtime, accept an optional fetchFn parameter (defaults to fetch) and pass mock fetch directly in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): bypass mock.module bleed in agent-tarball tests orchestrate.test.ts uses mock.module("../shared/agent-tarball", ...) which is process-global in Bun and bleeds into agent-tarball.test.ts. Import via URL (import.meta.url resolution) to bypass the specifier- based mock matching and get the real module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tarballs): eliminate mock.module bleed between test files Bun's mock.module is process-global — orchestrate.test.ts mocking agent-tarball poisoned agent-tarball.test.ts (the mock function ignored the fetchFn parameter and always returned false). Fix: make tryTarballInstall injectable via OrchestrationOptions. orchestrate.test.ts passes the mock directly via options instead of using mock.module. agent-tarball.test.ts imports the real module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tests): mock Bun.which in credential priority tests Tests assumed no cloud CLIs were installed, but machines with hcloud/ doctl would get "CLI installed" hint overrides, failing the assertion. Spy on Bun.which to return null so tests are environment-independent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: fix import ordering after rebase Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: add curl domain allowlist and expand command blocklist Addresses security review findings: - Add domain allowlist for curl/wget targets (claude.ai, opencode.ai, raw.githubusercontent.com, registry.npmjs.org, crates.io, github.com) - Expand suspicious command blocklist (python -c, perl -e, ruby -e, dd, /dev/) - Document 4-layer security model in workflow comments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: add rm -rf to command blocklist Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 04:49:39 -05:00
Ahmed Abushagur	77c3e34803	feat(docker): replace Packer snapshots with Docker-based agent delivery (#2206 ) * feat(docker): replace Packer snapshots with Docker-based agent delivery Docker images on GHCR are public and cross-account, unlike DO snapshots which are private/account-scoped. Cloud-init installs Docker + pulls the agent image during boot. The install step extracts pre-built binaries via `docker cp` and falls back to normal install if unavailable. - Add Dockerfiles for all 7 agents (claude, codex, openclaw, opencode, kilocode, zeroclaw, hermes) - Convert docker.yml to matrix build for all agents - Add tryInstallFromDocker() shared helper with Docker-first install - Add Docker pull to DigitalOcean cloud-init userdata - Remove Packer snapshot pipeline, lookup, and SSH-only wait - Remove packer/ directory (HCL templates, tier scripts, agents.json) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: address review findings in docker agent delivery - Add agentName validation regex (/^[a-z0-9-]+$/) in digitalocean.ts before interpolation into cloud-init script - Quote dockerImage variable in all docker command strings in agent-setup.ts to prevent command injection - Restrict docker cp to specific known directories (.claude, .bun, .local, .npm, .cargo, .opencode) instead of blanket /root/. Agent: pr-maintainer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: B <6723574+louisgv@users.noreply.github.com>	2026-03-05 11:23:56 -05:00
Ahmed Abushagur	07c2c08e3a	revert: remove Packer snapshot pipeline (#2205 ) DO snapshots are private and account-scoped — users on different accounts cannot see snapshots built by the CI token. Docker images are the better approach for cross-account pre-built agents. Removes: packer/, packer-snapshots workflow, snapshot lookup code, and snapshot test. Reverts DO CLI to plain cloud-init flow. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 02:48:52 -05:00
Ahmed Abushagur	96ffb3e201	fix(packer): pass var file explicitly to packer build (#2203 ) Packer wasn't auto-loading build.auto.pkrvars.json, causing "Unset variable" errors. Pass it explicitly with -var-file. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 01:16:41 -05:00
Ahmed Abushagur	ed98a59318	feat(digitalocean): Packer nightly snapshot pipeline for fast boot (#2198 ) * feat(digitalocean): Packer nightly snapshot pipeline for fast boot Add pre-built Packer snapshots for DigitalOcean droplets. Instead of 10-20 min cloud-init + agent install on every boot, snapshot-based droplets boot in ~2-3 min (SSH only, agent pre-installed). - Packer HCL2 template with parametrized agent/tier builds - Agent build matrix (packer/agents.json) for all 7 agents - Tier scripts mirroring cloud-init.ts package tiers - Nightly GitHub Actions workflow (4 AM UTC, max-parallel: 3) - Automatic cleanup: keeps only latest snapshot per agent - CLI: findSpawnSnapshot() looks up pre-built images via DO API - CLI: waitForSshOnly() skips cloud-init when using snapshots - CLI: createServer() accepts optional snapshotId, skips user_data - CLI: main.ts routes to fast path when snapshot detected - Tests for findSpawnSnapshot() (5 cases, all passing) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(packer): use var-file for install_commands to avoid shell quoting issues The previous approach passed install_commands as `-var` inline, but GitHub Actions expands `${{ }}` before shell evaluation — JSON arrays with `\|`, `&&`, and `"` characters break shell quoting. Fix: generate a `.auto.pkrvars.json` file (auto-loaded by Packer) using jq with --argjson for safe JSON handling. Also route all `${{ inputs }}` and `${{ matrix }}` values through env vars to prevent script injection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 20:47:46 -08:00
L	61bcedc0eb	feat: migrate to openrouter.ai/labs/spawn CDN + release artifact version checks (#2178 ) * feat: migrate shell script URLs to openrouter.ai/labs/spawn CDN Users on older CLI versions can't auto-update because the repo was restructured (cli/ → packages/cli/), so old version-check URLs 404. This decouples the CLI from the repo's internal directory structure: - Shell script URLs (install, agent scripts, github-auth) now use openrouter.ai/labs/spawn/* as primary with GitHub raw as fallback - Version checks now use GitHub release artifact (cli-latest/version) as primary — a static URL that never changes regardless of repo layout - CI workflow updated to publish a `version` file alongside cli.js - Remove GITHUB_RAW_URL_PATTERN validation (no longer needed since install URL is now a hardcoded CDN string, not interpolated) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: fix biome formatting in update-check test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: CLAUDE.md says biome lint but should say biome check biome lint only checks lint rules, not formatting. biome check does both. The hooks and CI already run biome check — the docs were out of sync. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(hooks): PostToolUse hook wasn't running biome on CLI source files Two bugs in validate-file.ts: 1. Config search only checked 1-2 levels up from the edited file, but biome.json is at packages/cli/ — 3 levels above src/__tests__/*.ts. Fix: walk up directories until biome.json is found (or hit root). 2. Ran `biome format` (prints formatted output, always exits 0) instead of `biome format --check` (exits non-zero if file needs formatting). Fix: use `biome check` which does lint + format check in one pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-03 23:34:58 -08:00
A	446923c447	refactor: extract inline hook commands to TypeScript scripts (#2174 ) * refactor: extract inline hook commands to TypeScript scripts in .claude/scripts/ Replace long inline `bash -c '...'` one-liners in .claude/settings.json with standalone TypeScript scripts that are easier to read, debug, and maintain: - enforce-worktree.ts: PreToolUse hook ensuring edits happen in worktrees - validate-file.ts: PostToolUse hook for .sh/.ts file validation - pre-merge-check.ts: PreToolUse hook running biome + tests before merge Add .claude/scripts as a bun workspace package (@spawn/hooks). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: replace manual typeguards with valibot schemas in hook scripts - Extract shared schemas (FilePathInput, CommandInput, parseStdin) to schemas.ts - Replace inline multi-level typeof/in checks with v.safeParse() calls - Add valibot dependency to @spawn/hooks package - Add CLAUDE.md rule: always prefer valibot over manual typeguards, share schemas Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: split CLAUDE.md into modular .claude/rules/ files Split the 437-line monolithic CLAUDE.md into a lean 89-line project overview plus 9 focused rules files in .claude/rules/ (auto-loaded by Claude Code): - culture.md — embrace bold changes, parallelize, verify exhaustively - shell-scripts.md — curl\|bash compat, macOS bash 3.x, ESM only, bun not python - type-safety.md — no `as` assertions, ALWAYS use valibot (never manual typeguards) - testing.md — bun:test only, no vitest, no subprocess spawning - git-workflow.md — worktree-first mandatory workflow - autonomous-loops.md — discovery/refactor service architecture - discovery.md — how to fill matrix gaps, add clouds/agents - documentation.md — never commit docs, use .docs/ - cli-version.md — bump version on every CLI change The type-safety rule now explicitly mandates valibot schemas over manual typeguard chains in all cases beyond single-primitive narrowing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(lint): run biome check across all packages in CI The lint workflow only checked packages/cli/src/. Now it checks all TypeScript locations in a single biome check command: - packages/cli/src/ (with GritQL plugins) - packages/shared/src/ (new biome.json) - .claude/scripts/ (new biome.json) - .claude/skills/setup-spa/ Fixed all pre-existing lint/format errors: - node: protocol on all Node.js built-in imports in hook scripts - useBlockStatements in packages/shared/src/type-guards.ts - expand formatting in .claude/skills/setup-spa/main.ts and spa.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-03 23:05:41 -08:00
A	d04096a15b	feat!: remove Fly.io cloud provider support (#1979 ) * feat!: remove Fly.io cloud provider support Drop Fly.io as a supported cloud provider. Sprite (which uses Fly.io infrastructure internally) is retained. - Delete packages/cli/src/fly/ module, sh/fly/ scripts, fixtures/fly/ - Remove fly cloud entry and 6 fly matrix entries from manifest.json - Remove fly imports, destroy cases, and connection handlers from commands.ts - Remove fly-ssh sentinel from security.ts - Port E2E test suite from Fly.io to AWS Lightsail (fly-e2e.sh → aws-e2e.sh) - Update README (7 clouds, 42 combinations), CLAUDE.md, and skill prompts - Clean up fly references in build config, gitignore, icon sources - Bump CLI version to 0.11.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: restore Docker image build under sh/docker/ Move openclaw Dockerfile from sh/fly/docker/ to sh/docker/ and rename workflow from fly-docker.yml to docker.yml with updated paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix extra blank lines in commands.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: spawn-bot <spawn-bot@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com>	2026-02-27 00:06:32 -05:00
A	9e54f0cf57	ci: add Mock Tests job to satisfy required status check (#1904 ) * ci: add Mock Tests job to satisfy required status check Split the unit-tests job into mock-tests (runs bun test) and unit-tests (verifies cloud bundles build). The repo ruleset requires "Mock Tests", "Unit Tests", and "Biome Lint" checks — the missing "Mock Tests" job was blocking all PR merges. Fixes #1901 Agent: issue-fixer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * style: fix pre-existing Biome format issues in 9 files Auto-applied Biome formatter to src/ to resolve failing "Biome Lint" required status check. No logic changes — formatting only. Agent: issue-fixer Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: B <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-25 00:54:33 -05:00
A	b2bddc4ba5	ci: bump QA cron from daily to every 4 hours (#1895 ) Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 16:46:55 -08:00
A	98a0d0f68f	feat(qa): add e2e-tester as subagent in scheduled quality sweep (#1894 ) E2E tests now run as a 4th teammate alongside test-runner, dedup-scanner, and code-quality-reviewer during schedule-triggered QA cycles. The standalone e2e mode is preserved for on-demand use. - Add e2e-tester teammate to qa-quality-prompt.md - Increase quality mode timeout from 35 to 40 min - Add "e2e" to trigger-server valid reasons - Re-enable daily schedule in qa.yml, default to "schedule" Co-authored-by: spawn-bot <spawn-bot@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 19:34:35 -05:00
A	58c91571f1	fix: make macOS compat linter blocking and add 6 missing rules (#1867 ) The linter was running in CI with --warn-only, meaning it never blocked anything — effectively vaporware. This removes --warn-only to make it a real gate. Also adds rules for bash 4.0+ features that were documented in CLAUDE.md but not enforced: - MC014: readarray/mapfile (bash 4.0+) - MC015: coproc (bash 4.0+) - MC016: &>> redirect (bash 4.0+) - MC017: relative source paths (breaks curl\|bash) - MC018: wait -n (bash 4.3+) - MC019: declare -g (bash 4.2+) Excludes .claude/worktrees/ from scanning (temp copies, not committed code). Co-authored-by: spawn-bot <spawn-bot@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 03:50:12 -05:00
A	65f6f1be32	feat: Bun workspace monorepo — packages/cli + packages/shared (#1853 ) Restructure the repo as a Bun workspace monorepo: - Move cli/ → packages/cli/ - Create packages/shared/ (@openrouter/spawn-shared) with type-guards and parse utilities - Add root package.json with workspace configuration - Update all CLI imports to use @openrouter/spawn-shared - Deduplicate toRecord/toObjectArray helpers from 4 cloud modules - Update SPA (slack-bot) to use shared package instead of local toObj() - Update 48 agent shell scripts for new packages/cli/ path - Update install.sh, install.ps1, e2e, and test scripts - Update all GitHub workflows, .gitignore, pre-commit hooks - Update CLAUDE.md, README.md, and skill prompt references - Pin all dependency versions (no ^ ranges) - Bump CLI version 0.9.1 → 0.10.0 All 1908 tests pass. Lint clean. All 8 cloud bundles build. Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-23 22:07:05 -08:00
A	b84adfb74e	refactor: move all shell scripts to /sh directory (#1843 ) Reorganizes the project so all shell scripts live under a dedicated /sh directory, enabling the OpenRouter rewrite URL to point at /sh/ instead of the repository root. Moves: - cli/install.sh → sh/cli/install.sh - shared/.sh → sh/shared/.sh - {cloud}/{agent}.sh → sh/{cloud}/{agent}.sh (48 scripts) - {cloud}/README.md → sh/{cloud}/README.md - e2e/.sh → sh/e2e/.sh - test/macos-compat.sh → sh/test/macos-compat.sh - test/fixtures/*/.sh → sh/test/fixtures/*/.sh Updates all references: - RAW_BASE path construction in commands.ts, update-check.ts - GitHub auth URL in agent-setup.ts - Self-referencing URLs in install.sh, github-auth.sh - CI workflow paths in lint.yml, cli-release.yml - Test file paths in install-script-validation, manifest-integrity - Documentation in README.md, cli/README.md, CLAUDE.md - QA scripts in .claude/skills/ Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-23 21:14:54 -08:00
A	fccb73d147	feat: add Fly.io E2E test suite and QA e2e mode (#1823 ) - Add e2e/ directory with fly-e2e.sh orchestrator and lib/ helpers (provision, verify, teardown, cleanup) that provision real Fly.io VMs, verify agent installation, and tear everything down - Fix openclaw E2E failure by setting MODEL_ID=openrouter/auto to bypass interactive model selection prompt in headless mode - Add e2e mode to qa.sh (reason=e2e) that launches a Claude agent to run the E2E suite and investigate/fix any failures - Update qa.yml with reason dropdown (e2e/schedule/fixtures), kept disabled Co-authored-by: spawn-bot <spawn-bot@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 19:31:44 -05:00
A	aa88e70488	fix: add concurrency guard and workflow_dispatch to CLI release (#1812 ) The race condition: two PRs merged 3 seconds apart both triggered the CLI Release workflow. The second run (v0.7.12) finished last and overwrote the release with a stale binary, even though the repo HEAD was at v0.8.0. - Add concurrency group so concurrent releases cancel the older one - Add workflow_dispatch trigger for manual re-runs Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-23 13:52:17 -05:00
A	a26d27f139	style: enforce biome format across codebase, add CI check (#1794 ) Run `biome format --write` on all 98 source files (38 needed fixes). The main change: object literals and long argument lists are now expanded onto separate lines per Biome's `"expand": "always"` setting, making code much easier to scan on narrow screens. Add `biome format` check step to CI lint workflow so formatting regressions are caught on every PR. Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-22 23:32:12 -08:00
A	ec210e37af	fix: Result monad for retry logic — prevent duplicate server creation (#1771 ) * fix: Result monad for retry logic — prevent duplicate server creation SSH exit 255 after an interactive session caused runWithRetries to retry the entire bash script, creating duplicate servers. The old withRetry also blindly retried all errors including timeouts where the remote command may have already completed. Introduces a Result<T> monad (Ok/Err) so callers explicitly signal whether a failure is retryable (return Err) or fatal (throw). Adds wrapSshCall() that classifies SSH errors: transient connection failures are retryable, timeouts are not. Removes retry loop from the top-level script runner entirely since it spans server creation + interactive session. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: mandate draft-PR-first workflow for all changes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add biome lint to CI and pre-commit hook, fix lint violations - Add Biome lint job to .github/workflows/lint.yml - Add TypeScript lint check to .githooks/pre-commit - Fix useBlockStatements violations in ui.ts and tests - Add biome lint to CLAUDE.md "After Each Change" checklist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: rename Result.value to Result.data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: clean up stale pre-commit hook - Remove dead check for deleted functions (write_oauth_response_file, create_oauth_response_html) — they no longer exist in the codebase - Fix early exit skipping Biome lint when no .sh files are staged - Replace echo -e with printf (the hook was using the pattern it bans) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve biome lint errors blocking CI - Fix useImportType: import { type Result } → import type { Result } - Fix noUnusedImports: remove unused KNOWN_FLAGS import - Fix noUnusedTemplateLiteral: template literal → string literal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-22 20:39:42 -05:00
A	60986e5a05	refactor: remove shared/common.sh and 27 subprocess-heavy test files (#1728 ) shared/common.sh (3852 lines) was dead code — the entire architecture was rewritten to TypeScript in cli/src/. No agent scripts source it anymore. The only consumer was github-auth.sh which just needed 4 log functions (now inlined). Remove 27 test files that spawned ~800+ real bash/bun subprocesses per run (the root cause of slow bun test). Every shared-common-*.test.ts file forked a real bash shell per test case to source shared/common.sh. CLI subprocess tests spawned `bun run index.ts` per assertion. These were integration tests, not unit tests. Also removes: - mock-tests CI job from test.yml (ran test/mock.sh which opens browser) - Stale plan files referencing deleted infrastructure - All CLAUDE.md/README.md references to the old lib/common.sh pattern Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-22 11:32:27 -08:00
A	33bd3e615c	chore: disable QA workflow schedule until VM is fixed (#1722 ) Keep workflow_dispatch for manual testing. Re-enable cron when the QA VM is back online. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-22 11:06:50 -08:00
A	0f4df7be71	feat: pre-built Docker image for OpenClaw on Fly.io (#1686 ) Eliminates the slow waitForCloudInit() + bun install phase by booting a pre-built image with Node.js, bun, and openclaw already installed. The image is rebuilt daily via GitHub Actions to pick up new releases. Other agents are unaffected — they still use ubuntu:24.04 + cloud-init. Co-authored-by: spawn-bot <spawn-bot@openrouter.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 02:50:46 -05:00
A	262d081756	refactor: move fly TS into cli/src/fly/, add build-clouds.sh (#1604 ) Move all fly TypeScript files from fly/lib/.ts and fly/main.ts into cli/src/fly/. This gives them access to cli/node_modules (@clack/prompts), biome linting, and the existing bun:test infrastructure — no symlinks or NODE_PATH hacks needed. The org picker now uses @clack/prompts select() directly (static import, bundled at build time). New: cli/build-clouds.sh — auto-discovers cli/src//main.ts and bundles each into {cloud}.js. Scalable to future cloud TS migrations: bash cli/build-clouds.sh # build all bash cli/build-clouds.sh fly # build one Shims now check for cli/src/fly/main.ts (local) or download fly.js from GitHub releases (remote curl\|bash). Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-21 12:34:09 -08:00
Ahmed Abushagur	f2795a6d84	fix: Node.js v22 upgrade, aider uv install, SSH & cloud reliability (#1440 ) * fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds aider-chat on Python 3.13 fails with `ImportError: cannot import name '_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved — those releases have no Python 3.13 binary wheels, so the C extension is missing at runtime. Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>` and single quotes get mangled by `printf '%q'` in run_server before the command reaches the remote machine) with `--upgrade`, which forces all transitive deps including Pillow to their latest compatible versions. Also adds a plain-text echo before the install so users see progress instead of a silent hang during the 2-4 minute install. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: update aider/gptme/interpreter assertions from pip to uv The install method for aider, gptme, and open-interpreter was changed from pip to `uv tool install` across all clouds. The mock test assertions still checked for the old `pip.install.` patterns, causing 9 failures (3 agents × 3 clouds). Update patterns to match the actual `uv tool install` commands now used in all cloud scripts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger test run for uv assertion fix * fix: prevent SSH hangs, restore stderr, fix command escaping across clouds - Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH stdin theft causing sequential install/verify/configure steps to hang - Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default SSH_OPTS so long-running installs don't silently drop on flaky networks - Remove 2>/dev/null from Fly.io run_server so remote command errors are no longer silently swallowed (--quiet flag still suppresses flyctl noise) - Fix Fly.io printf '%q' double-quoting: remove extra quotes around $escaped_cmd that prevented the remote shell from consuming escapes, breaking && \|\| \| operators in commands - Remove broken printf '%q' from Daytona run_server and interactive_session where it escaped shell operators into literal characters since daytona exec has no intermediate shell layer - Pin aider to --python 3.12 instead of --with audioop-lts across all clouds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add --pty to fly ssh console for interactive sessions fly ssh console -C does not allocate a pseudo-terminal by default, causing interactive TUI agents (aider, claude) to fail with "Input is not a terminal (fd=0)" or completely unresponsive input. Adding --pty forces PTY allocation, matching how other clouds handle interactive sessions (SSH uses -t, Sprite uses -tty). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prepend ~/.local/bin to PATH in ssh_run_server After uv installs to ~/.local/bin, the current shell session doesn't have it in PATH, causing "uv: command not found" on DigitalOcean and all other SSH-based clouds (Hetzner, AWS, GCP, OVH). Fly.io's run_server already prepends this PATH — now the shared ssh_run_server does the same, fixing all SSH-based clouds at once. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add Node.js to cloud-init for all cloud providers npm-based agents (codex, kilocode, etc.) fail with "npm: command not found" because Node.js isn't installed during cloud-init. Fly.io was the only provider installing Node.js (in wait_for_cloud_init). Now all cloud-init scripts install Node.js v22 LTS from nodesource, matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and GCP cloud-init (was already in shared/DigitalOcean/Hetzner). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use apt packages for nodejs/npm instead of nodesource The nodesource setup script (setup_22.x) runs its own apt-get update and repository configuration, nearly doubling cloud-init time and causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm in its default repos — just add them to the packages list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add timeouts and better error handling to Daytona CLI commands Daytona CLI commands (login, list, create) can hang indefinitely when the API is slow or unreachable. This causes: - "Failed to create sandbox: timeout" with no recovery - Token validation timeouts misreported as "invalid token" - Users re-entering valid tokens that also timeout Fixes: - Wrap all daytona CLI calls with timeout (30s for auth, 120s for create) - Detect timeout errors separately from auth errors - Show actionable "try again / check status" messages for timeouts - Add nodejs/npm to Daytona wait_for_cloud_init Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: set DAYTONA_API_URL to Daytona Cloud by default The Daytona CLI may default to connecting to a local self-hosted server instead of Daytona Cloud. Without DAYTONA_API_URL set to https://app.daytona.io/api, every CLI command (login, list, create) hangs trying to reach a non-existent local server and times out. The SDK documents this as the default, but the CLI doesn't always pick it up — now we export it explicitly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing n installs Node.js v22 to /usr/local/bin/node but apt's v18 at /usr/bin/node can shadow it in non-interactive SSH sessions. After n 22, symlink the new binaries over the apt ones so v22 is always resolved. Also fix hcloud CLI token extraction for new TOML format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review, add curl timeouts to trigger workflows - Fix ssh_run_server command injection concern: use single-quoted path_prefix so $HOME/$PATH expand remotely, not locally - Add --connect-timeout 15 --max-time 30 to trigger workflows to prevent 5-min hangs when server streams responses - Handle 409 (dedup) as success — expected when cron fires every 15min but cycles take 35min - Reduce workflow timeout-minutes from 5 to 2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-18 06:54:07 -05:00
Ahmed Abushagur	22b6a402f4	feat: E2E test harness, QA pipeline integration, macOS compat linter (#1425 ) * feat: add QA upgrade — macOS compat linter, per-agent mock assertions Layer 1: macOS compat linter (test/macos-compat.sh) - 12 rules (MC001–MC012) catching bash 3.2 incompatibilities - Detects: base64 -w0 file args, non-portable echo flags, source <(), ((var++)), read -d, nounset flag, sed -i, date %N, local -n, declare -A, ${var,,}, and \|& - Added to CI lint.yml in warn-only mode for burn-in - Integrated as Phase 0.5 in qa-dry-run.sh Layer 2: Per-agent mock assertions - test/fixtures/_shared_agent_assertions.sh with install checks for all 15 agents (claude, openclaw, aider, goose, etc.) - Integrated into test/mock.sh via _run_agent_assertions() Also includes branch fixes: - Fix base64 -w0 to use stdin redirect (aws, daytona, fly) - Fix fly/openclaw to use npm install instead of broken curl\|bash Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add E2E test harness and integrate into QA pipeline Add test/e2e.sh — a full E2E test harness that provisions real servers, installs agents, and verifies setup across all clouds. Features: - Smoke test (one canary agent per cloud) and full matrix modes - Credential auto-detection for 8 clouds - Per-cloud preflight validation (sequential) then parallel agent tests - Stale server cleanup, timing history, cross-cloud comparison - Auto-fix and optimization phases via Claude agents - macOS bash 3.2 compatible Integrate E2E as Phase 5 in both qa-cycle.sh and qa-dry-run.sh: - Runs after mock tests pass, gated on cloud credentials - Phase 5b auto-fixes failures using per-agent worktree branches - Parses results and includes in QA summary Also fixes: - shared/common.sh: honour SPAWN_NON_INTERACTIVE=1 in safe_read() - aws/lib/common.sh: fix SSH key import (use cat instead of base64, handle race condition on concurrent imports) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 20:41:07 -05:00
L	6e13256d96	refactor: simplify claude launch — no streaming, no output monitoring (#1412 ) Replace the complex claude launch pattern (subshell + PID file + tee pipe + stream-json + 50-line watchdog monitoring log file growth + session-end detection) with a simple direct launch: claude -p "..." >> "${LOG_FILE}" 2>&1 & The watchdog is now just a wall-clock timeout. The idle-output detection, stream-json result parsing, and tee piping are all removed. Also remove GitHub Actions concurrency groups — the trigger server already handles dedup (409 for same issue, 409 for same reason), making the GH Actions concurrency groups redundant queuing. Changes: - refactor.sh: simple launch + wall-clock-only watchdog - security.sh: same simplification - discovery.sh: same (refactored _kill_claude_process and _run_watchdog_loop to simpler signatures) - All 4 workflows: remove concurrency groups Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-17 09:02:47 -08:00
L	f3cfe890f7	refactor: simplify trigger server to fire-and-forget + fix monitoring loop prompts (#1384 ) The trigger server streamed script stdout back to GitHub Actions via a long-lived HTTP response, requiring --http1.1, heartbeat injection, server.timeout(req, 0), createEnqueuer, drainStreamOutput, and 90-min GH Actions timeouts. In practice GitHub Actions is just a dumb trigger — the real state lives on the VM (log files, journalctl). Simplify to fire-and-forget: spawn script, return 200 JSON immediately. Also fix the refactor and discovery team lead monitoring loops. The prompts buried the loop in a single compressed line that the model ignored (doing Bash("sleep 10") repeatedly without calling TaskList). Replace with a dedicated "Monitor Loop (CRITICAL)" section with numbered steps, matching the security.sh pattern that actually works. Changes: - trigger-server.ts: remove ~150 lines of streaming code (createEnqueuer, drainStreamOutput, startStreamingRun, heartbeat, ReadableStream), replace with startFireAndForgetRun (stdout: "inherit", immediate JSON) - All 4 workflows: simple curl POST, timeout-minutes 90→5, remove --http1.1/-N/--max-time/exit-code handling - refactor.sh: add Monitor Loop (CRITICAL) section with numbered steps - discovery-team-prompt.txt: same Monitor Loop fix - SKILL.md: update architecture docs, remove streaming sections Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-17 10:47:52 -05:00
A	99a9badf62	ci: increase refactor team frequency to every 15 minutes (#1378 ) Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-16 20:50:03 -08:00
Ahmed Abushagur	3fbdf56c4c	fix: add guardrails to prevent bots from inventing unnecessary work (#1347 ) - Add team lead pre-approval gate: teammates spawn in plan mode and must get approval before creating any PR (hard gate, not just prompt rules) - Add diminishing returns rule: default posture is "code is good, shut down" - Add dedup rule: check for existing open/closed PRs before creating new ones - Require concrete PR justification (what breaks without this change) - Add off-limits files list (.github/workflows, .claude/skills, CLAUDE.md) - Use git pathspec exclusions in refactor.sh to never stage protected files - Constrain pr-maintainer to only act on approved or feedback PRs - Reduce refactor cron from every 5 minutes to every 2 hours Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 20:24:25 -05:00
A	a4fe0388c1	fix: allow repo collaborators through the gate workflow (#1166 ) Previously only org members were allowed. Now checks both org membership and repo collaborator status, so invited collaborators can open issues and PRs without being blocked. Co-authored-by: lab <6723574+louisgv@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-14 18:32:50 -08:00

1 2

97 commits