agent-tarballs.yml has been failing nightly since 2026-03-27 and
packer-snapshots.yml since 2026-04-25. Two distinct breakages.
cursor:
capture-agent.sh's allowlist was missing cursor, so the install
step succeeded but the capture step rejected the agent name.
Adds cursor to the allowlist plus its capture paths
(~/.local/bin/ for the `agent` symlink, ~/.local/share/cursor-agent/
for the extracted package, matching what verify.sh and cursor-proxy
already expect).
hermes:
The upstream installer launches an interactive setup wizard after
install, which fails in CI with `/dev/tty: No such device or
address`. Production code already passes `--skip-setup` (see
packages/cli/src/shared/agent-setup.ts:1336); packer/agents.json
was the lone exception. Adds the same flag.
Both pipelines read from packer/agents.json, so this single edit
unblocks both the daily tarball build and the DO marketplace image
build for hermes.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces all references to DO_API_TOKEN with DIGITALOCEAN_ACCESS_TOKEN,
matching DigitalOcean's official CLI and API documentation. This includes
TypeScript source, tests, shell scripts, Packer config, CI workflows,
and documentation.
Supersedes #3068 (rebased onto current main).
Agent: pr-maintainer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Cursor CLI installs a native binary via curl, so it needs both x86_64
and arm64 builds. Also adds cursor.com to the allowed domains list.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
* fix: destroy orphaned Packer builder instances on workflow cancel
When a Packer Snapshots workflow is cancelled mid-build, Packer's process
is killed before it can clean up its temporary builder droplet/server.
This leaves orphaned packer-* instances running and costing money.
Add `if: cancelled()` cleanup steps for both DigitalOcean and Hetzner
that destroy any packer-* prefixed instances after cancellation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: remove Hetzner cleanup step — only DO needed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove Hetzner from Packer snapshots, add cancel cleanup
Remove Hetzner from the Packer workflow entirely — only DigitalOcean
snapshots are built. Deletes packer/hetzner.pkr.hcl and simplifies the
workflow by removing all Hetzner-specific steps and cloud conditionals.
Also adds a cancelled() cleanup step that destroys orphaned packer-*
builder droplets when a workflow run is cancelled mid-build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add missing sprite-keep-running.sh script
The keep-alive install was 404ing because sh/shared/sprite-keep-running.sh
never existed in the repo. The TypeScript code downloaded it from the CDN
(which maps to sh/shared/) but the file was never created.
The script wraps a command and pings the sprite's own public URL every 30s
to prevent inactivity shutdown. It resolves the URL via sprite-env info
(available on all sprites) and falls back to exec without keep-alive if
the URL can't be determined.
Also removes Hetzner from the Packer snapshots workflow entirely — only
DigitalOcean snapshots are built.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address security review — scope cleanup filter, fix JSON injection
1. Add `spawn-packer` tag to DO builder droplets in Packer template and
filter cleanup by tag instead of broad `packer-` name prefix. Prevents
accidentally destroying builder instances from other concurrent builds.
2. Use `jq --arg` for SINGLE_AGENT_INPUT instead of string interpolation
to prevent JSON injection via crafted agent names.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cx23 is only available in Helsinki — poor availability. Switch to
cpx22 (AMD, 2 vCPU, 4GB) which is available in nbg1/hel1/sin.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CLI changes:
- Add findSpawnSnapshot() to query Hetzner /images?type=snapshot API
for pre-built spawn-{agent}-* images (matches by description prefix)
- Add waitForSshOnly() for snapshot boots (skips cloud-init polling)
- Update createServer() to accept optional snapshotId — boots from
snapshot instead of ubuntu-24.04, skips cloud-init userdata
- Wire up orchestrator with skipAgentInstall flag
Packer changes:
- Add packer/hetzner.pkr.hcl using hcloud plugin, mirroring the DO
template (tier scripts, agent install, cleanup, manifest)
- Unify packer-snapshots.yml to build both DO and Hetzner in a single
workflow with cloud×agent matrix and per-cloud cleanup steps
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: increase packer snapshot transfer timeout to 60m
The default 30m timeout is too short for transferring snapshots to
distant DO regions (blr1, sgp1, syd1). This caused zeroclaw and
kilocode builds to fail despite successful provisioning.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* revert: remove batch splitting from packer workflow
DO droplet cap is no longer an issue — revert to single parallel build
job for all agents.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Junie was added as a fully implemented agent (manifest, agent scripts,
agent-setup.ts) but the packer/tarball pipeline was never updated.
This meant the nightly agent-tarballs workflow could not build a
pre-built tarball for Junie, forcing all deployments to do a live
npm install.
- Add junie entry to packer/agents.json (tier: node, @jetbrains/junie-cli)
- Add junie to capture-agent.sh allowlist and path-capture case
(npm-based, same as codex/kilocode — captures /root/.npm-global/)
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- wget not available on many cloud VMs, use curl instead
- Remove 2>/dev/null from dpkg/apt so install errors are visible
- Capture /usr/bin/google-chrome-stable in tarball (actual .deb binary name)
- Use curl in packer/agents.json tarball build too
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
* fix: use Google Chrome .deb instead of Playwright for OpenClaw browser
Snap Chromium on Ubuntu 24.04 fails because AppArmor confinement blocks
CDP control. OpenClaw's own docs recommend installing Google Chrome via
.deb package which bypasses snap entirely.
Also adds browser.noSandbox and browser.executablePath to the OpenClaw
config so the browser tool works out of the box on Linux VMs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: remove unnecessary confirmation prompt when OAuth fails
If OAuth didn't complete, the user obviously wants to paste a key.
The "Paste your API key manually? (Y/n)" prompt was pointless friction.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: remove unnecessary "Continue anyway?" credential confirmation
If the user selected a cloud, they obviously want to continue.
The warning + setup guidance is sufficient — no need to block on a confirm.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: move Chrome install to configure step so it runs after tarball
The tarball path skips agent.install() entirely, so Chrome never got
installed. Moving it to configure() (setupOpenclawConfig) ensures it
always runs regardless of install method.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: bundle Google Chrome in openclaw tarball
Add Chrome .deb install to openclaw's tarball build so it ships
pre-installed. Capture /usr/bin/google-chrome and /opt/google/chrome/
in the tarball. Add dl.google.com to the workflow domain allowlist.
The configure() step still has a fallback install with idempotency
check (command -v google-chrome) for non-tarball installs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use openclaw config set for browser setup + correct binary name
- Use `google-chrome-stable` (actual .deb binary name) not `google-chrome`
- Set browser config via `openclaw config set` CLI (the supported way)
instead of writing JSON directly which wasn't being picked up
- Remove browser section from JSON config to avoid conflicts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Three distinct E2E bugs fixed:
1. SSH key generation race condition: When multiple agents provision in
parallel, concurrent processes all call generateSshKey() and race to
create ~/.ssh/id_ed25519. ssh-keygen won't overwrite an existing file
(prompts on stdin which is "ignore"), causing zeroclaw/codex to fail
with "SSH key generation failed". Fix: check if key already exists
before generating, and re-check after a failed generation attempt.
2. Hetzner SSH key 409 uniqueness_error: The Hetzner API returns HTTP 409
with "SSH key not unique" when the same key content is registered under
a different name. The hetznerApi() function throws on non-2xx before
the error-parsing code runs, and the regex /already/ didn't match
"not unique". Fix: catch 409 in ensureSshKey() and match against
uniqueness_error/not unique/already patterns.
3. Hermes binary not found: The hermes install script (uv tool) creates
the actual binary + venv at ~/.hermes/hermes-agent/venv/ with a symlink
at ~/.local/bin/hermes. The tarball capture script only captured the
symlink + ~/.local/share/, leaving a dangling symlink. Fix: include
~/.hermes/ in capture paths, add venv/bin to verify.sh PATH check,
and update hermes launchCmd to include the venv PATH.
Fixes#2304
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Packer template:
- Match official 90-cleanup.sh: remove SSH host keys, create
revoked_keys, remove cloud-init instances, zero-fill free space,
use --force-confold for upgrades, autoremove/autoclean
- Add Packer manifest post-processor for snapshot ID extraction
- Remove PACKER_LOG=1 (debug logging not needed in production)
Workflow:
- Add "Submit to DO Marketplace" step after successful build
- Reads agent→app_id mapping from MARKETPLACE_APP_IDS secret (JSON)
- Extracts snapshot ID from Packer manifest, PATCHes Vendor API
- Gracefully handles 400 (app already pending review)
- Skips silently if no MARKETPLACE_APP_IDS secret is configured
Setup: add MARKETPLACE_APP_IDS secret as JSON, e.g.:
{"claude":"60089fc6...", "codex":"60089fc7..."}
App IDs come from the DO Vendor Portal after initial approval.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add whitelist validation for AGENT_NAME immediately after the empty
check to prevent command injection and path traversal via the parameter.
While the existing case statement catches unknown agents, explicit
upfront validation makes the security intent clear and defensive.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Three fixes for marketplace validation failures:
1. Install all security updates (apt-get dist-upgrade) — img_check
fails if any security patches are pending.
2. Purge droplet-agent and /opt/digitalocean — img_check fails if
the DO monitoring agent directory exists.
3. Correct img_check.sh filename to 99-img-check.sh — the previous
URL returned 404.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The marketplace-partners repo uses `99-img-check.sh`, not
`img_check.sh`. The wrong filename caused a 404 on curl download,
failing all agent builds with exit code 22.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: claude snapshot build — remove npm fallback from install command
The native install (curl | bash) succeeds but exits non-zero due to a
PATH warning. The || fallback then tries `npm install` which doesn't
exist on the "minimal" tier → exit 127.
Fix: replace npm fallback with binary existence check (same pattern
as hermes agent). If install exits non-zero but ~/.local/bin/claude
exists, the build succeeds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: snapshot cleanup and lookup — use name prefix instead of tags
DO Packer builder `tags` only apply to the temporary build droplet,
not the resulting snapshot image. Both the workflow cleanup step and
the CLI's findSpawnSnapshot() were querying by `tag_name` which
returned nothing — old snapshots piled up and the CLI couldn't find
existing snapshots.
Fix: filter by snapshot name prefix (`spawn-{agent}-`) instead of
tags, in both the workflow and the CLI. Remove misleading `tags`
from the Packer template. Add test cases for name-prefix filtering.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: packer build failures — OOM kill + history builtin
Two issues introduced by PR #2271 (marketplace compliance):
1. Droplet downsized to s-1vcpu-1gb (1GB RAM) — Claude's native
installer and zeroclaw's Rust build get OOM-killed. Restore
s-2vcpu-2gb.
2. Cleanup provisioner uses `history -c` which is a bash builtin.
Packer runs scripts with /bin/sh (dash on Ubuntu) which doesn't
have it → exit 127 on ALL agents. Remove it — the .bash_history
file deletion already handles persistent history.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: claude snapshot build — remove npm fallback from install command
The native install (curl | bash) succeeds but exits non-zero due to a
PATH warning. The || fallback then tries `npm install` which doesn't
exist on the "minimal" tier → exit 127.
Fix: replace npm fallback with binary existence check (same pattern
as hermes agent). If install exits non-zero but ~/.local/bin/claude
exists, the build succeeds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: snapshot cleanup and lookup — use name prefix instead of tags
DO Packer builder `tags` only apply to the temporary build droplet,
not the resulting snapshot image. Both the workflow cleanup step and
the CLI's findSpawnSnapshot() were querying by `tag_name` which
returned nothing — old snapshots piled up and the CLI couldn't find
existing snapshots.
Fix: filter by snapshot name prefix (`spawn-{agent}-`) instead of
tags, in both the workflow and the CLI. Remove misleading `tags`
from the Packer template. Add test cases for name-prefix filtering.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Switch build droplet from s-2vcpu-2gb to s-1vcpu-1gb ($6/mo) per DO
Marketplace recommendation for cross-size snapshot compatibility
- Add ufw firewall provisioner (deny incoming, allow SSH, enable)
- Replace basic apt-get clean with full DO Marketplace cleanup sequence:
removes SSH authorized_keys, clears bash history, truncates /var/log,
resets machine-id, and runs cloud-init clean so each launched droplet
gets a fresh identity on first boot
- Add img_check.sh validation step (from digitalocean/marketplace-partners)
to verify firewall active, no root password, and security posture before
the snapshot is finalized — build fails if image doesn't meet requirements
Fixes#2269
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* feat: restore Packer DO snapshot pipeline for fast agent boot
Restores the nightly Packer snapshot build pipeline (reverted in #2205)
that pre-bakes agent images as DigitalOcean snapshots. When a snapshot
exists on the user's account, droplet boot skips cloud-init and tarball
install entirely — cutting provisioning from ~10min to ~2min.
- Add `packer/digitalocean.pkr.hcl` HCL2 template with multi-region
distribution, apt-lock wait, and snapshot marker
- Add `.github/workflows/packer-snapshots.yml` nightly build with
matrix strategy, auto-cleanup of old snapshots, and injection-safe
env var handling
- Add `findSpawnSnapshot()` to query DO API for pre-built snapshots
- Add `waitForSshOnly()` for snapshot boots (skip cloud-init wait)
- Modify `createServer()` to accept optional `snapshotId` param
- Wire snapshot detection in DO `main.ts` orchestrator
- Add `skipAgentInstall` to `CloudOrchestrator` interface to skip
tarball + install steps when booting from snapshot
- Add 5 unit tests for snapshot lookup (happy path, empty, error,
invalid ID, network failure)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use repo-root-relative path for tier scripts in Packer template
Packer resolves script paths relative to cwd (repo root), not relative
to the .pkr.hcl file. Changed `scripts/tier-*.sh` to
`packer/scripts/tier-*.sh`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: Packer build region/size and PATH for agent installs
Two issues causing build failures:
1. `s-2vcpu-4gb` not available in `nyc3` — changed build region to
`sfo3` and size to `s-2vcpu-2gb` (universally available, cheaper,
sufficient for building snapshots)
2. Claude install puts binary in `~/.local/bin` which isn't in PATH
during Packer provisioning — added full PATH to environment_vars
on both the install and marker provisioners so agent binaries and
subsequent scripts can find each other
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: restore Packer DO snapshot pipeline for fast agent boot
Restores the nightly Packer snapshot build pipeline (reverted in #2205)
that pre-bakes agent images as DigitalOcean snapshots. When a snapshot
exists on the user's account, droplet boot skips cloud-init and tarball
install entirely — cutting provisioning from ~10min to ~2min.
- Add `packer/digitalocean.pkr.hcl` HCL2 template with multi-region
distribution, apt-lock wait, and snapshot marker
- Add `.github/workflows/packer-snapshots.yml` nightly build with
matrix strategy, auto-cleanup of old snapshots, and injection-safe
env var handling
- Add `findSpawnSnapshot()` to query DO API for pre-built snapshots
- Add `waitForSshOnly()` for snapshot boots (skip cloud-init wait)
- Modify `createServer()` to accept optional `snapshotId` param
- Wire snapshot detection in DO `main.ts` orchestrator
- Add `skipAgentInstall` to `CloudOrchestrator` interface to skip
tarball + install steps when booting from snapshot
- Add 5 unit tests for snapshot lookup (happy path, empty, error,
invalid ID, network failure)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use repo-root-relative path for tier scripts in Packer template
Packer resolves script paths relative to cwd (repo root), not relative
to the .pkr.hcl file. Changed `scripts/tier-*.sh` to
`packer/scripts/tier-*.sh`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Restores the nightly Packer snapshot build pipeline (reverted in #2205)
that pre-bakes agent images as DigitalOcean snapshots. When a snapshot
exists on the user's account, droplet boot skips cloud-init and tarball
install entirely — cutting provisioning from ~10min to ~2min.
- Add `packer/digitalocean.pkr.hcl` HCL2 template with multi-region
distribution, apt-lock wait, and snapshot marker
- Add `.github/workflows/packer-snapshots.yml` nightly build with
matrix strategy, auto-cleanup of old snapshots, and injection-safe
env var handling
- Add `findSpawnSnapshot()` to query DO API for pre-built snapshots
- Add `waitForSshOnly()` for snapshot boots (skip cloud-init wait)
- Modify `createServer()` to accept optional `snapshotId` param
- Wire snapshot detection in DO `main.ts` orchestrator
- Add `skipAgentInstall` to `CloudOrchestrator` interface to skip
tarball + install steps when booting from snapshot
- Add 5 unit tests for snapshot lookup (happy path, empty, error,
invalid ID, network failure)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: tarball workflow failures (root ownership, swapfile, hermes TTY)
- Use sudo mv + chown for tarball in release step (root-owned from capture)
- Skip swapfile creation if /swapfile already exists (GitHub Actions runners)
- Tolerate hermes setup wizard failure when /dev/tty unavailable in CI
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: capture claude symlink target in tarball + fix verify PATH
The claude installer creates a symlink at ~/.local/bin/claude pointing
to ~/.local/share/claude/versions/X.Y.Z. The capture script was missing
~/.local/share/claude/, causing a broken symlink in the tarball.
Also add ~/.npm-global/bin to the verify PATH check for claude (npm
fallback install path).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
- Use sudo mv + chown for tarball in release step (root-owned from capture)
- Skip swapfile creation if /swapfile already exists (GitHub Actions runners)
- Tolerate hermes setup wizard failure when /dev/tty unavailable in CI
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: pre-built agent tarballs on GitHub Releases for fast install
Adds a nightly GitHub Actions workflow that builds and uploads agent
tarballs to rolling GitHub Releases. During provisioning, the CLI now
attempts to download and extract a tarball before falling back to live
install. Priority chain: snapshot > tarball > live install.
- New workflow: .github/workflows/agent-tarballs.yml
- New capture script: packer/scripts/capture-agent.sh
- New module: packages/cli/src/shared/agent-tarball.ts
- Orchestrate tries tarball first on non-local clouds
- Skip tarball when using DO snapshot (skipTarball flag)
- Tests for tarball install + orchestration integration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use global.fetch mock pattern and address security review
- Use `global.fetch = mock(...)` instead of `spyOn(globalThis, "fetch")`
to match codebase convention and fix CI mock interception
- Add URL validation regex to reject shell metacharacters (CRITICAL)
- Add agent name validation in workflow input (MEDIUM)
- Add `jq has()` check before executing install commands (CRITICAL)
- Use `tar -T` instead of unquoted word-splitting in capture-agent.sh (MEDIUM)
- Resolve merge conflicts with upstream/main (keep Docker fields, adapt
to simplified DO flow, bump version to 0.15.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: use globalThis.fetch for testability in CI
Bun's native fetch binding doesn't go through global.fetch property
lookup, so global.fetch = mock(...) doesn't intercept it. Using
globalThis.fetch explicitly ensures the mock interception works.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add missing packer dependencies and harden install command safety
- Add packer/agents.json (agent tier + install command definitions)
- Add packer/scripts/tier-{minimal,node,bun,full}.sh (dependency scripts)
- Add basic command safety check rejecting suspicious patterns
- Document packer/agents.json as a trust boundary requiring PR review
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): fix npm prefix mismatch, add apt-get update, cleanup
- Add apt-get update -y before apt-get install in all tier scripts
- Add --prefix ~/.npm-global to npm install commands in agents.json
so installed packages land where capture-agent.sh expects them
- Rename misleading MARKER_DIR → MARKER_FILE in capture-agent.sh
- Remove stale comment referencing packer snapshots in workflow
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): detect empty agent installs in capture script
The "no files found" check was dead code — the marker file is always
created before filtering, so FILTERED_FILE always had at least one
entry. Now we count non-marker entries to catch cases where the agent
install silently fails and no actual files are on disk.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): use bare fetch() for Bun mock compatibility in CI
In Bun, global.fetch = mock(...) overrides bare fetch() calls but NOT
globalThis.fetch() calls. Every other source file in the codebase uses
bare fetch() and their mocks work fine in CI. Switch to match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): use dependency injection for fetch in tests
Bun's global.fetch mock doesn't reliably intercept bare fetch() calls
across all Bun versions in CI. Instead of fighting the runtime, accept
an optional fetchFn parameter (defaults to fetch) and pass mock fetch
directly in tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): bypass mock.module bleed in agent-tarball tests
orchestrate.test.ts uses mock.module("../shared/agent-tarball", ...)
which is process-global in Bun and bleeds into agent-tarball.test.ts.
Import via URL (import.meta.url resolution) to bypass the specifier-
based mock matching and get the real module.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tarballs): eliminate mock.module bleed between test files
Bun's mock.module is process-global — orchestrate.test.ts mocking
agent-tarball poisoned agent-tarball.test.ts (the mock function
ignored the fetchFn parameter and always returned false).
Fix: make tryTarballInstall injectable via OrchestrationOptions.
orchestrate.test.ts passes the mock directly via options instead
of using mock.module. agent-tarball.test.ts imports the real module.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tests): mock Bun.which in credential priority tests
Tests assumed no cloud CLIs were installed, but machines with hcloud/
doctl would get "CLI installed" hint overrides, failing the assertion.
Spy on Bun.which to return null so tests are environment-independent.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: fix import ordering after rebase
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* security: add curl domain allowlist and expand command blocklist
Addresses security review findings:
- Add domain allowlist for curl/wget targets (claude.ai, opencode.ai,
raw.githubusercontent.com, registry.npmjs.org, crates.io, github.com)
- Expand suspicious command blocklist (python -c, perl -e, ruby -e, dd, /dev/)
- Document 4-layer security model in workflow comments
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* security: add rm -rf to command blocklist
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: Ahmed Abushagur <ahmed@abushagur.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
DO snapshots are private and account-scoped — users on different
accounts cannot see snapshots built by the CI token. Docker images
are the better approach for cross-account pre-built agents.
Removes: packer/, packer-snapshots workflow, snapshot lookup code,
and snapshot test. Reverts DO CLI to plain cloud-init flow.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(digitalocean): Packer nightly snapshot pipeline for fast boot
Add pre-built Packer snapshots for DigitalOcean droplets. Instead of
10-20 min cloud-init + agent install on every boot, snapshot-based
droplets boot in ~2-3 min (SSH only, agent pre-installed).
- Packer HCL2 template with parametrized agent/tier builds
- Agent build matrix (packer/agents.json) for all 7 agents
- Tier scripts mirroring cloud-init.ts package tiers
- Nightly GitHub Actions workflow (4 AM UTC, max-parallel: 3)
- Automatic cleanup: keeps only latest snapshot per agent
- CLI: findSpawnSnapshot() looks up pre-built images via DO API
- CLI: waitForSshOnly() skips cloud-init when using snapshots
- CLI: createServer() accepts optional snapshotId, skips user_data
- CLI: main.ts routes to fast path when snapshot detected
- Tests for findSpawnSnapshot() (5 cases, all passing)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(packer): use var-file for install_commands to avoid shell quoting issues
The previous approach passed install_commands as `-var` inline, but
GitHub Actions expands `${{ }}` before shell evaluation — JSON arrays
with `|`, `&&`, and `"` characters break shell quoting.
Fix: generate a `.auto.pkrvars.json` file (auto-loaded by Packer)
using jq with --argjson for safe JSON handling. Also route all
`${{ inputs }}` and `${{ matrix }}` values through env vars to
prevent script injection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>