Commit graph

1346 commits

Author SHA1 Message Date
A
3db19d90ac
fix: accept comma in Fly.io macaroon tokens & handle flat org dict (#1593)
Real `fly auth token` returns comma-separated multi-segment macaroon
tokens (fm2_...,fm2_...,fo1_...). The token validation regex rejected
commas, forcing re-auth on every run. Add comma to the allowed charset.

`fly orgs list --json` returns a flat dict ({"slug": "Name"}) on some
flyctl versions, not the list/nodes format the parser expected. Detect
and handle both formats so the org picker works correctly.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 08:20:48 -08:00
A
ed9501235b
fix: bash 3.2 compat — sed for pattern sub + split local var=$(cmd) (#1572, #1571) (#1587)
Issue #1572: Replace bash 4+ ${//} pattern substitution in generate_env_config
with sed for macOS bash 3.2 compatibility.

Issue #1571: Split local var=$(cmd) declarations in fly/lib/common.sh so
exit codes propagate correctly with set -e on macOS bash 3.2.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 07:49:17 -08:00
A
ce8b1afdf8
fix: always rm temp env file even if .zshrc append fails (#1573) (#1586)
Use semicolons instead of && for rm in inject_env_vars, inject_env_vars_sprite,
inject_env_vars_cb, and inject_env_vars_cloud so the temp file containing the
API key is always deleted even if ~/.zshrc doesn't exist or append fails.

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 10:45:55 -05:00
A
aa4174db9e
fix: add retry logic to wait_for_cloud_init for error recovery (#1575) (#1588)
Add _fly_run_with_retry helper that wraps run_server with configurable
retry count, sleep interval, and timeout. Apply it to package manager
and installer commands in wait_for_cloud_init so transient failures
(network timeouts, apt lock contention) no longer abort the entire
cloud-init sequence.

Agent: complexity-hunter

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 10:45:32 -05:00
A
cbb6198258
test: add Fly.io failure-mode tests for SSH tunnel, API errors (#1579) (#1590)
Add mock tests covering real failure scenarios that were previously
untested despite 36/36 happy-path tests passing:

- API rate limit (429): mock curl returns 429 for cloud API calls
- Machine creation failure (422): mock curl returns 422 for POST to */machines*
- SSH tunnel failure: fly ssh console / fly machine exec exit non-zero
  (simulates WireGuard tunnel context deadline exceeded)
- SSH timeout: fly CLI never returns "ok", _fly_wait_for_ssh exhausts retries

The fly mock now checks MOCK_ERROR_SCENARIO to simulate CLI-level failures
(ssh_tunnel_failure, ssh_timeout) in addition to the existing curl-level
error injection (rate_limit, create_failure).

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 10:43:17 -05:00
A
282803a9bb
fix: add debug logging to ensure_fly_token auth chain (#1574) (#1589)
Add log_info/log_warn messages at each step of the 5-step auth chain
so users can see which auth method is being tried and why fallbacks occur.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 10:42:46 -05:00
L
3eca4221c6
fix: address architectural brittleness in Fly.io integration (issue #1581) (#1585)
Resolves sub-issues #1569, #1570, #1576, #1577, #1578, #1580.

#1569 — /wait endpoint replaces polling loop:
  _fly_wait_for_machine_start now uses GET /apps/{app}/machines/{id}/wait
  ?state=started&timeout=90. One blocking API call instead of 30 polls.

#1570 — fly machine exec replaces fly ssh console for run_server:
  run_server uses 'fly machine exec MACHINE_ID --app APP -- bash -c cmd'
  (direct API, no WireGuard tunnel) when FLY_MACHINE_ID is set. Falls
  back to 'fly ssh console -C' for environments without a machine ID.

#1576 — App name collision loop capped at 5 retries:
  Prevents infinite re-prompt. Suggests FLY_APP_NAME env var after 5
  failed attempts.

#1577 — destroy_server errors are now reported:
  All fly_api calls check for error responses. Reports failed machine
  deletions and exits non-zero on app deletion failure instead of
  always logging "destroyed" regardless of outcome.

#1578 — bun replaced with python3 for all JSON parsing:
  _fly_json_get, _fly_build_machine_body, _fly_list_orgs, destroy_server,
  list_servers all use python3 -c now. python3 is universally available;
  bun was only available after cloud-init completed on the target machine.

#1580 — upload_file uses stdin pipe instead of base64 string injection:
  'fly machine exec ... -- bash -c "cat > path" < local_file' streams
  file content directly. Eliminates the command-length/injection risk of
  embedding base64 content in a shell argument string.

test/mock.sh: add 'fly machine exec' case to the fly CLI mock.
test/fixtures/fly/_env.sh: add FLY_MACHINE_ID to test env.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 07:19:23 -08:00
L
b42d1a52d6
fix: don't bail on non-zero exit from fly orgs list + pass JSON as arg (#1582)
Some flyctl versions exit non-zero even on success. Removed '|| return 1'
so the output is always captured. Empty output is still a failure.

Also pass JSON as a bun argument (process.argv[1]) instead of piping via
stdin — avoids any Bun.stdin buffering issue in the _fly_list_orgs context.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 07:03:56 -08:00
L
07ce24b710
fix: capture interactive_pick output and export FLY_ORG in _fly_prompt_org (#1568)
interactive_pick() echoes the selected value to stdout — it does NOT
export the env var. _fly_prompt_org was calling it without capturing
the output, so FLY_ORG was never set and the echo printed the org
slug as a raw string to the terminal.

Fix: org=$(interactive_pick ...) && export FLY_ORG.
Also guard with the standard FLY_ORG / SPAWN_NON_INTERACTIVE early-exit.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:40:23 -08:00
L
6fab9b6ae5
fix: use fly orgs list for org picker + strip ANSI from token capture (#1567)
1. _fly_list_orgs: use 'fly orgs list --json' (flyctl) instead of the
   non-existent api.fly.io/v1/organizations REST endpoint. Pipe through
   interactive_pick (same pattern as Hetzner/GCP pickers) so org
   selection uses the shared arrow-key / fzf / numbered-list picker.

2. fly auth token captures: add 'sed s/\x1b...//g' to strip ANSI color
   escape codes. flyctl may output the token with terminal colors even
   when stdout is piped; the ESC character (\033) fails the security
   character check (^[a-zA-Z0-9._/@:+=\ -]+$) causing the token to be
   marked malformed and cleared on the next run.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:35:44 -08:00
L
0f63596a12
feat: use Fly.io API to list orgs in picker for _fly_prompt_org (#1566)
Replace flyctl-based org listing with a direct API call to
api.fly.io/v1/organizations, feeding results into _display_and_select
(the shared arrow-key / fzf / numbered-list picker).

_fly_list_orgs():
  - Calls GET /v1/organizations with Bearer auth
  - Emits pipe-delimited "slug|name (type)" lines for _display_and_select

_fly_prompt_org():
  - Single org: auto-selects silently
  - Multiple orgs: shows arrow-key picker via _display_and_select
    (defaults to "personal" if that slug is in the list)
  - API unavailable: falls back to safe_read prompt with "personal" default

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:31:57 -08:00
L
7f6d99f90f
fix: clear corrupt saved token + capture only first line of fly auth token (#1565)
Two fixes for persistent Fly.io auth failures:

1. shared/common.sh — _load_token_from_config():
   When the saved token fails the security character check, auto-delete
   the corrupt config file instead of silently returning 1. This prevents
   the user from being stuck in a loop where every run loads a malformed
   token (from a previous failed auth attempt) and immediately fails.
   Message changed from error to warn: "Saved token is malformed —
   clearing cached credentials."

2. fly/lib/common.sh — _try_flyctl_auth() and _try_fly_browser_auth():
   Pipe 'fly auth token' output through 'head -1' to capture only the
   first line. Newer flyctl versions may print warnings/metadata after
   the token on subsequent lines; previously these got concatenated into
   the token string via $() and could introduce characters that fail
   the security validator (newlines stripped by _sanitize_fly_token, but
   concatenated text from warning lines could contain unusual chars).

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:23:35 -08:00
L
4a7c0fab81
fix: skip token validation after flyctl browser auth + fix validation endpoint (#1564)
1. Skip _validate_fly_token after 'fly auth login':
   Token from flyctl is definitionally valid — calling the Machines API
   (api.machines.dev) with a user OAuth token causes a false failure
   because that API only accepts deploy tokens, not OAuth user tokens.

2. Fix _validate_fly_token endpoint:
   Now tries api.fly.io/v1/user (Bearer, accepts OAuth tokens) first,
   then falls back to the Machines API for deploy tokens. Prevents
   'no tokens found in header' false failures for env/config tokens.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:16:19 -08:00
L
e8eb836534
fix: use fly auth login for browser auth + add org selection (#1563)
Root cause of persistent 'no tokens found in header':
The CLI Sessions API returns a user-level OAuth code that requires
flyctl's internal token exchange step to become a valid API token.
We were using the raw access_token directly, bypassing that step.

_try_fly_browser_auth() — now delegates to flyctl:
  - Calls 'fly auth login' directly (flyctl handles browser open,
    polling, and token exchange internally)
  - Gets the final token via 'fly auth token' (always correct format)
  - Falls back to manual token entry if flyctl unavailable

_fly_prompt_org() — new function:
  - Called after successful auth (flyctl, browser, or manual)
  - Lists orgs via 'fly orgs list --json' if multiple exist
  - Shows picker or simple prompt; defaults to "personal"
  - Exports FLY_ORG for use in app creation / list_servers
  - Skipped when FLY_ORG is already set or SPAWN_NON_INTERACTIVE=1

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:11:32 -08:00
L
f1ca9cbce1
fix: smart bun mock + restore Bun JSON parsing in fly/lib (reverts #1553) (#1556)
* Revert "fix: handle raw m2. macaroon tokens from Fly.io CLI Sessions API (#1552)"

This reverts commit 9fc59ded1c.

* Revert "fix: replace bun -e with python3 in fly/lib/common.sh to fix 18 mock test failures (#1553)"

This reverts commit 328e6a6da4.

* fix: bun passthrough mock + restore Bun JSON parsing in fly/lib

Reverts PR #1553 (which reverted Bun in favour of Python to fix tests)
and instead fixes the root cause: the test/mock.sh bun mock was a dumb
no-op that discarded all output, causing _fly_json_get() to return empty
string and every fly script to fail with "Failed to extract machine ID".

test/mock.sh — smart bun mock:
- `bun -e "..."` (inline eval, used for JSON processing) → delegates to
  the real bun binary so _fly_json_get() / _fly_build_machine_body()
  actually produce correct output during tests
- All other bun invocations (install, run, etc.) → logged no-op as before

fly/lib/common.sh:
- Restores Bun-based _fly_json_get(), _fly_build_machine_body(),
  destroy_server machine-ID extraction, and list_servers table formatter
- Re-applies m2. macaroon token fix from #1552 (which was lost when
  #1553 reverted the whole file):
  _sanitize_fly_token now wraps raw m2.* tokens as "FlyV1 m2.*" so
  CLI Sessions OAuth tokens are sent with the correct auth header

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* test: add node fallback to bun mock for CI environments

CI (GitHub Actions ubuntu-latest) has node but not bun, so the bun
passthrough mock silently returns empty string, causing _fly_json_get
to fail and 18 Fly.io tests to break. Add a fallback chain:
real bun -> node (with Bun.stdin.text() polyfill) -> exit 0.

Agent: test-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-21 06:01:58 -08:00
A
25d7bfe027
fix: align key-request.sh token regex with shared/common.sh for FlyV1 tokens (#1562)
The _try_load_env_var regex in key-request.sh rejected tokens containing
spaces, colons, plus signs, or equals signs. This caused FlyV1 prefixed
tokens ("FlyV1 fm2_...") to fail validation during QA cycle key loading,
making Fly.io always appear as a missing key provider.

Updated regex to match _load_token_from_config in shared/common.sh which
already allows these characters.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 07:15:34 -05:00
A
184bc21b3a
fix: replace printf -v with export for bash 3.2 compat in key-request.sh (#1561)
printf -v requires bash 4.0+; macOS ships bash 3.2, causing _try_load_env_var()
to fail with 'printf: -v: invalid option' and breaking saved API key loading for
all cloud providers. Both var_name and val are validated against strict regexes
immediately above, so export "NAME=VALUE" is injection-safe and works on bash 3.2+.
The macos-compat linter already flags this pattern as MC013 error.

Agent: team-lead

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 06:42:14 -05:00
A
907d21f030
fix: allow space in token validation regex for FlyV1 prefixed tokens (#1560)
The _load_token_from_config regex (added in #1547) rejects tokens
containing spaces, but Fly.io browser OAuth tokens are saved with
a "FlyV1 " prefix (e.g., "FlyV1 fm2_xxx"). This causes the token
to be silently rejected on reload, forcing re-authentication every
session. Space is safe inside curl -K double-quoted header values.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 10:48:53 +00:00
A
3af5005896
fix: pass response via env var in record.sh has_api_error (SC2259) (#1559)
The heredoc overrode piped stdin, so $response never reached python3.
sys.stdin.read() got empty input, making API error detection silently
fail during live fixture recording. Pass data via environment variables
instead.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 05:47:50 -05:00
A
c1f730c69a
fix: replace eval with declare and add base64 validation (#1557)
* fix: replace eval with declare and add base64 validation (issues #1554, #1555)

- shared/key-request.sh: replace eval with declare for defense-in-depth
  (eval avoided when safer declare alternative exists; validated vars stay safe)
- fly/lib/common.sh: add base64 output alphabet validation before shell
  interpolation, matching daytona/lib/common.sh proven-safe pattern

Fixes #1554
Fixes #1555

Agent: team-lead
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use printf -v instead of declare for safe variable assignment in key-request.sh

Addresses security review feedback on PR #1557. The declare approach
created a local variable whose export had no effect outside the function.
printf -v assigns directly in the current scope without eval or command
substitution.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 04:47:33 -05:00
A
e9431430dd
fix: report temp file leaks in _assert_no_temp_leaks test assertion (#1558)
The function only had a success branch — when temp files were leaked,
it silently returned without incrementing FAILED or printing output.
Add the missing else branch so leaked temp files are detected.

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 04:45:09 -05:00
L
9fc59ded1c
fix: handle raw m2. macaroon tokens from Fly.io CLI Sessions API (#1552)
Root cause of 'no tokens found in header' after browser OAuth:

The Fly.io CLI Sessions API returns raw macaroon tokens (e.g. m2.XXXX)
WITHOUT the 'FlyV1 ' prefix. _sanitize_fly_token only handled fm2_
tokens, so m2. tokens fell through unchanged and were sent as:
  Authorization: Bearer m2.XXXX
Fly.io's Machines API expects FlyV1 macaroon format, not Bearer.

Fixes:
- _sanitize_fly_token: add m2.* case that wraps as 'FlyV1 m2.XXX'
- _try_fly_browser_auth polling: eagerly wrap any non-FlyV1 token with
  'FlyV1 ' prefix at the source, before it's echoed back to the caller

Token format handling after fix:
  m2.XXXX         → FlyV1 m2.XXXX      ← CLI Sessions API (was broken)
  fm2_XXXX        → FlyV1 fm2_XXXX     ← still handled (unchanged)
  FlyV1 fm2_XXXX  → FlyV1 fm2_XXXX    ← already correct (unchanged)
  eyJhbGci...     → Bearer eyJ...      ← legacy JWT (fallback to manual)

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 23:54:34 -08:00
A
328e6a6da4
fix: replace bun -e with python3 in fly/lib/common.sh to fix 18 mock test failures (#1553)
bun is not installed in the mock test environment (CI or local test runs).
The mock harness stubs bun as a no-op logger, so _fly_json_get() always
returned empty string, causing "Failed to extract machine ID" and 18 fly
script test failures in bash test/mock.sh.

Replace all 4 bun -e invocations with equivalent python3 code:
- _fly_json_get: extract top-level JSON field from stdin
- _fly_build_machine_body: build machine creation JSON body
- _fly_destroy_app: extract machine IDs array
- list_servers: format apps table

python3 is always available and already has a pass-through mock in
test/mock.sh (like /usr/bin/python3). No behavior change for real runs.

Before: bash test/mock.sh fly → 18 passed, 18 failed
After:  bash test/mock.sh fly → 36 passed, 0 failed

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 02:19:46 -05:00
A
9acc239001
fix: validate token characters in _load_token_from_config to prevent curl injection (#1547)
* fix: validate token characters in _load_token_from_config to prevent curl injection

Tokens loaded from ~/.config/spawn/{cloud}.json were exported without
character validation. A tampered config file containing a token with
embedded newlines could exploit the _curl_api function's -K - (stdin
config) mechanism to inject arbitrary curl directives (e.g., output,
url), since curl interprets newlines in the config format as directive
separators.

Add allowlist validation (^[a-zA-Z0-9._/@:-]+$) matching the pattern
already used in key-request.sh _try_load_env_var and validate_api_token,
making all three token-loading paths consistent.

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review feedback on token validation PR

- Update backslash test to expect validation failure (backslashes not
  valid in any known API token format; the old expectation was wrong
  after validation was added)
- Fix test so exit code comes from _load_token_from_config directly,
  not the trailing echo which always exits 0
- Add comment in shared/common.sh explaining why the pattern includes
  colon vs key-request.sh pattern (Fly.io FlyV1 tokens use colons)

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review feedback — widen token charset for base64 segments

The original regex rejected + and = which are valid base64 characters
found in API tokens (e.g. sk-or-v1-abc/def+ghi==). This caused a
pre-existing test to fail. Widen the allowlist to include + and =
while keeping the security comment documenting the pattern difference
with key-request.sh.

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 01:18:34 -05:00
A
53e6de7f55
fix: validate mock-curl-script.sh stays in sync with mock.sh in test-infra-sync (#1550)
The test-infra-sync test validates that mock.sh's _strip_api_base() and
_validate_body() cover all clouds with fixtures. However, the actual
runtime mock used by tests is mock-curl-script.sh, which has its own
copies of these functions. Nothing enforced these copies staying in sync,
so a contributor could update mock.sh to pass validation while the
runtime mock silently fails to handle new cloud URLs.

Add cross-file sync tests that verify both files handle the same cloud
patterns for _strip_api_base() and _validate_body(). Also refactor
helpers to accept content as a parameter for reuse across both files.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 01:18:15 -05:00
L
fe2c0b024b
fix: prevent premature exit when Fly.io CLI session access_token is null (#1551)
The polling loop in _try_fly_browser_auth() was returning immediately
on the first poll (t=2s) because:

  access_token=$(... "d.get('access_token','')")

When the JSON has "access_token": null (before the user completes
browser auth), Python's print(None) outputs the string "None".
Bash $() captures "None" as non-empty, passes [[ -n "$access_token" ]],
and returns it as the token — before the user even sees the browser.

Then _validate_fly_token(FLY_API_TOKEN="None") sends:
  Authorization: Bearer None
which Fly.io rejects with:
  verify: invalid token: no tokens found in header

Fix:
  d.get('access_token') or ''   →  None or '' = ''  (empty, keeps polling)
  + explicit != "None" guard for belt-and-suspenders

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 21:37:34 -08:00
L
4ae781d2a8
fix: remove 2>/dev/null from token validation calls in auth flow (#1549)
Token validation functions (test_hcloud_token, test_do_token,
test_daytona_token, _validate_fly_token) contain rich diagnostic
log_error/log_warn messages with error details and fix instructions.
Calling them with 2>/dev/null silently discarded all that output,
leaving users with no explanation when their token was rejected.

shared/common.sh — ensure_api_token_with_provider():
  Remove 2>/dev/null from "${test_func}" in both the env-var and
  config-file validation branches, so callers like test_hcloud_token
  can print API error details and remediation steps.

fly/lib/common.sh — ensure_fly_token():
  Remove 2>/dev/null from both _validate_fly_token calls (config-file
  path and post-browser-OAuth path) so users see why validation failed.

Note: Issue 1 (API polling in _poll_instance_once) is intentionally
left with 2>/dev/null — suppressing curl errors during a 60-iteration
polling loop prevents terminal flooding and is handled by '|| true'.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 21:27:42 -08:00
L
8c437435eb
fix: show Fly.io login URL immediately by removing 2>/dev/null suppression (#1548)
2>/dev/null on _try_fly_browser_auth() was swallowing all stderr,
including the auth URL printf and log_step messages that the user
needs to see for sandbox/headless environments.

Also add a 'Fetching Fly.io login URL...' log_step before the API
call so the user gets immediate feedback while the session is created
(the curl call can take 1-2 seconds before the URL is available).

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 21:12:17 -08:00
A
dce55a3f6c
fix: prevent Python code injection in shared utility functions (#1544)
Pass field names via sys.argv instead of interpolating bash variables
directly into Python source strings in extract_ssh_key_ids() and
_load_json_config_fields(). This aligns with the secure pattern already
used elsewhere (e.g., _try_load_env_var in key-request.sh).

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 23:37:24 -05:00
A
d6c53d838f
fix: source .spawnrc directly in agent launch commands for reliable env loading (#1546)
24 agent scripts (codex, opencode, kilocode, openclaw across 6 clouds) used
`source ~/.zshrc && <agent>` which loads env vars indirectly via a hook.
This fails silently when .zshrc has errors or the hook install was non-fatal,
causing agents to launch without OPENROUTER_API_KEY.

Change to `source ~/.spawnrc 2>/dev/null; source ~/.zshrc 2>/dev/null; <agent>`
which loads env vars directly (matching claude/zeroclaw pattern) and tolerates
.zshrc failures without blocking the agent.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 23:37:03 -05:00
A
af475629d8
fix: exclude echo -n from macos-compat MC002 rule to eliminate false positives (#1545)
The MC002 regex matched both `echo -e` and `echo -n`, but only
`echo -e` is non-portable on macOS bash 3.2. `echo -n` works fine
as a bash builtin. This caused 3 false positive errors (all TTY
probe patterns using `echo -n "" > /dev/tty`) making the linter
exit non-zero incorrectly.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 23:36:47 -05:00
L
031b8fbcf0
update: Claude Code description to match Anthropic's official branding (#1542)
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 20:19:28 -08:00
A
2c214c2f5b
fix: call ensure_jq before jq usage in hetzner and daytona libs (#1541)
Both cloud_authenticate() functions use jq for JSON construction in
create_server() but never verify jq is installed. On minimal Ubuntu/
Debian, Alpine, or fresh macOS without Homebrew, this causes a hard
failure with "jq: command not found" after the user has already entered
their API token. ensure_jq() in shared/common.sh auto-installs jq on
Linux/macOS -- wire it in before the first jq-dependent call.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 22:00:09 -05:00
A
aa0b182f71
fix: validate GCP_USERNAME before assignment to prevent injection (#1537)
* fix: validate GCP_USERNAME before assignment to prevent injection

Assign logname output to _username first, validate against
^[a-zA-Z0-9_-]+$ regex, then assign to GCP_USERNAME. This
ensures the validated value is what gets used in su commands.

Fixes #1536

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: validate whoami output in gcp/lib/common.sh main script

Apply same validation pattern to line 27 as was applied in cloud-init.
Assigns whoami output to temp var, validates against alphanumeric pattern,
then assigns to GCP_USERNAME only after validation passes.

Agent: security-auditor
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 20:38:34 -05:00
A
3c4d92cc9f
test: fix 25 test failures from mock.module global pollution and sandbox env (#1539)
Add autocomplete mock to 38 @clack/prompts mock.module declarations
that were missing it. Bun's mock.module is process-global, so when any
other test file's mock wins the race, p.autocomplete was undefined,
causing 17 cmd-interactive tests to fail non-deterministically.

Also guard sandbox-verification tests with describe.skipIf(!isSandboxed)
so the 8 meta-tests skip cleanly when running from repo root (where
bunfig.toml preload is not active) instead of failing.

Result: 6995 pass, 0 fail from cli/; 6978 pass, 0 fail, 17 skip from root.

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 19:36:20 -05:00
A
38b972f5ce
fix: use destroy_server for sprite delete to support org users (#1538)
The sprite case in buildDeleteScript called `sprite destroy` directly,
bypassing ensure_sprite_authenticated and destroy_server. This meant
SPRITE_ORG was never detected, so org users got "sprite not found"
errors and orphaned sprites continued incurring charges.

Align with every other cloud (hetzner, digitalocean, fly, gcp, aws,
daytona) by calling ensure_sprite_authenticated then destroy_server,
which applies _sprite_org_flags automatically.

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 19:34:58 -05:00
A
c7e1c73c8a
fix: unbreak spawn delete and align error handling conventions (#1534)
spawn delete was broken for all clouds because execDeleteServer passed
inline scripts (without shebangs) through runBash, which calls
validateScriptContent requiring a #! prefix. Extract spawnBash helper
and add runBashTrusted for locally-generated delete scripts that already
validate their inputs via validateServerIdentifier/validateMetadataValue.

Also fix instanceof Error usage in manifest.ts and history.ts to use
duck typing, matching the convention documented in index.ts and
commands.ts. Fix stale comment in security.ts that claimed colons were
in the server ID allowlist when the regex excludes them.

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 18:17:29 -05:00
A
7d83bb6191
test: sync 21 failing tests with current source behavior (#1535)
Tests fell out of sync with recent source changes:
- _display_and_select: check for "server types" (agnostic of UI path)
- opencode_install_cmd: check for "tr A-Z a-z" (new OS detection)
- _curl_api: test non-auth headers (auth now via -K stdin)
- ensure_gh_auth: use valid token prefix, match new log messages
- GITHUB_TOKEN piping: match _gh_token variable name
- daytona: remove from exec-based clouds (uses SSH)
- cmdrun/prompt-file: add --dry-run to prevent script execution timeouts
- sandbox: clean stale /root/subprocess-test.txt before assertion

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 18:17:21 -05:00
A
c69c12c8db
fix: validate RAW_BASE URL in update-check to prevent future injection (#1533)
Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 12:52:02 -05:00
A
2bb1b82bc3
fix: align tests with re-exec update behavior and sprite upload classification (#1532)
- update-check.test.ts: mock execFileSync for re-exec path added in eea43ad,
  account for findUpdatedBinary() "which spawn" call, update bare-spawn test
  to expect re-exec instead of "Run your spawn command again"
- upload-file-security.test.ts: fix sprite classification to match
  "sprite $(...) exec" with org flags; remove daytona from strict allowlist
  regression list (uses printf %q escaping, validated by general exec tests)
- version-comparison.test.ts: mock execFileSync for auto-update integration test

Agent: test-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 11:52:35 -05:00
A
4b9e6ae0b8
fix: validate agent.repo format in update.ts before passing to Bun.spawn (#1530)
Adds a GITHUB_REPO_PATTERN allowlist check to refreshAgentStats() so that
only well-formed owner/repo strings reach `gh api repos/…`.  A malicious
or corrupted manifest.json entry with shell metacharacters in the repo
field is now rejected before it is interpolated into the command argument.

Fixes #1527

Agent: security-auditor

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 11:51:31 -05:00
A
3570caa840
fix: accept localhost and hostnames in validateConnectionIP (#1531)
validateConnectionIP rejected "localhost" (written by local cloud) and
hostnames like "ssh.app.daytona.io" (written by Daytona), causing
mergeLastConnection to silently discard connection data. This broke
spawn list and spawn delete for these providers.

- Add "localhost" to CONNECTION_SENTINELS
- Add HOSTNAME_PATTERN for valid multi-label DNS hostnames
- Update tests: localhost now valid, add hostname acceptance/rejection tests

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 11:49:23 -05:00
A
6782618b7c
feat: add user-friendly cloud descriptions to manifest (#1529)
Replace technical API-focused cloud descriptions with short,
user-friendly ones that highlight pricing and key selling points.

Co-authored-by: lab <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 08:18:19 -08:00
L
eea43adcad
fix: re-exec with new binary after auto-update for all invocations (#1526)
Two bugs in reExecWithArgs():

1. args.length === 0 early exit:
   Running bare `spawn` (interactive picker) after an auto-update would
   print "Run your spawn command again" and exit, requiring the user to
   manually re-invoke. Now always re-exec so the new flow triggers
   immediately.

2. process.argv[1] stale binary path:
   If the installer places the updated binary in a different directory than
   the currently running binary (e.g. old: ~/.local/bin, new: /usr/local/bin),
   re-exec would run the old stale binary. Fix: add findUpdatedBinary() which
   resolves via `which spawn` (PATH lookup) first, falling back to
   process.argv[1] only if which fails.

Bump CLI version 0.5.17 → 0.5.18.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 10:26:02 -05:00
A
be48fe8576
fix: display spawn names in list output (#1523)
Users who name their spawns via the interactive "Name your spawn" prompt
cannot see those names in `spawn list` output. Multiple spawns of the
same agent/cloud combo (e.g. two "Claude Code on Hetzner") are
indistinguishable despite having different names.

Show the spawn name in both interactive picker labels and non-interactive
table output so users can tell their spawns apart.

Agent: ux-engineer

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 10:22:14 -05:00
A
fc87ebf939
fix: replace printf -v (bash 4.0+) with eval for macOS bash 3.2 compat (#1522)
printf -v was introduced in bash 4.0 but macOS ships bash 3.2.
_update_retry_interval() in shared/common.sh used printf -v and is called
from generic_ssh_wait and _cloud_api_retry_loop — meaning ALL SSH
connectivity checks and cloud API retries would fail on macOS with:
"printf: -v: invalid option"

Changes:
- shared/common.sh: replace printf -v with eval in _update_retry_interval()
- shared/common.sh: remove dead code in calculate_retry_backoff() where
  next_interval was computed but never used
- shared/key-request.sh: same printf -v fix
- test/macos-compat.sh: add MC013 rule to catch printf -v in future

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 10:20:12 -05:00
L
be176e4cdb
fix: confirm kebab resource name + improve Fly.io sandbox auth (#1525)
shared/common.sh — prompt_spawn_name():
  Replace log_info with safe_read so user confirms (or overrides) the
  derived kebab-case resource name before it's used for any cloud resource:
    Spawn name (e.g. "My Dev Box"): My Claude Box
    Resource name [my-claude-box]: ⏎   ← press Enter to accept

fly/lib/common.sh — _try_fly_browser_auth():
  - Print auth URL prominently on its own line (not just as a warning)
    so sandbox users can copy-paste it into their local browser
  - Suppress open_browser errors (|| true) so the script doesn't abort
    if no browser is available
  - Add explicit sandbox hint while polling
  - After 120s timeout: offer manual API token entry as a last resort
    with a direct link to fly.io/dashboard → Tokens

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 07:12:49 -08:00
L
f2df9bffa5
feat: add gcp and aws (Lightsail) to featured_cloud for all agents (#1524)
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-02-20 07:08:02 -08:00
A
eff1dc2512
fix: repair Daytona delete and Fly.io reconnect in spawn list (#1521)
- Remove nonexistent `ensure_daytona_cli` call from Daytona delete script
  (causes "command not found" error when running `spawn delete` on Daytona)
- Add Fly.io SSH handler in cmdConnect to use `fly ssh console -a NAME`
  instead of falling through to broken `ssh root@fly-ssh` path

Agent: code-health

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 09:26:31 -05:00
A
7b6d6eed3b
fix: replace hardcoded history path in security.ts error messages (#1520)
* fix: replace hardcoded ~/.spawn/history.json path in security.ts error messages

Error messages in security validation functions (validateConnectionIP,
validateUsername, validateServerIdentifier, validateMetadataValue) hardcoded
~/.spawn/history.json as the fix path. This is wrong when SPAWN_HOME is set,
directing users to a nonexistent file. Replace all 9 occurrences with
'spawn list --clear' which works regardless of SPAWN_HOME and is simpler
than manually editing JSON.

Agent: ux-engineer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: bump cli version to 0.5.17

Required by CLAUDE.md: any change to cli/ needs a version bump.
PR #1520 changes security.ts error messages (cli/ change).

Agent: pr-maintainer
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-20 08:37:01 -05:00