Commit graph

24 commits

Author SHA1 Message Date
A
4e33cc39cd
fix: address medium security findings from #753 (#755)
- Replace `echo -e` with `printf` in cli/install.sh for macOS bash 3.x compat
- Remove `-u` (nounset) from test/run.sh — use `${VAR:-}` pattern instead
- Replace `source <(curl ...)` with `eval "$(curl ...)"` in test/run.sh for curl|bash compat
- Add .gitignore patterns for sensitive files (.env, *.pem, *.key, credentials)

Refs #753

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:48:52 -08:00
A
cec1806128
refactor: improve readability of config setup and shellcheck discovery (#744)
- Replace hardcoded 4-cloud script list in run_shellcheck with dynamic
  discovery that covers all 21 clouds automatically
- Convert 3 inline JSON templates (setup_claude_code_config,
  setup_openclaw_config, setup_continue_config) from single-line printf
  to readable heredocs while preserving json_escape security

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-12 15:19:11 -08:00
A
35997c8ae5
refactor: extract helpers from run_test() in test/mock.sh (#713)
Break down the 150-line run_test() function into focused helpers:
- run_script_with_timeout(): script execution with env vars and timeout
- show_failure_output(): display last 20 lines on failure
- assert_error_scenario(): handle error scenario assertions
- assert_cloud_api_calls(): cloud-specific API call assertions
- record_test_result(): write pass/fail to RESULTS_FILE

run_test() is now 57 lines (62% reduction), each helper is under 35 lines.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:01:49 -08:00
A
ea943d1583
refactor: decompose 287-line setup_mock_curl into named helpers (#718)
The mock curl heredoc script was a monolithic 287-line function with
inline arg parsing, error injection, URL routing, body validation,
fixture lookup, and state tracking all in one flow.

Extract 10 focused helper functions within the heredoc:
- _parse_args: curl argument parsing
- _maybe_inject_error: MOCK_ERROR_SCENARIO handling
- _handle_special_urls: install scripts, OpenRouter, spawn repo
- _strip_api_base: URL-to-endpoint mapping for 14 cloud APIs
- _check_fields / _validate_body: POST body validation
- _try_fixture: fixture file lookup
- _synthetic_active_response: cloud-specific GET-by-ID responses
- _respond_get / _respond_post: METHOD-based response routing
- _track_state: creation/deletion state tracking

The main logic is now a 26-line sequence of named function calls,
making the mock's control flow immediately readable.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:01:41 -08:00
A
9a851b36b6
refactor: extract assert_equals/assert_match helpers in test/run.sh (#727)
Replace 36 inline if/else assertion blocks across 9 test functions with
calls to two new reusable helpers (assert_equals, assert_match). Reduces
test/run.sh by 126 lines (794 -> 668) while keeping all 79 tests passing.

Key functions reduced:
- _test_open_browser: 53 -> 36 lines (-32%)
- _test_ssh_key_utils: 48 -> 26 lines (-46%)
- _test_cloud_init: 41 -> 22 lines (-46%)
- _test_oauth_functions: 39 -> 23 lines (-41%)
- _test_ssh_wait: 33 -> 21 lines (-36%)

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:00:59 -08:00
A
d5d7da0833
refactor: decompose setup_mock_agents and record_cloud into helpers (#722)
- Extract _create_logging_mock and _create_silent_mock from setup_mock_agents
  (test/mock.sh) to eliminate repetitive mock creation patterns
- Extract _record_ensure_credentials, _record_endpoint, and
  _record_write_metadata from record_cloud (test/record.sh) to separate
  credential checking, API recording, and metadata writing concerns

Pure refactoring — no behavior changes.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 15:00:56 -08:00
Ahmed Abushagur
1ad2371a25
feat: qa bot and emails (#565) 2026-02-11 20:19:45 -08:00
A
385a8a9b56
refactor: split 3 large test functions in test/run.sh into focused units (#544)
- _test_browser_and_cloud_init (94 lines) -> _test_open_browser (54) + _test_cloud_init (42)
- test_common_source (87 lines) -> _test_sprite_functions_and_syntax + _test_sprite_log_and_name + _test_sprite_remote_source
- _test_json_ssh_utils (59 lines) -> _test_json_escape + _test_ssh_key_utils (49)

All 75 tests pass. No behavioral changes.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 15:34:06 -08:00
A
7ba3559773
refactor: extract helpers from test_shared_common to reduce complexity (#511)
Break the 415-line test_shared_common() function in test/run.sh into
7 focused sub-functions grouped by feature:
- _test_model_validation (validate_model_id tests)
- _test_json_ssh_utils (json_escape, SSH key ops)
- _test_syntax_and_logging (syntax check, logging functions)
- _test_browser_and_cloud_init (open_browser, cloud-init, connectivity)
- _test_oauth_functions (wait_for_oauth_code, cleanup_oauth_session)
- _test_ssh_wait (generic_ssh_wait success/failure)
- _test_input_and_server_validation (safe_read, validate_server_name)

Also add assert_common_succeeds and assert_common_fails helpers to
eliminate repeated test boilerplate for simple pass/fail assertions.

Agent: complexity-hunter

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-11 12:53:40 -08:00
A
88fa9e48e6
fix: prevent shell/Python injection in env var and credential handling (#443)
- binarylane/continue.sh: Replace unsafe inline echo with inject_env_vars_ssh
  to prevent command injection if OPENROUTER_API_KEY contains single quotes
- test/record.sh: Pass credential values via sys.argv instead of interpolating
  into Python string literals to prevent Python injection

Agent: security-auditor

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 04:50:34 -08:00
A
1576577ed8
feat: Add RamNode cloud provider with OpenStack API support (#408)
Add RamNode budget VPS cloud provider ($0.006/hr) with full OpenStack API integration.

Implementation:
- ramnode/lib/common.sh: OpenStack Keystone v3 auth + Compute API wrapper
- ramnode/claude.sh, ramnode/aider.sh, ramnode/goose.sh: 3 agent scripts
- manifest.json: Added ramnode cloud entry + 15 matrix entries (3 implemented)
- ramnode/README.md: Complete documentation
- test/record.sh: Live cycle testing for RamNode (_live_ramnode function)
- test/mock.sh: URL stripping for Identity/Compute/Network APIs

Technical details:
- Auth: RAMNODE_USERNAME + RAMNODE_PASSWORD + RAMNODE_PROJECT_ID
- APIs: Identity (5000/v3), Compute (8774/v2.1), Network (9696/v2.0)
- Token-based authentication (X-Auth-Token header)
- Server provisioning with cloud-init via base64-encoded userdata
- SSH key management via OpenStack keypairs API

Agent: cloud-scout-1

Co-authored-by: B <6723574+louisgv@users.noreply.github.com>
2026-02-11 01:36:02 -08:00
A
5181f28704
feat: Add local cloud provider for running agents on local machine (#381)
Adds a "local" cloud provider that installs and runs agents directly on the
user's machine without any cloud provisioning. This is useful for local
development and testing.

- local/lib/common.sh: Cloud lib with local execution functions
- local/claude.sh: Claude Code agent script
- local/openclaw.sh: OpenClaw agent script
- local/nanoclaw.sh: NanoClaw agent script
- manifest.json: Added local cloud + matrix entries
- test/: Updated record.sh and mock.sh for local cloud support

Fixes #378

Agent: issue-fixer

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 00:50:05 -08:00
Ahmed Abushagur
8b9f9a0e5a
QA-Bot setup (#335)
* feat: testing

* feat: auto-fix dead apis

* fix: mock works

* feat: new fixtures

* fix: more clouds tested

* fix: dry run fix

* fix: civo valid size

* fix: civo result wait

* feat: fixtures

* feat: per cloud agent
2026-02-10 19:51:07 -08:00
Sprite
b2e2462f0d fix: Poll for sprite provisioning instead of blind sleep
ensure_sprite_exists() now polls `sprite list` until the sprite
appears (up to 30s) instead of a fixed sleep. This eliminates the
spurious "sprite not found" errors that appeared while the sprite
was still provisioning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-11 00:52:35 +00:00
A
bbbe815035
refactor: Security fixes, complexity reduction, and UX improvements (#58)
Security:
- Fix command injection in modal/lib/common.sh (run_server, upload_file, interactive_session)
- Fix command injection in fly/lib/common.sh (run_server, upload_file, interactive_session)
- All container providers now use printf '%q' for proper shell escaping

Complexity:
- Extract _api_should_retry_on_error() helper in shared/common.sh (-19 lines)
- Refactor scaleway_api and upcloud_api to use shared retry helper (-24 lines)
- Extract _save_fly_token() helper in fly/lib/common.sh (-11 lines)
- Extract validateAndGetAgent() in commands.ts, reducing cmdRun/cmdAgentInfo duplication
- Refactor cmdList column width calculation to use calculateColumnWidth()

UX:
- Add actionable next steps to error messages in shared/common.sh
- Improve CLI bash fallback error messages with guidance (spawn.sh)
- Add OAuth progress indicator during browser authentication wait
- Show invalid model ID value and link to openrouter.ai/models
- Add troubleshooting steps for agent installation failures

Tests:
- Update test assertions in test/run.sh to match refactored patterns
- All tests passing: 74 TypeScript + 75 bash = 149 total, 0 failures

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 17:09:27 -08:00
Sprite
3ae83aa867 fix: Fix 4 failing claude.sh tests
Root causes:
- `clear` command fails with exit 1 when TERM is not set (test env has
  no terminal), crashing the script due to set -e. Guard with || true.
- Test patterns for Claude settings/state uploads used old temp file
  naming convention (/tmp/claude_settings, /tmp/claude_global) that no
  longer matches the paths generated by upload_config_file +
  upload_file_sprite (/tmp/*settings.json, /tmp/*.claude.json).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 18:26:53 +00:00
Sprite
8302cafbd7 Remove stale tests and fix echo -e in test harness
Remove tests for deleted nc_listen and create_oauth_response_html
functions. Replace echo -e with printf for macOS bash 3.x compat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 05:13:37 +00:00
Sprite
ce0f2ce7fb refactor: Add default case to script-specific assertions
Added default '*) ' case to handle agents without specific assertions,
resolving SC2249 info warning and improving code clarity.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 03:56:29 +00:00
Sprite
326850dc17 refactor: fix SC2188 and SC2155 warnings in test suite
- Fix SC2188: Use proper null command syntax (: >) for log truncation
- Fix SC2155: Separate local declarations from command substitutions
  - leaked_temps, missing, mtime_before/after, test_str, long_name
- Prevents masking return values in test harness
2026-02-08 03:12:53 +00:00
Sprite
cabdbc37ba refactor: add pipefail to error handling flags
Changed 65 agent scripts from `set -e` to `set -eo pipefail` to ensure
errors in piped commands are properly caught. This prevents silent
failures when commands like `curl | bash` fail in the middle.

Files updated across all cloud providers:
- aws-lightsail: 10 scripts
- digitalocean: 3 scripts
- e2b: 10 scripts
- gcp: 10 scripts
- hetzner: 3 scripts
- lambda: 10 scripts
- linode: 3 scripts
- modal: 10 scripts
- sprite: 3 scripts
- vultr: 3 scripts

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 02:34:45 +00:00
Sprite
3b6c761904 refactor: add username parameter to generic_ssh_wait
- Add required username parameter to generic_ssh_wait()
- Update SSH command to use dynamic username instead of hardcoded "root"
- Update all existing callers to pass username explicitly
- Enables GCP and AWS Lightsail to adopt generic_ssh_wait in future

Score: 40 (Impact: 8, Confidence: 10, Risk: 2)
2026-02-08 01:58:48 +00:00
Sprite
39ee858943 security: fix SC2064 trap quoting to prevent early variable expansion
Changed trap commands from double quotes to single quotes so variables
expand at trap execution time instead of definition time. This prevents
security issues where variables could be tampered with between trap
definition and execution.

Fixed 3 instances:
- cli/install.sh (2 instances): trap 'rm -rf "$tmpdir"' EXIT
- test/run.sh (1 instance): trap 'cleanup' EXIT

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 01:29:56 +00:00
L
3fb2e77b03
Autonomous refactoring: 5 rounds, ~1,400 lines eliminated, production-ready
Five rounds of autonomous AI agent team refactoring with security fixes, code consolidation, and expanded test coverage.
2026-02-08 00:06:46 +00:00
L
f43c52eb61
Add NanoClaw spawn script (#2)
* Add NanoClaw spawn script

NanoClaw is a lightweight WhatsApp-based Claude AI assistant that runs
agents in isolated containers. This script sets up a sprite with
nanoclaw pre-configured: clones the repo, installs dependencies,
configures the API key, and launches in dev mode for WhatsApp QR auth.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix verify_sprite_connectivity exiting script early after single failed check

Retry connectivity up to 6 attempts (30s) instead of trying once and
silently continuing, which caused the next sprite exec to fail under set -e.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add test harness for spawn scripts

Mocks the sprite CLI and runs each script end-to-end verifying:
- common.sh sources correctly and all functions resolve
- Log functions write to stderr (not stdout)
- Env var flow (SPRITE_NAME, OPENROUTER_API_KEY)
- Sprite commands called in correct order
- Temp files created and cleaned up
- Each script reaches its final interactive launch

Usage: bash test/run.sh

42 tests, all passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-06 21:23:18 -08:00