mirror of
https://github.com/OpenRouterTeam/spawn.git
synced 2026-04-28 03:49:31 +00:00
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds aider-chat on Python 3.13 fails with `ImportError: cannot import name '_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved — those releases have no Python 3.13 binary wheels, so the C extension is missing at runtime. Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>` and single quotes get mangled by `printf '%q'` in run_server before the command reaches the remote machine) with `--upgrade`, which forces all transitive deps including Pillow to their latest compatible versions. Also adds a plain-text echo before the install so users see progress instead of a silent hang during the 2-4 minute install. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: update aider/gptme/interpreter assertions from pip to uv The install method for aider, gptme, and open-interpreter was changed from pip to `uv tool install` across all clouds. The mock test assertions still checked for the old `pip.*install.*` patterns, causing 9 failures (3 agents × 3 clouds). Update patterns to match the actual `uv tool install` commands now used in all cloud scripts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger test run for uv assertion fix * fix: prevent SSH hangs, restore stderr, fix command escaping across clouds - Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH stdin theft causing sequential install/verify/configure steps to hang - Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default SSH_OPTS so long-running installs don't silently drop on flaky networks - Remove 2>/dev/null from Fly.io run_server so remote command errors are no longer silently swallowed (--quiet flag still suppresses flyctl noise) - Fix Fly.io printf '%q' double-quoting: remove extra quotes around $escaped_cmd that prevented the remote shell from consuming escapes, breaking && || | operators in commands - Remove broken printf '%q' from Daytona run_server and interactive_session where it escaped shell operators into literal characters since daytona exec has no intermediate shell layer - Pin aider to --python 3.12 instead of --with audioop-lts across all clouds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add --pty to fly ssh console for interactive sessions fly ssh console -C does not allocate a pseudo-terminal by default, causing interactive TUI agents (aider, claude) to fail with "Input is not a terminal (fd=0)" or completely unresponsive input. Adding --pty forces PTY allocation, matching how other clouds handle interactive sessions (SSH uses -t, Sprite uses -tty). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prepend ~/.local/bin to PATH in ssh_run_server After uv installs to ~/.local/bin, the current shell session doesn't have it in PATH, causing "uv: command not found" on DigitalOcean and all other SSH-based clouds (Hetzner, AWS, GCP, OVH). Fly.io's run_server already prepends this PATH — now the shared ssh_run_server does the same, fixing all SSH-based clouds at once. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add Node.js to cloud-init for all cloud providers npm-based agents (codex, kilocode, etc.) fail with "npm: command not found" because Node.js isn't installed during cloud-init. Fly.io was the only provider installing Node.js (in wait_for_cloud_init). Now all cloud-init scripts install Node.js v22 LTS from nodesource, matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and GCP cloud-init (was already in shared/DigitalOcean/Hetzner). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use apt packages for nodejs/npm instead of nodesource The nodesource setup script (setup_22.x) runs its own apt-get update and repository configuration, nearly doubling cloud-init time and causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm in its default repos — just add them to the packages list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add timeouts and better error handling to Daytona CLI commands Daytona CLI commands (login, list, create) can hang indefinitely when the API is slow or unreachable. This causes: - "Failed to create sandbox: timeout" with no recovery - Token validation timeouts misreported as "invalid token" - Users re-entering valid tokens that also timeout Fixes: - Wrap all daytona CLI calls with timeout (30s for auth, 120s for create) - Detect timeout errors separately from auth errors - Show actionable "try again / check status" messages for timeouts - Add nodejs/npm to Daytona wait_for_cloud_init Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: set DAYTONA_API_URL to Daytona Cloud by default The Daytona CLI may default to connecting to a local self-hosted server instead of Daytona Cloud. Without DAYTONA_API_URL set to https://app.daytona.io/api, every CLI command (login, list, create) hangs trying to reach a non-existent local server and times out. The SDK documents this as the default, but the CLI doesn't always pick it up — now we export it explicitly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing n installs Node.js v22 to /usr/local/bin/node but apt's v18 at /usr/bin/node can shadow it in non-interactive SSH sessions. After n 22, symlink the new binaries over the apt ones so v22 is always resolved. Also fix hcloud CLI token extraction for new TOML format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review, add curl timeouts to trigger workflows - Fix ssh_run_server command injection concern: use single-quoted path_prefix so $HOME/$PATH expand remotely, not locally - Add --connect-timeout 15 --max-time 30 to trigger workflows to prevent 5-min hangs when server streams responses - Handle 409 (dedup) as success — expected when cron fires every 15min but cycles take 35min - Reduce workflow timeout-minutes from 5 to 2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
248 lines
9.4 KiB
Bash
248 lines
9.4 KiB
Bash
#!/bin/bash
|
|
# Common bash functions for AWS Lightsail spawn scripts
|
|
# Uses AWS CLI (aws lightsail) — requires `aws` CLI configured with credentials
|
|
|
|
# Bash safety flags
|
|
set -eo pipefail
|
|
|
|
# ============================================================
|
|
# Provider-agnostic functions
|
|
# ============================================================
|
|
|
|
# Source shared provider-agnostic functions (local or remote fallback)
|
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" 2>/dev/null && pwd)"
|
|
if [[ -n "${SCRIPT_DIR}" && -f "${SCRIPT_DIR}/../../shared/common.sh" ]]; then
|
|
source "${SCRIPT_DIR}/../../shared/common.sh"
|
|
else
|
|
eval "$(curl -fsSL https://raw.githubusercontent.com/OpenRouterTeam/spawn/main/shared/common.sh)"
|
|
fi
|
|
|
|
# Note: Provider-agnostic functions (logging, OAuth, browser, nc_listen) are now in shared/common.sh
|
|
|
|
# ============================================================
|
|
# AWS Lightsail specific functions
|
|
# ============================================================
|
|
|
|
SPAWN_DASHBOARD_URL="https://lightsail.aws.amazon.com/"
|
|
# SSH_OPTS is now defined in shared/common.sh
|
|
|
|
# Configurable timeout/delay constants
|
|
INSTANCE_STATUS_POLL_DELAY=${INSTANCE_STATUS_POLL_DELAY:-5} # Delay between instance status checks
|
|
|
|
ensure_aws_cli() {
|
|
if ! command -v aws &>/dev/null; then
|
|
_log_diagnostic \
|
|
"AWS CLI is required but not installed" \
|
|
"aws command not found in PATH" \
|
|
--- \
|
|
"Install the AWS CLI: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html" \
|
|
"Or on macOS: brew install awscli"
|
|
return 1
|
|
fi
|
|
# Verify credentials are configured
|
|
if ! aws sts get-caller-identity &>/dev/null; then
|
|
_log_diagnostic \
|
|
"AWS CLI is not configured with valid credentials" \
|
|
"No AWS credentials found or credentials have expired" \
|
|
--- \
|
|
"Run: aws configure" \
|
|
"Or set environment variables: export AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=..."
|
|
return 1
|
|
fi
|
|
local region="${AWS_DEFAULT_REGION:-${LIGHTSAIL_REGION:-us-east-1}}"
|
|
export AWS_DEFAULT_REGION="${region}"
|
|
log_info "Using AWS region: ${region}"
|
|
}
|
|
|
|
ensure_ssh_key() {
|
|
local key_path="${HOME}/.ssh/id_ed25519"
|
|
local pub_path="${key_path}.pub"
|
|
|
|
# Generate key if needed
|
|
generate_ssh_key_if_missing "${key_path}"
|
|
|
|
# Validate SSH public key path before upload
|
|
if [[ ! -f "${pub_path}" ]]; then
|
|
log_error "SSH public key not found: ${pub_path}"
|
|
return 1
|
|
fi
|
|
if [[ -L "${pub_path}" ]]; then
|
|
log_error "SSH public key cannot be a symlink: ${pub_path}"
|
|
return 1
|
|
fi
|
|
# SSH public keys are typically 100-600 bytes (ed25519/RSA)
|
|
# Reject suspiciously large files to prevent arbitrary file upload
|
|
local size
|
|
size=$(wc -c <"${pub_path}")
|
|
if [[ ${size} -gt 10000 ]]; then
|
|
log_error "SSH public key file too large: ${size} bytes (max 10000)"
|
|
return 1
|
|
fi
|
|
|
|
local key_name="spawn-key"
|
|
|
|
# Check if already registered
|
|
if aws lightsail get-key-pair --key-pair-name "${key_name}" &>/dev/null; then
|
|
log_info "SSH key already registered with Lightsail"
|
|
return 0
|
|
fi
|
|
|
|
log_step "Importing SSH key to Lightsail..."
|
|
# --public-key-base64 accepts the OpenSSH key directly (not base64-wrapped)
|
|
aws lightsail import-key-pair \
|
|
--key-pair-name "${key_name}" \
|
|
--public-key-base64 "$(cat "${pub_path}")" \
|
|
>/dev/null 2>&1 || {
|
|
# Race condition: another process may have imported it
|
|
if aws lightsail get-key-pair --key-pair-name "${key_name}" &>/dev/null; then
|
|
log_info "SSH key already registered with Lightsail"
|
|
return 0
|
|
fi
|
|
log_error "Failed to import SSH key to Lightsail"
|
|
return 1
|
|
}
|
|
log_info "SSH key imported to Lightsail"
|
|
}
|
|
|
|
get_server_name() {
|
|
get_resource_name "LIGHTSAIL_SERVER_NAME" "Enter Lightsail instance name: "
|
|
}
|
|
|
|
get_cloud_init_userdata() {
|
|
cat << 'CLOUD_INIT_EOF'
|
|
#!/bin/bash
|
|
apt-get update -y
|
|
apt-get install -y curl unzip git zsh nodejs npm
|
|
# Upgrade Node.js to v22 LTS (apt has v18, agents like Cline need v20+)
|
|
# n installs to /usr/local/bin but apt's v18 at /usr/bin can shadow it, so symlink over
|
|
npm install -g n && n 22 && ln -sf /usr/local/bin/node /usr/bin/node && ln -sf /usr/local/bin/npm /usr/bin/npm && ln -sf /usr/local/bin/npx /usr/bin/npx
|
|
# Install Bun
|
|
su - ubuntu -c 'curl -fsSL https://bun.sh/install | bash'
|
|
# Install Claude Code
|
|
su - ubuntu -c 'curl -fsSL https://claude.ai/install.sh | bash'
|
|
# Configure PATH
|
|
echo 'export PATH="${HOME}/.claude/local/bin:${HOME}/.local/bin:${HOME}/.bun/bin:${PATH}"' >> /home/ubuntu/.bashrc
|
|
echo 'export PATH="${HOME}/.claude/local/bin:${HOME}/.local/bin:${HOME}/.bun/bin:${PATH}"' >> /home/ubuntu/.zshrc
|
|
chown ubuntu:ubuntu /home/ubuntu/.bashrc /home/ubuntu/.zshrc
|
|
touch /home/ubuntu/.cloud-init-complete
|
|
chown ubuntu:ubuntu /home/ubuntu/.cloud-init-complete
|
|
CLOUD_INIT_EOF
|
|
}
|
|
|
|
# Wait for Lightsail instance to become running and get its public IP
|
|
# Sets: LIGHTSAIL_SERVER_IP
|
|
# Usage: _wait_for_lightsail_instance NAME [MAX_ATTEMPTS]
|
|
_wait_for_lightsail_instance() {
|
|
local name="${1}"
|
|
local max_attempts=${2:-60}
|
|
local attempt=1
|
|
|
|
log_step "Waiting for instance to become running..."
|
|
while [[ ${attempt} -le ${max_attempts} ]]; do
|
|
local state
|
|
state=$(aws lightsail get-instance --instance-name "${name}" \
|
|
--query 'instance.state.name' --output text 2>/dev/null)
|
|
|
|
if [[ "${state}" == "running" ]]; then
|
|
LIGHTSAIL_SERVER_IP=$(aws lightsail get-instance --instance-name "${name}" \
|
|
--query 'instance.publicIpAddress' --output text)
|
|
export LIGHTSAIL_SERVER_IP
|
|
log_info "Instance running: IP=${LIGHTSAIL_SERVER_IP}"
|
|
return 0
|
|
fi
|
|
log_step "Instance state: ${state} (${attempt}/${max_attempts})"
|
|
sleep "${INSTANCE_STATUS_POLL_DELAY}"
|
|
attempt=$((attempt + 1))
|
|
done
|
|
|
|
log_error "Instance did not become running after ${max_attempts} checks"
|
|
log_warn "The instance may still be provisioning. You can:"
|
|
log_warn " 1. Re-run the command to try again"
|
|
log_warn " 2. Check the instance status: aws lightsail get-instance --instance-name '${name}'"
|
|
log_warn " 3. Check the Lightsail console: https://lightsail.aws.amazon.com/"
|
|
return 1
|
|
}
|
|
|
|
create_server() {
|
|
local name="${1}"
|
|
local bundle="${LIGHTSAIL_BUNDLE:-medium_3_0}"
|
|
local region="${AWS_DEFAULT_REGION:-us-east-1}"
|
|
local az="${region}a"
|
|
local blueprint="ubuntu_24_04"
|
|
|
|
# Validate env var inputs to prevent command injection
|
|
validate_resource_name "${bundle}" || { log_error "Invalid LIGHTSAIL_BUNDLE"; return 1; }
|
|
validate_region_name "${region}" || { log_error "Invalid AWS_DEFAULT_REGION"; return 1; }
|
|
|
|
log_step "Creating Lightsail instance '${name}' (bundle: ${bundle}, AZ: ${az})..."
|
|
|
|
local userdata
|
|
userdata=$(get_cloud_init_userdata)
|
|
|
|
if ! aws lightsail create-instances \
|
|
--instance-names "${name}" \
|
|
--availability-zone "${az}" \
|
|
--blueprint-id "${blueprint}" \
|
|
--bundle-id "${bundle}" \
|
|
--key-pair-name "spawn-key" \
|
|
--user-data "${userdata}" \
|
|
>/dev/null; then
|
|
log_error "Failed to create Lightsail instance"
|
|
log_warn "Common issues:"
|
|
log_warn " - Instance limit reached for your account"
|
|
log_warn " - Bundle unavailable in region (try different LIGHTSAIL_BUNDLE or LIGHTSAIL_REGION)"
|
|
log_warn " - AWS credentials lack Lightsail permissions (check IAM policy)"
|
|
log_warn " - Instance name '${name}' already in use"
|
|
return 1
|
|
fi
|
|
|
|
export LIGHTSAIL_INSTANCE_NAME="${name}"
|
|
log_info "Instance creation initiated: ${name}"
|
|
|
|
_wait_for_lightsail_instance "${name}"
|
|
|
|
save_vm_connection "${LIGHTSAIL_SERVER_IP}" "ubuntu" "" "$name" "aws"
|
|
}
|
|
|
|
# Lightsail uses 'ubuntu' user, not 'root'
|
|
SSH_USER="ubuntu"
|
|
|
|
# SSH operations — delegates to shared helpers
|
|
verify_server_connectivity() { ssh_verify_connectivity "$@"; }
|
|
run_server() { ssh_run_server "$@"; }
|
|
upload_file() { ssh_upload_file "$@"; }
|
|
interactive_session() { ssh_interactive_session "$@"; }
|
|
|
|
wait_for_cloud_init() {
|
|
local ip="${1}"
|
|
local max_attempts=${2:-60}
|
|
|
|
# First ensure SSH connectivity is established
|
|
ssh_verify_connectivity "${ip}" 30 5 || return 1
|
|
|
|
# Then wait for cloud-init completion marker
|
|
generic_ssh_wait "ubuntu" "${ip}" "${SSH_OPTS}" "test -f /home/ubuntu/.cloud-init-complete" "cloud-init" "${max_attempts}" 5
|
|
}
|
|
|
|
destroy_server() {
|
|
local name="${1}"
|
|
log_step "Destroying Lightsail instance ${name}..."
|
|
aws lightsail delete-instance --instance-name "${name}" >/dev/null
|
|
log_info "Instance ${name} destroyed"
|
|
}
|
|
|
|
list_servers() {
|
|
aws lightsail get-instances --query 'instances[].{Name:name,State:state.name,IP:publicIpAddress,Bundle:bundleId}' --output table
|
|
}
|
|
|
|
# ============================================================
|
|
# Cloud adapter interface
|
|
# ============================================================
|
|
|
|
cloud_authenticate() { ensure_aws_cli; ensure_ssh_key; }
|
|
cloud_provision() { create_server "$1"; }
|
|
cloud_wait_ready() { verify_server_connectivity "${LIGHTSAIL_SERVER_IP}"; wait_for_cloud_init "${LIGHTSAIL_SERVER_IP}" 60; }
|
|
cloud_run() { run_server "${LIGHTSAIL_SERVER_IP}" "$1"; }
|
|
cloud_upload() { upload_file "${LIGHTSAIL_SERVER_IP}" "$1" "$2"; }
|
|
cloud_interactive() { interactive_session "${LIGHTSAIL_SERVER_IP}" "$1"; }
|
|
cloud_label() { echo "Lightsail instance"; }
|