mirror of
https://github.com/OpenRouterTeam/spawn.git
synced 2026-04-28 03:49:31 +00:00
* fix: use uv --upgrade to ensure Python 3.13-compatible Pillow across all clouds aider-chat on Python 3.13 fails with `ImportError: cannot import name '_imaging' from 'PIL'` when an old Pillow version (pre-10.4) is resolved — those releases have no Python 3.13 binary wheels, so the C extension is missing at runtime. Replace `--with 'Pillow>=10.2.0'` (which was silently broken — the `>` and single quotes get mangled by `printf '%q'` in run_server before the command reaches the remote machine) with `--upgrade`, which forces all transitive deps including Pillow to their latest compatible versions. Also adds a plain-text echo before the install so users see progress instead of a silent hang during the 2-4 minute install. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: update aider/gptme/interpreter assertions from pip to uv The install method for aider, gptme, and open-interpreter was changed from pip to `uv tool install` across all clouds. The mock test assertions still checked for the old `pip.*install.*` patterns, causing 9 failures (3 agents × 3 clouds). Update patterns to match the actual `uv tool install` commands now used in all cloud scripts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger test run for uv assertion fix * fix: prevent SSH hangs, restore stderr, fix command escaping across clouds - Add < /dev/null to ssh_run_server and generic_ssh_wait to prevent SSH stdin theft causing sequential install/verify/configure steps to hang - Add ServerAliveInterval, ServerAliveCountMax, ConnectTimeout to default SSH_OPTS so long-running installs don't silently drop on flaky networks - Remove 2>/dev/null from Fly.io run_server so remote command errors are no longer silently swallowed (--quiet flag still suppresses flyctl noise) - Fix Fly.io printf '%q' double-quoting: remove extra quotes around $escaped_cmd that prevented the remote shell from consuming escapes, breaking && || | operators in commands - Remove broken printf '%q' from Daytona run_server and interactive_session where it escaped shell operators into literal characters since daytona exec has no intermediate shell layer - Pin aider to --python 3.12 instead of --with audioop-lts across all clouds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add --pty to fly ssh console for interactive sessions fly ssh console -C does not allocate a pseudo-terminal by default, causing interactive TUI agents (aider, claude) to fail with "Input is not a terminal (fd=0)" or completely unresponsive input. Adding --pty forces PTY allocation, matching how other clouds handle interactive sessions (SSH uses -t, Sprite uses -tty). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prepend ~/.local/bin to PATH in ssh_run_server After uv installs to ~/.local/bin, the current shell session doesn't have it in PATH, causing "uv: command not found" on DigitalOcean and all other SSH-based clouds (Hetzner, AWS, GCP, OVH). Fly.io's run_server already prepends this PATH — now the shared ssh_run_server does the same, fixing all SSH-based clouds at once. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add Node.js to cloud-init for all cloud providers npm-based agents (codex, kilocode, etc.) fail with "npm: command not found" because Node.js isn't installed during cloud-init. Fly.io was the only provider installing Node.js (in wait_for_cloud_init). Now all cloud-init scripts install Node.js v22 LTS from nodesource, matching Fly.io's setup. Also adds ~/.local/bin to PATH in AWS and GCP cloud-init (was already in shared/DigitalOcean/Hetzner). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use apt packages for nodejs/npm instead of nodesource The nodesource setup script (setup_22.x) runs its own apt-get update and repository configuration, nearly doubling cloud-init time and causing hangs on DigitalOcean. Ubuntu 24.04 includes nodejs and npm in its default repos — just add them to the packages list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add timeouts and better error handling to Daytona CLI commands Daytona CLI commands (login, list, create) can hang indefinitely when the API is slow or unreachable. This causes: - "Failed to create sandbox: timeout" with no recovery - Token validation timeouts misreported as "invalid token" - Users re-entering valid tokens that also timeout Fixes: - Wrap all daytona CLI calls with timeout (30s for auth, 120s for create) - Detect timeout errors separately from auth errors - Show actionable "try again / check status" messages for timeouts - Add nodejs/npm to Daytona wait_for_cloud_init Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: set DAYTONA_API_URL to Daytona Cloud by default The Daytona CLI may default to connecting to a local self-hosted server instead of Daytona Cloud. Without DAYTONA_API_URL set to https://app.daytona.io/api, every CLI command (login, list, create) hangs trying to reach a non-existent local server and times out. The SDK documents this as the default, but the CLI doesn't always pick it up — now we export it explicitly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: symlink n-installed Node.js v22 over apt v18 to prevent shadowing n installs Node.js v22 to /usr/local/bin/node but apt's v18 at /usr/bin/node can shadow it in non-interactive SSH sessions. After n 22, symlink the new binaries over the apt ones so v22 is always resolved. Also fix hcloud CLI token extraction for new TOML format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review, add curl timeouts to trigger workflows - Fix ssh_run_server command injection concern: use single-quoted path_prefix so $HOME/$PATH expand remotely, not locally - Add --connect-timeout 15 --max-time 30 to trigger workflows to prevent 5-min hangs when server streams responses - Handle 409 (dedup) as success — expected when cron fires every 15min but cycles take 35min - Reduce workflow timeout-minutes from 5 to 2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
303 lines
12 KiB
Bash
303 lines
12 KiB
Bash
#!/bin/bash
|
|
# Common bash functions for GCP Compute Engine spawn scripts
|
|
# Uses gcloud CLI — requires Google Cloud SDK installed and configured
|
|
|
|
# Bash safety flags
|
|
set -eo pipefail
|
|
|
|
# ============================================================
|
|
# Provider-agnostic functions
|
|
# ============================================================
|
|
|
|
# Source shared provider-agnostic functions (local or remote fallback)
|
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" 2>/dev/null && pwd)"
|
|
if [[ -n "${SCRIPT_DIR}" && -f "${SCRIPT_DIR}/../../shared/common.sh" ]]; then
|
|
source "${SCRIPT_DIR}/../../shared/common.sh"
|
|
else
|
|
eval "$(curl -fsSL https://raw.githubusercontent.com/OpenRouterTeam/spawn/main/shared/common.sh)"
|
|
fi
|
|
|
|
# Note: Provider-agnostic functions (logging, OAuth, browser, nc_listen) are now in shared/common.sh
|
|
|
|
# ============================================================
|
|
# GCP Compute Engine specific functions
|
|
# ============================================================
|
|
|
|
# Cache username to avoid repeated subprocess calls
|
|
GCP_USERNAME=$(whoami)
|
|
SSH_USER="${GCP_USERNAME}"
|
|
|
|
SPAWN_DASHBOARD_URL="https://console.cloud.google.com/compute/instances"
|
|
|
|
# SSH_OPTS is now defined in shared/common.sh
|
|
|
|
# Verify gcloud CLI is installed
|
|
_gcp_check_cli_installed() {
|
|
if ! command -v gcloud &>/dev/null; then
|
|
log_error "Google Cloud SDK (gcloud) is required but not installed"
|
|
log_error ""
|
|
log_error "Possible causes:"
|
|
log_error " - gcloud CLI has not been installed on this machine"
|
|
log_error ""
|
|
log_error "How to fix:"
|
|
log_error " 1. Install gcloud CLI for your platform:"
|
|
log_error ""
|
|
log_error " ${CYAN}macOS (Homebrew)${NC}"
|
|
log_error " brew install google-cloud-sdk"
|
|
log_error ""
|
|
log_error " ${CYAN}Ubuntu/Debian${NC}"
|
|
log_error " curl https://sdk.cloud.google.com | bash"
|
|
log_error " exec -l \$SHELL # Restart shell"
|
|
log_error ""
|
|
log_error " ${CYAN}Fedora/RHEL${NC}"
|
|
log_error " sudo tee -a /etc/yum.repos.d/google-cloud-sdk.repo << EOM"
|
|
log_error " [google-cloud-cli]"
|
|
log_error " name=Google Cloud CLI"
|
|
log_error " baseurl=https://packages.cloud.google.com/yum/repos/cloud-sdk-el9-x86_64"
|
|
log_error " enabled=1"
|
|
log_error " gpgcheck=1"
|
|
log_error " repo_gpgcheck=0"
|
|
log_error " gpgkey=https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg"
|
|
log_error " EOM"
|
|
log_error " sudo dnf install google-cloud-cli"
|
|
log_error ""
|
|
log_error " 2. Full installation guide: ${CYAN}https://cloud.google.com/sdk/docs/install${NC}"
|
|
log_error ""
|
|
log_error " 3. After installation, authenticate:"
|
|
log_error " gcloud auth login"
|
|
log_error " gcloud config set project YOUR_PROJECT_ID"
|
|
return 1
|
|
fi
|
|
}
|
|
|
|
# Verify gcloud has an active authenticated account
|
|
_gcp_check_auth() {
|
|
if ! gcloud auth list --filter=status:ACTIVE --format="value(account)" 2>/dev/null | head -1 | grep -q '@'; then
|
|
log_warn "No active Google Cloud account — launching gcloud auth login..."
|
|
gcloud auth login || {
|
|
log_error "Authentication failed. You can also set credentials via:"
|
|
log_error " export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json"
|
|
return 1
|
|
}
|
|
fi
|
|
}
|
|
|
|
# Resolve and export GCP_PROJECT from env var or gcloud config
|
|
_gcp_resolve_project() {
|
|
local project="${GCP_PROJECT:-$(gcloud config get-value project 2>/dev/null)}"
|
|
if [[ -z "${project}" || "${project}" == "(unset)" ]]; then
|
|
log_error "No GCP project configured"
|
|
log_error ""
|
|
log_error "Possible causes:"
|
|
log_error " - No project is set in gcloud config or GCP_PROJECT env var"
|
|
log_error " - You haven't created a GCP project yet"
|
|
log_error ""
|
|
log_error "How to fix:"
|
|
log_error " 1. List your existing projects:"
|
|
log_error " ${CYAN}gcloud projects list${NC}"
|
|
log_error ""
|
|
log_error " 2. Set a project via environment variable:"
|
|
log_error " ${CYAN}export GCP_PROJECT=your-project-id${NC}"
|
|
log_error ""
|
|
log_error " 3. Or set via gcloud config:"
|
|
log_error " ${CYAN}gcloud config set project YOUR_PROJECT_ID${NC}"
|
|
log_error ""
|
|
log_error " 4. Don't have a project? Create one:"
|
|
log_error " ${CYAN}https://console.cloud.google.com/projectcreate${NC}"
|
|
log_error ""
|
|
log_error " 5. Enable Compute Engine API for your project:"
|
|
log_error " ${CYAN}https://console.cloud.google.com/apis/library/compute.googleapis.com${NC}"
|
|
return 1
|
|
fi
|
|
export GCP_PROJECT="${project}"
|
|
log_info "Using GCP project: ${project}"
|
|
}
|
|
|
|
ensure_gcloud() {
|
|
_gcp_check_cli_installed || return 1
|
|
_gcp_check_auth || return 1
|
|
_gcp_resolve_project
|
|
}
|
|
|
|
ensure_ssh_key() {
|
|
local key_path="${HOME}/.ssh/id_ed25519"
|
|
|
|
# Generate key if needed
|
|
generate_ssh_key_if_missing "${key_path}"
|
|
|
|
# GCP handles SSH keys via project/instance metadata, added during create
|
|
log_info "SSH key ready"
|
|
}
|
|
|
|
get_server_name() {
|
|
get_resource_name "GCP_INSTANCE_NAME" "Enter instance name: "
|
|
}
|
|
|
|
get_cloud_init_userdata() {
|
|
cat << 'CLOUD_INIT_EOF'
|
|
#!/bin/bash
|
|
apt-get update -y
|
|
apt-get install -y curl unzip git zsh nodejs npm
|
|
# Upgrade Node.js to v22 LTS (apt has v18, agents like Cline need v20+)
|
|
# n installs to /usr/local/bin but apt's v18 at /usr/bin can shadow it, so symlink over
|
|
npm install -g n && n 22 && ln -sf /usr/local/bin/node /usr/bin/node && ln -sf /usr/local/bin/npm /usr/bin/npm && ln -sf /usr/local/bin/npx /usr/bin/npx
|
|
# Install Bun
|
|
su - $(logname 2>/dev/null || echo "$USER") -c 'curl -fsSL https://bun.sh/install | bash' || true
|
|
# Install Claude Code
|
|
su - $(logname 2>/dev/null || echo "$USER") -c 'curl -fsSL https://claude.ai/install.sh | bash' || true
|
|
# Configure PATH for all users
|
|
echo 'export PATH="${HOME}/.claude/local/bin:${HOME}/.local/bin:${HOME}/.bun/bin:${PATH}"' >> /etc/profile.d/spawn.sh
|
|
chmod +x /etc/profile.d/spawn.sh
|
|
touch /tmp/.cloud-init-complete
|
|
CLOUD_INIT_EOF
|
|
}
|
|
|
|
# Prepare startup script and SSH metadata temp files for gcloud instance creation
|
|
# Sets startup_script_file and pub_key variables in caller's scope
|
|
_gcp_prepare_instance_files() {
|
|
startup_script_file=$(mktemp)
|
|
track_temp_file "${startup_script_file}"
|
|
get_cloud_init_userdata > "${startup_script_file}"
|
|
|
|
pub_key=$(cat "${HOME}/.ssh/id_ed25519.pub")
|
|
}
|
|
|
|
# Run gcloud compute instances create and handle errors
|
|
# Returns 0 on success, 1 on failure with diagnostic output
|
|
_gcp_run_create() {
|
|
local name="${1}" zone="${2}" machine_type="${3}"
|
|
local image_family="${4}" image_project="${5}" startup_script_file="${6}" pub_key="${7}"
|
|
|
|
local gcloud_err
|
|
gcloud_err=$(mktemp)
|
|
track_temp_file "${gcloud_err}"
|
|
|
|
if gcloud compute instances create "${name}" \
|
|
--zone="${zone}" \
|
|
--machine-type="${machine_type}" \
|
|
--image-family="${image_family}" \
|
|
--image-project="${image_project}" \
|
|
--metadata-from-file="startup-script=${startup_script_file}" \
|
|
--metadata="ssh-keys=${GCP_USERNAME}:${pub_key}" \
|
|
--project="${GCP_PROJECT}" \
|
|
--quiet \
|
|
>/dev/null 2>"${gcloud_err}"; then
|
|
return 0
|
|
fi
|
|
|
|
local err_output
|
|
err_output=$(cat "${gcloud_err}" 2>/dev/null)
|
|
|
|
# Auto-reauth on expired tokens, then retry once
|
|
if printf '%s' "${err_output}" | grep -qi "reauthentication\|refresh.*auth\|token.*expired\|credentials.*invalid"; then
|
|
log_warn "Auth tokens expired — running gcloud auth login..."
|
|
if gcloud auth login && gcloud config set project "${GCP_PROJECT}"; then
|
|
log_info "Re-authenticated, retrying instance creation..."
|
|
if gcloud compute instances create "${name}" \
|
|
--zone="${zone}" \
|
|
--machine-type="${machine_type}" \
|
|
--image-family="${image_family}" \
|
|
--image-project="${image_project}" \
|
|
--metadata-from-file="startup-script=${startup_script_file}" \
|
|
--metadata="ssh-keys=${GCP_USERNAME}:${pub_key}" \
|
|
--project="${GCP_PROJECT}" \
|
|
--quiet \
|
|
>/dev/null 2>"${gcloud_err}"; then
|
|
return 0
|
|
fi
|
|
err_output=$(cat "${gcloud_err}" 2>/dev/null)
|
|
fi
|
|
fi
|
|
|
|
log_error "Failed to create GCP instance"
|
|
if [[ -n "${err_output}" ]]; then
|
|
log_error "gcloud error: ${err_output}"
|
|
fi
|
|
log_warn "Common issues:"
|
|
log_warn " - Billing not enabled for the project (enable at https://console.cloud.google.com/billing)"
|
|
log_warn " - Compute Engine API not enabled (enable at https://console.cloud.google.com/apis)"
|
|
log_warn " - Instance quota exceeded in zone (try different GCP_ZONE)"
|
|
log_warn " - Machine type unavailable in zone (try different GCP_MACHINE_TYPE or GCP_ZONE)"
|
|
return 1
|
|
}
|
|
|
|
# Get the external IP of a GCP instance
|
|
# Usage: _gcp_get_instance_ip NAME ZONE
|
|
_gcp_get_instance_ip() {
|
|
local name="${1}" zone="${2}"
|
|
gcloud compute instances describe "${name}" \
|
|
--zone="${zone}" \
|
|
--project="${GCP_PROJECT}" \
|
|
--format='get(networkInterfaces[0].accessConfigs[0].natIP)' 2>/dev/null
|
|
}
|
|
|
|
create_server() {
|
|
local name="${1}"
|
|
local machine_type="${GCP_MACHINE_TYPE:-e2-medium}"
|
|
local zone="${GCP_ZONE:-us-central1-a}"
|
|
local image_family="ubuntu-2404-lts-amd64"
|
|
local image_project="ubuntu-os-cloud"
|
|
|
|
# Validate env var inputs to prevent command injection
|
|
validate_resource_name "${machine_type}" || { log_error "Invalid GCP_MACHINE_TYPE"; return 1; }
|
|
validate_region_name "${zone}" || { log_error "Invalid GCP_ZONE"; return 1; }
|
|
|
|
log_step "Creating GCP instance '${name}' (type: ${machine_type}, zone: ${zone})..."
|
|
|
|
local startup_script_file pub_key
|
|
_gcp_prepare_instance_files
|
|
|
|
_gcp_run_create "${name}" "${zone}" "${machine_type}" \
|
|
"${image_family}" "${image_project}" "${startup_script_file}" "${pub_key}" || return 1
|
|
|
|
# shellcheck disable=SC2034 # Variables exported for use by sourcing scripts
|
|
export GCP_INSTANCE_NAME_ACTUAL="${name}"
|
|
export GCP_ZONE="${zone}"
|
|
export GCP_SERVER_IP="$(_gcp_get_instance_ip "${name}" "${zone}")"
|
|
|
|
log_info "Instance created: IP=${GCP_SERVER_IP}"
|
|
|
|
save_vm_connection "${GCP_SERVER_IP}" "${SSH_USER:-$(whoami)}" "" "$name" "gcp" "{\"zone\":\"${zone}\",\"project\":\"${GCP_PROJECT}\"}"
|
|
}
|
|
|
|
verify_server_connectivity() { ssh_verify_connectivity "$@"; }
|
|
|
|
wait_for_cloud_init() {
|
|
local ip="${1}" max_attempts=${2:-60}
|
|
|
|
# First establish SSH connectivity
|
|
ssh_verify_connectivity "${ip}" 30 5
|
|
|
|
# Then wait for startup script completion marker
|
|
generic_ssh_wait "${SSH_USER}" "${ip}" "$SSH_OPTS -o ConnectTimeout=5" "test -f /tmp/.cloud-init-complete" "startup script completion" "${max_attempts}" 5
|
|
}
|
|
|
|
# Standard SSH operations (delegates to shared helpers in shared/common.sh)
|
|
# GCP uses current username via SSH_USER set above
|
|
run_server() { ssh_run_server "$@"; }
|
|
upload_file() { ssh_upload_file "$@"; }
|
|
interactive_session() { ssh_interactive_session "$@"; }
|
|
|
|
destroy_server() {
|
|
local name="${1}"
|
|
local zone="${GCP_ZONE:-us-central1-a}"
|
|
log_step "Destroying GCP instance ${name}..."
|
|
gcloud compute instances delete "${name}" --zone="${zone}" --project="${GCP_PROJECT}" --quiet >/dev/null 2>&1
|
|
log_info "Instance ${name} destroyed"
|
|
}
|
|
|
|
list_servers() {
|
|
gcloud compute instances list --project="${GCP_PROJECT}" --format='table(name,zone,status,networkInterfaces[0].accessConfigs[0].natIP:label=EXTERNAL_IP,machineType.basename())'
|
|
}
|
|
|
|
# ============================================================
|
|
# Cloud adapter interface
|
|
# ============================================================
|
|
|
|
cloud_authenticate() { ensure_gcloud; ensure_ssh_key; }
|
|
cloud_provision() { create_server "$1"; }
|
|
cloud_wait_ready() { verify_server_connectivity "${GCP_SERVER_IP}"; wait_for_cloud_init "${GCP_SERVER_IP}" 60; }
|
|
cloud_run() { run_server "${GCP_SERVER_IP}" "$1"; }
|
|
cloud_upload() { upload_file "${GCP_SERVER_IP}" "$1" "$2"; }
|
|
cloud_interactive() { interactive_session "${GCP_SERVER_IP}" "$1"; }
|
|
cloud_label() { echo "GCP instance"; }
|