spawn/.github
Ahmed Abushagur ed127cf592
Some checks failed
CLI Release / Build and release CLI (push) Failing after 5s
Lint / Biome Lint (push) Failing after 4s
Lint / macOS Compatibility (push) Successful in 15s
Lint / ShellCheck (push) Successful in 59s
feat: never-give-up resilience layer (#2807)
* feat: never-give-up resilience layer — retry every failure instead of exiting

Add retryOrQuit() helper to shared/ui.ts that prompts "Try again? (Y/n)"
after any recoverable failure. Wrap all fatal exit points with retry loops:

- Cloud auth (Hetzner, DigitalOcean, AWS, GCP): retry after 3 failed tokens
- API key acquisition: retry after 3 failed OAuth+manual attempts
- Server creation: retry on any createServer failure (both fast & sequential)
- SSH readiness: retry on waitForReady timeout
- Agent install: retry on install failure
- Pre-launch hooks: retry on preLaunch failure

Non-interactive mode (SPAWN_NON_INTERACTIVE=1) still throws immediately.
Ctrl+C at any retry prompt exits cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(e2e): add AI-driven interactive test harness

Add --interactive mode to the E2E test framework. Instead of running spawn
in headless mode (SPAWN_NON_INTERACTIVE=1), this spawns the CLI in a real
PTY and uses Claude Haiku to respond to prompts like a human user would.

New files:
- sh/e2e/interactive-harness.ts — Bun script that drives the PTY + AI loop
- sh/e2e/lib/interactive.sh — Bash integration with the E2E framework

Usage:
  e2e.sh --cloud hetzner claude --interactive

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(qa): wire interactive E2E into scheduled QA pipeline

- Add `e2e-interactive` option to workflow_dispatch in qa.yml
- Add `e2e-interactive` run mode to qa.sh (loads cloud creds + ANTHROPIC_API_KEY)
- Runs `e2e.sh --cloud hetzner claude --interactive` directly (no Claude Code needed)
- Defaults to hetzner (cheapest), overridable via E2E_INTERACTIVE_CLOUD/AGENT env vars

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(qa): schedule interactive E2E daily at 6am UTC

Runs one agent (claude) on one cloud (hetzner) with AI-driven prompts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): offset soak cron to avoid GitHub Actions schedule dedup

GitHub Actions deduplicates overlapping cron schedules into one run,
making `github.event.schedule` unpredictable. The soak test at `0 3 * * 1`
was getting absorbed by the `0 */4 * * *` quality sweep and never firing
as reason=soak.

Move soak to `30 1 * * 1` (Monday 1:30am UTC) — safely between the
0am and 4am quality sweep slots. Interactive E2E at `0 6 * * *` is
already safe (between the 4am and 8am slots).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): add e2e-interactive to trigger server valid reasons

The trigger server validates reason query params against an allowlist.
Without this, the `e2e-interactive` dispatch returns 400.

Also note: `soak` is already in VALID_REASONS in the repo but the running
service on the QA VM is stale — needs a restart to pick up both soak and
e2e-interactive reasons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:33:22 -07:00
..
ISSUE_TEMPLATE fix: allow rich text in bug report issue template (#1710) 2026-02-22 10:04:20 -08:00
workflows feat: never-give-up resilience layer (#2807) 2026-03-19 17:33:22 -07:00