mirror of
https://github.com/OpenRouterTeam/spawn.git
synced 2026-05-19 16:39:50 +00:00
feat(e2e): Run E2E tests on all configured clouds, not just AWS (#2236)
- manifest.json: change aws auth to AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY so the key-request system includes AWS in its missing-key emails - sh/e2e/e2e.sh: clouds missing credentials now SKIP (not FAIL), so running --cloud all is safe and only tests what's configured - qa.sh: include e2e mode in cloud credential loading (was fixtures+quality only) - qa-quality-prompt.md: e2e-tester now runs e2e.sh --cloud all --parallel 6 --skip-input-test - qa-e2e-prompt.md: standalone e2e bot now runs e2e.sh --cloud all --parallel 6 Also wires KEY_SERVER_URL + KEY_SERVER_SECRET into /etc/spawn-qa-auth.env (system change, not in this commit) so missing-key emails are actually sent. Co-authored-by: spawn-qa-bot <qa@openrouter.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: L <6723574+louisgv@users.noreply.github.com> Co-authored-by: Ahmed Abushagur <ahmed@abushagur.com>
This commit is contained in:
parent
9291dd9c76
commit
df0593fb21
5 changed files with 43 additions and 40 deletions
|
|
@ -2,7 +2,7 @@ You are a single-agent QA E2E tester for the spawn codebase.
|
|||
|
||||
## Mission
|
||||
|
||||
Run the AWS E2E test suite, investigate any failures, and fix broken provisioning scripts or test infrastructure.
|
||||
Run the E2E test suite across all configured clouds, investigate any failures, and fix broken provisioning scripts or test infrastructure.
|
||||
|
||||
## Time Budget
|
||||
|
||||
|
|
@ -21,15 +21,15 @@ cd WORKTREE_BASE_PLACEHOLDER
|
|||
|
||||
```bash
|
||||
cd REPO_ROOT_PLACEHOLDER
|
||||
chmod +x sh/e2e/aws-e2e.sh
|
||||
./sh/e2e/aws-e2e.sh --parallel 6
|
||||
chmod +x sh/e2e/e2e.sh
|
||||
./sh/e2e/e2e.sh --cloud all --parallel 6
|
||||
```
|
||||
|
||||
Capture the full output. Note which agents passed and which failed.
|
||||
Capture the full output. Note which clouds ran, which agents passed, which failed, and which clouds were skipped (no credentials).
|
||||
|
||||
## Step 2 — If All Pass
|
||||
## Step 2 — If All Configured Clouds Pass
|
||||
|
||||
If every agent passes, you're done. Log the results and exit. No PR needed.
|
||||
If every agent on every configured cloud passes (clouds with no credentials are shown as skipped — that's expected), you're done. Log the results and exit. No PR needed.
|
||||
|
||||
## Step 3 — If Any Agent Fails
|
||||
|
||||
|
|
@ -40,17 +40,14 @@ For each failed agent, investigate the root cause. The failure categories are:
|
|||
1. Check the stderr log in the temp directory printed at the start of the run
|
||||
2. Common causes:
|
||||
- Missing env var for headless mode (e.g., `MODEL_ID` for openclaw)
|
||||
- AWS API auth issues
|
||||
- Cloud API auth issues
|
||||
- Agent-specific install script changed upstream
|
||||
3. Read the agent's provisioning code: `packages/cli/src/aws/aws.ts` and `packages/cli/src/shared/agent-setup.ts`
|
||||
3. Read the agent's provisioning code: `packages/cli/src/{cloud}/{cloud}.ts` and `packages/cli/src/shared/agent-setup.ts`
|
||||
4. Read the E2E provision script: `sh/e2e/lib/provision.sh`
|
||||
|
||||
### Verification failure (instance exists but checks fail)
|
||||
|
||||
1. SSH into the VM to investigate:
|
||||
```bash
|
||||
ssh -o StrictHostKeyChecking=no root@INSTANCE_IP "ls -la ~; cat ~/.spawnrc; echo ---; env"
|
||||
```
|
||||
1. SSH into the VM to investigate: check the IP from the log output
|
||||
2. Check if the binary path changed — read the agent's install script in `packages/cli/src/shared/agent-setup.ts`
|
||||
3. Check if the env var names changed — read the agent's config in `manifest.json`
|
||||
4. Update the verification checks in `sh/e2e/lib/verify.sh` if they are stale
|
||||
|
|
@ -67,13 +64,12 @@ Make fixes in the worktree at WORKTREE_BASE_PLACEHOLDER. Fixes may be in:
|
|||
- `sh/e2e/lib/verify.sh` — binary paths, config file locations, env var checks
|
||||
- `sh/e2e/lib/common.sh` — API helpers, constants
|
||||
- `sh/e2e/lib/teardown.sh` — cleanup logic
|
||||
- `sh/e2e/lib/cleanup.sh` — stale instance detection
|
||||
|
||||
After fixing:
|
||||
1. Run `bash -n` on every modified `.sh` file
|
||||
2. Re-run the E2E suite for the failed agent(s) only to verify the fix:
|
||||
```bash
|
||||
./sh/e2e/aws-e2e.sh AGENT_NAME
|
||||
./sh/e2e/e2e.sh --cloud CLOUD AGENT_NAME
|
||||
```
|
||||
|
||||
## Step 5 — Commit and PR
|
||||
|
|
|
|||
|
|
@ -122,29 +122,26 @@ cd REPO_ROOT_PLACEHOLDER && git worktree remove WORKTREE_BASE_PLACEHOLDER/TASK_N
|
|||
|
||||
### Teammate 4: e2e-tester (model=sonnet)
|
||||
|
||||
**Task**: Run the AWS E2E test suite, investigate failures, and fix broken test infrastructure.
|
||||
**Task**: Run the E2E test suite across all configured clouds, investigate failures, and fix broken test infrastructure.
|
||||
|
||||
**Protocol**:
|
||||
1. Run the E2E suite from the main repo checkout (E2E tests provision live VMs — no worktree needed for the test runner itself):
|
||||
```bash
|
||||
cd REPO_ROOT_PLACEHOLDER
|
||||
chmod +x sh/e2e/aws-e2e.sh
|
||||
./sh/e2e/aws-e2e.sh --parallel 6
|
||||
chmod +x sh/e2e/e2e.sh
|
||||
./sh/e2e/e2e.sh --cloud all --parallel 6 --skip-input-test
|
||||
```
|
||||
2. Capture the full output. Note which agents passed and which failed.
|
||||
3. If all agents pass: report results and you're done. No PR needed.
|
||||
4. If any agent fails, investigate the root cause. Failure categories:
|
||||
2. Capture the full output. Note which clouds ran, which agents passed, which failed, and which clouds were skipped (no credentials).
|
||||
3. If all configured clouds pass (or only skipped clouds): report results and you're done. No PR needed.
|
||||
4. If any agent fails on a configured cloud, investigate the root cause. Failure categories:
|
||||
|
||||
**a) Provision failure** (instance does not exist after provisioning):
|
||||
- Check the stderr log in the temp directory printed at the start of the run
|
||||
- Common causes: missing env var for headless mode, AWS API auth issues, agent install script changed upstream
|
||||
- Read: `packages/cli/src/aws/aws.ts`, `packages/cli/src/shared/agent-setup.ts`, `sh/e2e/lib/provision.sh`
|
||||
- Common causes: missing env var for headless mode, cloud API auth issues, agent install script changed upstream
|
||||
- Read: `packages/cli/src/{cloud}/{cloud}.ts`, `packages/cli/src/shared/agent-setup.ts`, `sh/e2e/lib/provision.sh`
|
||||
|
||||
**b) Verification failure** (instance exists but checks fail):
|
||||
- SSH into the VM to investigate:
|
||||
```bash
|
||||
ssh -o StrictHostKeyChecking=no root@INSTANCE_IP "ls -la ~; cat ~/.spawnrc; echo ---; env"
|
||||
```
|
||||
- SSH into the VM to investigate: check the IP from the log output
|
||||
- Check if binary paths or env var names changed in `manifest.json` or `packages/cli/src/shared/agent-setup.ts`
|
||||
- Update verification checks in `sh/e2e/lib/verify.sh` if stale
|
||||
|
||||
|
|
@ -160,12 +157,11 @@ cd REPO_ROOT_PLACEHOLDER && git worktree remove WORKTREE_BASE_PLACEHOLDER/TASK_N
|
|||
- `sh/e2e/lib/verify.sh` — binary paths, config file locations, env var checks
|
||||
- `sh/e2e/lib/common.sh` — API helpers, constants
|
||||
- `sh/e2e/lib/teardown.sh` — cleanup logic
|
||||
- `sh/e2e/lib/cleanup.sh` — stale instance detection
|
||||
7. Run `bash -n` on every modified `.sh` file
|
||||
8. Re-run the E2E suite for the failed agent(s) only: `./sh/e2e/aws-e2e.sh AGENT_NAME`
|
||||
8. Re-run only the failed agents: `./sh/e2e/e2e.sh --cloud CLOUD AGENT_NAME`
|
||||
9. If changes were made: commit, push, open a PR (NOT draft) with title "fix(e2e): [description]"
|
||||
10. Clean up worktree when done
|
||||
11. Report: agents tested, passed, failed, fixed
|
||||
11. Report: clouds tested, clouds skipped, agents passed, agents failed, fixed
|
||||
12. **SIGN-OFF**: `-- qa/e2e-tester`
|
||||
|
||||
### Teammate 5: record-keeper (model=sonnet)
|
||||
|
|
@ -267,7 +263,7 @@ After all teammates finish, compile a summary:
|
|||
- PRs: [links if any]
|
||||
|
||||
### E2E Tester
|
||||
- Agents tested: X | Passed: Y | Failed: Z | Fixed: W
|
||||
- Clouds tested: X | Clouds skipped: Y | Agents passed: Z | Agents failed: W | Fixed: V
|
||||
- PRs: [links if any]
|
||||
|
||||
### Record-Keeper
|
||||
|
|
|
|||
|
|
@ -183,8 +183,8 @@ done
|
|||
|
||||
log "Pre-cycle cleanup done."
|
||||
|
||||
# --- Load cloud credentials (quality + fixtures modes) ---
|
||||
if [[ "${RUN_MODE}" == "fixtures" ]] || [[ "${RUN_MODE}" == "quality" ]]; then
|
||||
# --- Load cloud credentials (quality + fixtures + e2e modes) ---
|
||||
if [[ "${RUN_MODE}" == "fixtures" ]] || [[ "${RUN_MODE}" == "quality" ]] || [[ "${RUN_MODE}" == "e2e" ]]; then
|
||||
if [[ -f "${REPO_ROOT}/sh/shared/key-request.sh" ]]; then
|
||||
source "${REPO_ROOT}/sh/shared/key-request.sh"
|
||||
load_cloud_keys_from_config
|
||||
|
|
|
|||
|
|
@ -253,7 +253,7 @@
|
|||
"description": "Simple AWS instances starting at $3.50/mo",
|
||||
"url": "https://aws.amazon.com/lightsail/",
|
||||
"type": "cli",
|
||||
"auth": "aws configure (AWS credentials)",
|
||||
"auth": "AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY",
|
||||
"provision_method": "aws lightsail create-instances with --user-data",
|
||||
"exec_method": "ssh ubuntu@IP",
|
||||
"interactive_method": "ssh -t ubuntu@IP",
|
||||
|
|
|
|||
|
|
@ -235,13 +235,9 @@ run_agents_for_cloud() {
|
|||
|
||||
# Validate environment for this cloud
|
||||
if ! require_env; then
|
||||
log_err "Environment validation failed for ${cloud}"
|
||||
# Write fail results for all agents
|
||||
for agent in ${AGENTS_TO_TEST}; do
|
||||
printf 'fail' > "${log_dir}/${cloud}-${agent}.result"
|
||||
done
|
||||
printf 'FAILED (env validation)' > "${log_dir}/${cloud}.summary"
|
||||
return 1
|
||||
log_warn "Credentials not configured for ${cloud} — skipping"
|
||||
printf 'SKIPPED (no credentials)' > "${log_dir}/${cloud}.summary"
|
||||
return 0
|
||||
fi
|
||||
|
||||
local cloud_passed=""
|
||||
|
|
@ -468,6 +464,21 @@ for cloud in ${CLOUDS}; do
|
|||
|
||||
cloud_pass=0
|
||||
cloud_fail=0
|
||||
cloud_skip=0
|
||||
|
||||
# Check if this cloud was skipped (no credentials) — no result files written
|
||||
cloud_has_results=0
|
||||
for agent in ${AGENTS_TO_TEST}; do
|
||||
if [ -f "${LOG_DIR}/${cloud}-${agent}.result" ]; then
|
||||
cloud_has_results=1
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "${cloud_has_results}" -eq 0 ]; then
|
||||
printf " ${YELLOW}(skipped — credentials not configured)${NC}\n"
|
||||
continue
|
||||
fi
|
||||
|
||||
for agent in ${AGENTS_TO_TEST}; do
|
||||
result_file="${LOG_DIR}/${cloud}-${agent}.result"
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue