spawn/test
Ahmed Abushagur c6d0cb218e
improve: make QA bot more effective with structured failures and verification (#1034)
5 improvements to the QA cycle:

1. Fix agents now get structured failure context — categorized failures
   (exit_code, missing_api_call, missing_env, no_fixture) instead of
   raw 500-line test output, plus a passing agent for comparison

2. Fix agent changes are verified before committing — re-runs mock tests
   after the agent finishes and only commits if results actually improved,
   discarding bad fixes that would create noise PRs

3. Test results now include failure categories — mock.sh records
   cloud/agent:fail:reason instead of just cloud/agent:fail, enabling
   smarter failure routing

4. Mock curl logs NO_FIXTURE warnings when no fixture matches a GET
   request, surfacing false-confidence gaps where tests pass with
   synthetic fallback data

5. Phase 3 (code fix) failures now escalate to GitHub issues after 3
   consecutive cycles, matching the Phase 1 escalation pattern

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 20:07:54 -05:00
..
fixtures refactor: extract mock test env config and API assertions into per-cloud fixture files (#803) 2026-02-13 02:16:11 -08:00
mock.sh improve: make QA bot more effective with structured failures and verification (#1034) 2026-02-13 20:07:54 -05:00
qa-dry-run.sh feat: qa bot and emails (#565) 2026-02-11 20:19:45 -08:00
record.sh refactor: decompose multi-credential config handling in test/record.sh (#1004) 2026-02-13 13:34:37 -08:00
run.sh refactor: extract helpers from run_script_test and run_shellcheck in test/run.sh (#776) 2026-02-12 17:19:32 -08:00
update-readme.py QA-Bot setup (#335) 2026-02-10 19:51:07 -08:00