spawn

vrr/spawn

mirror of https://github.com/OpenRouterTeam/spawn.git synced 2026-04-29 12:29:31 +00:00

Author	SHA1	Message	Date
L	6e13256d96	refactor: simplify claude launch — no streaming, no output monitoring (#1412 ) Replace the complex claude launch pattern (subshell + PID file + tee pipe + stream-json + 50-line watchdog monitoring log file growth + session-end detection) with a simple direct launch: claude -p "..." >> "${LOG_FILE}" 2>&1 & The watchdog is now just a wall-clock timeout. The idle-output detection, stream-json result parsing, and tee piping are all removed. Also remove GitHub Actions concurrency groups — the trigger server already handles dedup (409 for same issue, 409 for same reason), making the GH Actions concurrency groups redundant queuing. Changes: - refactor.sh: simple launch + wall-clock-only watchdog - security.sh: same simplification - discovery.sh: same (refactored _kill_claude_process and _run_watchdog_loop to simpler signatures) - All 4 workflows: remove concurrency groups Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-17 09:02:47 -08:00
L	f3cfe890f7	refactor: simplify trigger server to fire-and-forget + fix monitoring loop prompts (#1384 ) The trigger server streamed script stdout back to GitHub Actions via a long-lived HTTP response, requiring --http1.1, heartbeat injection, server.timeout(req, 0), createEnqueuer, drainStreamOutput, and 90-min GH Actions timeouts. In practice GitHub Actions is just a dumb trigger — the real state lives on the VM (log files, journalctl). Simplify to fire-and-forget: spawn script, return 200 JSON immediately. Also fix the refactor and discovery team lead monitoring loops. The prompts buried the loop in a single compressed line that the model ignored (doing Bash("sleep 10") repeatedly without calling TaskList). Replace with a dedicated "Monitor Loop (CRITICAL)" section with numbered steps, matching the security.sh pattern that actually works. Changes: - trigger-server.ts: remove ~150 lines of streaming code (createEnqueuer, drainStreamOutput, startStreamingRun, heartbeat, ReadableStream), replace with startFireAndForgetRun (stdout: "inherit", immediate JSON) - All 4 workflows: simple curl POST, timeout-minutes 90→5, remove --http1.1/-N/--max-time/exit-code handling - refactor.sh: add Monitor Loop (CRITICAL) section with numbered steps - discovery-team-prompt.txt: same Monitor Loop fix - SKILL.md: update architecture docs, remove streaming sections Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-17 10:47:52 -05:00
L	0a0512652a	chore: reduce workflow cron frequencies (#1046 ) - discovery: every 30 min → every 3 days - refactor: every 5 min → hourly - security: every 5 min → every 30 min Co-authored-by: Security Reviewer <security-reviewer@spawn.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 18:55:40 -08:00
L	f69f95c7c7	refactor: Simplify security workflow to match discovery/refactor pattern (#929 ) Move mode-detection logic from the GitHub Actions workflow into security.sh where it belongs. The workflow now passes github.event_name directly as the reason parameter (like discovery.yml and refactor.yml), and security.sh uses `gh issue view` to check labels when reason=issues. - Remove 25-line if/elif/else reason-mapping block from security.yml - Remove workflow_dispatch mode input (server-side handles it) - Add `if:` label guard for issues (safe-to-work + team-building/security) - Add `labeled` to issue trigger types - Set cancel-in-progress: false (prevents killing long review_all runs) - Bump cron to */5 - Handle schedule/workflow_dispatch → review_all in security.sh - Keep backwards compat for direct team_building/triage reasons Co-authored-by: Security Reviewer <security-reviewer@spawn.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 05:26:21 -08:00
L	49bb39c8ec	fix: prevent duplicate review_all runs via reason-based dedup (#848 ) Two problems: 1. Schedule was every 20 min but review_all cycles take 35 min, causing overlapping triggers that fill both slots 2. Trigger server only deduped by issue number, not by reason, so two review_all runs could stack up Fixes: - Change schedule from */20 to 0,45 (every 45 min) - Add reason-based dedup in trigger-server.ts: reject 409 if a non-issue run with the same reason is already in progress Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-13 01:41:11 -08:00
L	56c4c020d5	feat: consolidate security review_all and scan into single 20-min cycle (#802 ) The two scheduled modes (review_all every 15 min, scan every 30 min) competed for MAX_CONCURRENT=1 on the trigger server, causing 429 drops and 30-55+ min gaps. Merge both into a single cycle that runs every 20 min, prioritizing PR review but also performing lightweight repo scanning when capacity allows (≤5 open PRs). Also prevents refactor agents from closing issues manually — issues now auto-close via `Fixes #N` in the PR body when merged. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 20:29:56 -08:00
L	15e2ca6caf	feat: consolidate security modes — merge pr+hygiene into review_all (#739 ) Simplify from 6 modes (Hexa-Mode) to 4 modes (Quad-Mode) by folding single-PR review and hygiene into a unified review_all mode that runs every 15 minutes. This removes the pull_request trigger entirely since review_all catches all open PRs on schedule, and absorbs staleness checks + branch cleanup into the same cycle. Remaining modes: team_building, triage, review_all, scan. Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 14:53:26 -08:00
L	4924a7d5db	feat: add security triage gate for issue safety before agent processing (#734 ) New issues are triaged by the security team before other workflows can act on them. The triage agent checks for prompt injection, social engineering, spam, and unsafe payloads — marking safe issues with `safe-to-work`, closing malicious ones, or flagging unclear ones for human review. Discovery and refactor workflows now require the `safe-to-work` label in addition to their existing label requirements. Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 14:23:33 -08:00
L	4d175ae6c7	feat: add Team Building issue template + route workflows by label (#733 ) - New issue template: Team Building (team-building label) — 2 fields: which agent team to improve + what to change - Security team gets a new team_building mode: reads the issue, spawns implementer + reviewer (both Opus), creates PR, reviews, merges, closes issue - Discovery workflow: only triggers on cloud-request / agent-request issues - Refactor workflow: only triggers on bug / cli issues - Security workflow: only triggers on team-building issues (+ PR/schedule) - All workflows still run on schedule and workflow_dispatch as before Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 14:17:57 -08:00
L	56ba47109c	feat: add security review team for PR review (#543 ) (#730 ) * feat: add security review team for PR review (#543) Adds a security team that automatically reviews every PR for security issues (injection, credential leaks, unsafe patterns, macOS compat) and sends Slack notifications to #spawn when concerns are found. - security.sh: dual-mode cycle script (PR review + scheduled scan) - security.yml: GitHub Actions workflow on pull_request events - start-security.sh: gitignored wrapper with secrets (deployed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: expand security team with hygiene, scan modes + auto-merge clean PRs - PR mode: 2-agent team (code-reviewer + test-verifier) reviews PRs. If zero findings, auto-approves AND merges. If concerns, requests changes and sends Slack notification to #spawn. - Hygiene mode (every 6h): pr-triager + branch-cleaner close stale PRs, file follow-up issues, delete orphan branches. - Scan mode (daily): shell-auditor + code-auditor + drift-detector perform full repo security audit, file GitHub issues for findings. - All modes use Claude Code agent teams (TeamCreate, parallel teammates via Task tool, SendMessage coordination, TaskList monitoring). - Workflow updated with schedule triggers and workflow_dispatch inputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: upgrade all security auditor agents to Opus model All security-critical roles (code-reviewer, pr-triager, shell-auditor, code-auditor) now use Opus. Helper roles (test-verifier, branch-cleaner, drift-detector) remain on Haiku. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: auto-merge PRs with MEDIUM/LOW or no findings Only CRITICAL/HIGH findings block a PR. MEDIUM/LOW are informational notes included in the approving review — PR still gets merged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sprite <noreply@sprites.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 14:04:38 -08:00

10 commits