Commit graph

20 commits

Author SHA1 Message Date
L
f7c6e07867
feat: security triage applies full label taxonomy (#766)
* feat: security triage now applies full label taxonomy

Triage mode now applies:
- Safety label (safe-to-work / malicious / needs-human-review)
- Content-type label (bug, enhancement, security, question, etc.)
- Lifecycle label (Pending Review) so downstream teams can pick up

Team-building mode now transitions lifecycle labels:
- Adds "In Progress" at start, removes it on close

Added a "Available Labels Reference" section to the triage prompt
documenting all label categories for the agent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: all security-filed issues get safe-to-work + Pending Review

Issues filed by the security team (scan findings, drift/anomaly
reports, follow-up issues from closed PRs) now automatically get
`safe-to-work` and `Pending Review` labels so downstream teams
can immediately pick them up without waiting for another triage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove Pending Review from safe-to-work issues

safe-to-work already means triage is complete — adding Pending Review
is redundant and confusing. Now only UNCLEAR issues get Pending Review
(they still need human attention). SAFE issues and security-filed
issues skip straight to actionable with just safe-to-work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: normalize all labels to kebab-case

Renamed on GitHub:
- "In Progress" → "in-progress"
- "Pending Review" → "pending-review"
- "Under Review" → "under-review"
- "good first issue" → "good-first-issue"
- "help wanted" → "help-wanted"

Updated all references in security.sh and refactor.sh to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: align issue templates and workflows with actual repo labels

Created missing labels: cloud-request, agent-request, cli.
Replaced nonexistent needs-triage with pending-review in all templates.

Templates updated:
- bug_report: bug + pending-review
- cli_feature_request: cli + enhancement + pending-review
- cloud_request: cloud-request + enhancement + pending-review
- agent_request: agent-request + enhancement + pending-review

Workflows updated:
- refactor.yml: trigger on safe-to-work AND (bug|cli|enhancement|maintenance)
- discovery.yml: already correct (safe-to-work AND cloud-request|agent-request)
- security.yml: already correct (team-building label check)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Sprite <noreply@sprites.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 16:20:07 -08:00
L
4924a7d5db
feat: add security triage gate for issue safety before agent processing (#734)
New issues are triaged by the security team before other workflows can
act on them. The triage agent checks for prompt injection, social
engineering, spam, and unsafe payloads — marking safe issues with
`safe-to-work`, closing malicious ones, or flagging unclear ones for
human review. Discovery and refactor workflows now require the
`safe-to-work` label in addition to their existing label requirements.

Co-authored-by: Sprite <noreply@sprites.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 14:23:33 -08:00
L
4d175ae6c7
feat: add Team Building issue template + route workflows by label (#733)
- New issue template: Team Building (team-building label) — 2 fields:
  which agent team to improve + what to change
- Security team gets a new team_building mode: reads the issue, spawns
  implementer + reviewer (both Opus), creates PR, reviews, merges, closes issue
- Discovery workflow: only triggers on cloud-request / agent-request issues
- Refactor workflow: only triggers on bug / cli issues
- Security workflow: only triggers on team-building issues (+ PR/schedule)
- All workflows still run on schedule and workflow_dispatch as before

Co-authored-by: Sprite <noreply@sprites.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-12 14:17:57 -08:00
B
200b6dc5b2 fix: Force HTTP/1.1 for streaming to avoid HTTP/2 stream errors
HTTP/2 has strict stream lifecycle management that doesn't play well
with long-lived chunked responses — curl exits with error 92
(stream not closed cleanly: INTERNAL_ERROR). HTTP/1.1 handles
persistent streaming connections natively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-10 22:51:35 +00:00
B
874b9c95f4 feat: Stream script output back to GH Actions instead of keep-alive
Replace the broken keep-alive ping loop with a fundamentally better
approach: the trigger server now streams the script's stdout/stderr
back as the HTTP response body in chunks. The GH Action holds the
curl connection open for the entire cycle duration (~90 min timeout).

This works because Sprite keeps VMs alive while "actively servicing
HTTP requests." A single long-lived streaming response satisfies
this naturally — no synthetic pings needed.

Key changes:

trigger-server.ts:
- /trigger now returns a streaming text/plain Response
- stdout/stderr piped through ReadableStream with chunked output
- 30s heartbeat lines injected during silent periods
- Client disconnect handled gracefully (process keeps running)
- X-Accel-Buffering: no header to prevent proxy buffering

discovery.yml / refactor.yml:
- curl -sSN --fail-with-body streams output in real-time
- timeout-minutes: 90 to hold the connection for full cycles
- Error responses (429/409/401) still print body and exit cleanly

discovery.sh / refactor.sh:
- Removed all keep-alive logic (start_keepalive/stop_keepalive)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-10 18:09:26 +00:00
A
6f47c852c8 Increase refactor workflow frequency from 30min to 5min
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-10 16:24:12 +00:00
A
7ace2695e6
feat: Run issue-fix cycles concurrently with refactor cycles (#145)
Issue triggers now spawn lightweight 2-agent runs (15-min timeout) in
isolated worktrees, while refactor cycles continue independently with
the full 6-agent team (30-min timeout). Duplicate issue runs are
rejected with 409.

- trigger-server.ts: pass SPAWN_ISSUE/SPAWN_REASON env vars to script,
  add issue dedup (409), include issue in health/trigger responses
- refactor.sh: dual-mode (issue vs refactor) with isolated worktrees,
  mode-specific prompts and timeouts, scoped cleanup
- start-refactor.sh: set MAX_CONCURRENT=3 (gitignored, local only)
- refactor.yml: handle 409 alongside existing 429

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 22:15:19 -08:00
B
6b5a547e2d fix: Treat 429 (cycle already running) as success in workflows
When MAX_CONCURRENT=1 and a cycle is in progress, the trigger server
returns 429. This is expected behavior, not an error — the previous
curl -f treated it as failure (exit code 22).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-10 03:43:32 +00:00
A
ab343d26a2
fix: Prevent duplicate work, add graceful shutdown, and enforce team lifecycle (#86)
- Change trigger-server MAX_CONCURRENT default from 3 to 1 to prevent
  overlapping cycles that duplicate GitHub issue comments
- Add SIGTERM/SIGINT handling to trigger-server so running scripts finish
  gracefully on service restart instead of being killed mid-flight
- Add cleanup trap to refactor.sh for worktree/tempfile cleanup on exit
- Add pre-cycle cleanup of stale worktrees, merged branches, and
  abandoned PRs from previously interrupted cycles
- Add mandatory Lifecycle Management section to team lead prompt requiring
  shutdown_request to all teammates before exiting
- Add dedup checks to community-coordinator: check existing comments
  before posting to prevent duplicate acknowledgments/resolutions
- Pass issue number in workflow trigger reason for better logging

Co-authored-by: A <6723574+louisgv@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 09:10:56 -08:00
B
4c456df091 fix: Switch to direct sprite URL with bearer auth
The Sprite start service API (/services/{name}/start) returns
"service name required" for all service names — appears to be an API
bug. Switched to hitting the sprite's public URL directly with
TRIGGER_SECRET bearer auth instead.

- Re-added TRIGGER_SECRET auth to trigger-server.ts
- Set sprite url_settings.auth to "public"
- Updated both workflows to use SPRITE_URL + TRIGGER_SECRET pattern
- Aligned workflow structure (both use same env vars and curl format)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 09:07:49 +00:00
B
9eb9e74295 debug: Print secret lengths and hash to verify values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 08:35:23 +00:00
B
87e5790880 debug: Echo SVC_NAME in refactor workflow
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 08:16:52 +00:00
Sprite
a361d92e13 fix: Pass env vars correctly in refactor workflow
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 08:13:09 +00:00
Sprite
758e79bb59 fix: Inline secret refs in curl URL to avoid env var issues
SERVICE_NAME env var may conflict with GitHub Actions internals.
Inline the secrets directly in the URL template instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 01:04:26 +00:00
Sprite
57cf080c39 chore: Run refactor workflow every 30 minutes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-09 00:09:04 +00:00
Sprite
66221dac80 fix: Use duration=0s to fire-and-forget on start service API
The Sprite start service API returns streaming NDJSON, causing curl -f
to fail with exit code 22. Use duration=0s to return immediately and
drop -f flag since the response is streaming.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 23:40:50 +00:00
Sprite
b7b102a352 fix: Remove curl timeout on trigger workflows
Sprite may take time to wake from pause, causing --max-time 30 to fail
with exit code 22.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 21:33:03 +00:00
Sprite
38ffd7ebd6 feat: Update trigger workflows to use Sprite start service API
- Replace SPRITE_URL/SPRITE_SECRET pattern with SPRITE_NAME/SERVICE_NAME
- Use Sprite start service API endpoint (api.sprites.dev)
- Share SPRITE_TOKEN across all services
- Update skill documentation to reflect new approach
- Delete deprecated URL/SECRET based secrets

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-08 20:29:19 +00:00
Sprite
286609c1ed feat: Add concurrency limits to trigger workflows
Add max 3 concurrent run limits:
- GitHub Actions: concurrency groups prevent workflow queue buildup
- trigger-server: tracks concurrent runs, rejects with 429 if at max
- Configurable via MAX_CONCURRENT env var (defaults to 3)
- Returns running count and max in trigger response

This prevents resource exhaustion when workflows trigger frequently.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 19:34:52 +00:00
L
4a05b32897
Add GitHub Actions triggers for Sprite services (#53)
* refactor: Automated improvements

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore: Remove __pycache__ and add to .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Sprite <noreply@sprite.dev>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 10:29:18 -08:00