codeburn

mirror of https://github.com/AgentSeal/codeburn.git synced 2026-05-17 12:20:43 +00:00

Author	SHA1	Message	Date
voidborne-d	c16b21ec50	fix(classifier): surface skill name as subCategory for general turns (#203 ) Turns whose only assistant tool is `Skill` collapse to category `general` because `classifyByToolPattern` returns `'general'` and `refineByKeywords` only operates on `coding`/`exploration`. In environments that lean on Claude Code skills, the per-activity dashboard column flattens — every `/init`, `/review`, `/security-review`, `/claude-api`, plus user-defined skills, all land in `general` with no signal about which workflow ran. Implements Option A from the issue: - `ParsedApiCall.skills: string[]` populated in the Anthropic-path parser via a new `extractSkillNames` helper that reads `input.skill \|\| input.name` from each `Skill` ToolUseBlock (mirrors `detectGhostSkills` extraction at optimize.ts:765 so the two stay in sync). - `ClassifiedTurn.subCategory?: string` set to the first skill name when the resolved category is `general` AND any skill identifier was extracted. Top-level category stays `general` — existing aggregations, exports, and category-keyed code paths unchanged. - `SessionSummary.skillBreakdown: Record<string, {turns,costUSD,editTurns, oneShotTurns}>` populated in the same per-turn loop that builds `categoryBreakdown`. Provider sessions (Codex/Cursor/etc.) keep `skills: []` — they don't expose the Skill tool surface today. - Dashboard `ActivityBreakdown` renders top-N skill sub-rows beneath the `general` row when present (indented `/skill-name`, dimmed). Other categories render exactly as before; if no skills were invoked, the panel is byte-identical to current output. Existing 419 tests still pass. New `tests/classifier.test.ts` adds 8 cases: single skill via `input.skill`, single via `input.name`, first-wins for multi-skill turns, aggregation across multiple assistant calls in one turn, no-name fallback (`subCategory` stays undefined), `Skill+Edit` promoting to `coding` and dropping subCategory, non-Skill general turns, and a legacy ParsedApiCall shape with `skills` field absent (forward-compat). Pre-fix verification by stashing the source change reproduces 4/8 failures with the exact "expected 'init', received undefined" diff; restoring → 8/8 pass. Closes #203. 🤖 AI assistance disclosure: assistant-scaffolded by Claude (Opus 4.7); author of record reviewed every line, ran the full vitest suite locally (`npm test` → 32 files / 427 tests pass), `npx tsc --noEmit` clean, and `npm run build` produces a clean ESM bundle.	2026-05-04 06:26:45 +08:00
iamtoruk	bd43b15342	feat(compare): model comparison with planning rate fix 5-section compare view: Performance (one-shot, retry, self-correction), Efficiency (cost/call, cost/edit, output/call, cache hit), Category Head-to-Head bar charts, Working Style, and Context. Planning rate now detects TaskCreate/TaskUpdate/TodoWrite instead of only EnterPlanMode (which was never used, showing 0% for all models). Validated against raw JSONL with zero false positives. Responsive side-by-side layout at 90+ cols. Self-correction scanner with compact file skipping and model+timestamp dedup. 274 tests.	2026-04-19 08:34:49 -07:00
iamtoruk	fb24eea186	fix(compare): refine self-correction patterns, skip compact files, deduplicate Remove high-false-positive patterns (I'm sorry, I should have, sorry for). Add precise patterns (you're right I, that was incorrect, let me correct). Skip compact JSONL files that replay compressed context. Deduplicate by model+timestamp to prevent double-counting. Fix test timestamps to work with deduplication.	2026-04-19 07:14:02 -07:00
iamtoruk	3cb9a7a7bc	feat(compare): add self-correction JSONL scanner Adds scanSelfCorrections() which reads raw .jsonl session files (including subagent dirs) and counts per-model self-correction patterns for use in the model comparison metrics.	2026-04-19 05:25:31 -07:00
iamtoruk	ac9afffed5	feat(compare): add computeComparison with normalized metrics	2026-04-19 05:22:34 -07:00
iamtoruk	9d119bfe40	feat(compare): add ModelStats type and aggregateModelStats	2026-04-19 05:20:37 -07:00

6 commits