codeburn

mirror of https://github.com/AgentSeal/codeburn.git synced 2026-05-17 12:20:43 +00:00

Author	SHA1	Message	Date
Resham Joshi	daa673449c	Menubar and CLI hardening from multi-agent audit (#257 ) Some checks are pending CI / semgrep (push) Waiting to run Details Two passes of validators across CLI accuracy, dashboard UX, menubar Swift, performance, security, and end-to-end smoke tests on real session data. Data-correctness fixes: - parseLocalDate rejects month/day overflow. JS Date silently rolled Feb 31 to Mar 3, so --from 2026-02-31 --to 2026-03-15 quietly dropped sessions on Feb 28 - Mar 2. Now throws "Invalid date" with a clear reason. Leap-day case covered (2024-02-29 valid, 2025-02-29 rejected). - CSV/JSON exports use the active currency's natural decimal places. The previous round2 helper produced ¥412.37 in CSV while the dashboard rendered ¥412 — finance teams comparing the two surfaces saw a discrepancy. New roundForActiveCurrency consults Intl.NumberFormat for the right precision (0 for JPY/KRW/CLP, 2 for USD/EUR, etc). - Copilot toolRequests is Array.isArray-guarded in both modern and legacy event branches. Previously a corrupt session with toolRequests=null or a string aborted the whole file's parse loop and silently dropped every legitimate call after it. - Codex token_count dedup uses a null sentinel for prevCumulativeTotal so the first event is never confused with a duplicate. Sessions that emit only last_token_usage (no total_token_usage) report cumulativeTotal=0 on every event; with the previous 0-initialized prev, the first event matched the dedup guard and was dropped. - LiteLLM pricing values are clamped to [0, 1] per token via safePerTokenRate. Defense in depth against a tampered upstream JSON shipping negative or absurdly large per-token costs that would otherwise propagate into all cost totals. Performance: - Cursor SQLite parse no longer pegs at minutes on multi-GB DBs. Two changes: per-conversation user-message buffer uses an index pointer instead of Array.shift() (which was O(n) per call); and a real ROWID cutoff via subquery limits the scan to the most recent 250k bubbles with a stderr warning so power users get a partial report rather than a stalled CLI. - Spawned codeburn CLI subprocesses are terminated when the calling Task is cancelled. Without this, rapid period/provider tab clicks in the menubar cancelled the Task but left the subprocess running to completion, piling up zombie processes. UX: - Dashboard period switch flips to loading and clears projects synchronously before reloadData runs, eliminating the frame where the new period label rendered over the old period's projects. - Optimize findings tab paginates 3-at-a-time with j/k scroll. With 4 new detectors plus 7 originals, 8-10 findings * 6 lines was scrolling the StatusBar off the alt buffer top. - Custom --from/--to ranges hide the period tab strip and disable the 1-5 / arrow keys so a stray period press no longer abandons the user's explicit range. A "Custom range: X to Y" banner replaces the tab strip. - OpenCode storage-format warning is per-table-set, rate-limited to once per process, and points the user at OpenCode's migration step or the issue tracker. The previous all-or-nothing check fired the generic "format not recognized" string for any schema mismatch. Menubar / OAuth: - Both Claude and Codex bootstrap (Reconnect button) now honour the usageBlockedUntil 429 backoff that refreshIfBootstrapped respects. Spamming Reconnect during sustained rate-limit windows previously hammered the upstream endpoint on every click. - Codex Retry-After HTTP header is parsed (delta-seconds plus IMF-fixdate fallback) so we don't over-back-off when ChatGPT tells us a shorter window than our 5-minute floor. - Both credential cache files are written via SafeFile.write (O_CREAT \| O_EXCL \| O_NOFOLLOW with explicit 0600) so there is no race window where the temp file briefly exists at default umask, and a symlink at the destination cannot redirect the write. Reads now route through SafeFile.read with a 64 KiB cap, closing the symlink-follow gap on Data(contentsOf:). CI signal: - TypeScript strict typecheck (tsc --noEmit) is now zero errors. The six errors in src/providers/copilot.ts came from a discriminated-union catch-all branch whose `data: Record<string, unknown>` shape TS picked over the specific event branches when narrowing on `type`. Removed the catch-all; runtime falls through unknown event types via the existing if/else chain. Tests added: 16 new (now 555 total) - date-range-filter: month/day/year overflow rejection, leap-day correctness - currency-rounding: convertCost no-rounding contract, roundForActiveCurrency for USD/JPY/KRW/EUR - providers/copilot: malformed toolRequests does not abort the parse - providers/cursor-bubble-dedup: re-parse after token mutation does not double-count, single parse yields one call per bubble - providers/codex: first event with cumulativeTotal=0 not dropped, consecutive zero-cumulative duplicates still deduped	2026-05-06 22:15:11 -07:00
ozymandiashh	ff8b20a79e	review: drop streamError flag, add multi-chunk and torn-write tests - Stop tracking a separate streamError flag. createReadStream's default 64 KiB highWaterMark means the stream may already be reading chunk 2 when we break out of the loop after yielding the first line; if that later chunk errors, the flag could reject an otherwise-valid line. readline's async iterator already re-throws stream errors on Node 16+, which the existing catch handles. - Test: 120 KB session_meta line forces multi-chunk line assembly. - Test: truncated mid-write first line is rejected, not parsed as half an object.	2026-05-02 02:34:41 +03:00
ozymandiashh	98bbe5b678	review: cap first-line read size and add edge-case tests - Cap createReadStream at 1 MiB so a malformed file with no newline cannot make readline buffer indefinitely (real session_meta lines are 22-27 KB). - Capture stream errors explicitly; readline's async iterator does not always re-throw underlying stream errors per Node docs. - Test: assert project is extracted from the >16 KB session_meta to prove the line was actually parsed, not just discovered. - Test: session_meta line with no trailing newline is still accepted. - Test: empty rollout file is silently skipped.	2026-05-02 02:30:17 +03:00
ozymandiashh	945da9f0ba	fix(codex): read full first line for session validation `readFirstLine` allocated a fixed 16 KB buffer, but Codex CLI 0.128+ embeds the entire base_instructions / system prompt in the `session_meta` line, pushing it past 20 KB. When the buffer doesn't catch a newline, `isValidCodexSession` rejects the session, so every recent Codex session is silently excluded from totals. Switch to a streaming readline read so the first line is captured regardless of length, and add a regression test that creates a 40 KB session_meta payload. Locally, this changes my 30-day Codex total from €267 (only ~half of sessions parsed) to €878 (all sessions parsed).	2026-05-02 02:17:53 +03:00
AgentSeal	475ab0da61	fix: case-insensitive Codex originator check Codex Desktop on Windows uses "Codex Desktop" as the originator string instead of "codex_cli" or "codex_vscode". The startsWith check was case-sensitive, rejecting these sessions silently. Fixes #1 (comment by @JiglioNero).	2026-04-15 16:06:28 -07:00
AgentSeal	51c56d0726	fix: include agent/subagent sessions, fix Codex cache hit and cost calculation - Remove agent-*.jsonl exclusion filter that was dropping ~46% of API calls - Scan subagents/ directories for subagent session files - Normalize Codex token semantics: OpenAI includes cached tokens inside input_tokens, subtract them to match Anthropic's separate reporting - Fixes cost double-counting and 100% cache hit display for Codex users	2026-04-14 10:18:14 -07:00
AgentSeal	391a235d1d	feat: multi-provider support (Codex + provider plugin system) Add Codex (OpenAI) as a second provider alongside Claude Code. Provider plugin architecture makes adding future providers (Pi, OpenCode, Amp) a single-file addition. - Provider interface: types, session discovery, stateful JSONL parsing - Codex parser: token_count dedup, tool normalization, model resolution - TUI: press p to cycle All/Claude/Codex with 1-min cache for instant switching - CLI: --provider flag on report, today, month, status, export commands - Pricing: Codex model fallbacks, fixed fuzzy matching for gpt-5.4-mini - Menubar: per-provider cost breakdown when multiple providers detected - 27 tests (10 new: Codex parser, provider registry, tool/model mapping)	2026-04-14 04:32:09 -07:00

7 commits