codeburn

mirror of https://github.com/AgentSeal/codeburn.git synced 2026-05-28 01:16:28 +00:00

Author	SHA1	Message	Date
AgentSeal	7aefd674fc	fix: drop better-sqlite3 to remove deprecated prebuild-install (#75 ) npm was warning on every install that prebuild-install@7.1.3 is no longer maintained. prebuild-install ships as a transitive dependency of better-sqlite3 and upstream PR #1446 to replace it is still open, so we switch to Node's built-in node:sqlite module (stable in Node 24, experimental in Node 22/23) and remove the better-sqlite3 dep entirely. - src/sqlite.ts: uses DatabaseSync from node:sqlite. The one-shot ExperimentalWarning about SQLite on Node 22/23 is silenced for that specific warning; other warnings pass through unchanged. - package.json: engines.node bumped to >=22 (Node 20 EOL 2026-04-30), better-sqlite3 and @types/better-sqlite3 removed, @types/node added (it was coming in transitively via @types/better-sqlite3). - tests/providers/opencode.test.ts: fixture DB creation switched to node:sqlite (API parity for the CREATE TABLE + INSERT + prepare path we use). End-user install footprint shrinks from 167 to 40 packages and prints zero deprecation warnings. Credit: @primeminister for the report.	2026-04-18 01:26:23 -07:00
Resham Joshi	495a254338	feat(mac): native Swift menubar app + one-command install Introduces mac/ with a native SwiftUI menubar app that replaces the previous SwiftBar plugin entirely. Install via `npx codeburn menubar`, which downloads the .app from GitHub Releases, strips Gatekeeper quarantine, and drops it into ~/Applications. Highlights - mac/ SwiftUI app: agent tabs, Today/7/30/Month/All period switcher, Trend/Forecast/Pulse/Stats/Plan insights, activity + model breakdowns, optimize findings, CSV/JSON export, Star-on-GitHub banner, live 60s refresh, instant currency switching with offline FX cache. - Security: CodeburnCLI argv-based spawn (no shell interpretation), SafeFile symlink guards + O_NOFOLLOW writes, FX rate clamping to [0.0001, 1_000_000], keychain filtered to account == "default", removed byte-window credential log, in-flight refresh guard, POSIX flock on config.json writes, TerminalLauncher validates argv before AppleScript interpolation. - Performance: shared static NumberFormatter (thousands of allocations per popover redraw eliminated), concurrent pipe drain with 20 MB cap + 60s timeout in DataClient, Observation-tracked reactive UI, 5-min payload cache keyed on (period, provider). - CLI: new `codeburn menubar` subcommand that downloads + installs + launches the .app (no clone, no build). New `status --format menubar-json` payload builder. `export` rewritten to produce a folder of one-table-per-file CSVs with a `.codeburn-export` marker so arbitrary -o paths cannot be silently deleted. - Removed: src/menubar.ts (SwiftBar plugin generator), install-menubar / uninstall-menubar subcommands, `status --format menubar` directive output, tests/menubar.test.ts, tests/security/menubar-injection.test.ts. - Release: .github/workflows/release-menubar.yml builds universal binary, assembles .app, ad-hoc signs, zips, uploads on mac-v* tag push. Runs on the free macos-latest runner. Tests - 230 TypeScript tests pass - 10 Swift CapacityEstimator tests pass - TypeScript typecheck clean - Swift release build clean	2026-04-17 16:55:56 -07:00
AgentSeal	77257bcb89	Merge pull request #68 from lfl1337/fix/remove-claudeignore-references docs(optimize): remove references to .claudeignore (#61)	2026-04-17 14:20:50 +02:00
Ninym	71461fb352	test(security): add failing test for MEDIUM-2 menubar injection Three cases (pipe-in-model, ANSI-in-model, pipe-in-category) reproduce the audit's SwiftBar directive-separator attack. Tests fail against current menubar.ts -- Task 13 will close with an allowlist sanitizer.	2026-04-17 08:32:20 +02:00
Ninym	0ab66f6fe9	test(fs-utils): add failing test for bounded read helper Tests the to-be-built readSessionFile helper: under-cap fast path, at-threshold stream path, over-cap null+skip, verbose stderr warning, and stat-failure graceful fallback. Fails against missing module -- Task 5 will implement src/fs-utils.ts to flip GREEN.	2026-04-17 08:32:18 +02:00
Ninym	e890d9bfc3	test(security): add failing test for HIGH-1 prototype pollution Three PoC fixtures (tool name, bash command, model name) reproduce the audit's HIGH-1 attack. Tests assert Object.prototype.calls stays undefined after parsing. They fail against current parser.ts -- Task 3 will close the pollution sink with Object.create(null).	2026-04-17 08:32:18 +02:00
Ninym	bd71377fdd	docs(optimize): remove references to non-existent .claudeignore Claude Code does not document or implement a .claudeignore feature. The junk-reads detector's fix is now a CLAUDE.md instruction asking Claude to avoid generated/dependency directories. The separate detectMissingClaudeignore finding and its tests are removed; checking for the presence of a non-existent file has no signal. Closes #61.	2026-04-17 08:32:07 +02:00
AgentSeal	1cd96ea19f	Merge origin/main into feat/optimize Resolve conflicts in src/dashboard.tsx: keep optimize view plumbing (setOptimizeResult, OptimizeResult state, o/b keys) while integrating project/exclude filters on reloadData and renderDashboard entry.	2026-04-16 16:08:20 -07:00
AgentSeal	98c1e266d7	test: cover filterProjectsByName include/exclude semantics Adds unit tests for the project-filter helper: include OR semantics, exclude AND-negation, case-insensitive matching against both project name and projectPath, ordering (exclude applied after include), empty-string edge case, and input immutability.	2026-04-16 15:49:57 -07:00
AgentSeal	21444df2bf	merge main into feat/optimize - resolve dashboard.tsx conflicts: keep optimize view + context budget column alongside main's all-time period and TopSessions panel - ProjectBreakdown: add avg/s column from main plus overhead column from optimize, widths 30/40 - StatusBar: 1-5 periods including all-time, plus o-optimize when findings exist - DashboardContent: all-time period handling and TopSessions panel preserved Copilot provider and its 253 tests from main merged cleanly as additions.	2026-04-16 12:56:53 -07:00
akki	9b6a9e8fc3	fix: respect SwiftBar settings when installing the menu bar	2026-04-16 22:39:51 +03:00
AgentSeal	1e01020ee9	Merge pull request #44 from theodorosD/feat/copilot-provider feat: add GitHub Copilot provider	2026-04-16 19:02:57 +02:00
AgentSeal	c02f63235a	test(optimize): add 34 filesystem-mocking tests Covers previously untested detectors and helpers with real temp-dir fixtures (not stubs) to verify behavior against actual file I/O. New coverage: - detectMissingClaudeignore: project with/without junk dirs and .claudeignore, impact scaling. - detectBloatedClaudeMd: plain oversized file, @-import expansion, circular import safety, email/npm-scope @-token filtering. - loadMcpConfigs: project reads, colon-to-underscore normalization, malformed JSON tolerance. - detectUnusedMcp: 24-hour grace period, config vs invocation merge. - detectBashBloat: env var unset, configured under/over limit. - detectGhostCommands: path prefixes are not commands, <command-name> tag parsing. - scanJsonlFile: missing file, tool_use parsing, malformed line skipping, date-range filter. - scanAndDetect: empty projects returns healthy result. - estimateContextBudget: system base, MCP tools, memory files. - discoverProjectCwd: empty dir, no jsonl, cwd extraction. Uses vi.mock to redirect os.homedir() to a disposable temp directory, so tests do not read the tester's real ~/.claude. 34 tests, <30ms wall time. Total suite now 160 tests.	2026-04-16 09:33:03 -07:00
Teo Delis	e7633d932b	fix: address PR review feedback on Copilot provider - init currentModel to '' and skip assistant messages before first session.model_change to avoid silent misattribution - add comment documenting why inputTokens is always 0 - fix delete_file tool mapping ('Edit' -> 'Delete') - add schema doc comment to ToolRequest optional fields - remove catch-all from CopilotEvent union for proper TS narrowing - add tests: pre-model-change skip, workspace.yaml quote/comment strip, longest-prefix model display name match	2026-04-16 19:30:08 +03:00
AgentSeal	a9ca2a1134	feat(optimize): fix tracking via recent vs baseline split Solves the problem where users who fixed an issue continued to see the finding for the remainder of the period. Findings now show visible progress or disappear entirely. Mechanism (no state file, no new I/O): - ToolCall and ApiCallMeta gain a `recent` boolean, set when the entry's timestamp falls inside a rolling 48-hour window. - Each session-based detector counts recent vs total occurrences. - computeTrend classifies each finding: active -- recent rate matches baseline improving -- recent rate under half of baseline (green arrow) resolved -- zero recent waste AND confirmed recent activity - Resolved findings are suppressed. Improving findings render with a green "improving down-arrow" badge next to the impact label. - When no recent activity exists, findings default to active so a user who simply paused is not told everything is fixed. Applies to the four session-based detectors: junk reads, duplicate reads, low read:edit ratio, cache bloat. The filesystem detectors (missing .claudeignore, bloated CLAUDE.md, unused MCP, ghost agents / skills / commands, bash limit) already self-heal on next run. 5 new tests cover computeTrend edge cases. 126 tests pass.	2026-04-16 06:53:08 -07:00
Ninym	0461193819	fix: handle empty firstTimestamp in TopSessions, add dashboard tests - TopSessions: show '----------' placeholder when session.firstTimestamp is empty (Copilot provider yields '' when timestamp is missing) - DashboardContent: add comment explaining undefined days tri-state - tests/dashboard.test.ts: cover top-5 selection (fewer than 5 sessions, tied costs, descending sort) and avg/s returning '-' for zero sessions Authored-by: AgentSeal <hello@agentseal.org>	2026-04-16 15:48:24 +02:00
AgentSeal	be45045fd8	refactor(optimize): correctness, constants, and real tests Phase 1 hardening pass. Bug fixes: - Move cwd/version collection inside date-range filter. 7d and 30d now produce different findings for filesystem detectors. - detectGhostSkills threshold aligned with peer detectors. - detectUnusedMcp gets 24-hour grace period via config file mtime so newly added servers are not flagged as unused. - detectCacheBloat replaces hardcoded 50K baseline with user-derived p25 of cache writes. Flags only when median exceeds 1.4x baseline. - detectBashBloat scans user shell profiles instead of the auditor's process.env. - @-import pattern requires ./ ../ or / to avoid matching email addresses or npm scopes. - Command usage pattern requires leading whitespace/start-of-line before /cmd so path references like /tmp are not counted as usage. - AVG_TOKENS_PER_READ lowered from 1500 to 600 and CLAUDEMD_TOKENS_PER_LINE lowered from 25 to 13 for realistic prose/config sizing. Code quality: - Every magic number extracted to named module-scope constants. - Dead code removed (IMPACT_ORDER, unused stat import). - Shared loadMcpConfigs helper deduplicates config walking. - Shared shortHomePath, isReadTool, inRange helpers. - All detectors and computeHealth exported for real tests. - Ghost detectors run in parallel via Promise.all. - Cost rate defaults to 0 when unknown so UI can suppress instead of showing fabricated numbers. Tests: - Replaced 17 fake tests that re-implemented detector logic with 26 tests importing and exercising the real exports. - Cover threshold boundaries, impact scaling, edge cases. - 121 tests pass. UX header: "Setup" renamed to "Health", issue count shown inline. CLAUDE.md: adds rule against "steal/copy/rip-off" wording in public-facing text.	2026-04-16 06:30:15 -07:00
Teo Delis	a8517d3235	feat: add GitHub Copilot provider - Parse ~/.copilot/session-state/*/events.jsonl - Track model via session.model_change events - Extract tools from assistant.message toolRequests - Add fallback pricing for gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-5-mini, o3, o4-mini - Factory function createCopilotProvider(sessionStateDir?) for testability - Typed event variants (ModelChangeData, UserMessageData, AssistantMessageData) - bashCommands: [] in yield (Copilot does not log bash commands) - 13 tests covering parsing, model tracking, tool extraction, dedup, discoverSessions - Note: only outputTokens available (Copilot does not log input tokens)	2026-04-16 15:40:22 +03:00
AgentSeal	b88f2cd730	feat: ghost detectors, health grade, @-import expansion Expanded the optimize engine with new detectors and scoring: 1. Health score + letter grade (A-F) in optimize header. Weighted per-impact with caps. Gives users an instant "is my setup healthy" read that doubles as a shareable number. 2. Urgency score replaces impact-enum sort. Weighted 0.7 * impact + 0.3 * normalized tokens. Produces better-ranked findings. 3. Three new ghost detectors: - Ghost agents: files in ~/.claude/agents/ never invoked via Agent/Task tool - Ghost skills: SKILL.md directories never triggered - Ghost slash-commands: ~/.claude/commands/ files never referenced in user messages 4. @-import chain expansion for CLAUDE.md. Recursively follows @path/to/file imports (max depth 5) so bloat detection counts transitive load, not just the base file. Fixes undercounting for users with modular CLAUDE.md setups. 9 new tests covering health scoring and import expansion.	2026-04-16 05:09:01 -07:00
AgentSeal	710316053e	feat: add optimize command, in-TUI optimize view, and per-project context budget Optimize engine detects 8 waste patterns from Claude Code session data: - Junk directory reads (node_modules, .git, dist, etc.) - Duplicate file reads per session - Unused MCP servers (configured but never called) - Missing .claudeignore in projects with junk dirs - Bloated CLAUDE.md files (>200 lines) - Uncapped BASH_MAX_OUTPUT_LENGTH - Low Read:Edit ratio (edit-without-reading, per #42796) - High cache_creation overhead (per #46917) Each finding includes impact rating, token/cost savings estimate, and exact fix (paste, command, or file content). Dashboard integration: - o key switches to in-TUI optimize view, b key goes back - Background scan on load, o button only when findings exist - Per-project Context Budget column in By Project panel showing estimated per-call overhead (system + MCP tools + skills + CLAUDE.md) CLI: codeburn optimize [-p period] [--provider]	2026-04-16 04:04:37 -07:00
AgentSeal	97c0869763	chore: hoist Pi model sort + cover bash-utils edge cases - Move modelDisplayEntries sort to module scope so it runs once instead of on every modelDisplayName call. - Add bash-utils tests for the env var prefix and true/false skip behaviors that came in with the Pi commit. Verified Pi token semantics against real session data: input + cacheRead + cacheWrite + output equals totalTokens exactly, confirming Anthropic-style accounting (cached tokens disjoint from input). No double-counting.	2026-04-16 02:02:32 -07:00
AgentSeal	d92d5b3f26	chore: normalize Pi tool names via toolNameMap Maps Pi's lowercase tool names (bash, read, edit, write...) to the capitalized form used by every other provider, so the dashboard shows consistent tool names across providers and the activity classifier works without extra lowercase entries. Reverts the lowercase additions to classifier.ts since the provider now normalizes tool names at the source.	2026-04-16 01:57:39 -07:00
Damian Jackson	7ac512a7e4	feat: add Pi provider for tracking Pi agent sessions - Adds support for Pi (pi.ai) as a new session provider. - Pi sessions are stored as JSONL files under `~/.pi/agent/sessions/<project-dir>/` and use OpenAI-compatible model IDs (gpt-5, gpt-5.4, gpt-4o, etc.). - `src/providers/pi.ts` (new): Pi provider - discovers JSONL session files, parses assistant turns, extracts token counts, tool calls, and bash commands, deduplicates via response ID with line-index fallback - `src/providers/types.ts`: added bashCommands field to `ParsedProviderCall` so all providers carry extracted bash command lists - `src/providers/index.ts`: registered Pi as a core provider alongside Claude and Codex - `src/providers/codex.ts`, `cursor.ts`: added `bashCommands: []` to satisfy the new required field on `ParsedProviderCall` - `src/parser.ts`: fixed bug where `providerCallToTurn` always emitted an empty bashCommands array instead of passing through the parsed commands - `src/classifier.ts`: added lowercase tool name variants (bash, edit, read, write) to match Pi's tool naming convention in JSONL output - `src/bash-utils.ts`: exclude `true`, `false`, and shell variable assignments from extracted commands; scan past leading `NAME=val` tokens so `FOO=bar ls` correctly records `ls` rather than being dropped - `package.json`: added pi to keywords - `tests/providers/pi.test.ts` (new): 16 unit tests covering session discovery, multi-turn parsing, tool/bash extraction, deduplication, zero-token filtering, and display name mapping - `tests/provider-registry.test.ts`: updated core provider list to include pi - [X] Unit tests pass (`npx vitest run`, 56 tests across 6 files); - [X] Manually verified via `npx tsx src/cli.ts` report and showing Pi sessions alongside Claude and Codex in the dashboard.	2026-04-16 01:54:42 -07:00
AgentSeal	475ab0da61	fix: case-insensitive Codex originator check Codex Desktop on Windows uses "Codex Desktop" as the originator string instead of "codex_cli" or "codex_vscode". The startsWith check was case-sensitive, rejecting these sessions silently. Fixes #1 (comment by @JiglioNero).	2026-04-15 16:06:28 -07:00
AgentSeal	2d114d9393	feat: add OpenCode provider Reads session data from OpenCode's SQLite databases at ~/.local/share/opencode/. Reuses the existing better-sqlite3 adapter (same as Cursor), lazy-loaded so users without OpenCode see no difference. Adds bashCommands to the provider interface so shell command breakdowns work across all providers. 31 tests, schema validation, diagnostic stderr on failures. Also fixes a pre-existing tsc error in currency.ts.	2026-04-15 14:24:37 -07:00
AgentSeal	2afab5f71a	fix: final review cleanup - Remove unused vi import from cursor test - Move LANG_DISPLAY_NAMES to module scope (was re-created per render) - Remove redundant 'script' regex (scrip?t already covers it) - Unexport getDbFingerprint (internal to cache module) - Move beforeEach inside describe block (only cursor tests need it)	2026-04-15 05:35:11 -07:00
AgentSeal	94762ca1f4	fix: address review findings before merge - getProvider() now async, eliminates race condition with cursor loading - cursor:edit pseudo-tool prevents inflating Claude's Edit count in --provider all - Tightened SCRIPT_PATTERNS to avoid false positives (run requires file context) - Removed duplicated LANG_NAMES from cursor.ts (dashboard handles display) - Test no longer assumes cursor always loads (CI-safe) - Removed unnecessary type assertion and setTimeout yield	2026-04-15 05:31:51 -07:00
AgentSeal	3fabc105d8	perf: file-based result cache for Cursor DB First run parses the 21GB DB (slow, ~40-80s). Writes parsed results to ~/.cache/codeburn/cursor-results.json. Subsequent runs check DB mtime+size -- if unchanged, load from cache (instant). Cache auto-invalidates when Cursor modifies the DB.	2026-04-15 05:11:30 -07:00
AgentSeal	b7b7b2c7d6	perf: lazy-load cursor provider to eliminate startup overhead Cursor module (sqlite.ts, better-sqlite3) now only loads when cursor provider is actually requested. Claude/Codex startup is unaffected -- cursor import never happens unless needed.	2026-04-15 03:59:49 -07:00
AgentSeal	70931b7269	feat: add Cursor IDE provider with SQLite adapter Reads token usage from Cursor's local state.vscdb database. Supports per-request input/output tokens, model tracking, and incremental caching for large databases. - better-sqlite3 as optionalDependency (lazy-loaded, no impact on Claude/Codex) - Parameterized SQL queries, read-only mode, per-row error handling - Schema detection with clear error on format changes - Cache layer with timestamp watermark for incremental reads - Provider colors and [p] key cycling in dashboard - 39 tests passing, zero regressions	2026-04-15 03:44:43 -07:00
AgentSeal	51c56d0726	fix: include agent/subagent sessions, fix Codex cache hit and cost calculation - Remove agent-*.jsonl exclusion filter that was dropping ~46% of API calls - Scan subagents/ directories for subagent session files - Normalize Codex token semantics: OpenAI includes cached tokens inside input_tokens, subtract them to match Anthropic's separate reporting - Fixes cost double-counting and 100% cache hit display for Codex users	2026-04-14 10:18:14 -07:00
AgentSeal	9ab7f37f6f	Fix CSV formula injection in exports	2026-04-14 11:04:10 -04:00
AgentSeal	391a235d1d	feat: multi-provider support (Codex + provider plugin system) Add Codex (OpenAI) as a second provider alongside Claude Code. Provider plugin architecture makes adding future providers (Pi, OpenCode, Amp) a single-file addition. - Provider interface: types, session discovery, stateful JSONL parsing - Codex parser: token_count dedup, tool normalization, model resolution - TUI: press p to cycle All/Claude/Codex with 1-min cache for instant switching - CLI: --provider flag on report, today, month, status, export commands - Pricing: Codex model fallbacks, fixed fuzzy matching for gpt-5.4-mini - Menubar: per-provider cost breakdown when multiple providers detected - 27 tests (10 new: Codex parser, provider registry, tool/model mapping)	2026-04-14 04:32:09 -07:00
Rafael Calleja	a5696362f2	refactor: share BASH_TOOLS from classifier, remove comments - Export BASH_TOOLS from classifier.ts instead of duplicating in bash-utils.ts - Remove isBashTool helper (use BASH_TOOLS.has() directly) - Strip unnecessary comments per codebase conventions	2026-04-14 10:24:38 +02:00
Rafael Calleja	45ce697eea	fix: correct quote-handling alignment in extractBashCommands Replace quoted strings with same-length spaces in stripQuotedStrings so separator indices in the stripped string map correctly to the original. Add test coverage for quoted separators and isBashTool.	2026-04-14 10:24:24 +02:00
Rafael Calleja	b75c2663b4	feat: add extractBashCommands with TDD tests Implements bash command parsing utility that splits on &&, ;, and \| separators while respecting quoted strings. Includes isBashTool helper. All 12 vitest tests pass.	2026-04-14 10:24:24 +02:00

36 commits