Adds desktop/ with a native tray app that mirrors the macOS popover via
a shared tokens.json and the same codeburn status --format menubar-json
data source. Same security posture as the Swift app: argv-validated CLI
spawn, O_NOFOLLOW cache writes, flock on config.json, FX rate clamping
to [0.0001, 1_000_000].
Stack:
- Tauri 2.x (Rust) for tray + window lifecycle, shells out to the CLI
- React + TypeScript + Vite for the popover UI
- libayatana-appindicator on Linux, system tray on Windows
- Produces .deb / .rpm / .AppImage on ubuntu-latest, .msi on
windows-latest. Both workflows run on free GitHub Actions minutes.
Rust modules (src-tauri/src/):
- lib.rs: tray icon, menu events, popover toggle, state wiring
- cli.rs: CodeburnCli with argv allowlist and bounded pipe drain
(20 MB stdout / 256 KB stderr / 60 s wall time)
- config.rs: flock-guarded read-modify-write of ~/.config/codeburn/config.json
- fx.rs: Frankfurter fetch with 24 h disk cache, bounds check
Frontend:
- App.tsx with agent tabs, period switcher, hero cost, activity rows,
optimize findings CTA, footer (currency picker / refresh / Open
Full Report). Listens for `codeburn://refresh` tray events.
- lib/payload.ts mirrors the CLI's MenubarPayload shape
- lib/currency.ts mirrors the Swift Double.asCurrency helpers
- styles.css with design tokens as CSS custom properties
CLI:
- `codeburn menubar` now platform-dispatches: macOS (.app zip),
Linux (.AppImage into ~/.local/bin), Windows (.msi via msiexec).
macOS behaviour preserved exactly.
Release workflow:
- .github/workflows/release-desktop-linux.yml triggers on `linux-v*`
tags, builds all three Linux formats, uploads to GitHub Releases.
Scaffold verified:
- cargo check -> clean
- tsc --noEmit -> clean
- npm run build (CLI) -> 205 KB
- Existing test suite: 230 / 230 still pass
renderStatusBar computed `today` via `new Date().toISOString().slice(0,10)`,
which is the UTC date. Session timestamps are also UTC ISO strings, but
the user's expectation of "today" is their wall-clock day. During the
window between local midnight and UTC midnight (e.g. 17:00 PDT on
2026-04-17, which is already 00:00 UTC on 2026-04-18), every session
bucketed under local April 17 missed the UTC-April-18 filter and the
status bar read `Today $0.00 0 calls` even while `--format json`
and the menubar app correctly showed the spend.
Both sides of the comparison now use the local date of each session
timestamp, so the terminal status and the JSON / menubar paths agree.
Verified at UTC midnight (the regression moment that surfaced the bug):
Before: Today $0.0000 0 calls
After: Today $339.87 1839 calls
Caught during the fresh-clone review of the menubar PR.
Sets CODEBURN_VERBOSE=1 via commander preAction, which the fs-utils
helpers check before emitting stderr lines on skipped or failed reads.
Closes LOW-1 from the 2026-04-16 audit.
Replaces any character outside [A-Za-z0-9 ._/-] with ? in model and
category labels and truncates to 14 chars before padEnd. Closes the
MEDIUM-2 finding from the 2026-04-16 audit: an attacker-controlled
JSONL with a crafted model name no longer injects SwiftBar directives
or ANSI escapes.
All four read paths in the optimizer (async session scan + three sync
config/import/profile scans) now pass through the 128 MB-capped
helpers. JSON.parse in readJsonFile stays wrapped in try/catch.
MEDIUM-1 coverage for the optimize module.
Config JSON, CLAUDE.md scans, and session-discovery reads now pass
through the 128 MB-capped helper. JSON.parse remains wrapped in
try/catch to preserve the previous 'null on malformed JSON' contract.
MEDIUM-1 coverage for the context-budget module.
Events JSONL and workspace.yaml reads now pass through the 128 MB-capped
helper. The workspace.yaml path stays non-fatal: a null read skips cwd
derivation but still pushes the session with sessionId as the fallback
project label. MEDIUM-1 coverage for the Copilot provider.
Both Codex session read paths (first-line meta and full-session parse)
now pass through the 128 MB-capped helper. MEDIUM-1 coverage for the
Codex provider.
Replaces the unbounded readFile in parseSessionFile with the 128 MB-capped
helper from src/fs-utils. Addresses MEDIUM-1 for the Claude provider
hot path.
Verbose-mode stderr output replaces the previous silent catch,
closing LOW-1 as a side effect.
Adds readSessionFile / readSessionFileSync / readSessionLines with a
128 MB hard cap and 8 MB streaming threshold. Verbose mode
(CODEBURN_VERBOSE=1) logs skipped and failed reads to stderr.
Prepares the MEDIUM-1 migration of all provider read paths.
Initialize the four breakdown maps (model, tool, mcp, bash) with null
prototype so attacker-controlled keys named __proto__ create own
properties on the map instead of mutating Object.prototype.
Closes the HIGH-1 finding from the 2026-04-16 external security audit.
Claude Code does not document or implement a .claudeignore feature.
The junk-reads detector's fix is now a CLAUDE.md instruction asking
Claude to avoid generated/dependency directories. The separate
detectMissingClaudeignore finding and its tests are removed; checking
for the presence of a non-existent file has no signal.
Closes#61.
SwiftBar/xbar launch plugins with a minimal environment. If an older
system Node (e.g. v18.x from Homebrew) appears earlier on PATH than the
NVM-managed Node used to install codeburn, the plugin silently fails
because string-width uses the /v regex flag (requires Node >= 20.11).
Resolve the node binary directory at install time via process.execPath
and prepend it to PATH in the generated plugin script. Also explicitly
set HOME, which SwiftBar does not always inherit.
Fixes#63
Rename slice(0, N) and length - N literals used in waste-finding
preview strings to GHOST_NAMES_PREVIEW, GHOST_CLEANUP_COMMANDS_LIMIT,
TOP_ITEMS_PREVIEW, MISSING_IGNORE_PATHS_PREVIEW, JUNK_DIRS_IGNORE_PREVIEW.
Drop the punctuation double-hyphen from the config-health header.
The menubar status output computes per-provider today costs by iterating
all providers after the main period blocks. That loop bypassed the
project filter, so --project/--exclude affected the main totals but not
the provider breakdown shown below them.
The TopSessions row layout summed to the full panel width without
leaving room for the Box border (round) + paddingX, so Ink truncated
the last 4 characters -- landing exactly on the calls column and
producing rows like "$182.58 ..." with no calls value.
Introduce PANEL_CHROME = 4 and subtract it from the name column so the
row fits inside the panel's inner area.
Adds two new repeatable flags to all commands (report, today, month, status, export):
- --project <name>: include only projects matching name (substring, case-insensitive)
- --exclude <name>: exclude projects matching name (substring, case-insensitive)
Both flags can be specified multiple times to match multiple projects.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- resolve dashboard.tsx conflicts: keep optimize view + context budget column alongside main's all-time period and TopSessions panel
- ProjectBreakdown: add avg/s column from main plus overhead column from optimize, widths 30/40
- StatusBar: 1-5 periods including all-time, plus o-optimize when findings exist
- DashboardContent: all-time period handling and TopSessions panel preserved
Copilot provider and its 253 tests from main merged cleanly as additions.
Outputs full dashboard data as structured JSON to stdout, including:
overview, daily breakdown, projects, models with token counts,
activities with one-shot rates, core tools, MCP servers, and
shell commands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ContextBudgetPanel in dashboard.tsx was defined but never rendered
after the per-project overhead column replaced it in Phase 1.
- detectLowReadEditRatio accepted a dateRange parameter that was
never used inside the function (trend is computed via recent-call
flags on ToolCall).
Verified with tsc --noUnusedLocals --noUnusedParameters: 0 errors.
160 tests pass.
- init currentModel to '' and skip assistant messages before first
session.model_change to avoid silent misattribution
- add comment documenting why inputTokens is always 0
- fix delete_file tool mapping ('Edit' -> 'Delete')
- add schema doc comment to ToolRequest optional fields
- remove catch-all from CopilotEvent union for proper TS narrowing
- add tests: pre-model-change skip, workspace.yaml quote/comment strip,
longest-prefix model display name match
- ScanData.versions and ScanFileResult.versions were collected but never
read in scanAndDetect. Per-call version lives on ApiCallMeta.version
which is what detectCacheBloat actually uses. Dropped the unused
aggregation path end-to-end.
- Extract DEFAULT_CACHE_BASELINE_TOKENS, CACHE_BASELINE_QUANTILE,
CACHE_VERSION_MIN_SAMPLES, CACHE_VERSION_DIFF_THRESHOLD as named
module-scope constants. Honors the "no magic numbers" project rule
across the full optimize engine.
- Rename "ctx" column to "overhead" with wider padding for legibility
in the By Project panel.
- Name magic numbers for the project column widths.
- Empty-state now includes a one-line description of what optimize
does, so first-time users with no findings still understand the
feature.
Solves the problem where users who fixed an issue continued to see
the finding for the remainder of the period. Findings now show
visible progress or disappear entirely.
Mechanism (no state file, no new I/O):
- ToolCall and ApiCallMeta gain a `recent` boolean, set when the
entry's timestamp falls inside a rolling 48-hour window.
- Each session-based detector counts recent vs total occurrences.
- computeTrend classifies each finding:
active -- recent rate matches baseline
improving -- recent rate under half of baseline (green arrow)
resolved -- zero recent waste AND confirmed recent activity
- Resolved findings are suppressed. Improving findings render with
a green "improving down-arrow" badge next to the impact label.
- When no recent activity exists, findings default to active so a
user who simply paused is not told everything is fixed.
Applies to the four session-based detectors: junk reads, duplicate
reads, low read:edit ratio, cache bloat. The filesystem detectors
(missing .claudeignore, bloated CLAUDE.md, unused MCP, ghost agents
/ skills / commands, bash limit) already self-heal on next run.
5 new tests cover computeTrend edge cases. 126 tests pass.
- TopSessions: show '----------' placeholder when session.firstTimestamp
is empty (Copilot provider yields '' when timestamp is missing)
- DashboardContent: add comment explaining undefined days tri-state
- tests/dashboard.test.ts: cover top-5 selection (fewer than 5 sessions,
tied costs, descending sort) and avg/s returning '-' for zero sessions
Authored-by: AgentSeal <hello@agentseal.org>
- runWithConcurrency helper runs file reads with configurable parallelism
(default 16) instead of sequential await.
- isFileStaleForRange skips files whose mtime is older than the date range
start, avoiding unnecessary reads for narrow periods.
- Result-level cache keyed on (dateRange, project fingerprint) with
60s TTL. Warm dashboard 'o' press now hits cache instead of rescanning.
- Early-return when projects array is empty so empty-state path does not
trigger filesystem walk.
Measured CLI cold scan on 12K files, 1833 sessions week:
before: 12-17s
after: 6-7s
Dashboard warm cache hit: <50ms.
Phase 1 hardening pass.
Bug fixes:
- Move cwd/version collection inside date-range filter. 7d and 30d
now produce different findings for filesystem detectors.
- detectGhostSkills threshold aligned with peer detectors.
- detectUnusedMcp gets 24-hour grace period via config file mtime so
newly added servers are not flagged as unused.
- detectCacheBloat replaces hardcoded 50K baseline with user-derived
p25 of cache writes. Flags only when median exceeds 1.4x baseline.
- detectBashBloat scans user shell profiles instead of the auditor's
process.env.
- @-import pattern requires ./ ../ or / to avoid matching email
addresses or npm scopes.
- Command usage pattern requires leading whitespace/start-of-line
before /cmd so path references like /tmp are not counted as usage.
- AVG_TOKENS_PER_READ lowered from 1500 to 600 and
CLAUDEMD_TOKENS_PER_LINE lowered from 25 to 13 for realistic
prose/config sizing.
Code quality:
- Every magic number extracted to named module-scope constants.
- Dead code removed (IMPACT_ORDER, unused stat import).
- Shared loadMcpConfigs helper deduplicates config walking.
- Shared shortHomePath, isReadTool, inRange helpers.
- All detectors and computeHealth exported for real tests.
- Ghost detectors run in parallel via Promise.all.
- Cost rate defaults to 0 when unknown so UI can suppress instead of
showing fabricated numbers.
Tests:
- Replaced 17 fake tests that re-implemented detector logic with
26 tests importing and exercising the real exports.
- Cover threshold boundaries, impact scaling, edge cases.
- 121 tests pass.
UX header: "Setup" renamed to "Health", issue count shown inline.
CLAUDE.md: adds rule against "steal/copy/rip-off" wording in
public-facing text.
Expanded the optimize engine with new detectors and scoring:
1. Health score + letter grade (A-F) in optimize header. Weighted
per-impact with caps. Gives users an instant "is my setup healthy"
read that doubles as a shareable number.
2. Urgency score replaces impact-enum sort. Weighted 0.7 * impact
+ 0.3 * normalized tokens. Produces better-ranked findings.
3. Three new ghost detectors:
- Ghost agents: files in ~/.claude/agents/ never invoked via
Agent/Task tool
- Ghost skills: SKILL.md directories never triggered
- Ghost slash-commands: ~/.claude/commands/ files never referenced
in user messages
4. @-import chain expansion for CLAUDE.md. Recursively follows
@path/to/file imports (max depth 5) so bloat detection counts
transitive load, not just the base file. Fixes undercounting for
users with modular CLAUDE.md setups.
9 new tests covering health scoring and import expansion.
- Add 'all' period (key 5) to dashboard showing data from all time
- Daily Activity panel shows all available days when All Time is active
- Add avg cost per session column to Project breakdown
- Add Top Sessions panel highlighting the 5 most expensive sessions
- Add Pi entry to PROVIDER_COLORS (pink) and PROVIDER_DISPLAY_NAMES.
- Update README with Pi in supported providers, requirements,
command examples, and the data-format section.
dispatch_agent is Pi's delegation tool. Mapping it to Agent makes
the classifier route those turns into the Delegation category
instead of Conversation, matching the existing semantics for the
task tool.
- Move modelDisplayEntries sort to module scope so it runs once
instead of on every modelDisplayName call.
- Add bash-utils tests for the env var prefix and true/false
skip behaviors that came in with the Pi commit.
Verified Pi token semantics against real session data:
input + cacheRead + cacheWrite + output equals totalTokens
exactly, confirming Anthropic-style accounting (cached tokens
disjoint from input). No double-counting.