Commit graph

239 commits

Author SHA1 Message Date
iamtoruk
38d21643bd Merge origin/main into feat/session-outlier-detection 2026-05-04 20:21:26 -07:00
iamtoruk
5120ec696c Merge origin/main into feat/mcp-tool-coverage 2026-05-04 20:13:15 -07:00
iamtoruk
735f41bc6c Fix cache-write pricing and shell-quote server names in fix commands
- Use 1.25x multiplier for cache-write tokens to match Anthropic's
  actual pricing (was incorrectly using 1x)
- Shell-quote server names in `claude mcp remove` fix text to prevent
  issues with unusual server names
2026-05-04 20:11:50 -07:00
iamtoruk
bfa5fe7fa0 fix(labels): update remaining 'all' period labels to '6 Months'
PR #221 unified the period logic but missed the TUI hotkey bar,
GNOME indicator popup, and macOS menubar app. All surfaces now
consistently show '6 Months' instead of 'All' or 'all time'.
2026-05-04 19:46:20 -07:00
ozymandiashh
d18ba3d2fe feat(optimize): detect session cost outliers 2026-05-05 05:25:49 +03:00
ozymandiashh
e46b20b927 fix(optimize): reuse mcp coverage and type schema estimator 2026-05-05 05:11:00 +03:00
ozymandiashh
9a258a8a99 fix(date-range): avoid all-period month overflow 2026-05-05 05:05:13 +03:00
ozymandiashh
1a080a006f feat(optimize): MCP tool coverage detector with cache-aware costing
Adds a per-tool optimizer finding for MCP servers whose schema is loaded
on every turn but rarely invoked. Builds on the existing server-level
`detectUnusedMcp` (zero invocations) by reporting partial-use cases:
"loaded 54 tools, called 0" or "loaded 26 tools, called 2 (8% coverage)".

Inventory comes from Claude Code's JSONL `attachment.deferred_tools_delta`
entries: `addedNames` lists the exact tools available at that turn,
including every fully-qualified `mcp__<server>__<tool>` name. We union
across all delta entries in a session (not just the first) because tool
availability can change mid-session when the user reloads MCP config or
a subagent inherits a different tool set. Names that don't match the
`mcp__<server>__<tool>` shape with both segments non-empty are rejected
at extraction so downstream `split('__')` consumers can't be poisoned.

Token-savings estimates are cache-aware. MCP tool schemas live in the
cached prefix of the system prompt: a session pays the full input price
on each cache-creation turn (rebuilds happen every ~5 minutes of
inactivity) and the cache-read discount on subsequent turns. Each call's
contribution is capped at its observed `cacheCreationInputTokens` /
`cacheReadInputTokens` so we never claim more MCP overhead than the
call's own cache buckets could contain.

When multiple servers are flagged, costing happens in a single combined
pass: the per-call cap applies to the total unused-schema budget across
all flagged servers, not per server. Two flagged servers cannot both
independently claim the same call's cache bucket, which would otherwise
overstate `tokensSaved` and misclassify findings as high impact.

A session counts toward `loadedSessions` (and toward the cost estimate)
only if its observed inventory included the server. Pure invocation-only
sessions, where the server appears in `mcpBreakdown` or `call.mcpTools`
without any matching `deferred_tools_delta`, do not satisfy the
`>= 2 sessions` threshold on their own. The same invariant applies in
`estimateMcpSchemaCost` so the two passes agree.

Coverage is computed against the inventory only: invocations of names
not present in any observed inventory (older config, hallucinated tool,
typo) do not inflate `toolsInvoked` and cannot drive `unusedCount`
negative. `toolsInvoked` is derived as `inventory.size - unusedTools.length`
to keep both numbers consistent.

`detectUnusedMcp` and the new detector are explicitly disjoint:
`detectUnusedMcp` skips servers that the coverage detector will report,
not every server that happens to be in any inventory, so a small
inventoried-but-uninvoked server below the coverage thresholds still
gets flagged as "configured but never called."

Thresholds for the coverage finding:
- > 10 tools available (small servers are noise)
- < 20% coverage
- >= 2 sessions with observed inventory
- High impact when total effective tokens >= 200_000 or >= 3 servers flagged

Smoke-tested on a real account: 7 servers flagged across 93 sessions
(`office-word-mcp` 0/54, `notebooklm-mcp` 0/38, `office-ppt-mcp` 0/37,
`excel-mcp-server` 0/25, `github-mcp-server` 2/26, `peekaboo` 3/22, plus
`claude_ai_Asana`). Combined-cap costing keeps `tokensSaved` honest.

Changes:
- src/types.ts: optional `mcpInventory: string[]` on `SessionSummary`.
  Provider-agnostic field; currently populated only by the Claude parser.
- src/parser.ts: `extractMcpInventory` walks all entries, validates
  fully-qualified names, returns sorted unique list. `buildSessionSummary`
  passes it through; field is omitted when empty so JSON exports stay
  clean.
- src/optimize.ts: `aggregateMcpCoverage`, `estimateMcpSchemaCost`
  (single- and multi-server signatures), `detectMcpToolCoverage`. Wired
  into `scanAndDetect`. `detectUnusedMcp` updated to disjoint with the
  new detector.
- tests/mcp-coverage.test.ts: 23 cases covering aggregation, costing,
  combined-cap behaviour, threshold gates, invocation-only-session
  filtering, foreign-tool invocations, cache rebuild events, write+read
  on the same call, multi-server pluralisation.
- tests/parser-mcp-inventory.test.ts: 12 cases for the JSONL extractor
  including malformed name rejection and tolerant attachment parsing.
- CHANGELOG.md: entry under Unreleased / Added (CLI).

Closes #2
2026-05-05 04:13:04 +03:00
ozymandiashh
3dc3e32715 fix(date-range): unify 'all' period semantics between CLI and dashboard
`getDateRange` was duplicated across `src/cli.ts` and `src/dashboard.tsx`
with conflicting semantics for `'all'`. The CLI intentionally bounded
`'all'` to the last 6 months (justified inline: keeps Codex/Cursor parses
responsive on sparse multi-year history). The dashboard returned
`new Date(0)` instead, so the same `--period all` flag silently meant
two different windows depending on which entry point you hit.

`Period`, `PERIODS`, `PERIOD_LABELS`, and `toPeriod` were duplicated as
well, and `cli-date.ts` already existed for date helpers
(`parseDateRangeFlags`) so the consolidation lives there.

Both call sites now go through a single `getDateRange(period: string)`
in `cli-date.ts` that returns `{ range, label }`. The dashboard wraps it
as `getPeriodRange(period: Period)` to keep the strict `Period` type at
the React boundary while letting the CLI continue to accept extras like
`'yesterday'`.

`PERIOD_LABELS.all` becomes `'6 Months'` (short, for the dashboard tab
strip; the previous `'All Time'` was misleading and the long-form
`'Last 6 months'` from `getDateRange().label` already drives CLI output).

Changes:
- src/cli-date.ts: add `Period`, `PERIODS`, `PERIOD_LABELS`, `toPeriod`,
  `getDateRange`. Pull the existing 6-month rationale into a named
  `ALL_TIME_MONTHS` constant.
- src/cli.ts: drop the local copies and import from cli-date.
- src/dashboard.tsx: drop the local copies, route through
  `getPeriodRange`, alias the shared `getDateRange` import to
  `getDateRangeShared` to avoid shadowing the wrapper.
- tests/cli-date.test.ts: 13 cases covering `'all'` regression guard
  (must never silently fall back to `Date(0)`), CLI/dashboard agreement,
  end-of-month clamping tolerance, `'yesterday'` support, and
  unknown-input fallback.
- README.md, CHANGELOG.md: surface the bound and point heavy users at
  `--from`/`--to` for unbounded windows.

The CLI flag `--period all` continues to be accepted; only the dashboard
window changes to match what the CLI was already doing. No public API
or schema change.

Refs #93
2026-05-05 03:53:46 +03:00
iamtoruk
6fb05c6813 Fix command injection in yield via execFileSync
Replace execSync with execFileSync and argument arrays so shell
metacharacters in git branch names cannot be interpreted as commands.
Add SAFE_REF_PATTERN validation as defense in depth for branch names
from git symbolic-ref.

Addresses #214.
2026-05-04 10:13:40 -07:00
iamtoruk
15334fac67 Add SHA-256 checksum verification to menubar installer
The installer now downloads and verifies a .sha256 companion file
before extracting and launching the menubar app. Build script and
CI workflow generate the checksum alongside the zip. Adds SECURITY.md
with reporting instructions.

Addresses #215.
2026-05-04 10:08:58 -07:00
Resham Joshi
cf8c2aa493
Merge pull request #206 from voidborne-d/fix/skill-subcategory-203
Some checks are pending
CI / semgrep (push) Waiting to run
fix(classifier): surface skill name as subCategory for general turns (#203)
2026-05-03 16:42:47 -07:00
ozymandiashh
61be92a834 Stream-parse Codex session files to handle 250+ MB rollouts
Heavy Codex users hit MAX_SESSION_FILE_BYTES (128 MB) on long-running
sessions. The file is read in full via readSessionFile and then split on
'\n', so even bumping the cap eventually runs into V8's 512 MB string
limit (split doubles the high-water mark).

readSessionLines is a streaming generator that already exists in
fs-utils for exactly this case but only readFirstLine was using it.
Switch the Codex provider to consume it and let the cap apply only when
streaming would still be unreasonable.

Changes:
- src/fs-utils.ts: introduce MAX_STREAM_SESSION_FILE_BYTES (2 GB) and
  apply it in readSessionLines instead of the full-read cap. Keep
  MAX_SESSION_FILE_BYTES for readSessionFile / readSessionFileSync
  consumers that materialize the whole file.
- src/providers/codex.ts: replace `readSessionFile -> split('\n')` with
  `for await (... of readSessionLines)`. Add sawAnyLine guard so a
  failed/empty stream skips cache write, preserving the previous
  early-return behavior.

Empirical impact on a real account with one 247 MB rollout: 7-day totals
went from 4,536 calls / €358.69 / 20.1M input tokens to 6,111 calls /
€550.67 / 37.3M input tokens. The previously-skipped session is now
included; no other behavior changes.

Refs #204
2026-05-04 02:15:04 +03:00
voidborne-d
c16b21ec50 fix(classifier): surface skill name as subCategory for general turns (#203)
Turns whose only assistant tool is `Skill` collapse to category `general`
because `classifyByToolPattern` returns `'general'` and `refineByKeywords`
only operates on `coding`/`exploration`. In environments that lean on Claude
Code skills, the per-activity dashboard column flattens — every `/init`,
`/review`, `/security-review`, `/claude-api`, plus user-defined skills, all
land in `general` with no signal about which workflow ran.

Implements Option A from the issue:

- `ParsedApiCall.skills: string[]` populated in the Anthropic-path parser
  via a new `extractSkillNames` helper that reads `input.skill || input.name`
  from each `Skill` ToolUseBlock (mirrors `detectGhostSkills` extraction at
  optimize.ts:765 so the two stay in sync).
- `ClassifiedTurn.subCategory?: string` set to the first skill name when the
  resolved category is `general` AND any skill identifier was extracted.
  Top-level category stays `general` — existing aggregations, exports, and
  category-keyed code paths unchanged.
- `SessionSummary.skillBreakdown: Record<string, {turns,costUSD,editTurns,
  oneShotTurns}>` populated in the same per-turn loop that builds
  `categoryBreakdown`. Provider sessions (Codex/Cursor/etc.) keep `skills:
  []` — they don't expose the Skill tool surface today.
- Dashboard `ActivityBreakdown` renders top-N skill sub-rows beneath the
  `general` row when present (indented `/skill-name`, dimmed). Other
  categories render exactly as before; if no skills were invoked, the panel
  is byte-identical to current output.

Existing 419 tests still pass. New `tests/classifier.test.ts` adds 8 cases:
single skill via `input.skill`, single via `input.name`, first-wins for
multi-skill turns, aggregation across multiple assistant calls in one turn,
no-name fallback (`subCategory` stays undefined), `Skill+Edit` promoting to
`coding` and dropping subCategory, non-Skill general turns, and a legacy
ParsedApiCall shape with `skills` field absent (forward-compat). Pre-fix
verification by stashing the source change reproduces 4/8 failures with the
exact "expected 'init', received undefined" diff; restoring → 8/8 pass.

Closes #203.

🤖 AI assistance disclosure: assistant-scaffolded by Claude (Opus 4.7);
author of record reviewed every line, ran the full vitest suite locally
(`npm test` → 32 files / 427 tests pass), `npx tsc --noEmit` clean, and
`npm run build` produces a clean ESM bundle.
2026-05-04 06:26:45 +08:00
iamtoruk
988398abfc Fix $0.0000 display for near-zero costs
Closes #205
2026-05-03 12:11:10 -07:00
AgentSeal
8cf68e7a16 Fix Antigravity dedup collision and add Codex ChatGPT Plus token estimation (#204)
- Antigravity: use loop index as fallback when responseId is empty to prevent
  all entries in a cascade sharing the same dedup key; bump CACHE_VERSION to
  force re-parse of stale cached data
- Codex: estimate tokens from message text when info is null (ChatGPT Plus/Pro
  subscription sessions), feeding through calculateCost so subscription users
  see API-equivalent spend; add costIsEstimated flag to ParsedProviderCall
- Update LiteLLM pricing snapshot
2026-05-03 19:51:13 +02:00
iamtoruk
800c106250 Fix streaming dedup: keep last occurrence of each message.id within session files
Claude Code writes the same message.id multiple times during streaming.
The first write has partial tokens (often 1) and no tool_use blocks.
The last write has authoritative token counts and all tool_use/MCP blocks.

Old behavior kept the first occurrence (keep-first), silently dropping
real output tokens (+6.3% undercount) and all MCP tool calls.

New behavior keeps the last occurrence's content but preserves the first
occurrence's timestamp for correct date bucketing.

Validated against 21,390 real session files: 40.5% had duplicate IDs,
output tokens were understated by up to 78% per session.
2026-05-02 22:30:17 -07:00
iamtoruk
341aa46f78 Strip ANSI escapes from bash commands across all providers
Use strip-ansi (already in dep tree via Ink) in extractBashCommands
to prevent terminal escape codes from leaking into dashboard bash
breakdown keys. Route goose, gemini, qwen, and openclaw through
extractBashCommands instead of inline split, which also gives them
multi-command extraction (matching claude/codex/droid behavior).
2026-05-02 20:59:24 -07:00
iamtoruk
fe1007cc63 Add missing Antigravity model aliases for gemini-3-pro, flash-image, flash-lite 2026-05-02 20:26:45 -07:00
Nihal Jain
73cf90cb2c Add Antigravity Gemini model IDs with preview pricing aliases 2026-05-03 05:46:33 +05:30
iamtoruk
c511627e87 Fix dashboard hang and ExperimentalWarning on Windows
Strip Ink v7 DEC mode 2026 synchronized output markers (BSU/ESU) on
Windows. ConPTY does not implement this protocol and buffers
indefinitely, causing the dashboard to hang with no output. The patch
intercepts standalone BSU/ESU writes on stdout while preserving full
interactivity (keyboard, live refresh, cursor management).

Fix ExperimentalWarning timing: the process.emit patch was restored
synchronously in the finally block, but Node defers the warning via
process.nextTick. Delay restore by one tick so the patch is still
active when the warning fires.

Closes #195
2026-05-02 11:54:58 -07:00
Resham Joshi
3ea4a62408
Add Goose provider, fix Codex fork dedup
Some checks are pending
CI / semgrep (push) Waiting to run
Goose: read token usage from ~/.local/share/goose/sessions/sessions.db.
Lazy-loaded, zero overhead for non-Goose users.

Codex: use sessionId instead of file path in dedup key so forked
sessions sharing the same session_id don't double-count tokens.
2026-05-02 09:58:52 -07:00
Resham Joshi
8c845253c2
Add Antigravity IDE provider
Fetch token usage from Antigravity's local language server via RPC.
Falls back to cached results when the IDE is closed.
2026-05-02 08:58:23 -07:00
Nihal Jain
791f2b077d
Add gpt-5.5 model display name for Codex 2026-05-02 08:57:44 -07:00
ozymandiashh
ffe14fecb7 review: silence stream errors and tighten comment wording
- Attach a no-op `stream.on('error', () => {})` so a late read-ahead
  error that races with a successful first-line yield can never escape
  as an unhandled 'error' event. Defense in depth: empirically the
  destroy() in finally already swallows it on Node 18+, but the listener
  removes any version-dependent surprise.
- Tighten the comment to say "up to FIRST_LINE_READ_CAP" instead of
  "regardless of length"; the cap is real and worth being precise about.
2026-05-02 02:38:36 +03:00
ozymandiashh
ff8b20a79e review: drop streamError flag, add multi-chunk and torn-write tests
- Stop tracking a separate streamError flag. createReadStream's default
  64 KiB highWaterMark means the stream may already be reading chunk 2
  when we break out of the loop after yielding the first line; if that
  later chunk errors, the flag could reject an otherwise-valid line.
  readline's async iterator already re-throws stream errors on Node 16+,
  which the existing catch handles.
- Test: 120 KB session_meta line forces multi-chunk line assembly.
- Test: truncated mid-write first line is rejected, not parsed as half
  an object.
2026-05-02 02:34:41 +03:00
ozymandiashh
98bbe5b678 review: cap first-line read size and add edge-case tests
- Cap createReadStream at 1 MiB so a malformed file with no newline
  cannot make readline buffer indefinitely (real session_meta lines
  are 22-27 KB).
- Capture stream errors explicitly; readline's async iterator does
  not always re-throw underlying stream errors per Node docs.
- Test: assert project is extracted from the >16 KB session_meta to
  prove the line was actually parsed, not just discovered.
- Test: session_meta line with no trailing newline is still accepted.
- Test: empty rollout file is silently skipped.
2026-05-02 02:30:17 +03:00
ozymandiashh
945da9f0ba fix(codex): read full first line for session validation
`readFirstLine` allocated a fixed 16 KB buffer, but Codex CLI 0.128+
embeds the entire base_instructions / system prompt in the
`session_meta` line, pushing it past 20 KB. When the buffer doesn't
catch a newline, `isValidCodexSession` rejects the session, so every
recent Codex session is silently excluded from totals.

Switch to a streaming readline read so the first line is captured
regardless of length, and add a regression test that creates a
40 KB session_meta payload.

Locally, this changes my 30-day Codex total from €267 (only ~half
of sessions parsed) to €878 (all sessions parsed).
2026-05-02 02:17:53 +03:00
Resham Joshi
ffc0e486d3
Merge pull request #188 from getagentseal/feat/menubar-hardening
Harden menubar: refresh loop, concurrency, data sync, edge cases
2026-05-01 08:03:03 -07:00
iamtoruk
39fc05595c Harden menubar: fix refresh loop, concurrency, data sync, and edge cases
- Fix refresh loop: proper while loop with 30s sleep and force:true
  instead of single-fire Task that never repeated
- Fix loading overlay: counter-based isLoading so concurrent fetches
  don't flicker the overlay on/off
- Fix rapid tab switching: cancel previous switchTask, check
  Task.isCancelled after CLI returns to discard stale results
- Fix tab strip vs hero desync: fetch provider-specific and all-provider
  data in parallel so costs arrive from same data snapshot
- Fix stale menubar icon after wake: forceRefresh now fetches today/all
  in parallel alongside the current selection
- Fix accent color: ThemeState is now @Observable so color changes
  propagate via observation, removing .id() view hierarchy teardown
- Fix currency flash: defer store.currency and symbol update until a
  rate is available so symbol and rate apply atomically
- Fix export: terminationHandler instead of waitUntilExit (no UI freeze),
  HHmmss in filename to prevent overwrite on double-export
- Fix CurrencyState: @MainActor isolation with proper Sendable
  conformance, nonisolated on pure static functions
- Fix streak count: iterate calendar days instead of sparse history
  entries so gaps are counted as streak-breakers
- Fix TrendBar identity: stable date-based id instead of UUID
- Add GPT-5.3 and DeepSeek model display names
2026-05-01 08:01:25 -07:00
Resham Joshi
bbe99fa298
Merge pull request #180 from josteinaj/locate-workspaceStorage-in-vscode-dev-container
Some checks are pending
CI / semgrep (push) Waiting to run
VSCode: correctly locate workspaceStorage when working inside a dev container
2026-04-30 18:41:20 -07:00
Resham Joshi
78ad04c77f
Merge pull request #186 from getagentseal/fix/menubar-timezone-184
Fix timezone handling for non-UTC users
2026-04-30 17:33:27 -07:00
iamtoruk
68c6f2c710 Fix timezone handling: menubar UTC bugs, --timezone flag, DST-safe dates
Three fixes for issue #184:

1. Menubar Swift code used UTC instead of local timezone in two places:
   computeHistoryStats hardcoded TimeZone("UTC") and
   effectiveTokensInLast7Days used ISO8601DateFormatter (UTC default).
   Both now use .current to match CLI-produced local date keys.

2. Add --timezone flag and CODEBURN_TZ env var to override the system
   timezone for all date grouping. Sets process.env.TZ before any Date
   operations so all existing local-timezone code works unchanged.

3. Replace MS_PER_DAY arithmetic with Date constructor day-of-month
   math for yesterday/backfill computations. Subtracting 86400000ms
   from midnight skips a day on DST spring-forward (23-hour day).

Fixes #184
2026-04-30 17:33:02 -07:00
iamtoruk
8ab9ea916b Add per-file result cache for Codex provider
Fixes #183. Users with large Codex session directories (45 GB, 10K+
files) experienced CPU pegging because every 30-second refresh re-parsed
all session files from scratch.

Three optimizations:

1. readFirstLine now reads 16 KB via fs.open() instead of loading the
   entire file through readSessionFile. Cuts discovery I/O from ~45 GB
   to ~160 MB for 10K files.

2. Per-file result cache (codex-results.json) with mtime+size
   fingerprinting. Parsed results are cached on first run; subsequent
   runs return cached data instantly for unchanged files.

3. Cache-accelerated discovery skips header validation for cached files,
   pulling the project name directly from the cache manifest.

Cache safety: fingerprint captured before read (no TOCTOU), atomic
write via temp+fsync+rename, 0o600 permissions, Object.hasOwn for
prototype pollution defense, eviction of deleted files on flush,
try/finally ensures flush even on parse errors.
2026-04-30 16:43:41 -07:00
Jostein Austvik Jacobsen
2de8909823
locate workspaceStorage in vscode dev container 2026-04-30 22:51:52 +02:00
David
116388427e fix(copilot): discover VS Code Insiders transcripts 2026-04-30 16:27:22 +02:00
AgentSeal
7b8f4df856 Fix review findings: error handling, constant dedup, migration persist
- Wrap hydrateCache() in try/catch so disk errors don't crash commands
  that previously never touched the cache
- Export MS_PER_DAY, BACKFILL_DAYS, toDateString from daily-cache.ts
  and remove duplicates from cli.ts
- Remove double hydrateCache() call in report JSON path
- Persist migrated cache to disk so old-version files aren't
  re-migrated on every run
- Export emptyCache() for use as fallback on hydration failure
2026-04-28 22:56:47 +02:00
AgentSeal
888f592bd2 Make daily cache durable: hydrate from all commands, migrate instead of nuke
- Extract ensureCacheHydrated() from menubar-json path into daily-cache.ts
- Call it from every command that parses sessions (report, status, today,
  month, export, optimize, compare, yield) so CLI-only users also persist
  historical data that survives source file deletion
- Replace strict version equality check with fill-defaults migration for
  cache versions 2-4, preserving history across schema changes
- Back up old cache to .bak before discarding on unmigrateable versions
- Fix Copilot auto bucket display names in menubar (Copilot (Anthropic),
  Copilot (OpenAI))
- Fix Roo Code / KiloCode provider key matching in menubar tab strip
2026-04-28 22:41:01 +02:00
Resham Joshi
fbb2c4e69c
Merge pull request #171 from ksp2000/feature/copilot-auto-model-buckets
refactor(copilot): use auto model buckets for transcript inference
2026-04-28 12:17:50 -07:00
Dunccan de Weerdt
26ebe75aa1 Add Droid CLI provider
Discovers and parses sessions from ~/.factory/sessions/, reading JSONL
message logs and companion settings.json files for token usage tracking.

- Discovers sessions by scanning per-cwd subdirectories
- Skips internal .factory housekeeping sessions
- Extracts tools, bash commands, and user messages from JSONL
- Distributes session-level cumulative token counts across calls
- Normalizes Droid model wrappers before existing pricing lookup
- Derives clean project names from cwd paths
- Adds menubar provider filtering for Droid
2026-04-28 20:16:45 +02:00
AgentSeal
d043795855 Add Qwen provider and replace hardcoded pricing with LiteLLM snapshot
- Add Qwen CLI provider (discovers sessions from ~/.qwen/projects/)
- Replace FALLBACK_PRICING (40 hand-maintained entries) with auto-generated
  LiteLLM snapshot (3595 models including Azure, OpenRouter pricing)
- Build script fetches and bundles LiteLLM data before tsup
- Provider-prefixed lookups (azure/, openrouter/) resolve to correct pricing
- Add display names for all GPT-5.x model variants
- Add Qwen to menubar provider filter and tab strip
2026-04-28 19:49:14 +02:00
Resham Joshi
ec2de6a642
Add OpenClaw, Roo Code, and KiloCode providers (#175)
- OpenClaw: JSONL parser with multi-path discovery, tool extraction
  (toolCall + tool_use block types), model tracking via model_change
  and custom model-snapshot events
- Roo Code + KiloCode: shared Cline-family parser extracts model from
  <model> tags in api_conversation_history.json, strips provider
  prefixes from model names
- Add cline-auto and openclaw-auto aliases and display names
- Add menubar provider filters and tab colors for all three
- Show cached data instantly instead of blocking on CLI refresh
2026-04-28 09:24:14 -07:00
saipraneeth.konda
74c1c4b4c1 refactor(copilot): use auto model buckets for transcript inference 2026-04-28 19:32:03 +05:30
AgentSeal
ce78ac52c1 Add builtin aliases for Cursor model names
Some checks are pending
CI / semgrep (push) Waiting to run
Cursor emits dot-version tier-last names like claude-4.6-sonnet
that don't match our pricing keys. Map them so individual models
show costs instead of $0.
2026-04-28 15:09:54 +02:00
AgentSeal
64259c929c Fix Gemini provider for JSONL format (CLI 0.39+)
Gemini CLI 0.39 switched from single JSON to JSONL with one object
per line and $set metadata lines. Parser now handles both formats.
Also updated --provider help text to list all providers.
2026-04-28 15:03:39 +02:00
Resham Joshi
383df90c93
Fix Cursor provider dropping data older than 35 days (#169)
Increase lookback from 35 to 180 days so longer period views
(30-day, month, all) show accurate totals.

Fixes #159, fixes #163
2026-04-27 20:02:06 -07:00
Resham Joshi
6d15ea43a5
Add Gemini CLI provider for session tracking (#168)
Parse ~/.gemini/tmp/<project>/chats/session-*.json files from Gemini
CLI 0.38+. Uses real token counts (input, output, cached, thoughts)
embedded in each message instead of character estimation. Correctly
separates cached tokens from fresh input to avoid double-charging.

- Pricing for gemini-3.1-pro-preview, gemini-3-flash-preview,
  gemini-2.5-pro, gemini-2.5-flash from official Google API rates
- Tool name normalization (ReadFile->Read, SearchText->Grep, etc.)
- Menubar tab with Google Blue color (#4485F4)

Closes #166
2026-04-27 19:48:25 -07:00
Resham Joshi
f7f64a01ab
Add new providers, fix menubar tabs, accent color picker (#167)
* Add Kiro provider and transparent auto-model naming

- Add Kiro IDE provider: parses .chat JSON files, estimates tokens,
  normalizes dot-versioned model IDs for cost lookup
- Show "Cursor (auto)", "Copilot (auto)", "Kiro (auto)" in CLI
  dashboard instead of pretending to know which model was used
- Route auto model names through BUILTIN_ALIASES for cost estimation

* Fix menubar tabs: add missing providers, show period-scoped costs

- Add Kiro, OMP to ProviderFilter enum so installed providers appear as tabs
- Merge Cursor + Cursor Agent into single Cursor tab
- Tab costs now reflect the selected period (7d/30d/month/all) instead
  of always showing today
- Tab visibility still uses today's provider list so tabs don't
  disappear when switching to periods with no data

* Add accent color picker to menubar with Apple system presets

- 9 presets using Apple's exact macOS dark-mode accent colors
  (Ember, Blue, Purple, Pink, Red, Orange, Yellow, Green, Graphite)
- Color picker in header, persisted via UserDefaults
- "Burn" text stays fixed ember regardless of accent
- ThemeState is MainActor-isolated for thread safety
- Picker state lifted to AppStore so it survives .id() tree rebuild
- Accessibility labels on all color swatches
- Renamed brandAccentDark/brandEmberDeep/brandEmberGlow to match
  their actual light/deep/glow semantics

* Fix review findings: case-sensitive cost lookup, Kiro timestamp guard, cache versioning

- Normalize provider dictionary keys to lowercase in tab cost lookup
  so "Cursor Agent" (title-case from CLI) matches providerKeys
- Guard against missing/invalid/epoch startTime in Kiro parser to
  prevent RangeError crash or 1970-01-01 ghost entries
- Bump DAILY_CACHE_VERSION to 4 so upgraded users get a clean
  recompute with the new auto-model naming (cursor-auto vs default)
- Add version field to cursor-results.json cache to invalidate stale
  entries that still use the old 'default' model name
2026-04-27 19:46:30 -07:00
Resham Joshi
5d1b335c0a
Fix Copilot provider to read VS Code workspace transcripts (#165)
The Copilot provider only looked in ~/.copilot/session-state/ which is
from an older CLI tool. VS Code Copilot agent stores transcripts in
~/Library/Application Support/Code/User/workspaceStorage/*/GitHub.copilot-chat/transcripts/.

The new transcript format has no outputTokens or model_change events,
so tokens are estimated from content length and the model is inferred
from tool call ID prefixes. Both legacy and VS Code paths are now
scanned in parallel.

Fixes #161
2026-04-27 19:44:35 -07:00
AgentSeal
410fad9495 Fix Cursor provider reporting $0 for v3 bubble format and NULL createdAt rows
Cursor v3 stores zero token counts in bubbles, causing parseBubbles to
return empty results. The query also dropped rows with NULL createdAt
via the SQL comparison, hiding data from older Cursor versions too.

Changes:
- Remove inputTokens > 0 SQL filter, estimate tokens from text length
  when token counts are zero (same 4 chars/token ratio as agentKv)
- Include NULL createdAt rows with OR IS NULL, fall back to current
  timestamp when createdAt is missing
- Parse agentKv entries with plain string content instead of skipping
  them (not all content is a JSON array)
- Always parse both bubbles and agentKv instead of agentKv-only fallback
- Discover subagent transcripts in subagents/ subdirectories
- Fix timezone-dependent test in day-aggregator

Fixes #159, #163
2026-04-28 00:35:51 +02:00