Commit graph

222 commits

Author SHA1 Message Date
iamtoruk
3cb9a7a7bc feat(compare): add self-correction JSONL scanner
Adds scanSelfCorrections() which reads raw .jsonl session files (including subagent dirs) and counts per-model self-correction patterns for use in the model comparison metrics.
2026-04-19 05:25:31 -07:00
iamtoruk
ac9afffed5 feat(compare): add computeComparison with normalized metrics 2026-04-19 05:22:34 -07:00
iamtoruk
9d119bfe40 feat(compare): add ModelStats type and aggregateModelStats 2026-04-19 05:20:37 -07:00
iamtoruk
7cb1cf58bf Add implementation plan for model comparison feature
9 tasks covering: ModelStats aggregation, comparison metrics,
self-correction scanner, Ink components, CLI command, dashboard integration.
2026-04-19 05:09:11 -07:00
iamtoruk
b69bf39deb Add design spec for model comparison feature
Side-by-side comparison of any two AI models using normalized metrics:
cost/call, one-shot rate, retry rate, self-correction rate, cache hit.
Accessible via codeburn compare and dashboard [c] shortcut.
2026-04-19 04:55:32 -07:00
iamtoruk
e3395d241f Fix daily cache gap fill using UTC instead of local time
The gapStart date was constructed with T00:00:00.000Z (UTC midnight),
causing it to land hours before local midnight. In PDT this meant
the gap fill re-parsed a partial slice of the previous day, and the
upsert replaced the full day with that partial data, losing cost.

Bump DAILY_CACHE_VERSION to 3 to force cache rebuild.
2026-04-19 04:23:17 -07:00
iamtoruk
070a378160 chore: bump to 0.7.4 and update CHANGELOG
Some checks are pending
CI / semgrep (push) Waiting to run
2026-04-19 03:36:27 -07:00
iamtoruk
bc92b49c1b feat(mac): auto-update checker and Plan pane button cleanup
Remove the broken "Connect Claude" / "Reconnect Claude" buttons from
the Plan pane -- they opened a terminal session that did nothing useful
for already-logged-in users. Keep only the "Retry" button.

Add an auto-update checker that queries GitHub releases every 2 days in
the background. When a newer menubar build is available, an "Update"
pill appears in the header. Clicking it runs the existing installer
flow (download, replace, relaunch) with no manual steps.
2026-04-19 03:33:37 -07:00
iamtoruk
72ccf34a5a fix: use local timezone for daily date bucketing instead of UTC
Timestamps in session files are UTC ISO strings. Several code paths
extracted the date via .slice(0, 10) which gives the UTC date, while
date range filtering uses local-time boundaries. This caused turns
between UTC midnight and local midnight to be bucketed under the wrong
day -- the menubar showed lower today cost than the TUI because those
turns were attributed to tomorrow (UTC) but filtered as today (local).

format.ts already had a localDateString fix; this applies the same
pattern everywhere via dateKey() in day-aggregator.ts.
2026-04-19 03:18:38 -07:00
iamtoruk
888030fce3 fix: recompute yesterday in daily cache to prevent stale menubar data
The daily cache never re-processed yesterday once cached, so a mid-day
run would freeze partial cost/call data permanently. The "All" provider
path in menubar-json relied on this cache, causing the menubar to show
wildly incorrect numbers while per-provider views (which parse fresh)
were correct. Now yesterday is evicted and recomputed on every run, and
addNewDays upserts instead of skipping duplicates as defense-in-depth.
2026-04-19 03:07:54 -07:00
Resham Joshi
64aae10175
Merge pull request #104 from aaronflorey/fix/opencode-sqlite-esm-loader
fix(sqlite): load node:sqlite in ESM runtime
2026-04-19 02:11:21 -07:00
AgentSeal
11b3de89e4
fix(sqlite): load node:sqlite in ESM runtime
Replace eval-based require with createRequire(import.meta.url) so the SQLite driver loads correctly when the CLI runs as ESM.

This restores OpenCode and Cursor session discovery instead of returning empty results when require is unavailable.
2026-04-19 05:27:05 +00:00
AgentSeal
82df214958
docs: cover --from/--to, avgCostPerSession, and semgrep guard (#99)
Some checks are pending
CI / semgrep (push) Waiting to run
README gains a --from/--to example in the Usage block, a dedicated
'Date range filtering' subsection, and a note that JSON projects[]
now includes avgCostPerSession.

CHANGELOG opens an Unreleased section crediting @lfl1337 for PRs #78
and #80. Flags the projects.csv column-order shift (Avg/Session now
between Cost and Share) so consumers parsing by position read by
header instead.

Co-authored-by: AgentSeal <hello@agentseal.org>
2026-04-18 15:45:45 -07:00
Ninym
c634b10560
feat(report): add --from/--to date range filtering and avgCostPerSession (#80)
* test(cli): failing tests for parseDateRangeFlags helper

* feat(cli): add parseDateRangeFlags helper with local-time dates

* feat(report): add --from/--to date range filtering

* feat(report): add avgCostPerSession to JSON report and CSV/JSON export
2026-04-18 15:11:33 -07:00
Ninym
5932a273a1
chore(ci): add semgrep guard against prototype pollution regressions in provider hot paths (#78)
* chore(ci): add semgrep rule no-bracket-assign-on-literal-object-map

* chore(ci): add workflow running semgrep bracket-assign guard on push/PR

* fix(parser): use Object.create(null) for categoryBreakdown map

* chore(ci): expand semgrep rule to cover ||, ??=, and if-guard variants

* chore(ci): limit push trigger to main and add semgrep --strict

* chore(ci): use jq to enforce finding count (--error unreliable in semgrep 1.x)
2026-04-18 15:10:24 -07:00
AgentSeal
a031c8d32d
chore: point repo URLs at getagentseal org (#97)
Add package.json repository/bugs/homepage fields. Swap hardcoded
AgentSeal/codeburn URLs to getagentseal/codeburn across README,
mac README, macOS menubar star banner, and the menubar installer's
release-API endpoint. 301 redirects keep old URLs working, but
canonical links now point at the current org.

Co-authored-by: AgentSeal <hello@agentseal.org>
2026-04-18 14:55:44 -07:00
AgentSeal
af3676a2b1
Merge pull request #95 from getagentseal/fix/trend-tooltip-per-provider
fix(mac): show correct cost in trend tooltip for per-provider views
2026-04-18 13:41:49 -07:00
AgentSeal
94240f5341 fix(mac): show correct cost in trend tooltip for per-provider views
The trend chart tooltip always displayed `bar.tokens` in its header,
which is zero for provider-filtered history (the CLI only carries
per-provider cost+calls in the daily cache, not tokens). Result: when
you selected Claude/Codex/Cursor/Pi, hovering a bar showed $0.00 even
on days with real spend.

The trend chart's main metric already falls back to cost when tokens
are zero. Pass that same metric value through to the tooltip so both
stay consistent.

Also removed the misleading "No model breakdown available" fallback
line. For provider-filtered views the per-model breakdown legitimately
doesn't exist in the payload, so the tooltip now just shows date +
cost without the error-sounding message.
2026-04-18 13:18:11 -07:00
AgentSeal
70f47f8d9e
Merge pull request #94 from getagentseal/fix/menubar-today-cache-staleness
fix(mac): keep (today, all) cache fresh for menubar title and tab labels
2026-04-18 13:03:07 -07:00
AgentSeal
7ee8b679f9 fix(mac): keep (today, all) cache fresh for menubar title and tab labels
The refresh loop previously skipped `refreshQuietly(.today)` when the
user was already viewing the Today period. That guard meant while the
user was on (today, claude) or any other non-.all provider, the
(today, all) cache went stale. The menubar title and the agent tab
strip both read from that cache, so they displayed stale costs while
the hero section (which reads the currently-viewed payload) showed
the correct fresh value.

Remove the guard so the (today, all) cache refreshes every cycle
regardless of the currently selected period/provider.

Shipped as mac-v0.7.4.
2026-04-18 12:58:50 -07:00
AgentSeal
8ee1f38f86
Merge pull request #92 from AgentSeal/chore/reset-version-to-0.7.3
chore: reset version to 0.7.3 to match published npm
2026-04-18 09:57:52 -07:00
AgentSeal
c83a12efed chore: reset version to 0.7.3 to match published npm 2026-04-18 09:54:03 -07:00
AgentSeal
476b3c51ee
Merge pull request #91 from AgentSeal/revert/remove-npm-oidc
revert: remove npm OIDC publish workflow
2026-04-18 09:53:53 -07:00
AgentSeal
9ac2144950 revert: remove npm OIDC publish workflow
Three consecutive failed publish attempts on a live repo are not
acceptable. Reverting to manual `npm publish` from the laptop, which
has always worked. OIDC can be revisited later in a staging
environment, not on the production package.
2026-04-18 09:51:58 -07:00
AgentSeal
c62a1cf21f
Merge pull request #90 from AgentSeal/chore/bump-0.7.4-rc.2
chore: bump to 0.7.4-rc.2
2026-04-18 09:47:25 -07:00
AgentSeal
35d4d32955 chore: bump to 0.7.4-rc.2 for Node 24 OIDC retry 2026-04-18 09:47:21 -07:00
AgentSeal
ec130037f5
Merge pull request #89 from AgentSeal/fix/node-24-for-oidc
fix(ci): use Node 24 for npm OIDC trusted publishing
2026-04-18 09:47:18 -07:00
AgentSeal
4fccca47d2 fix(ci): use Node 24 for npm OIDC trusted publishing
Node 22 on GitHub's hosted runners currently pins to a broken npm
10.9.7 whose internal `promise-retry` module is missing from the
toolcache (runner-images#13883, nodejs/node#62430). Self-upgrading
via `npm install -g npm@latest` crashes before the install can run,
because `@npmcli/arborist` cannot start without that module.

Node 24 LTS bundles npm 11.x natively, which supports OIDC trusted
publishing out of the box (minimum is 11.5.1, per npm docs). Bumping
the runtime lets us delete the fragile upgrade step entirely.

Test: tag `v0.7.4-rc.2` after merge to validate the flow publishes
successfully with provenance.
2026-04-18 09:46:13 -07:00
AgentSeal
27af2ef96a
Merge pull request #88 from AgentSeal/chore/bump-0.7.4-rc.1
chore: bump to 0.7.4-rc.1
2026-04-18 09:36:52 -07:00
AgentSeal
e7f1b33196 chore: bump to 0.7.4-rc.1 for OIDC retry after npm upgrade fix 2026-04-18 09:36:38 -07:00
AgentSeal
679363a25c
Merge pull request #87 from AgentSeal/fix/npm-version-for-oidc
fix(ci): upgrade npm to 11.5.1+ for OIDC trusted publishing
2026-04-18 09:36:36 -07:00
AgentSeal
832dd4ada1 fix(ci): upgrade npm to 11.5.1+ for OIDC trusted publishing
Node 22 ships with npm 10.x, which does not know how to exchange the
GitHub OIDC id-token for a short-lived npm token. Without this upgrade,
the publish step silently falls back to the empty NODE_AUTH_TOKEN that
setup-node writes to .npmrc, and the registry returns 404.

First test publish (v0.7.4-rc.0) failed at exactly this point, even
though provenance signing via sigstore succeeded, confirming the OIDC
handshake with GitHub was fine and only the npm-side auth was broken.

Fix: `npm install -g npm@latest` before the publish step. Adds ~5s to
runtime.
2026-04-18 09:33:52 -07:00
AgentSeal
bed772b6a5
Merge pull request #86 from AgentSeal/chore/bump-0.7.4-rc.0
chore: bump to 0.7.4-rc.0 for OIDC test publish
2026-04-18 09:27:05 -07:00
AgentSeal
46f72ba348 chore: bump to 0.7.4-rc.0 for OIDC test publish
Pre-release bump to validate npm OIDC trusted publishing end to end:
workflow trigger, Environment approval gate, Trusted Publisher match,
provenance attestation. Will not be tagged as `latest` on npm (npm
auto-excludes SemVer pre-releases from dist-tags). After this RC
succeeds, cut 0.7.4 proper.
2026-04-18 09:25:59 -07:00
AgentSeal
882deafc2b
Merge pull request #84 from AgentSeal/feat/npm-oidc-publish
CI: npm OIDC trusted publishing workflow
2026-04-18 09:10:56 -07:00
AgentSeal
21a4627780
Merge pull request #85 from AgentSeal/chore/ignore-discord-brand-assets
chore: ignore local Discord brand assets
2026-04-18 09:10:53 -07:00
AgentSeal
c90340f205 chore: ignore local Discord brand assets
Adds `assets/discord-*.png` to .gitignore so local promo/branding
assets that aren't ready to publish don't show up as untracked noise
in `git status`. Any Discord asset that should be tracked later can be
added with `git add -f`.
2026-04-18 09:08:43 -07:00
AgentSeal
8c87947980
Merge pull request #83 from AgentSeal/chore/block-claude-coauthor-prs
CI: block Co-authored-by Claude/Anthropic trailers on PRs
2026-04-18 09:04:54 -07:00
AgentSeal
e834f64c22 ci: block Co-authored-by Claude/Anthropic trailers on PRs
New GitHub Actions check that scans every PR commit for
`Co-authored-by: ... claude ...` or `... anthropic ...` trailers and
fails the PR with a clear remediation message if found. Contributors
can still use AI tools; the trailer attribution must be removed before
the PR is eligible to merge, consistent with the project contributor
guidelines.

The workflow scans only commits introduced by the PR
(base.sha..head.sha), so existing history is untouched.
2026-04-18 09:02:48 -07:00
AgentSeal
d80f68928b ci: add npm OIDC trusted-publish workflow
Triggers on v* tag push or manual dispatch. Builds, tests, then publishes
codeburn to npm with provenance attestation. Uses OIDC so no NPM_TOKEN is
stored in repo secrets. The npm-publish GitHub Environment gates the
publish step behind a required reviewer, so every release needs explicit
human approval before it reaches the registry.

Tag/package version mismatch fails fast before any build work. Tests run
before publish to prevent shipping a broken release.
2026-04-18 07:43:06 -07:00
AgentSeal
7a5cb32e4c Merge: restore agent tabs + Connect Claude button 2026-04-18 07:17:49 -07:00
AgentSeal
43a938ff9e feat(mac): add Connect Claude button to Plan pane
The Plan pane previously told users to "run claude login in your
terminal, then retry" with no way to start the flow from the app.
Added a primary Connect Claude button on both the no-credentials and
failed states that launches Terminal.app with `claude login`, so the
OAuth flow is one click away.

TerminalLauncher.openClaudeLogin() uses a hardcoded literal, so no
user input reaches AppleScript. Refactored the common path into
runInTerminal(command:preValidated:) which re-validates any non-
literal input against CodeburnCLI.isSafe as defense-in-depth.

On machines without Terminal.app (iTerm/Ghostty/Warp), the button
surfaces an inline instruction to run `claude login` manually instead
of failing silently.
2026-04-18 06:54:57 -07:00
AgentSeal
9483d66e65 fix(mac): restore agent tab strip to show all detected providers
Tabs were filtering on `value > 0` (today's spend), which hid the row
whenever only one provider had activity today. The CLI's providers map
already contains only providers detected on the system, so showing the
map as-is matches user intent: a tab for each installed tool,
regardless of today's spend. Tab strip only hides when nothing is
detected.

This also makes the Plan pill reachable again: it gates on
`selectedProvider == .claude`, which required clicking the Claude tab
to select.
2026-04-18 06:54:06 -07:00
AgentSeal
29e175c735 Merge: hide Mac agent tabs when fewer than two providers have spend
Brings in the Mac menubar's agent tab visibility rule: tabs only appear when
two or more providers have non-zero spend in the all-provider today view.
Also expands ProviderFilter to include every provider the CLI supports
(OpenCode, Pi) so their tabs appear when those tools produce sessions.
2026-04-18 05:08:10 -07:00
AgentSeal
85d7bea7ea feat(mac): hide agent tabs when fewer than two providers have spend
The tab strip was visible for everyone regardless of which tools they
actually run, which produced a row of All + one provider for Claude-only
users and a row of All + zeros for users on exotic stacks. Hide the
whole row until a second provider has real spend, matching the behavior
the GNOME extension ships with.

Also expand ProviderFilter to include every provider the CLI supports
(OpenCode and Pi were missing) so their tabs appear when those tools
produce sessions. The CLI already emits pi and opencode in the payload's
providers map; the Mac app just wasn't offering a tab for them.

visibleFilters now filters on value > 0 instead of key presence, because
the CLI includes zero-cost entries for discovered-but-unused providers
and we don't want those rendering as blank tabs.
2026-04-18 05:07:36 -07:00
AgentSeal
b232b3cfbe
Merge pull request #76 from AgentSeal/fix/drop-prebuild-install
Drop better-sqlite3 to remove deprecated prebuild-install (#75)
2026-04-18 10:39:26 +02:00
AgentSeal
7aefd674fc fix: drop better-sqlite3 to remove deprecated prebuild-install (#75)
npm was warning on every install that prebuild-install@7.1.3 is no
longer maintained. prebuild-install ships as a transitive dependency
of better-sqlite3 and upstream PR #1446 to replace it is still open,
so we switch to Node's built-in node:sqlite module (stable in Node 24,
experimental in Node 22/23) and remove the better-sqlite3 dep entirely.

- src/sqlite.ts: uses DatabaseSync from node:sqlite. The one-shot
  ExperimentalWarning about SQLite on Node 22/23 is silenced for that
  specific warning; other warnings pass through unchanged.
- package.json: engines.node bumped to >=22 (Node 20 EOL 2026-04-30),
  better-sqlite3 and @types/better-sqlite3 removed, @types/node added
  (it was coming in transitively via @types/better-sqlite3).
- tests/providers/opencode.test.ts: fixture DB creation switched to
  node:sqlite (API parity for the CREATE TABLE + INSERT + prepare
  path we use).

End-user install footprint shrinks from 167 to 40 packages and prints
zero deprecation warnings.

Credit: @primeminister for the report.
2026-04-18 01:26:23 -07:00
AgentSeal
2b15256189 docs: actually update README to the renamed screenshot path
The prior rename commit moved the PNG but forgot to stage the matching
README edit, leaving the live image tag pointing at a path that no
longer existed. This fixes it.
2026-04-17 17:43:09 -07:00
AgentSeal
1037d2c527 docs: rename README screenshot so CDN+Camo re-fetch
The ?v=0.7.2 query bust wasn't enough; GitHub's Camo proxy was still
serving the SwiftBar-era image to viewers. Renaming the asset to a
new path forces every downstream cache to treat it as a new resource.
2026-04-17 17:41:28 -07:00
AgentSeal
feda92124d docs: bust CDN cache on menubar screenshot
Appends ?v=0.7.2 to the image URL so GitHub's Camo proxy re-fetches
the new 0.7.2 screenshot instead of serving its stale copy of the
SwiftBar-era one.
2026-04-17 17:29:16 -07:00