diff --git a/.gitignore b/.gitignore
index 1fc45e8..abdd828 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,7 +16,7 @@ Thumbs.db
 # Planning artifacts (internal, not shipped)
 docs/superpowers/
 .claude/
-CLAUDE.md
+/CLAUDE.md
 
 # Config / secrets
 .env
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..069b3f9
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,110 @@
+# Contributing to CodeBurn
+
+Thanks for your interest. This document covers what you need to know to send a working pull request.
+
+## Prerequisites
+
+- Node.js 22.20 or newer (`engines.node` in `package.json`).
+- npm 10 or newer (ships with recent Node).
+- macOS or Linux for full provider coverage. Windows works for most providers but Cursor / Antigravity development is easier on macOS.
+- Optional: Swift 6 toolchain if you are touching the macOS menubar (`mac/`).
+- Optional: GNOME 45 or newer if you are touching the GNOME extension (`gnome/`).
+
+## Setup
+
+```bash
+git clone https://github.com/getagentseal/codeburn
+cd codeburn
+npm install
+```
+
+There is no separate build step required to run the dev CLI. `npm run dev` runs `tsx` against `src/cli.ts` directly.
+
+## Common Commands
+
+| Command | What it does |
+|---|---|
+| `npm test` | Runs the vitest suite (41 test files, 558 tests). |
+| `npm run dev -- status` | Runs the CLI in dev mode against your real data. |
+| `npm run build` | Bundles the litellm pricing snapshot, then runs `tsup` to produce `dist/cli.js`. |
+| `npm run bundle-litellm` | Refreshes `src/data/litellm-snapshot.json` from the upstream litellm repo. |
+
+To test a specific suite, pass a path:
+
+```bash
+npm test -- tests/providers/codex.test.ts
+```
+
+## What to Read Before Editing
+
+- `docs/architecture.md` for the high-level codebase map.
+- `docs/providers/<name>.md` for the provider you intend to change.
+- `RELEASING.md` if you are touching version bumps or the release pipeline.
+- `SECURITY.md` for the disclosure policy.
+
+## Project Layout
+
+```
+src/                CLI, parsers, optimize detectors, cache layers
+src/providers/      One file per AI tool integration
+src/data/           Bundled litellm pricing snapshot
+tests/              vitest specs
+mac/                Swift menubar app
+gnome/              GNOME shell extension
+scripts/            Build helpers (litellm bundle)
+```
+
+See `docs/architecture.md` for a fuller map.
+
+## Coding Conventions
+
+- TypeScript strict mode is on. Do not introduce `any` without a comment explaining why.
+- Avoid bracket-assign (`obj[key] = value`) on parsed user input in hot paths inside `src/providers/` and `src/parser.ts`. There is a Semgrep rule (`.semgrep/rules/no-bracket-assign-hot-paths.yml`) enforced in CI that will fail your PR if you do. Use a `Map` or an explicit allowlist instead.
+- Provider parsers must be deterministic given the same input. If you read the system clock or the filesystem outside the documented session paths, add a fixture-based test.
+- New providers go through `src/providers/index.ts`. Lazy-load anything that pulls a heavy native dependency (sqlite, protobuf) so users without that provider are not slowed down.
+
+## Tests
+
+- Each new provider should ship with a fixture-based test under `tests/providers/`. The five providers without test files today (claude, gemini, goose, qwen, antigravity) are a known gap; new code should not add to that list.
+- Each new optimize detector in `src/optimize.ts` needs at least one positive and one negative case in `tests/optimize.test.ts`.
+- If your change affects the menubar JSON contract, update `tests/menubar-json.test.ts`.
+
+## Commit Message Format
+
+Short imperative subject, optional body. Examples from `git log`:
+
+```
+Enhance GNOME extension with scrollable UI, dark mode, charts, and performance fixes
+Add table column headers, oneshot placeholder, currency picker dropdown
+```
+
+### No AI Co-Author Trailers
+
+The `.github/workflows/block-claude-coauthor.yml` workflow rejects any PR whose commits contain a `Co-authored-by: ... claude ...` or `... anthropic ...` trailer. You may use AI tools to help write code, but strip the co-author line before pushing.
+
+If a flagged PR rejects on this check, the workflow prints the exact rebase command to fix it.
+
+## Pull Requests
+
+1. Fork or branch from `main`.
+2. Push your branch and open a PR against `main`.
+3. The `firstlook` workflow will auto-assess the PR. The `semgrep` CI workflow runs the hot-path bracket-assign guard. The `block-claude-coauthor` workflow scans commits.
+4. A maintainer reviews. For non-trivial changes, expect requests for tests.
+5. Squash-merge is the default. Keep the PR title short and accurate; the description carries the context.
+
+## Reporting Bugs
+
+File issues at https://github.com/getagentseal/codeburn/issues. Useful details:
+
+- Output of `codeburn --version`.
+- Provider involved and rough size of your session history (`du -sh ~/.codex/sessions`, etc.).
+- Output of the failing command with `DEBUG=1` if applicable.
+- For parsing bugs: a redacted JSONL or SQLite snippet that reproduces the issue.
+
+## Security Issues
+
+Do not file security issues in the public tracker. See `SECURITY.md` for the disclosure process.
+
+## License
+
+CodeBurn is MIT-licensed. By contributing, you agree your contributions are licensed under the same terms.
diff --git a/RELEASING.md b/RELEASING.md
new file mode 100644
index 0000000..56e4124
--- /dev/null
+++ b/RELEASING.md
@@ -0,0 +1,252 @@
+# Releasing CodeBurn
+
+This document describes the actual steps a maintainer takes to cut a CLI or macOS menubar release. CLI releases are run by hand with `npm publish`; macOS menubar releases are automated by `.github/workflows/release-menubar.yml` when a `mac-v*` tag is pushed.
+
+## Versioning
+
+CodeBurn uses semantic versioning (major.minor.patch). The CLI and macOS menubar share the same version number for clarity.
+
+## Before Every Release
+
+Run the test suite to catch any regressions:
+
+```bash
+npm test
+```
+
+Verify that the build completes without errors:
+
+```bash
+npm run build
+```
+
+## CLI Release Process
+
+### 1. Update the Version
+
+Edit `package.json` to bump the version number. Update both the `version` field at the top and the `package-lock.json` lockfile to match (npm handles this automatically):
+
+```bash
+npm version <version>
+```
+
+For example, `npm version 0.9.8` updates both files and creates a commit.
+
+Alternatively, edit `package.json` by hand and run `npm install` to regenerate the lockfile with the new version.
+
+### 2. Update the Changelog
+
+Edit `CHANGELOG.md`. Move all changes from the "Unreleased" section into a new section with the version number and today's date:
+
+```markdown
+## Unreleased
+
+### ...
+
+## 0.9.8 - 2026-05-10
+
+### Added
+- Feature X
+
+### Fixed
+- Bug Y
+```
+
+Commit these changes:
+
+```bash
+git add CHANGELOG.md package.json package-lock.json
+git commit -m "chore: bump to 0.9.8"
+```
+
+### 3. Publish to npm
+
+There is no GitHub Actions workflow for the CLI; the maintainer runs `npm publish` from a clean working tree:
+
+```bash
+npm publish
+```
+
+The `prepublishOnly` script in `package.json` runs `npm run build` first, which bundles the litellm pricing snapshot and then runs `tsup` to emit `dist/cli.js`.
+
+If publishing for the first time on a new machine, run `npm login` first.
+
+### 4. Tag the Release
+
+After npm accepts the publish, tag the commit and push:
+
+```bash
+git tag v0.9.8
+git push origin v0.9.8
+```
+
+The tag is for human reference and to anchor the GitHub Release. No workflow runs on `v*` tags for the CLI today.
+
+### 5. Verify npm Publication
+
+```bash
+npm view codeburn version
+```
+
+### 6. Create a GitHub Release
+
+Use the GitHub CLI to create a release with notes from the changelog:
+
+```bash
+gh release create v0.9.8 --title v0.9.8 --notes "$(sed -n '/^## 0.9.8/,/^## /p' CHANGELOG.md | head -n -1)"
+```
+
+Or use the web interface to draft a release and copy the changelog section into the body.
+
+## macOS Menubar Release Process
+
+The macOS menubar is released separately with its own GitHub Release, but shares the same version number as the CLI.
+
+### 1. Same Version Bump
+
+Follow the same version bumping process as the CLI. Both `package.json` and `CHANGELOG.md` reflect the shared version.
+
+### 2. Tag the macOS Release
+
+After the CLI tag is published, create a separate tag for the menubar:
+
+```bash
+git tag mac-v0.9.8
+git push origin mac-v0.9.8
+```
+
+### 3. GitHub Actions Builds the Bundle
+
+The `.github/workflows/release-menubar.yml` workflow automatically detects the `mac-v*` tag and:
+
+1. Checks out the repo
+2. Runs `mac/Scripts/package-app.sh 0.9.8`
+3. Signs the app bundle (ad-hoc signing)
+4. Creates a zip file: `CodeBurnMenubar-0.9.8.zip`
+5. Computes a SHA-256 checksum: `CodeBurnMenubar-0.9.8.zip.sha256`
+6. Uploads both to a GitHub Release named "Menubar v0.9.8"
+
+The script output on the build machine shows:
+
+```
+✓ Built /path/mac/.build/dist/CodeBurnMenubar-0.9.8.zip
+✓ Checksum /path/mac/.build/dist/CodeBurnMenubar-0.9.8.zip.sha256
+<sha256-hash>  CodeBurnMenubar-0.9.8.zip
+```
+
+No manual action is needed; the workflow handles everything.
+
+### 4. Verify the Release
+
+After the workflow completes, the GitHub Release page shows the zip and sha256 files. The menubar installer command in the CLI calls `npx codeburn menubar`, which fetches the latest release from GitHub and installs it into `~/Applications`.
+
+## Homebrew Tap Update
+
+The Homebrew tap lives at `https://github.com/getagentseal/homebrew-codeburn`. A maintainer with access to that repository must manually update the formula.
+
+### 1. Fetch the npm Tarball
+
+When the CLI is published to npm, get its download URL and SHA-256 hash:
+
+```bash
+npm view codeburn@0.9.8 dist.tarball
+npm view codeburn@0.9.8 dist.shasum
+```
+
+This returns a URL like `https://registry.npmjs.org/codeburn/-/codeburn-0.9.8.tgz` and a SHA-256 hash.
+
+Alternatively, compute the hash yourself:
+
+```bash
+curl -sL https://registry.npmjs.org/codeburn/-/codeburn-0.9.8.tgz | shasum -a 256
+```
+
+### 2. Update the Formula
+
+Edit `Formula/codeburn.rb` in the homebrew-codeburn tap:
+
+```ruby
+class Codeburn < Formula
+  desc "See where your AI coding tokens go"
+  homepage "https://github.com/getagentseal/codeburn"
+  url "https://registry.npmjs.org/codeburn/-/codeburn-0.9.8.tgz"
+  sha256 "<computed-hash>"
+  license "MIT"
+
+  depends_on "node"
+
+  def install
+    system "npm", "install", *Language::Node.std_npm_install_args(libexec)
+    bin.install_symlink Dir[libexec/"bin/*"]
+  end
+
+  test do
+    system "#{bin}/codeburn", "--version"
+  end
+end
+```
+
+Update the `url` and `sha256` fields with the new version's values.
+
+### 3. Test Locally
+
+Before pushing, test the formula locally:
+
+```bash
+brew install --build-from-source Formula/codeburn.rb
+codeburn --version
+```
+
+### 4. Commit and Push
+
+Commit the formula change:
+
+```bash
+git add Formula/codeburn.rb
+git commit -m "codeburn: bump to 0.9.8"
+git push origin main
+```
+
+Users can now install with:
+
+```bash
+brew tap getagentseal/codeburn
+brew install codeburn
+```
+
+Or upgrade an existing installation:
+
+```bash
+brew upgrade codeburn
+```
+
+## Replacing Assets on an Existing Release
+
+If a release is published with broken assets (e.g., a menubar zip with a build error), re-run the build and upload the fixed assets without creating a new tag.
+
+Use `gh release upload` with the `--clobber` flag to overwrite existing files:
+
+```bash
+# After re-running mac/Scripts/package-app.sh 0.9.8 to regenerate the zip and sha256
+gh release upload mac-v0.9.8 mac/.build/dist/CodeBurnMenubar-0.9.8.zip --clobber
+gh release upload mac-v0.9.8 mac/.build/dist/CodeBurnMenubar-0.9.8.zip.sha256 --clobber
+```
+
+The GitHub Release page will now serve the fixed assets. The menubar installer fetches from the Release by tag, so users who run `npx codeburn menubar` after the replacement get the fixed version automatically.
+
+## Rollback
+
+If a released version has a critical bug, the fastest path is to fix the bug and cut a new patch release (e.g., 0.9.8 -> 0.9.9). Delete the broken tag locally and on GitHub if it has not yet been widely distributed:
+
+```bash
+git tag -d v0.9.8
+git push origin --delete v0.9.8
+```
+
+npm does not allow republishing to the same version. If you must unpublish from npm, use `npm unpublish codeburn@0.9.8 --force` (requires Owner role), but this is discouraged and all users who installed that version retain it.
+
+For the menubar, tag a new mac-v0.9.9 and let the workflow build and upload it. Users will see the update pill in the menubar settings and upgrade automatically (or manually via `npx codeburn menubar --force`).
+
+## Summary
+
+The CLI release is manual: bump the version, update `CHANGELOG.md`, commit, run `npm publish`, then tag and create a GitHub Release. The macOS menubar release is automated: pushing a `mac-v*` tag fires `.github/workflows/release-menubar.yml`, which builds, signs, zips, and publishes the bundle. The Homebrew formula at `getagentseal/homebrew-codeburn` is updated by hand after each CLI publish.
diff --git a/docs/architecture.md b/docs/architecture.md
new file mode 100644
index 0000000..075206c
--- /dev/null
+++ b/docs/architecture.md
@@ -0,0 +1,189 @@
+# CodeBurn Architecture
+
+A map of the codebase. Read this once before opening a non-trivial PR.
+
+## Three Surfaces
+
+CodeBurn is one Node.js CLI plus two GUI clients that shell out to it.
+
+```
++----------------------+      +-----------------+
+| mac/  (Swift)        | ---> |                 |
++----------------------+      |  src/cli.ts     |
+| gnome/ (JavaScript)  | ---> |  (the CLI)      |
++----------------------+      |                 |
+                              |  status         |
+                              |  --format       |
+                              |  menubar-json   |
+                              +-----------------+
+                                       |
+                                       v
+                          +----------------------------+
+                          | session files on disk      |
+                          | (JSONL, SQLite, protobuf)  |
+                          +----------------------------+
+```
+
+The macOS menubar (`mac/`) and the GNOME extension (`gnome/`) both invoke `codeburn status --format menubar-json --period <p>` and parse the JSON. They do not share code with the CLI; they only depend on its output contract.
+
+## CLI (`src/`)
+
+`src/cli.ts` is the Commander.js entry point. The bin field in `package.json` points at `dist/cli.js`. Twelve commands are registered:
+
+| Command | Line | Purpose |
+|---|---|---|
+| `report` | 274 | Default. Interactive Ink TUI dashboard. |
+| `status` | 358 | Compact text status, plus `--format menubar-json` for clients. |
+| `today` | 524 | Today-only view of `report`. |
+| `month` | 542 | Month-only view of `report`. |
+| `export` | 560 | CSV or JSON dump of usage data. |
+| `menubar` | 621 | Downloads and launches the macOS menubar bundle. |
+| `currency` | 636 | Sets display currency. |
+| `model-alias` | 687 | Maps an unknown model name to a known one for pricing. |
+| `plan` | 737 | Configures a subscription plan for overage tracking. |
+| `optimize` | 857 | Runs all 14 waste detectors. |
+| `compare` | 870 | Compares two models side by side. |
+| `yield` | 882 | Tracks which sessions shipped to main vs. were reverted (experimental). |
+
+### Pipeline
+
+```
+provider.discoverSessions()
+        |
+        v
+provider.createSessionParser(source, seenKeys)
+        |
+        v   yields ParsedProviderCall (see src/providers/types.ts)
+        |
+        v
+src/parser.ts: parseAllSessions()
+        |
+        v   aggregates into ProjectSummary[]
+        |
+        v
+src/daily-cache.ts: aggregate per day, persist
+        |
+        v
+output formatter (Ink TUI, JSON, or menubar-json)
+```
+
+`src/parser.ts` is the central aggregator. Public exports: `parseAllSessions`, `filterProjectsByName`, `extractMcpInventory`. It owns the dedup `Set` (`seenKeys`) that is passed into every provider parser so a turn that surfaces in two providers (Claude logs vs. Cursor mirror, for instance) is counted once.
+
+### Cache Layers
+
+Three caches under `~/.cache/codeburn/` (override with `CODEBURN_CACHE_DIR`):
+
+| File | Owner | Invalidation |
+|---|---|---|
+| `codex-results.json` | `src/codex-cache.ts` | `mtimeMs + sizeBytes` per Codex `.jsonl`. |
+| `cursor-results.json` | `src/cursor-cache.ts` | `mtimeMs + sizeBytes` of the Cursor SQLite db. |
+| `daily-cache.json` | `src/daily-cache.ts` | Tracks `lastComputedDate`; new days are backfilled, old days are reused. |
+
+All three use atomic write (temp file + `rename`) and write with mode `0o600`. All three carry a numeric `version` field; bumping it forces a recompute next run.
+
+### Optimize Detectors
+
+`src/optimize.ts` exports 14 detectors. Each returns a `WasteFinding | null`. They are composed by `runOptimize()` which collects findings, ranks them by impact, and returns them with `WasteAction` objects (paste-to-CLAUDE.md, paste-to-session-opener, prompt-now, edit shell config).
+
+| Detector | Line | What it catches |
+|---|---|---|
+| `detectJunkReads` | 428 | Reads into `node_modules`, `.git`, `dist`, etc. |
+| `detectDuplicateReads` | 477 | Re-reads of the same file in a session. |
+| `detectMcpToolCoverage` | 795 | MCP servers with many tools but low usage. |
+| `detectUnusedMcp` | 855 | MCP servers configured but never invoked. |
+| `detectBloatedClaudeMd` | 944 | `CLAUDE.md` files past a healthy size. |
+| `detectLowReadEditRatio` | 987 | Edit-heavy sessions with too few prior reads. |
+| `detectCacheBloat` | 1048 | High `cache_creation_input_tokens`. |
+| `detectGhostAgents` | 1124 | Defined but never-invoked Claude agents. |
+| `detectGhostSkills` | 1154 | Defined but never-invoked skills. |
+| `detectGhostCommands` | 1184 | Defined but never-invoked slash commands. |
+| `detectBashBloat` | 1228 | Shell output limit set above the recommended 15K chars. |
+| `detectLowWorthSessions` | 1405 | Sessions with cost but no edits or git delivery. |
+| `detectContextBloat` | 1512 | Input:output token ratio above 25:1. |
+| `detectSessionOutliers` | 1558 | Sessions costing more than 2x the project average. |
+
+### Output Formats
+
+| Command | `--format` choices | Default |
+|---|---|---|
+| `report`, `today`, `month` | `tui`, `json` | `tui` |
+| `status` | `terminal`, `menubar-json`, `json` | `terminal` |
+| `export` | `csv`, `json` | `csv` |
+| `plan` | `text`, `json` | `text` |
+
+The macOS menubar and GNOME extension consume `menubar-json`. `src/menubar-json.ts` defines the contract; `tests/menubar-json.test.ts` pins it.
+
+## Providers (`src/providers/`)
+
+Every provider implements the `Provider` interface in `src/providers/types.ts`:
+
+```ts
+type Provider = {
+  name: string
+  displayName: string
+  modelDisplayName(model: string): string
+  toolDisplayName(rawTool: string): string
+  discoverSessions(): Promise<SessionSource[]>
+  createSessionParser(source: SessionSource, seenKeys: Set<string>): SessionParser
+}
+```
+
+`src/providers/index.ts` registers seventeen providers across two tiers:
+
+- **Eager**: `claude`, `codex`, `copilot`, `droid`, `gemini`, `kilo-code`, `kiro`, `openclaw`, `pi`, `omp`, `qwen`, `roo-code`. Imported at module load.
+- **Lazy**: `antigravity`, `goose`, `cursor`, `opencode`, `cursor-agent`. Imported via dynamic `import()` so the heavy dependencies (SQLite, protobuf) do not touch users who do not have those tools installed.
+
+Both lists hit the same `getAllProviders()` aggregator. A failed lazy import is silent and excludes that provider from the run.
+
+`src/providers/vscode-cline-parser.ts` is a shared helper consumed by `kilo-code` and `roo-code`. It is not registered as a provider on its own.
+
+For the per-provider data location, storage format, parser quirks, and test coverage, see `docs/providers/`.
+
+## macOS Menubar (`mac/`)
+
+Swift package (`mac/Package.swift`), targets macOS 14, strict concurrency on. Layout under `mac/Sources/CodeBurnMenubar/`:
+
+- `CodeBurnApp.swift` boots the SwiftUI `App` and the `NSStatusItem`.
+- `AppStore.swift` is the single source of truth for UI state.
+- `Data/` holds models, the CLI client, credential stores, and subscription services.
+  - `DataClient.swift` spawns the CLI and decodes `MenubarPayload`. See file-level comment for why we never route through `/bin/zsh -c`.
+  - `MenubarPayload.swift` mirrors the JSON the CLI emits; keep it in sync with `src/menubar-json.ts`.
+- `Security/CodeburnCLI.swift` resolves the CLI binary (env override `CODEBURN_BIN`, fallback `codeburn`), validates each argv entry against an allowlist regex, and augments PATH for Homebrew and npm-global installs. The Process is launched via `/usr/bin/env`, never via a shell.
+- `Theme/` holds color and typography constants and the dark/light state.
+- `Views/` are the SwiftUI components rendered inside `NSPopover`.
+
+Tests live in `mac/Tests/CodeBurnMenubarTests/` (currently `CapacityEstimatorTests.swift`).
+
+The build artifact is a zipped `.app` bundle produced by `mac/Scripts/package-app.sh`. See `RELEASING.md` for how the GitHub Actions workflow uses it.
+
+## GNOME Extension (`gnome/`)
+
+Plain JavaScript, no bundler. Targets GNOME Shell 45-50 (`metadata.json`).
+
+- `extension.js` is the entry point. On `enable()` it constructs a `CodeBurnIndicator` and adds it to the panel.
+- `indicator.js` is the popover. It owns the period selector, the insight tabs, and the provider filter.
+- `dataClient.js` wraps `Gio.Subprocess` to call the CLI. It validates argv against the same allowlist pattern as the macOS client and augments PATH with `~/.local/bin`, `~/.npm-global/bin`, `~/.volta/bin`, `~/.bun/bin`, `~/.cargo/bin`, `~/.asdf/shims`, and a few others. Results are cached for 300 seconds.
+- `prefs.js` is the settings dialog backed by `schemas/org.gnome.shell.extensions.codeburn.gschema.xml`.
+- `install.sh` copies the extension into `~/.local/share/gnome-shell/extensions/`.
+
+## Build (`scripts/`, `tsup.config.ts`)
+
+`npm run build` is two steps:
+
+1. `node scripts/bundle-litellm.mjs` fetches the latest litellm pricing JSON and writes `src/data/litellm-snapshot.json`. The bundle script keeps a manual override for MiniMax variants. Direct (un-prefixed) entries win over prefixed ones. The result is checked in so the build is reproducible.
+2. `tsup` reads `tsup.config.ts` and emits a single ESM bundle at `dist/cli.js` with a Node shebang banner. No source maps in publish builds; sourcemaps on for development.
+
+The `prepublishOnly` hook in `package.json` runs `npm run build` so `npm publish` always ships fresh code.
+
+## Tests
+
+`npm test` runs vitest. Forty-one test files live under `tests/`:
+
+- `tests/` root (27 files) covers CLI, parser, optimize, cache, format, models, plans.
+- `tests/security/` (1 file) covers prototype-pollution guards.
+- `tests/providers/` (13 files) covers per-provider parsing.
+- `tests/fixtures/` holds redacted real-world session data.
+
+Five providers ship without test files today: `antigravity`, `claude`, `gemini`, `goose`, `qwen`. Closing this gap is a standing good-first-issue.
+
+CI runs Semgrep against `.semgrep/rules/no-bracket-assign-hot-paths.yml` over `src/providers/` and `src/parser.ts` (`.github/workflows/ci.yml`). It does not run vitest in CI today; tests run locally before publish.
diff --git a/docs/providers/README.md b/docs/providers/README.md
new file mode 100644
index 0000000..e57d1b9
--- /dev/null
+++ b/docs/providers/README.md
@@ -0,0 +1,54 @@
+# Provider Docs
+
+One file per provider integration. If you are fixing a bug or adding a feature scoped to a single provider, read the file for that provider first; it tells you which file to edit, where on disk the source data lives, and what edge cases the test suite already covers.
+
+For the architectural picture, see `../architecture.md`.
+
+## Provider Index
+
+### Eager (always loaded)
+
+| Provider | Storage | Source | Test |
+|---|---|---|---|
+| [Claude](claude.md) | JSONL (no parser) | `src/providers/claude.ts` | none (covered indirectly) |
+| [Codex](codex.md) | JSONL | `src/providers/codex.ts` | `tests/providers/codex.test.ts` |
+| [Copilot](copilot.md) | JSONL | `src/providers/copilot.ts` | `tests/providers/copilot.test.ts` |
+| [Droid](droid.md) | JSONL | `src/providers/droid.ts` | `tests/providers/droid.test.ts` |
+| [Gemini](gemini.md) | JSON / JSONL | `src/providers/gemini.ts` | none |
+| [KiloCode](kilo-code.md) | JSON | `src/providers/kilo-code.ts` | `tests/providers/kilo-code.test.ts` |
+| [Kiro](kiro.md) | JSON | `src/providers/kiro.ts` | `tests/providers/kiro.test.ts` |
+| [OpenClaw](openclaw.md) | JSONL | `src/providers/openclaw.ts` | `tests/providers/openclaw.test.ts` |
+| [Pi](pi.md) | JSONL | `src/providers/pi.ts` | `tests/providers/pi.test.ts` |
+| [OMP](omp.md) | JSONL | `src/providers/pi.ts` | `tests/providers/omp.test.ts` |
+| [Qwen](qwen.md) | JSONL | `src/providers/qwen.ts` | none |
+| [Roo Code](roo-code.md) | JSON | `src/providers/roo-code.ts` | `tests/providers/roo-code.test.ts` |
+
+### Lazy (loaded on first call)
+
+| Provider | Storage | Source | Test |
+|---|---|---|---|
+| [Antigravity](antigravity.md) | protobuf over RPC | `src/providers/antigravity.ts` | none |
+| [Cursor](cursor.md) | SQLite | `src/providers/cursor.ts` | `tests/providers/cursor.test.ts` |
+| [Cursor Agent](cursor-agent.md) | text / JSONL | `src/providers/cursor-agent.ts` | `tests/providers/cursor-agent.test.ts` |
+| [Goose](goose.md) | SQLite | `src/providers/goose.ts` | none |
+| [OpenCode](opencode.md) | SQLite | `src/providers/opencode.ts` | `tests/providers/opencode.test.ts` |
+
+### Shared
+
+| Helper | Used by | Source |
+|---|---|---|
+| [vscode-cline-parser](vscode-cline-parser.md) | `kilo-code`, `roo-code` | `src/providers/vscode-cline-parser.ts` |
+
+## File Format
+
+Each provider doc has the same structure:
+
+1. **One-line summary** of what the provider integrates.
+2. **Where it reads from** on disk (or over RPC).
+3. **Storage format** and validation rules.
+4. **Caching** (which cache layer, if any).
+5. **Deduplication key** so you understand cross-provider dedup.
+6. **Quirks** that have bitten us before.
+7. **When fixing a bug here** as a checklist.
+
+If you add a new provider, copy `claude.md` as a template and fill in your provider's specifics. Update this index, and prefer adding a real test fixture under `tests/providers/`.
diff --git a/docs/providers/antigravity.md b/docs/providers/antigravity.md
new file mode 100644
index 0000000..723cef5
--- /dev/null
+++ b/docs/providers/antigravity.md
@@ -0,0 +1,43 @@
+# Antigravity
+
+Google Antigravity. The only provider that does not read files off disk: it speaks to a local language-server RPC endpoint instead.
+
+- **Source:** `src/providers/antigravity.ts`
+- **Loading:** lazy (`src/providers/index.ts:14-27`). Lazy because the protobuf dependency is heavy.
+- **Test:** none. Mocking the RPC endpoint cleanly is the open issue.
+
+## Where it reads from
+
+A local HTTPS RPC endpoint exposed by Antigravity's language server. The parser:
+
+1. Locates the running language-server process via `ps`.
+2. Reads its port and CSRF token from process metadata.
+3. Calls `GetCascadeTrajectoryGeneratorMetadata` over HTTPS.
+4. Validates the response (capped at 5-15 MB depending on cascade size).
+
+If the language server is not running, the parser falls back to the cached results file (`antigravity.ts:262-272`).
+
+## Storage format
+
+Protobuf. Cascade and response objects map to `ParsedProviderCall` directly; see `antigravity.ts:299-323`.
+
+## Caching
+
+Custom file cache at `$CODEBURN_CACHE_DIR/antigravity-results.json` (defaults to `~/.cache/codeburn/`). The version constant is at `antigravity.ts:12`; the cache machinery (`loadCache`, `flushCache`) lives in `antigravity.ts:75-125`. The cache is also used as the data source when the RPC endpoint is unavailable, not just as an optimization. Bumping the cache version forces a recompute.
+
+## Deduplication
+
+Per `<cascadeId>:<responseId>` (`antigravity.ts:308`).
+
+## Quirks
+
+- **Antigravity is the only provider that requires a live process.** A user who closes Antigravity loses the most-recent data until next launch (the cache covers older runs).
+- The 5-15 MB cap on RPC responses is necessary because individual cascades can balloon. Raising it risks OOM on the user's machine.
+- Token types are split across `inputTokens`, `responseOutputTokens`, and `thinkingOutputTokens` (`antigravity.ts:313-323`). Thinking is billed at output rate.
+
+## When fixing a bug here
+
+1. Reproducing requires Antigravity running locally. There is no fixture for the RPC, which is a real testing gap.
+2. Before any change, capture a sample protobuf response (anonymized) so future regressions can be tested against a recording.
+3. If the bug is "no data after Antigravity update", the protobuf schema may have shifted. The parser's response handling at `antigravity.ts:299-323` is the place to look.
+4. If the bug is "stale data", check whether the RPC is reachable; the cache fallback can mask connectivity issues.
diff --git a/docs/providers/claude.md b/docs/providers/claude.md
new file mode 100644
index 0000000..b0b7b8c
--- /dev/null
+++ b/docs/providers/claude.md
@@ -0,0 +1,43 @@
+# Claude
+
+Anthropic Claude Code CLI and Claude Desktop's local agent mode.
+
+- **Source:** `src/providers/claude.ts`
+- **Loading:** eager (`src/providers/index.ts:1`)
+- **Test:** none directly. Coverage comes from `tests/parser-claude-cwd.test.ts`, `tests/parser-filter.test.ts`, and `tests/parser-mcp-inventory.test.ts`, which exercise `src/parser.ts` end-to-end against fixture session files.
+
+## Where it reads from
+
+| Source | Path |
+|---|---|
+| Claude Code CLI | `$CLAUDE_CONFIG_DIR` if set, otherwise `~/.claude/projects/` |
+| Claude Desktop (macOS) | `~/Library/Application Support/Claude/local-agent-mode-sessions/` |
+| Claude Desktop (Windows) | `%APPDATA%/Claude/local-agent-mode-sessions/` |
+| Claude Desktop (Linux) | `~/.config/Claude/local-agent-mode-sessions/` |
+
+For Desktop, `findDesktopProjectDirs` walks up to 8 levels deep looking for `projects/` subdirectories, skipping `node_modules` and `.git`.
+
+## Storage format
+
+JSONL, one event per line, per session file. Sessions live under `<project>/<sessionId>.jsonl`.
+
+## Parser
+
+`createSessionParser` returns an empty async generator (`claude.ts:101-105`). Claude is a special case: `src/parser.ts` reads Claude JSONL files directly with full turn grouping, dedup of streaming message IDs, and MCP tool inventory extraction. The provider object exists only so `discoverSessions` can return Claude session sources alongside the others.
+
+## Caching
+
+None at the provider level. The daily aggregation cache (`src/daily-cache.ts`) reuses prior computed days.
+
+## Quirks
+
+- The parser is in `src/parser.ts`, not in `src/providers/claude.ts`. Anything that changes Claude parsing belongs in `parser.ts`.
+- Streaming responses produce duplicate message IDs across resumed sessions; `parser.ts` strips them via the global `seenMsgIds` Set.
+- Model display names are mapped in `claude.ts:7-20`; add new versions there when Anthropic releases them.
+
+## When fixing a bug here
+
+1. Confirm whether the bug is in **discovery** (sessions not picked up) or **parsing** (sessions found but data wrong).
+2. Discovery bugs live in `claude.ts:78-99`. Verify the directory layout you expect actually matches what Claude writes today.
+3. Parsing bugs live in `src/parser.ts`. Look for `parseSessionFile`, `groupIntoTurns`, and `dedupeStreamingMessageIds`.
+4. Add a fixture under `tests/fixtures/` and a test under `tests/parser-claude-cwd.test.ts` (or a new file). Do not mock the filesystem.
diff --git a/docs/providers/codex.md b/docs/providers/codex.md
new file mode 100644
index 0000000..268fd35
--- /dev/null
+++ b/docs/providers/codex.md
@@ -0,0 +1,50 @@
+# Codex
+
+OpenAI Codex CLI.
+
+- **Source:** `src/providers/codex.ts`
+- **Loading:** eager (`src/providers/index.ts:2`)
+- **Test:** `tests/providers/codex.test.ts` (374 lines)
+
+## Where it reads from
+
+`$CODEX_HOME` if set, otherwise `~/.codex`. Sessions are nested by date:
+
+```
+~/.codex/sessions/<YYYY>/<MM>/<DD>/rollout-*.jsonl
+```
+
+The discovery walk uses strict regex (`^\d{4}$`, `^\d{2}$`) on each path component.
+
+## Storage format
+
+JSONL. The first line must be a `session_meta` entry with `payload.originator` starting with `codex` (case-insensitive). Files that fail this check are silently skipped.
+
+The first line read is capped at 1 MB (`FIRST_LINE_READ_CAP`). Codex CLI 0.128+ embeds the full system prompt in `session_meta`, which can run 20-27 KB; the cap leaves headroom while bounding memory if a corrupt file has no newline.
+
+## Caching
+
+`src/codex-cache.ts` writes `~/.cache/codeburn/codex-results.json` (or `$CODEBURN_CACHE_DIR/codex-results.json`). Each entry is keyed by absolute file path and validated against `mtimeMs + sizeBytes`. Cached entries are returned wholesale.
+
+A session that yielded zero parseable lines does **not** write to the cache (`codex.ts:419`); this prevents a transient read failure from pinning an empty result against a fingerprint.
+
+## Deduplication
+
+`codex:<sessionId>:<timestamp>:<cumulativeTotal>` for accounted events, plus `codex:<sessionId>:<timestamp>:est<n>` for estimated events that fall back to char-counting.
+
+## Quirks
+
+- Codex CLI emits both `last_token_usage` (per turn) and `total_token_usage` (cumulative). The parser handles three modes:
+  1. `last_token_usage` present: use it directly.
+  2. Only cumulative: compute deltas against the prior turn.
+  3. Neither: estimate from message text length (`CHARS_PER_TOKEN = 4`).
+- `prevCumulativeTotal` is initialized to `null`, not `0`. A session whose first event reports `total = 0` would otherwise be dropped as a "duplicate" of the initial state.
+- `prev*` token counters are advanced on **every** event, including ones that used `last_token_usage`. Earlier code only updated them on the fallback branch, which double-counted any session that mixed modes.
+- OpenAI counts cached tokens **inside** `input_tokens`. The parser subtracts them so the rest of the codebase can assume Anthropic semantics (cached are separate).
+
+## When fixing a bug here
+
+1. Reproduce against a real `rollout-*.jsonl` if you can. Drop a redacted copy under `tests/fixtures/codex/` and reference it from `tests/providers/codex.test.ts`.
+2. If the bug is "zero tokens reported", first check whether the file is being skipped by `isValidCodexSession`.
+3. If the bug is "tokens counted twice", look at `prevCumulativeTotal` and the prev-counter advancement.
+4. If you change the dedup key shape, run `tests/providers/codex.test.ts` and `tests/parser-filter.test.ts` together; cross-provider dedup happens via the global `seenKeys` Set.
diff --git a/docs/providers/copilot.md b/docs/providers/copilot.md
new file mode 100644
index 0000000..a02198e
--- /dev/null
+++ b/docs/providers/copilot.md
@@ -0,0 +1,49 @@
+# Copilot
+
+GitHub Copilot Chat (CLI and VS Code extension transcripts).
+
+- **Source:** `src/providers/copilot.ts`
+- **Loading:** eager (`src/providers/index.ts:3`)
+- **Test:** `tests/providers/copilot.test.ts` (401 lines)
+
+## Where it reads from
+
+Two locations. Both are walked on every run; results merge.
+
+1. **Legacy CLI sessions:** `~/.copilot/session-state/`
+2. **VS Code transcripts:** `~/Library/Application Support/Code/User/workspaceStorage/<hash>/GitHub.copilot-chat/transcripts/` and equivalents on Windows / Linux
+
+## Storage format
+
+JSONL in both locations, but the schemas differ. The parser switches by detecting which schema the first event uses (`copilot.ts:83-159` for legacy, `copilot.ts:215-293` for transcripts).
+
+## Caching
+
+None at the provider level.
+
+## Deduplication
+
+Per `messageId` in both formats (`copilot.ts:118` for legacy, `copilot.ts:245` for transcripts).
+
+## Model inference
+
+Copilot does not always tag the model on each message. The parser infers it from the tool-call ID prefix:
+
+| Prefix | Inferred model family |
+|---|---|
+| `toolu_bdrk_`, `toolu_vrtx_`, `tooluse_`, `toolu_` | Anthropic |
+| `call_` | OpenAI |
+
+See `copilot.ts:176-213`.
+
+## Quirks
+
+- `toolRequests` can be missing or non-array on older sessions; the parser guards against that (`copilot.ts:126`, `:260`).
+- When `outputTokens` is missing the parser falls back to char-counting (`CHARS_PER_TOKEN = 4`, `copilot.ts:252-254`).
+- A single chat may be mirrored across both legacy and transcript paths if the user upgraded; the dedup `messageId` collision handles this.
+
+## When fixing a bug here
+
+1. Determine which schema reproduces the bug. The two parsers share little code on purpose; do not unify them unless you understand both formats.
+2. If the model is misidentified, look at the tool-call ID prefix list and consider whether a new prefix should be added.
+3. New fixtures go under `tests/fixtures/copilot/` and are referenced from `tests/providers/copilot.test.ts`.
diff --git a/docs/providers/cursor-agent.md b/docs/providers/cursor-agent.md
new file mode 100644
index 0000000..d77775b
--- /dev/null
+++ b/docs/providers/cursor-agent.md
@@ -0,0 +1,41 @@
+# Cursor Agent
+
+Cursor's background agent transcripts (separate from the regular chat).
+
+- **Source:** `src/providers/cursor-agent.ts`
+- **Loading:** lazy (`src/providers/index.ts:62-87`)
+- **Test:** `tests/providers/cursor-agent.test.ts` (243 lines)
+
+## Where it reads from
+
+`~/.cursor/projects/<projectId>/agent-transcripts/`. Inside each project, two layouts coexist:
+
+1. **Legacy:** `*.txt` flat files.
+2. **Composer 2:** UUID-named subdirectories, each containing JSONL.
+
+Subagents (delegated runs) live in `subagents/` subdirectories under the parent (`cursor-agent.ts:479-490`). They are picked up too.
+
+## Storage format
+
+- Legacy: free-form text transcripts. The parser does line-based heuristic parsing (`cursor-agent.ts:219-314`).
+- Composer 2: JSONL (`cursor-agent.ts:167-217`).
+
+## Caching
+
+None at the provider level. Conversation metadata is read from the same Cursor SQLite db (`state.vscdb`), specifically the `conversation_summaries` table (`cursor-agent.ts:46-50`). If the summary is missing, file mtime is used as the timestamp.
+
+## Deduplication
+
+Per `<provider>:<conversationId>:<turnIndex>` (`cursor-agent.ts:379`).
+
+## Quirks
+
+- A file with a UUID-shaped name is treated as the conversation ID directly (`cursor-agent.ts:142-143`); other names are derived from the parent directory.
+- Token counts are estimated from char count (`CHARS_PER_TOKEN = 4`, `cursor-agent.ts:35`, `:81-84`). The legacy text format never reports real tokens.
+- The text parser is regex-driven and brittle. It is easier to fix a Composer 2 (JSONL) bug than a legacy (text) bug.
+
+## When fixing a bug here
+
+1. Check which format the failing transcript uses before opening a fix.
+2. For text-format bugs, copy the redacted transcript verbatim into `tests/fixtures/cursor-agent/` so the regex change can be regression-tested.
+3. If the bug is "wrong project", look at `cursor-agent.ts:46-50` and whether a `conversation_summaries` row exists for the conversation.
diff --git a/docs/providers/cursor.md b/docs/providers/cursor.md
new file mode 100644
index 0000000..8ccf6c4
--- /dev/null
+++ b/docs/providers/cursor.md
@@ -0,0 +1,50 @@
+# Cursor
+
+Cursor IDE chat history.
+
+- **Source:** `src/providers/cursor.ts`
+- **Loading:** lazy (`src/providers/index.ts:44-57`). The `node:sqlite` import is the heavy dependency that justifies lazy loading.
+- **Test:** `tests/providers/cursor.test.ts` (77 lines), `tests/providers/cursor-bubble-dedup.test.ts` (176 lines)
+
+## Where it reads from
+
+A single SQLite database per platform:
+
+| Platform | Path |
+|---|---|
+| macOS | `~/Library/Application Support/Cursor/User/globalStorage/state.vscdb` |
+| Windows | `%APPDATA%/Cursor/User/globalStorage/state.vscdb` |
+| Linux | `~/.config/Cursor/User/globalStorage/state.vscdb` |
+
+## Storage format
+
+SQLite. Two parallel sources within the same db:
+
+1. **Bubbles** (`cursor.ts:201-331`): per-message rows. The richer source.
+2. **agentKv** (`cursor.ts:350-460`): per-conversation key-value blobs. The fallback for older sessions.
+
+The parser tries both and dedupes via `seenKeys`.
+
+## Caching
+
+`src/cursor-cache.ts` writes `~/.cache/codeburn/cursor-results.json` (override with `$CODEBURN_CACHE_DIR`). The fingerprint is `dbMtimeMs + dbSizeBytes` of `state.vscdb`. Atomic write via temp + rename.
+
+## Deduplication
+
+- Bubbles: per `bubbleId` (`cursor.ts:282`).
+- agentKv: per `requestId` (`cursor.ts:429`).
+
+## Quirks
+
+- **180-day lookback.** The bubbles query bounds itself to the trailing 180 days (`cursor.ts:205`). Older history is ignored. If a user reports "Cursor data missing", confirm the date range first.
+- **250 000 bubble cap.** Power users with massive history are capped to prevent unbounded memory. If you need to raise this, also raise the cache size budget.
+- **Per-conversation user-message queue.** The parser caches the user-message stream per conversation to avoid an O(n) shift on every turn (`cursor.ts:171-191`).
+- **agentKv has no per-message timestamp.** The DB file's mtime is used as the timestamp for every agentKv-derived call (`cursor.ts:358-363`). This is wrong but consistent.
+- **Cursor v3 reports zero token counts.** The parser falls back to char-counting (`CHARS_PER_TOKEN = 4`) for those rows (`cursor.ts:265-272`).
+
+## When fixing a bug here
+
+1. **Always reproduce against a fixture, not a real db.** SQLite over the live db is racy; the user might be using Cursor while you read.
+2. If the bug is "tokens are zero", check whether the row is a v3 zero-token bubble, in which case the char-fallback should kick in.
+3. If the bug is "duplicate counts", check both `bubbleId` dedup and the cross-provider `seenKeys` dedup.
+4. Cache poisoning is the most common failure mode after a Cursor schema change. Bump `CURSOR_CACHE_VERSION` in `src/cursor-cache.ts` so old caches are invalidated.
diff --git a/docs/providers/droid.md b/docs/providers/droid.md
new file mode 100644
index 0000000..b8288e5
--- /dev/null
+++ b/docs/providers/droid.md
@@ -0,0 +1,36 @@
+# Droid
+
+Factory's Droid CLI.
+
+- **Source:** `src/providers/droid.ts`
+- **Loading:** eager (`src/providers/index.ts:4`)
+- **Test:** `tests/providers/droid.test.ts` (148 lines)
+
+## Where it reads from
+
+`$FACTORY_DIR` if set, otherwise `~/.factory/sessions/<subdir>/*.jsonl`.
+
+The parser ignores the `.factory/` directory itself (`droid.ts:293-296`); some installs nest it accidentally.
+
+## Storage format
+
+JSONL.
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `messageId` within a session (`droid.ts:253`).
+
+## Quirks
+
+- **Token totals are session-level only.** Droid does not report per-message tokens. The parser reads `settings.tokenUsage` once per session and **splits it evenly** across all assistant calls, with the remainder added to the last call (`droid.ts:223-251`). This is approximate but consistent.
+- Project name is derived from the session's `cwd`. If the cwd contains `projects/<name>`, that name is preferred over the basename (`droid.ts:299-319`).
+
+## When fixing a bug here
+
+1. If the bug is "tokens unevenly attributed", that is by design. The session-level total is the only signal Droid emits.
+2. If the bug is "no sessions found", confirm the user does not have `$FACTORY_DIR` pointing somewhere unexpected.
+3. New fixtures go under `tests/fixtures/droid/`.
diff --git a/docs/providers/gemini.md b/docs/providers/gemini.md
new file mode 100644
index 0000000..b411d23
--- /dev/null
+++ b/docs/providers/gemini.md
@@ -0,0 +1,35 @@
+# Gemini
+
+Google Gemini CLI.
+
+- **Source:** `src/providers/gemini.ts`
+- **Loading:** eager (`src/providers/index.ts:5`)
+- **Test:** none. Adding a fixture-based test is a known good first issue.
+
+## Where it reads from
+
+`~/.gemini/tmp/<project>/chats/session-*.json` and `session-*.jsonl` (`gemini.ts:218-252`).
+
+## Storage format
+
+Either a single JSON document per session or JSONL, depending on Gemini CLI version. The parser sniffs the first non-whitespace character to decide (`gemini.ts:197-206`).
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `sessionId` (`gemini.ts:72`). Gemini sessions are aggregated to a single call per session.
+
+## Quirks
+
+- **Cached tokens are a subset of input.** Gemini reports cached tokens included inside `promptTokenCount`. The parser subtracts them so callers see Anthropic semantics (cached are separate).
+- **Thoughts are billed at output rate** (`gemini.ts:125`).
+- Each session collapses to one `ParsedProviderCall`. If you need per-turn data, the upstream format does not support it without re-parsing the prompt history.
+
+## When fixing a bug here
+
+1. The lack of a test file is a hazard. **Add a fixture and a test before changing parsing logic** so future regressions are caught.
+2. If the bug involves a new Gemini version's schema, sniff with the same first-character heuristic; do not call `JSON.parse` on the whole file.
+3. If the bug is "Gemini sessions report less than expected", check whether the cached-token subtraction is over-correcting.
diff --git a/docs/providers/goose.md b/docs/providers/goose.md
new file mode 100644
index 0000000..d203d55
--- /dev/null
+++ b/docs/providers/goose.md
@@ -0,0 +1,42 @@
+# Goose
+
+Block's Goose CLI.
+
+- **Source:** `src/providers/goose.ts`
+- **Loading:** lazy (`src/providers/index.ts:29-42`)
+- **Test:** none. Adding a fixture-based test is a known good first issue.
+
+## Where it reads from
+
+A SQLite database. Path resolution honors `XDG_DATA_HOME` and a `GOOSE_PATH_ROOT` override:
+
+| Platform | Default path |
+|---|---|
+| macOS / Linux | `~/.local/share/goose/sessions/sessions.db` |
+| Windows | `%APPDATA%/Block/goose/sessions/sessions.db` |
+
+See `goose.ts:52-62`.
+
+## Storage format
+
+SQLite.
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `sessionId` (`goose.ts:174`).
+
+## Quirks
+
+- Source paths are encoded as `<dbPath>:<sessionId>` so a single db can yield many session sources. The discovery code splits on the last colon (`goose.ts:148-150`).
+- Tool inventory comes from the `messages` table queried with `LIKE '%toolRequest%'` (`goose.ts:90`). This will miss tools whose payloads are encoded differently in a future Goose version.
+- Tokens are read directly from `accumulated_input_tokens` and `accumulated_output_tokens`. No estimation.
+
+## When fixing a bug here
+
+1. Add a fixture-based test before changing logic. `tests/providers/goose.test.ts` does not exist yet; create it and use a small SQLite file under `tests/fixtures/goose/`.
+2. If the bug is "no sessions", check `XDG_DATA_HOME` and `GOOSE_PATH_ROOT` first; users on non-default Linux setups will not match the default path.
+3. The `LIKE '%toolRequest%'` query is fragile. If Goose changes the message envelope, this is where it will break.
diff --git a/docs/providers/kilo-code.md b/docs/providers/kilo-code.md
new file mode 100644
index 0000000..188465f
--- /dev/null
+++ b/docs/providers/kilo-code.md
@@ -0,0 +1,34 @@
+# KiloCode
+
+KiloCode VS Code extension.
+
+- **Source:** `src/providers/kilo-code.ts`
+- **Loading:** eager (`src/providers/index.ts:6`)
+- **Test:** `tests/providers/kilo-code.test.ts` (62 lines)
+
+## Where it reads from
+
+VS Code extension globalStorage for `kilocode.kilo-code` (extension ID set at `kilo-code.ts:4`). The actual walk is delegated to `discoverClineTasks` in `src/providers/vscode-cline-parser.ts`.
+
+## Storage format
+
+Per-task directories with `ui_messages.json` and `api_conversation_history.json`. See [`vscode-cline-parser`](vscode-cline-parser.md) for the full schema description.
+
+## Caching
+
+None at the provider level; delegates to the shared helper.
+
+## Deduplication
+
+Delegated. Per `<providerName>:<taskId>:<index>` (handled in `vscode-cline-parser.ts:109`).
+
+## Quirks
+
+- This file is a thin wrapper. Almost every bug for KiloCode actually lives in `vscode-cline-parser.ts`.
+- The two providers using the cline parser (KiloCode and Roo Code) differ **only** by extension ID.
+
+## When fixing a bug here
+
+1. If the bug is "KiloCode and Roo Code both broken in the same way", fix it in `vscode-cline-parser.ts`.
+2. If the bug is "KiloCode broken, Roo Code fine", the difference is upstream (KiloCode's emitted JSON differs slightly). Reproduce with a fixture and consider whether the cline parser needs to branch on extension ID.
+3. Read [`vscode-cline-parser.md`](vscode-cline-parser.md) before editing.
diff --git a/docs/providers/kiro.md b/docs/providers/kiro.md
new file mode 100644
index 0000000..0c450fb
--- /dev/null
+++ b/docs/providers/kiro.md
@@ -0,0 +1,44 @@
+# Kiro
+
+Kiro IDE chat history.
+
+- **Source:** `src/providers/kiro.ts`
+- **Loading:** eager (`src/providers/index.ts:7`)
+- **Test:** `tests/providers/kiro.test.ts` (328 lines)
+
+## Where it reads from
+
+VS Code-style globalStorage at `kiro.kiroagent`:
+
+| Platform | Path |
+|---|---|
+| macOS | `~/Library/Application Support/Kiro/User/globalStorage/kiro.kiroagent` |
+| Windows | `%APPDATA%/Kiro/User/globalStorage/kiro.kiroagent` |
+| Linux | `~/.config/Kiro/User/globalStorage/kiro.kiroagent` |
+
+Sessions are `.chat` files under hash-named subdirectories. Discovery is in `kiro.ts:215-247`; the path-resolution helpers it uses start at `kiro.ts:164`.
+
+## Storage format
+
+JSON `.chat` files (`kiro.ts:153`).
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `executionId` (`kiro.ts:104`).
+
+## Quirks
+
+- **Workspace hash resolution** is non-trivial. The parser tries `workspace.json` first; if that fails, it base64-decodes the directory name to recover the workspace path (`kiro.ts:198-213`).
+- **Model ID normalization.** Kiro stores models like `claude-1.2`; the parser rewrites the dot to a hyphen so they match `claude-1-2` in the pricing snapshot (`kiro.ts:65-67`). Add new versions here when Kiro ships them.
+- **Tool name extraction is regex-driven.** Kiro embeds tool calls inside the message text as `<tool_use><name>...</name>` (`kiro.ts:69-78`). Brittle but unavoidable until Kiro emits structured tool data.
+- Token counts are estimated via char count (`CHARS_PER_TOKEN = 4`, `kiro.ts:9`, `:108-109`).
+
+## When fixing a bug here
+
+1. If the bug is "wrong workspace", check the base64 fallback path. Some users name their workspaces with characters that are not valid base64.
+2. If the bug is "missing model in pricing", add the model to the normalization map at `kiro.ts:65-67` and verify against `tests/providers/kiro.test.ts`.
+3. If the bug is "tools missing", look at the regex at `kiro.ts:69-78`. Kiro changes its envelope occasionally.
diff --git a/docs/providers/omp.md b/docs/providers/omp.md
new file mode 100644
index 0000000..4546a2f
--- /dev/null
+++ b/docs/providers/omp.md
@@ -0,0 +1,34 @@
+# OMP
+
+OMP CLI. Same parser as Pi, different data directory.
+
+- **Source:** `src/providers/pi.ts` (the `omp` export)
+- **Loading:** eager (`src/providers/index.ts:9`)
+- **Test:** `tests/providers/omp.test.ts` (225 lines)
+
+## Where it reads from
+
+`~/.omp/agent/sessions/` (`pi.ts:59-61`).
+
+## Storage format
+
+JSONL, identical schema to Pi.
+
+## Caching
+
+None.
+
+## Deduplication
+
+Identical to Pi: `<provider>:<path>:<responseId>` with timestamp / line-index fallbacks (`pi.ts:164`).
+
+## Quirks
+
+- OMP and Pi share the **same** `createParser` function. The provider object differs only in name, displayName, and the discovery directory.
+- If OMP and Pi diverge in a future release, do **not** copy-paste the parser. Add a discriminator to `createParser` and branch.
+
+## When fixing a bug here
+
+1. Check if the bug also reproduces against Pi. If yes, fix both with one change; the parser is shared.
+2. If the bug is OMP-specific, the right fix is usually to pass an option into `createParser` rather than to fork the file.
+3. Read [`pi.md`](pi.md) for the parser-level details.
diff --git a/docs/providers/openclaw.md b/docs/providers/openclaw.md
new file mode 100644
index 0000000..255b736
--- /dev/null
+++ b/docs/providers/openclaw.md
@@ -0,0 +1,41 @@
+# OpenClaw
+
+OpenClaw, plus the older Clawdbot / Moltbot / Moldbot lineage.
+
+- **Source:** `src/providers/openclaw.ts`
+- **Loading:** eager (`src/providers/index.ts:8`)
+- **Test:** `tests/providers/openclaw.test.ts` (192 lines)
+
+## Where it reads from
+
+Four directories, all checked on every run (`openclaw.ts:62-70`):
+
+- `~/.openclaw/agents`
+- `~/.clawdbot/agents`
+- `~/.moltbot/agents`
+- `~/.moldbot/agents`
+
+The legacy directories are kept for users who upgraded from older builds.
+
+## Storage format
+
+JSONL (`openclaw.ts:242`). Each agents directory has a `sessions.json` index file plus per-session `.jsonl` files. The parser reads the index when present and falls back to a directory scan if it is missing or stale (`openclaw.ts:220-247`).
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `<sessionId>:<dedupId>` (`openclaw.ts:169`).
+
+## Quirks
+
+- **Cost is preferred from the provider when reported.** OpenClaw emits `costUSD` in `message.usage`; the parser uses it directly when present (`openclaw.ts:174-177`) and only computes from tokens when it is missing.
+- Tokens are reported across `input`, `output`, `cacheRead`, and `cacheWrite`. Anthropic semantics throughout, no normalization needed.
+
+## When fixing a bug here
+
+1. If the bug is "session not found", check the four legacy dirs. A user might have a stray `~/.moltbot/` that the parser is reading instead of the real `~/.openclaw/`.
+2. If the bug is "wrong cost", confirm whether `costUSD` is present in the source data; the parser trusts it over its own calculation.
+3. The `sessions.json` index can drift when the user crashes mid-session. Make sure the directory-scan fallback triggers in those cases.
diff --git a/docs/providers/opencode.md b/docs/providers/opencode.md
new file mode 100644
index 0000000..0251fcd
--- /dev/null
+++ b/docs/providers/opencode.md
@@ -0,0 +1,36 @@
+# OpenCode
+
+OpenCode (sst/opencode).
+
+- **Source:** `src/providers/opencode.ts`
+- **Loading:** lazy (`src/providers/index.ts:59-75`)
+- **Test:** `tests/providers/opencode.test.ts` (558 lines, the largest provider test)
+
+## Where it reads from
+
+Default `~/.local/share/opencode/` or `$XDG_DATA_HOME/opencode/`. The discovery walk picks up `opencode*.db` files (`opencode.ts:71-88`).
+
+## Storage format
+
+SQLite.
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `<sessionId>:<messageId>` (`opencode.ts:242`).
+
+## Quirks
+
+- **Schema validation is loud.** When a required table is missing, the parser logs an actionable warning telling the user which table is gone and what version of OpenCode it expects (`opencode.ts:104-131`). This is the right behavior; do not silently swallow these.
+- Source paths are encoded as `<dbPath>:<sessionId>` (`opencode.ts:147-150`).
+- Each message's `parts` are indexed (`opencode.ts:177-191`); preserving the order matters for reasoning-token correctness.
+- Tokens are reported across `input`, `output`, `reasoning`, `cache.read`, and `cache.write`. Anthropic semantics.
+
+## When fixing a bug here
+
+1. The 558-line test suite catches a lot. Run `npm test -- tests/providers/opencode.test.ts` before and after any change.
+2. If the bug is "missing table" warning, do not catch and silence it. Either upgrade the version expectation in the parser or document the breaking schema change.
+3. If the bug is "reasoning tokens off by one", check the parts index ordering.
diff --git a/docs/providers/pi.md b/docs/providers/pi.md
new file mode 100644
index 0000000..9427226
--- /dev/null
+++ b/docs/providers/pi.md
@@ -0,0 +1,35 @@
+# Pi
+
+Pi agent CLI.
+
+- **Source:** `src/providers/pi.ts`
+- **Loading:** eager (`src/providers/index.ts:9`)
+- **Test:** `tests/providers/pi.test.ts` (336 lines)
+
+## Where it reads from
+
+`~/.pi/agent/sessions/` (`pi.ts:55-57`).
+
+## Storage format
+
+JSONL (`pi.ts:98`).
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `<provider>:<path>:<responseId>` when a response ID is present, falling back to the entry timestamp, and finally to a line index (`pi.ts:164`).
+
+## Quirks
+
+- Undefined token fields in `message.usage` are coerced to `0` (`pi.ts:156-159`); never `undefined`.
+- The provider name is taken from `source.provider` (`pi.ts:182`), not hard-coded. This matters because `pi.ts` is the parser for **both** Pi and OMP; see [`omp.md`](omp.md).
+- Tool-call content type is extracted from the message envelope (`pi.ts:169-176`).
+
+## When fixing a bug here
+
+1. If you change parsing logic, also run `tests/providers/omp.test.ts` because OMP shares this code.
+2. If the bug is "tokens are NaN", look at the coercion at `pi.ts:156-159`. A regression on this is silent and easy to miss.
+3. If the bug is specific to the dedup behavior, decide which of the three fallback keys was used by adding a temporary log; the keys collide differently for old vs. new Pi versions.
diff --git a/docs/providers/qwen.md b/docs/providers/qwen.md
new file mode 100644
index 0000000..1970328
--- /dev/null
+++ b/docs/providers/qwen.md
@@ -0,0 +1,36 @@
+# Qwen
+
+Qwen Code CLI.
+
+- **Source:** `src/providers/qwen.ts`
+- **Loading:** eager (`src/providers/index.ts:10`)
+- **Test:** none. Adding a fixture-based test is a known good first issue.
+
+## Where it reads from
+
+`$QWEN_DATA_DIR` if set, otherwise `~/.qwen/projects/<project>/chats/*.jsonl` (`qwen.ts:52-54`).
+
+## Storage format
+
+JSONL.
+
+## Caching
+
+None.
+
+## Deduplication
+
+Per `<sessionId>:<uuid>` (`qwen.ts:110`).
+
+## Quirks
+
+- **Project name comes from the last path component** (`qwen.ts:56-59`), not from any in-file field. If a user puts the same project under two different paths, they will appear as two projects.
+- **Thought parts are filtered out** before token accounting (`qwen.ts:97`). Qwen reports `thoughtsTokenCount` separately from `candidatesTokenCount`; this parser counts both as output but does not double-count thoughts in the main message.
+- **Tool calls** are extracted from a fixed envelope shape (`qwen.ts:61-76`). If Qwen restructures its tool-call format in a future release, this is where it will break first.
+- Tokens come from `usageMetadata`: `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`, `cachedContentTokenCount`.
+
+## When fixing a bug here
+
+1. Add a fixture and a test before changing logic. The lack of `tests/providers/qwen.test.ts` makes regressions invisible.
+2. If the bug is "tools missing", look at the function-call extraction loop at `qwen.ts:61-76`.
+3. If the bug is "duplicate counts", confirm `<sessionId>:<uuid>` actually uniquely identifies a turn in your reproducer; some Qwen builds repeat UUIDs across resumed sessions.
diff --git a/docs/providers/roo-code.md b/docs/providers/roo-code.md
new file mode 100644
index 0000000..6f9d16a
--- /dev/null
+++ b/docs/providers/roo-code.md
@@ -0,0 +1,34 @@
+# Roo Code
+
+Roo Code VS Code extension.
+
+- **Source:** `src/providers/roo-code.ts`
+- **Loading:** eager (`src/providers/index.ts:11`)
+- **Test:** `tests/providers/roo-code.test.ts` (247 lines)
+
+## Where it reads from
+
+VS Code extension globalStorage for `rooveterinaryinc.roo-cline` (extension ID set at `roo-code.ts:4`). The actual walk is delegated to `discoverClineTasks` in `src/providers/vscode-cline-parser.ts`.
+
+## Storage format
+
+Per-task directories with `ui_messages.json` and `api_conversation_history.json`. See [`vscode-cline-parser`](vscode-cline-parser.md) for the schema.
+
+## Caching
+
+None at the provider level; delegates to the shared helper.
+
+## Deduplication
+
+Delegated. Per `<providerName>:<taskId>:<index>` (in `vscode-cline-parser.ts:109`).
+
+## Quirks
+
+- Thin wrapper. Almost every Roo Code bug actually lives in `vscode-cline-parser.ts`.
+- The two providers using the cline parser (KiloCode and Roo Code) differ **only** by extension ID.
+
+## When fixing a bug here
+
+1. If the bug also reproduces against KiloCode, fix it in `vscode-cline-parser.ts`.
+2. If the bug is Roo Code-specific, the difference is upstream JSON shape. Reproduce with a fixture and consider whether the cline parser needs to branch on extension ID.
+3. Read [`vscode-cline-parser.md`](vscode-cline-parser.md) before editing.
diff --git a/docs/providers/vscode-cline-parser.md b/docs/providers/vscode-cline-parser.md
new file mode 100644
index 0000000..5b6bdfa
--- /dev/null
+++ b/docs/providers/vscode-cline-parser.md
@@ -0,0 +1,49 @@
+# vscode-cline-parser (Shared Helper)
+
+Shared discovery and parsing for VS Code extensions descended from Cline.
+
+- **Source:** `src/providers/vscode-cline-parser.ts`
+- **Loading:** not a provider; imported by `kilo-code.ts` and `roo-code.ts`.
+- **Test:** none directly. Coverage comes from `tests/providers/kilo-code.test.ts` and `tests/providers/roo-code.test.ts`.
+
+## What it does
+
+Two responsibilities:
+
+1. `discoverClineTasks(extensionId)` walks VS Code's `globalStorage/<extensionId>/tasks/` directories and returns one source per task that has a `ui_messages.json` file (`vscode-cline-parser.ts:25-50`).
+2. `createClineParser` reads each task's `ui_messages.json` and `api_conversation_history.json`, extracts model, tools, and token counts, and yields `ParsedProviderCall` objects.
+
+## Storage layout
+
+Per task directory:
+
+```
+<globalStorage>/<extensionId>/tasks/<taskId>/
+  ui_messages.json                # event stream
+  api_conversation_history.json   # full prompt history with model tags
+```
+
+## Model resolution
+
+The model is extracted from `api_conversation_history.json` by searching user message content blocks for a `<model>...</model>` tag (`vscode-cline-parser.ts:54-72`). Falls back to `cline-auto` if no tag is found.
+
+## Token extraction
+
+From `api_req_started` entries inside `ui_messages.json`. Each such entry's `text` field is JSON-parsed; the parsed object holds `tokensIn`, `tokensOut`, `cacheReads`, `cacheWrites`, and (optionally) `cost` (`vscode-cline-parser.ts:119-134`).
+
+If `cost` is present, it is used directly. If not, `calculateCost` from `src/models.ts` computes it from tokens (`vscode-cline-parser.ts:139`).
+
+## Deduplication
+
+Per `<providerName>:<taskId>:<index>` where `index` is the position of the `api_req_started` entry within `ui_messages.json` (`vscode-cline-parser.ts:109`).
+
+## Quirks
+
+- Only the **first** user message is emitted as `userMessage` in the `ParsedProviderCall` (`vscode-cline-parser.ts:157`). Subsequent user turns are accounted but not surfaced.
+- The model regex looks inside content blocks, not at top-level fields. Some Cline-derivative extensions emit the model elsewhere; if you add support for one, branch on extension ID rather than rewriting the regex.
+
+## When fixing a bug here
+
+1. A change here ripples to **both** KiloCode and Roo Code. Run both test files (`tests/providers/kilo-code.test.ts` and `tests/providers/roo-code.test.ts`) before opening a PR.
+2. If you find that one of the two extensions emits a different shape, branch on the extension ID parameter that the discovery function already takes; do not duplicate the parser.
+3. If you add support for a third Cline-derivative extension, register it as a thin wrapper file in the same shape as `kilo-code.ts` and `roo-code.ts`.