mirror of
https://github.com/AgentSeal/codeburn.git
synced 2026-05-17 12:20:43 +00:00
Some checks are pending
CI / semgrep (push) Waiting to run
Two passes of validators across CLI accuracy, dashboard UX, menubar Swift, performance, security, and end-to-end smoke tests on real session data. Data-correctness fixes: - parseLocalDate rejects month/day overflow. JS Date silently rolled Feb 31 to Mar 3, so --from 2026-02-31 --to 2026-03-15 quietly dropped sessions on Feb 28 - Mar 2. Now throws "Invalid date" with a clear reason. Leap-day case covered (2024-02-29 valid, 2025-02-29 rejected). - CSV/JSON exports use the active currency's natural decimal places. The previous round2 helper produced ¥412.37 in CSV while the dashboard rendered ¥412 — finance teams comparing the two surfaces saw a discrepancy. New roundForActiveCurrency consults Intl.NumberFormat for the right precision (0 for JPY/KRW/CLP, 2 for USD/EUR, etc). - Copilot toolRequests is Array.isArray-guarded in both modern and legacy event branches. Previously a corrupt session with toolRequests=null or a string aborted the whole file's parse loop and silently dropped every legitimate call after it. - Codex token_count dedup uses a null sentinel for prevCumulativeTotal so the first event is never confused with a duplicate. Sessions that emit only last_token_usage (no total_token_usage) report cumulativeTotal=0 on every event; with the previous 0-initialized prev, the first event matched the dedup guard and was dropped. - LiteLLM pricing values are clamped to [0, 1] per token via safePerTokenRate. Defense in depth against a tampered upstream JSON shipping negative or absurdly large per-token costs that would otherwise propagate into all cost totals. Performance: - Cursor SQLite parse no longer pegs at minutes on multi-GB DBs. Two changes: per-conversation user-message buffer uses an index pointer instead of Array.shift() (which was O(n) per call); and a real ROWID cutoff via subquery limits the scan to the most recent 250k bubbles with a stderr warning so power users get a partial report rather than a stalled CLI. - Spawned codeburn CLI subprocesses are terminated when the calling Task is cancelled. Without this, rapid period/provider tab clicks in the menubar cancelled the Task but left the subprocess running to completion, piling up zombie processes. UX: - Dashboard period switch flips to loading and clears projects synchronously before reloadData runs, eliminating the frame where the new period label rendered over the old period's projects. - Optimize findings tab paginates 3-at-a-time with j/k scroll. With 4 new detectors plus 7 originals, 8-10 findings * 6 lines was scrolling the StatusBar off the alt buffer top. - Custom --from/--to ranges hide the period tab strip and disable the 1-5 / arrow keys so a stray period press no longer abandons the user's explicit range. A "Custom range: X to Y" banner replaces the tab strip. - OpenCode storage-format warning is per-table-set, rate-limited to once per process, and points the user at OpenCode's migration step or the issue tracker. The previous all-or-nothing check fired the generic "format not recognized" string for any schema mismatch. Menubar / OAuth: - Both Claude and Codex bootstrap (Reconnect button) now honour the usageBlockedUntil 429 backoff that refreshIfBootstrapped respects. Spamming Reconnect during sustained rate-limit windows previously hammered the upstream endpoint on every click. - Codex Retry-After HTTP header is parsed (delta-seconds plus IMF-fixdate fallback) so we don't over-back-off when ChatGPT tells us a shorter window than our 5-minute floor. - Both credential cache files are written via SafeFile.write (O_CREAT | O_EXCL | O_NOFOLLOW with explicit 0600) so there is no race window where the temp file briefly exists at default umask, and a symlink at the destination cannot redirect the write. Reads now route through SafeFile.read with a 64 KiB cap, closing the symlink-follow gap on Data(contentsOf:). CI signal: - TypeScript strict typecheck (tsc --noEmit) is now zero errors. The six errors in src/providers/copilot.ts came from a discriminated-union catch-all branch whose `data: Record<string, unknown>` shape TS picked over the specific event branches when narrowing on `type`. Removed the catch-all; runtime falls through unknown event types via the existing if/else chain. Tests added: 16 new (now 555 total) - date-range-filter: month/day/year overflow rejection, leap-day correctness - currency-rounding: convertCost no-rounding contract, roundForActiveCurrency for USD/JPY/KRW/EUR - providers/copilot: malformed toolRequests does not abort the parse - providers/cursor-bubble-dedup: re-parse after token mutation does not double-count, single parse yields one call per bubble - providers/codex: first event with cumulativeTotal=0 not dropped, consecutive zero-cumulative duplicates still deduped
176 lines
6.6 KiB
TypeScript
176 lines
6.6 KiB
TypeScript
import { describe, it, expect, beforeEach, afterEach } from 'vitest'
|
|
import { mkdtemp, rm, writeFile } from 'fs/promises'
|
|
import { tmpdir } from 'os'
|
|
import { join } from 'path'
|
|
|
|
import { isSqliteAvailable, openDatabase } from '../../src/sqlite.js'
|
|
import { getAllProviders } from '../../src/providers/index.js'
|
|
import type { Provider, ParsedProviderCall } from '../../src/providers/types.js'
|
|
|
|
/// Pinned regression for the v3 bubble-dedup fix. The previous (v2) code used
|
|
/// the bubble row's mutable token counts as part of the deduplication key, so
|
|
/// the same bubble was counted twice once Cursor wrote the streaming-complete
|
|
/// final token totals on top of the streaming-in-progress row. v3 switched to
|
|
/// the SQLite primary `key` column (which is the stable bubbleId:<id>:<id>
|
|
/// path) so re-parsing the same DB after token updates produces zero new
|
|
/// calls. This test:
|
|
/// 1. Builds a tmp SQLite DB with the cursorDiskKV schema and one bubble row
|
|
/// with low token counts (the streaming-in-progress shape).
|
|
/// 2. Parses it through the cursor provider. Asserts one call.
|
|
/// 3. Mutates the row in place to higher token counts (the streaming-complete
|
|
/// shape) without changing the SQLite key.
|
|
/// 4. Re-parses with the SAME seenKeys set. Asserts zero new calls.
|
|
/// If a future refactor brings back token-count-based dedup, the second parse
|
|
/// will produce a duplicate call and this test will fail.
|
|
|
|
const skipReason = isSqliteAvailable()
|
|
? null
|
|
: 'node:sqlite not available — needs Node 22+; skipping'
|
|
|
|
let tmpDir: string
|
|
|
|
beforeEach(async () => {
|
|
tmpDir = await mkdtemp(join(tmpdir(), 'cursor-dedup-'))
|
|
})
|
|
|
|
afterEach(async () => {
|
|
await rm(tmpDir, { recursive: true, force: true })
|
|
})
|
|
|
|
function buildBubbleValue(opts: {
|
|
conversationId: string
|
|
text: string
|
|
inputTokens: number
|
|
outputTokens: number
|
|
type: 1 | 2
|
|
createdAt?: string
|
|
}): string {
|
|
return JSON.stringify({
|
|
type: opts.type,
|
|
conversationId: opts.conversationId,
|
|
text: opts.text,
|
|
tokenCount: {
|
|
inputTokens: opts.inputTokens,
|
|
outputTokens: opts.outputTokens,
|
|
},
|
|
createdAt: opts.createdAt ?? new Date().toISOString(),
|
|
modelId: 'gpt-5',
|
|
capabilityType: 'composer',
|
|
})
|
|
}
|
|
|
|
async function createCursorTestDb(): Promise<string> {
|
|
// Cursor uses a non-extension state DB filename (state.vscdb in the real app);
|
|
// any path works for openDatabase as long as we set up the schema and the
|
|
// directory layout the parser expects. The parser only checks the DB
|
|
// contents — discovery is bypassed because we hand it the path directly.
|
|
const dbPath = join(tmpDir, 'state.vscdb')
|
|
await writeFile(dbPath, '')
|
|
// Use the underlying node:sqlite to create the schema.
|
|
// We need cursorDiskKV with key + value columns.
|
|
const Module = await import('node:module')
|
|
const requireForSqlite = Module.createRequire(import.meta.url)
|
|
const { DatabaseSync } = requireForSqlite('node:sqlite') as {
|
|
DatabaseSync: new (path: string) => {
|
|
exec(sql: string): void
|
|
prepare(sql: string): { run(...p: unknown[]): unknown }
|
|
close(): void
|
|
}
|
|
}
|
|
const db = new DatabaseSync(dbPath)
|
|
db.exec('CREATE TABLE cursorDiskKV (key TEXT PRIMARY KEY, value TEXT)')
|
|
|
|
// Single assistant bubble (type=2). The parser yields one ParsedProviderCall
|
|
// per bubbleId:% row, so a multi-row fixture would muddy the dedup count;
|
|
// we keep the test surface minimal — one bubble through one parse, then
|
|
// the same bubble again after token mutation.
|
|
const bubbleKey = 'bubbleId:abc-123:bubble-xyz'
|
|
db.prepare('INSERT INTO cursorDiskKV (key, value) VALUES (?, ?)').run(
|
|
bubbleKey,
|
|
buildBubbleValue({
|
|
conversationId: 'abc-123',
|
|
text: 'def hello(): pass',
|
|
inputTokens: 100,
|
|
outputTokens: 20,
|
|
type: 2,
|
|
})
|
|
)
|
|
|
|
db.close()
|
|
return dbPath
|
|
}
|
|
|
|
async function updateAssistantBubbleTokens(dbPath: string, inputTokens: number, outputTokens: number): Promise<void> {
|
|
const Module = await import('node:module')
|
|
const requireForSqlite = Module.createRequire(import.meta.url)
|
|
const { DatabaseSync } = requireForSqlite('node:sqlite') as {
|
|
DatabaseSync: new (path: string) => {
|
|
prepare(sql: string): { run(...p: unknown[]): unknown }
|
|
close(): void
|
|
}
|
|
}
|
|
const db = new DatabaseSync(dbPath)
|
|
db.prepare('UPDATE cursorDiskKV SET value = ? WHERE key = ?').run(
|
|
buildBubbleValue({
|
|
conversationId: 'abc-123',
|
|
text: 'def hello(): pass',
|
|
inputTokens,
|
|
outputTokens,
|
|
type: 2,
|
|
}),
|
|
'bubbleId:abc-123:bubble-xyz'
|
|
)
|
|
db.close()
|
|
}
|
|
|
|
async function getCursorProvider(): Promise<Provider> {
|
|
const all = await getAllProviders()
|
|
const p = all.find(p => p.name === 'cursor')
|
|
if (!p) throw new Error('cursor provider not registered')
|
|
return p
|
|
}
|
|
|
|
describe.skipIf(skipReason !== null)('cursor bubble dedup (regression for v3 fix)', () => {
|
|
it('does not double-count when bubble token counts mutate between parses', async () => {
|
|
const dbPath = await createCursorTestDb()
|
|
const provider = await getCursorProvider()
|
|
|
|
// First parse: streaming-in-progress shape.
|
|
const seenKeys = new Set<string>()
|
|
const source = { path: dbPath, project: 'test-project', provider: 'cursor' }
|
|
const firstRunCalls: ParsedProviderCall[] = []
|
|
for await (const call of provider.createSessionParser(source, seenKeys).parse()) {
|
|
firstRunCalls.push(call)
|
|
}
|
|
expect(firstRunCalls.length).toBe(1)
|
|
|
|
// Cursor mutates the same bubble row to its final token totals when the
|
|
// stream completes. Simulate by updating in place. The SQLite primary
|
|
// key stays the same.
|
|
await updateAssistantBubbleTokens(dbPath, 250, 80)
|
|
|
|
// Second parse with the SAME seenKeys: must yield zero new calls. If the
|
|
// dedup key were derived from token counts (the v2 bug), this would
|
|
// produce a duplicate.
|
|
const secondRunCalls: ParsedProviderCall[] = []
|
|
for await (const call of provider.createSessionParser(source, seenKeys).parse()) {
|
|
secondRunCalls.push(call)
|
|
}
|
|
expect(secondRunCalls.length).toBe(0)
|
|
})
|
|
|
|
it('does not yield the same bubble twice within a single parser run', async () => {
|
|
const dbPath = await createCursorTestDb()
|
|
const provider = await getCursorProvider()
|
|
const seenKeys = new Set<string>()
|
|
const source = { path: dbPath, project: 'test-project', provider: 'cursor' }
|
|
const calls: ParsedProviderCall[] = []
|
|
for await (const call of provider.createSessionParser(source, seenKeys).parse()) {
|
|
calls.push(call)
|
|
}
|
|
// One bubble in the DB → one call. (The user message row at type=1 is
|
|
// not surfaced as a separate ParsedProviderCall; it's threaded into the
|
|
// assistant call's userMessage field.)
|
|
expect(calls.length).toBe(1)
|
|
})
|
|
})
|