mirror of
https://github.com/AgentSeal/codeburn.git
synced 2026-05-17 03:56:45 +00:00
Menubar and CLI hardening from multi-agent audit (#257)
Some checks are pending
CI / semgrep (push) Waiting to run
Some checks are pending
CI / semgrep (push) Waiting to run
Two passes of validators across CLI accuracy, dashboard UX, menubar Swift, performance, security, and end-to-end smoke tests on real session data. Data-correctness fixes: - parseLocalDate rejects month/day overflow. JS Date silently rolled Feb 31 to Mar 3, so --from 2026-02-31 --to 2026-03-15 quietly dropped sessions on Feb 28 - Mar 2. Now throws "Invalid date" with a clear reason. Leap-day case covered (2024-02-29 valid, 2025-02-29 rejected). - CSV/JSON exports use the active currency's natural decimal places. The previous round2 helper produced ¥412.37 in CSV while the dashboard rendered ¥412 — finance teams comparing the two surfaces saw a discrepancy. New roundForActiveCurrency consults Intl.NumberFormat for the right precision (0 for JPY/KRW/CLP, 2 for USD/EUR, etc). - Copilot toolRequests is Array.isArray-guarded in both modern and legacy event branches. Previously a corrupt session with toolRequests=null or a string aborted the whole file's parse loop and silently dropped every legitimate call after it. - Codex token_count dedup uses a null sentinel for prevCumulativeTotal so the first event is never confused with a duplicate. Sessions that emit only last_token_usage (no total_token_usage) report cumulativeTotal=0 on every event; with the previous 0-initialized prev, the first event matched the dedup guard and was dropped. - LiteLLM pricing values are clamped to [0, 1] per token via safePerTokenRate. Defense in depth against a tampered upstream JSON shipping negative or absurdly large per-token costs that would otherwise propagate into all cost totals. Performance: - Cursor SQLite parse no longer pegs at minutes on multi-GB DBs. Two changes: per-conversation user-message buffer uses an index pointer instead of Array.shift() (which was O(n) per call); and a real ROWID cutoff via subquery limits the scan to the most recent 250k bubbles with a stderr warning so power users get a partial report rather than a stalled CLI. - Spawned codeburn CLI subprocesses are terminated when the calling Task is cancelled. Without this, rapid period/provider tab clicks in the menubar cancelled the Task but left the subprocess running to completion, piling up zombie processes. UX: - Dashboard period switch flips to loading and clears projects synchronously before reloadData runs, eliminating the frame where the new period label rendered over the old period's projects. - Optimize findings tab paginates 3-at-a-time with j/k scroll. With 4 new detectors plus 7 originals, 8-10 findings * 6 lines was scrolling the StatusBar off the alt buffer top. - Custom --from/--to ranges hide the period tab strip and disable the 1-5 / arrow keys so a stray period press no longer abandons the user's explicit range. A "Custom range: X to Y" banner replaces the tab strip. - OpenCode storage-format warning is per-table-set, rate-limited to once per process, and points the user at OpenCode's migration step or the issue tracker. The previous all-or-nothing check fired the generic "format not recognized" string for any schema mismatch. Menubar / OAuth: - Both Claude and Codex bootstrap (Reconnect button) now honour the usageBlockedUntil 429 backoff that refreshIfBootstrapped respects. Spamming Reconnect during sustained rate-limit windows previously hammered the upstream endpoint on every click. - Codex Retry-After HTTP header is parsed (delta-seconds plus IMF-fixdate fallback) so we don't over-back-off when ChatGPT tells us a shorter window than our 5-minute floor. - Both credential cache files are written via SafeFile.write (O_CREAT | O_EXCL | O_NOFOLLOW with explicit 0600) so there is no race window where the temp file briefly exists at default umask, and a symlink at the destination cannot redirect the write. Reads now route through SafeFile.read with a 64 KiB cap, closing the symlink-follow gap on Data(contentsOf:). CI signal: - TypeScript strict typecheck (tsc --noEmit) is now zero errors. The six errors in src/providers/copilot.ts came from a discriminated-union catch-all branch whose `data: Record<string, unknown>` shape TS picked over the specific event branches when narrowing on `type`. Removed the catch-all; runtime falls through unknown event types via the existing if/else chain. Tests added: 16 new (now 555 total) - date-range-filter: month/day/year overflow rejection, leap-day correctness - currency-rounding: convertCost no-rounding contract, roundForActiveCurrency for USD/JPY/KRW/EUR - providers/copilot: malformed toolRequests does not abort the parse - providers/cursor-bubble-dedup: re-parse after token mutation does not double-count, single parse yields one call per bubble - providers/codex: first event with cumulativeTotal=0 not dropped, consecutive zero-cumulative duplicates still deduped
This commit is contained in:
parent
efac2bfa15
commit
daa673449c
20 changed files with 765 additions and 104 deletions
|
|
@ -285,27 +285,24 @@ enum ClaudeCredentialStore {
|
|||
private static func readOurCache() throws -> CredentialRecord? {
|
||||
let url = cacheFileURL()
|
||||
guard FileManager.default.fileExists(atPath: url.path) else { return nil }
|
||||
let data = try Data(contentsOf: url)
|
||||
// Route through SafeFile.read so we lstat for symlinks before opening
|
||||
// and bound the read with maxCredentialBytes. Without this, an
|
||||
// attacker who can plant a symlink in ~/Library/Application Support/
|
||||
// CodeBurn/ between disconnect and reconnect could redirect our read
|
||||
// to /dev/zero (unbounded memory) or another file the user owns.
|
||||
let data = try SafeFile.read(from: url.path, maxBytes: maxCredentialBytes)
|
||||
return try? JSONDecoder().decode(CredentialRecord.self, from: data)
|
||||
}
|
||||
|
||||
private static func writeOurCache(record: CredentialRecord) throws {
|
||||
let url = cacheFileURL()
|
||||
let dir = url.deletingLastPathComponent()
|
||||
try? FileManager.default.createDirectory(at: dir, withIntermediateDirectories: true, attributes: nil)
|
||||
let data = try JSONEncoder().encode(record)
|
||||
// Atomic temp-rename so a crash mid-write cannot leave a half-file.
|
||||
let tmp = url.appendingPathExtension("tmp-\(UUID().uuidString.prefix(8))")
|
||||
try data.write(to: tmp)
|
||||
// 0600 — owner read/write only. Mirrors ~/.claude/.credentials.json's
|
||||
// permission posture; nothing extra to protect since this is just a
|
||||
// cached copy of credentials the user already has on disk in cleartext.
|
||||
try? FileManager.default.setAttributes([.posixPermissions: NSNumber(value: Int16(0o600))], ofItemAtPath: tmp.path)
|
||||
if FileManager.default.fileExists(atPath: url.path) {
|
||||
_ = try FileManager.default.replaceItemAt(url, withItemAt: tmp)
|
||||
} else {
|
||||
try FileManager.default.moveItem(at: tmp, to: url)
|
||||
}
|
||||
// SafeFile.write opens the temp file with O_CREAT | O_EXCL | O_NOFOLLOW
|
||||
// and the explicit 0600 mode in a single syscall — no race window
|
||||
// where the file briefly exists at default umask, and no chance of
|
||||
// following a malicious symlink at the destination path. Also creates
|
||||
// the parent dir at 0700.
|
||||
try SafeFile.write(data, to: url.path, mode: 0o600)
|
||||
}
|
||||
|
||||
private static func deleteOurCache() {
|
||||
|
|
|
|||
|
|
@ -59,6 +59,13 @@ enum ClaudeSubscriptionService {
|
|||
/// User-initiated. Reads Claude's keychain (PROMPTS), copies to our keychain,
|
||||
/// then fetches usage. Idempotent — safe to call again to "reconnect".
|
||||
static func bootstrap() async throws -> SubscriptionUsage {
|
||||
// Honour the same 429 backoff that refreshIfBootstrapped respects.
|
||||
// Without this, a user spamming Reconnect during a sustained
|
||||
// rate-limit window hammers Anthropic on every click — exactly the
|
||||
// pattern that escalates the backoff.
|
||||
if let until = usageBlockedUntil(), until > Date() {
|
||||
throw FetchError.rateLimited(retryAt: until)
|
||||
}
|
||||
let record: ClaudeCredentialStore.CredentialRecord
|
||||
do {
|
||||
record = try ClaudeCredentialStore.bootstrap()
|
||||
|
|
|
|||
|
|
@ -200,24 +200,20 @@ enum CodexCredentialStore {
|
|||
private static func readOurCache() throws -> CredentialRecord? {
|
||||
let url = cacheFileURL()
|
||||
guard FileManager.default.fileExists(atPath: url.path) else { return nil }
|
||||
let data = try Data(contentsOf: url)
|
||||
// Symlink-defense + size cap (same hardening as ClaudeCredentialStore).
|
||||
let data = try SafeFile.read(from: url.path, maxBytes: maxCredentialBytes)
|
||||
return try? JSONDecoder().decode(CredentialRecord.self, from: data)
|
||||
}
|
||||
|
||||
private static func writeOurCache(record: CredentialRecord) throws {
|
||||
let url = cacheFileURL()
|
||||
let dir = url.deletingLastPathComponent()
|
||||
try? FileManager.default.createDirectory(at: dir, withIntermediateDirectories: true, attributes: nil)
|
||||
let data = try JSONEncoder().encode(record)
|
||||
let tmp = url.appendingPathExtension("tmp-\(UUID().uuidString.prefix(8))")
|
||||
do {
|
||||
try data.write(to: tmp)
|
||||
try? FileManager.default.setAttributes([.posixPermissions: NSNumber(value: Int16(0o600))], ofItemAtPath: tmp.path)
|
||||
if FileManager.default.fileExists(atPath: url.path) {
|
||||
_ = try FileManager.default.replaceItemAt(url, withItemAt: tmp)
|
||||
} else {
|
||||
try FileManager.default.moveItem(at: tmp, to: url)
|
||||
}
|
||||
// SafeFile.write opens the temp file with O_CREAT | O_EXCL | O_NOFOLLOW
|
||||
// and the explicit 0600 mode in a single syscall — no race window
|
||||
// where the file briefly exists at default umask, and no chance of
|
||||
// following a malicious symlink at the destination path.
|
||||
try SafeFile.write(data, to: url.path, mode: 0o600)
|
||||
} catch {
|
||||
throw StoreError.fileWriteFailed(String(describing: error))
|
||||
}
|
||||
|
|
|
|||
|
|
@ -47,6 +47,13 @@ enum CodexSubscriptionService {
|
|||
}
|
||||
|
||||
static func bootstrap() async throws -> CodexUsage {
|
||||
// Honour the same 429 backoff that refreshIfBootstrapped respects.
|
||||
// A user clicking Reconnect during a sustained ChatGPT rate-limit
|
||||
// window would otherwise re-hit /wham/usage on every click and keep
|
||||
// the backoff window pegged.
|
||||
if let until = usageBlockedUntil(), until > Date() {
|
||||
throw FetchError.rateLimited(retryAt: until)
|
||||
}
|
||||
let record: CodexCredentialStore.CredentialRecord
|
||||
do {
|
||||
record = try CodexCredentialStore.bootstrap()
|
||||
|
|
@ -120,7 +127,12 @@ enum CodexSubscriptionService {
|
|||
}
|
||||
throw FetchError.usageHTTPError(401, String(data: data, encoding: .utf8))
|
||||
case 429:
|
||||
let until = recordUsageRateLimit(retryAfterSeconds: nil)
|
||||
// Honour the RFC Retry-After header when present — ChatGPT's quota
|
||||
// endpoint sometimes sets it to a window shorter than our 5-min
|
||||
// floor, and ignoring it forced users to wait longer than the
|
||||
// server actually wanted.
|
||||
let retryAfter = parseRetryAfterHeader(http.value(forHTTPHeaderField: "Retry-After"))
|
||||
let until = recordUsageRateLimit(retryAfterSeconds: retryAfter)
|
||||
throw FetchError.rateLimited(retryAt: until)
|
||||
default:
|
||||
throw FetchError.usageHTTPError(http.statusCode, String(data: data, encoding: .utf8))
|
||||
|
|
@ -205,6 +217,23 @@ enum CodexSubscriptionService {
|
|||
}
|
||||
|
||||
@discardableResult
|
||||
/// RFC 7231 says Retry-After is either a delta-seconds or an HTTP-date.
|
||||
/// chatgpt.com appears to send delta-seconds today; we still parse both
|
||||
/// shapes defensively so a future change to HTTP-date doesn't drop us
|
||||
/// onto the silent 5-minute floor.
|
||||
private static func parseRetryAfterHeader(_ value: String?) -> Int? {
|
||||
guard let value = value?.trimmingCharacters(in: .whitespaces), !value.isEmpty else { return nil }
|
||||
if let seconds = Int(value), seconds >= 0 { return seconds }
|
||||
let f = DateFormatter()
|
||||
f.locale = Locale(identifier: "en_US_POSIX")
|
||||
f.timeZone = TimeZone(secondsFromGMT: 0)
|
||||
f.dateFormat = "EEE, dd MMM yyyy HH:mm:ss zzz"
|
||||
if let date = f.date(from: value) {
|
||||
return max(0, Int(date.timeIntervalSinceNow))
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
private static func recordUsageRateLimit(retryAfterSeconds: Int?) -> Date {
|
||||
let seconds = max(retryAfterSeconds ?? 300, 60)
|
||||
let until = Date().addingTimeInterval(TimeInterval(seconds))
|
||||
|
|
|
|||
|
|
@ -61,11 +61,6 @@ struct DataClient {
|
|||
throw DataClientError.spawn(error.localizedDescription)
|
||||
}
|
||||
|
||||
// Drain both pipes concurrently so a large stderr can't deadlock stdout (the child
|
||||
// blocks on write once the pipe buffer fills). `drain` also enforces a byte cap.
|
||||
async let stdoutData = drain(outPipe.fileHandleForReading, limit: maxPayloadBytes)
|
||||
async let stderrData = drain(errPipe.fileHandleForReading, limit: maxStderrBytes)
|
||||
|
||||
// Wall-clock timeout: if the CLI hangs (parser stuck, disk stall), kill it.
|
||||
let timeoutTask = Task.detached(priority: .utility) {
|
||||
try? await Task.sleep(nanoseconds: spawnTimeoutSeconds * 1_000_000_000)
|
||||
|
|
@ -75,7 +70,25 @@ struct DataClient {
|
|||
}
|
||||
defer { timeoutTask.cancel() }
|
||||
|
||||
let (out, err) = await (stdoutData, stderrData)
|
||||
// If the caller cancels its Task (rapid period/provider tab clicks
|
||||
// cancel switchTask in AppStore), terminate the in-flight subprocess.
|
||||
// Without this the cancelled Task returns immediately but the spawned
|
||||
// CLI keeps running to completion, piling up zombie codeburn processes
|
||||
// on rapid UI interactions. We hold a strong reference to the Process
|
||||
// in the cancellation handler so the closure can find it even if the
|
||||
// surrounding scope has gone async.
|
||||
let (out, err) = await withTaskCancellationHandler {
|
||||
// Drain both pipes concurrently so a large stderr can't deadlock stdout
|
||||
// (the child blocks on write once the pipe buffer fills). `drain`
|
||||
// also enforces a byte cap.
|
||||
async let stdoutData = drain(outPipe.fileHandleForReading, limit: maxPayloadBytes)
|
||||
async let stderrData = drain(errPipe.fileHandleForReading, limit: maxStderrBytes)
|
||||
return await (stdoutData, stderrData)
|
||||
} onCancel: {
|
||||
if process.isRunning {
|
||||
process.terminate()
|
||||
}
|
||||
}
|
||||
process.waitUntilExit()
|
||||
|
||||
if out.count >= maxPayloadBytes {
|
||||
|
|
|
|||
|
|
@ -47,7 +47,15 @@ function parseLocalDate(s: string): Date {
|
|||
throw new Error(`Invalid date format "${s}": expected YYYY-MM-DD`)
|
||||
}
|
||||
const [y, m, d] = s.split('-').map(Number) as [number, number, number]
|
||||
return new Date(y, m - 1, d)
|
||||
const date = new Date(y, m - 1, d)
|
||||
// JS Date silently rolls overflow forward (Feb 31 → Mar 3). That makes a
|
||||
// typo like `--from 2026-02-31 --to 2026-03-15` quietly drop sessions
|
||||
// dated Feb 28 - Mar 2. Reject overflow so the user gets a loud error
|
||||
// instead of an off-by-N-days date range.
|
||||
if (date.getFullYear() !== y || date.getMonth() !== m - 1 || date.getDate() !== d) {
|
||||
throw new Error(`Invalid date "${s}": ${m}/${d}/${y} is not a real calendar date`)
|
||||
}
|
||||
return date
|
||||
}
|
||||
|
||||
export function parseDateRangeFlags(from: string | undefined, to: string | undefined): DateRange | null {
|
||||
|
|
|
|||
|
|
@ -300,7 +300,8 @@ program
|
|||
return
|
||||
}
|
||||
await hydrateCache()
|
||||
await renderDashboard(period, opts.provider, opts.refresh, opts.project, opts.exclude, customRange)
|
||||
const customRangeLabel = customRange ? formatDateRangeLabel(opts.from, opts.to) : undefined
|
||||
await renderDashboard(period, opts.provider, opts.refresh, opts.project, opts.exclude, customRange, customRangeLabel)
|
||||
})
|
||||
|
||||
function buildPeriodData(label: string, projects: ProjectSummary[]): PeriodData {
|
||||
|
|
|
|||
|
|
@ -47,13 +47,24 @@ function resolveSymbol(code: string): string {
|
|||
return parts.find(p => p.type === 'currency')?.value ?? code
|
||||
}
|
||||
|
||||
function getFractionDigits(code: string): number {
|
||||
export function getFractionDigits(code: string): number {
|
||||
return new Intl.NumberFormat('en', {
|
||||
style: 'currency',
|
||||
currency: code,
|
||||
}).resolvedOptions().maximumFractionDigits ?? 2
|
||||
}
|
||||
|
||||
/// Round a converted cost to the currency's natural decimal places. JPY/KRW/CLP
|
||||
/// resolve to 0 fraction digits — exporting those with `round2` produced rows
|
||||
/// like `¥412.37` while the dashboard rendered `¥412`, breaking finance reports
|
||||
/// that compare the two surfaces.
|
||||
export function roundForActiveCurrency(value: number): number {
|
||||
const code = getCurrency().code
|
||||
const digits = getFractionDigits(code)
|
||||
const factor = Math.pow(10, digits)
|
||||
return Math.round(value * factor) / factor
|
||||
}
|
||||
|
||||
function getCacheDir(): string {
|
||||
return join(homedir(), '.cache', 'codeburn')
|
||||
}
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ import { dateKey } from './day-aggregator.js'
|
|||
import { CompareView } from './compare.js'
|
||||
import { getPlanUsageOrNull, type PlanUsage } from './plan-usage.js'
|
||||
import { planDisplayName } from './plans.js'
|
||||
import { getDateRange, PERIODS, PERIOD_LABELS, type Period } from './cli-date.js'
|
||||
import { getDateRange, PERIODS, PERIOD_LABELS, type Period, formatDateRangeLabel } from './cli-date.js'
|
||||
import { join } from 'path'
|
||||
import { patchStdoutForWindows } from './ink-win.js'
|
||||
|
||||
|
|
@ -563,13 +563,23 @@ function FindingPanel({ index, finding, costRate, width }: { index: number; find
|
|||
|
||||
const GRADE_COLORS: Record<string, string> = { A: '#5BF5A0', B: '#5BF5A0', C: GOLD, D: ORANGE, F: '#F55B5B' }
|
||||
|
||||
function OptimizeView({ findings, costRate, projects, label, width, healthScore, healthGrade }: { findings: WasteFinding[]; costRate: number; projects: ProjectSummary[]; label: string; width: number; healthScore: number; healthGrade: string }) {
|
||||
// Each finding panel takes ~6-8 lines. Show 3 at a time so the window fits a
|
||||
// 30-line terminal alongside the optimize header + status bar; users page
|
||||
// with j/k. Without this cap, 4 new detectors + 7 originals scrolled findings
|
||||
// off the alt-buffer top and the user couldn't see the StatusBar at all.
|
||||
const FINDINGS_WINDOW_SIZE = 3
|
||||
|
||||
function OptimizeView({ findings, costRate, projects, label, width, healthScore, healthGrade, cursor }: { findings: WasteFinding[]; costRate: number; projects: ProjectSummary[]; label: string; width: number; healthScore: number; healthGrade: string; cursor: number }) {
|
||||
const periodCost = projects.reduce((s, p) => s + p.totalCostUSD, 0)
|
||||
const totalTokens = findings.reduce((s, f) => s + f.tokensSaved, 0)
|
||||
const totalCost = totalTokens * costRate
|
||||
const pctRaw = periodCost > 0 ? (totalCost / periodCost) * 100 : 0
|
||||
const pct = pctRaw >= 1 ? pctRaw.toFixed(0) : pctRaw.toFixed(1)
|
||||
const gradeColor = GRADE_COLORS[healthGrade] ?? DIM
|
||||
const total = findings.length
|
||||
const start = total === 0 ? 0 : Math.min(cursor, Math.max(0, total - FINDINGS_WINDOW_SIZE))
|
||||
const end = Math.min(start + FINDINGS_WINDOW_SIZE, total)
|
||||
const visible = findings.slice(start, end)
|
||||
return (
|
||||
<Box flexDirection="column" width={width}>
|
||||
<Box flexDirection="column" borderStyle="round" borderColor={ORANGE} paddingX={1} width={width}>
|
||||
|
|
@ -580,27 +590,36 @@ function OptimizeView({ findings, costRate, projects, label, width, healthScore,
|
|||
<Text dimColor> ({healthScore}/100)</Text>
|
||||
</Text>
|
||||
<Text color="#5BF5A0" wrap="truncate-end">Savings: ~{formatTokens(totalTokens)} tokens (~{formatCost(totalCost)}, ~{pct}% of spend)</Text>
|
||||
{total > FINDINGS_WINDOW_SIZE && (
|
||||
<Text dimColor>Showing {start + 1}–{end} of {total} · j/k to scroll</Text>
|
||||
)}
|
||||
</Box>
|
||||
{findings.map((f, i) => <FindingPanel key={i} index={i + 1} finding={f} costRate={costRate} width={width} />)}
|
||||
{visible.map((f, i) => <FindingPanel key={start + i} index={start + i + 1} finding={f} costRate={costRate} width={width} />)}
|
||||
<Box paddingX={1} width={width}><Text dimColor>Token estimates are approximate.</Text></Box>
|
||||
</Box>
|
||||
)
|
||||
}
|
||||
|
||||
function StatusBar({ width, showProvider, view, findingCount, optimizeAvailable, compareAvailable }: { width: number; showProvider?: boolean; view?: View; findingCount?: number; optimizeAvailable?: boolean; compareAvailable?: boolean }) {
|
||||
function StatusBar({ width, showProvider, view, findingCount, optimizeAvailable, compareAvailable, customRange }: { width: number; showProvider?: boolean; view?: View; findingCount?: number; optimizeAvailable?: boolean; compareAvailable?: boolean; customRange?: boolean }) {
|
||||
const isOptimize = view === 'optimize'
|
||||
return (
|
||||
<Box borderStyle="round" borderColor={DIM} width={width} justifyContent="center" paddingX={1}>
|
||||
<Text>
|
||||
{isOptimize
|
||||
? <><Text color={ORANGE} bold>b</Text><Text dimColor> back </Text></>
|
||||
: <><Text color={ORANGE} bold>{'<'}</Text><Text color={ORANGE}>{'>'}</Text><Text dimColor> switch </Text></>}
|
||||
<Text color={ORANGE} bold>q</Text><Text dimColor> quit </Text>
|
||||
<Text color={ORANGE} bold>1</Text><Text dimColor> today </Text>
|
||||
<Text color={ORANGE} bold>2</Text><Text dimColor> week </Text>
|
||||
<Text color={ORANGE} bold>3</Text><Text dimColor> 30 days </Text>
|
||||
<Text color={ORANGE} bold>4</Text><Text dimColor> month </Text>
|
||||
<Text color={ORANGE} bold>5</Text><Text dimColor> 6 months</Text>
|
||||
? <><Text color={ORANGE} bold>b</Text><Text dimColor> back </Text><Text color={ORANGE} bold>j</Text><Text dimColor>/</Text><Text color={ORANGE} bold>k</Text><Text dimColor> scroll </Text></>
|
||||
: !customRange
|
||||
? <><Text color={ORANGE} bold>{'<'}</Text><Text color={ORANGE}>{'>'}</Text><Text dimColor> switch </Text></>
|
||||
: null}
|
||||
<Text color={ORANGE} bold>q</Text><Text dimColor> quit</Text>
|
||||
{!customRange && !isOptimize && (
|
||||
<>
|
||||
<Text dimColor> </Text><Text color={ORANGE} bold>1</Text><Text dimColor> today </Text>
|
||||
<Text color={ORANGE} bold>2</Text><Text dimColor> week </Text>
|
||||
<Text color={ORANGE} bold>3</Text><Text dimColor> 30 days </Text>
|
||||
<Text color={ORANGE} bold>4</Text><Text dimColor> month </Text>
|
||||
<Text color={ORANGE} bold>5</Text><Text dimColor> 6 months</Text>
|
||||
</>
|
||||
)}
|
||||
{!isOptimize && optimizeAvailable && findingCount != null && findingCount > 0 && (
|
||||
<><Text dimColor> </Text><Text color={ORANGE} bold>o</Text><Text dimColor> optimize</Text><Text color="#F55B5B"> ({findingCount})</Text></>
|
||||
)}
|
||||
|
|
@ -639,7 +658,7 @@ function DashboardContent({ projects, period, columns, activeProvider, budgets,
|
|||
)
|
||||
}
|
||||
|
||||
function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider, initialPlanUsage, refreshSeconds, projectFilter, excludeFilter }: {
|
||||
function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider, initialPlanUsage, refreshSeconds, projectFilter, excludeFilter, customRange, customRangeLabel }: {
|
||||
initialProjects: ProjectSummary[]
|
||||
initialPeriod: Period
|
||||
initialProvider: string
|
||||
|
|
@ -647,6 +666,8 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
refreshSeconds?: number
|
||||
projectFilter?: string[]
|
||||
excludeFilter?: string[]
|
||||
customRange?: DateRange | null
|
||||
customRangeLabel?: string
|
||||
}) {
|
||||
const { exit } = useApp()
|
||||
const [period, setPeriod] = useState<Period>(initialPeriod)
|
||||
|
|
@ -658,6 +679,11 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
const [optimizeResult, setOptimizeResult] = useState<OptimizeResult | null>(null)
|
||||
const [projectBudgets, setProjectBudgets] = useState<Map<string, ContextBudget>>(new Map())
|
||||
const [planUsage, setPlanUsage] = useState<PlanUsage | undefined>(initialPlanUsage)
|
||||
// Cursor for the OptimizeView's findings window. Reset whenever the user
|
||||
// leaves the optimize view OR the underlying findings change so a long
|
||||
// findings list never strands the user past the new array length.
|
||||
const [findingsCursor, setFindingsCursor] = useState(0)
|
||||
const isCustomRange = customRange != null
|
||||
const { columns } = useWindowSize()
|
||||
const { dashWidth } = getLayout(columns)
|
||||
const multipleProviders = detectedProviders.length > 1
|
||||
|
|
@ -743,7 +769,14 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
|
||||
const switchPeriod = useCallback((np: Period) => {
|
||||
if (np === period) return
|
||||
// Clear projects + flip loading synchronously so the dashboard never
|
||||
// renders the new period label over the old period's numbers between
|
||||
// setPeriod() and the reloadData() promise resolving. Without this,
|
||||
// there's a frame-to-hundreds-of-ms window where users saw wrong
|
||||
// figures captioned with the new period.
|
||||
setPeriod(np)
|
||||
setProjects([])
|
||||
setLoading(true)
|
||||
if (debounceRef.current) clearTimeout(debounceRef.current)
|
||||
debounceRef.current = setTimeout(() => { reloadData(np, activeProvider) }, 600)
|
||||
}, [period, activeProvider, reloadData])
|
||||
|
|
@ -751,6 +784,8 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
const switchPeriodImmediate = useCallback(async (np: Period) => {
|
||||
if (np === period) return
|
||||
setPeriod(np)
|
||||
setProjects([])
|
||||
setLoading(true)
|
||||
if (debounceRef.current) clearTimeout(debounceRef.current)
|
||||
await reloadData(np, activeProvider)
|
||||
}, [period, activeProvider, reloadData])
|
||||
|
|
@ -758,7 +793,13 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
useInput((input, key) => {
|
||||
if (input === 'q') { exit(); return }
|
||||
if (input === 'o' && findingCount > 0 && view === 'dashboard' && optimizeAvailable) { setView('optimize'); return }
|
||||
if ((input === 'b' || key.escape) && view === 'optimize') { setView('dashboard'); return }
|
||||
if ((input === 'b' || key.escape) && view === 'optimize') { setView('dashboard'); setFindingsCursor(0); return }
|
||||
if (view === 'optimize') {
|
||||
const total = optimizeResult?.findings.length ?? 0
|
||||
const maxStart = Math.max(0, total - FINDINGS_WINDOW_SIZE)
|
||||
if (input === 'j' || key.downArrow) { setFindingsCursor(c => Math.min(c + 1, maxStart)); return }
|
||||
if (input === 'k' || key.upArrow) { setFindingsCursor(c => Math.max(c - 1, 0)); return }
|
||||
}
|
||||
if (input === 'c' && compareAvailable && view === 'dashboard') { setView('compare'); return }
|
||||
if ((input === 'b' || key.escape) && view === 'compare') { setView('dashboard'); return }
|
||||
if (input === 'p' && multipleProviders && view !== 'compare') {
|
||||
|
|
@ -772,6 +813,11 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
// `projects` and would visibly change underneath the user without any
|
||||
// affordance back to the dashboard. Press `b` or Esc to return first.
|
||||
if (view === 'compare') return
|
||||
// Also disable while a custom --from/--to range is in effect. Switching
|
||||
// period would silently abandon the user's explicit range and reload
|
||||
// standard period data; the period tab strip is hidden in this mode so
|
||||
// users have no expectation that 1-5 should do anything.
|
||||
if (isCustomRange) return
|
||||
const idx = PERIODS.indexOf(period)
|
||||
if (key.leftArrow) switchPeriod(PERIODS[(idx - 1 + PERIODS.length) % PERIODS.length]!)
|
||||
else if (key.rightArrow || key.tab) switchPeriod(PERIODS[(idx + 1) % PERIODS.length]!)
|
||||
|
|
@ -782,33 +828,46 @@ function InteractiveDashboard({ initialProjects, initialPeriod, initialProvider,
|
|||
else if (input === '5') switchPeriodImmediate('all')
|
||||
})
|
||||
|
||||
const headerLabel = customRangeLabel ?? PERIOD_LABELS[period]
|
||||
|
||||
if (loading) {
|
||||
return (
|
||||
<Box flexDirection="column" width={dashWidth}>
|
||||
<PeriodTabs active={period} providerName={activeProvider} showProvider={view !== 'compare' && multipleProviders} />
|
||||
{!isCustomRange && <PeriodTabs active={period} providerName={activeProvider} showProvider={view !== 'compare' && multipleProviders} />}
|
||||
{isCustomRange && <CustomRangeBanner label={headerLabel} width={dashWidth} />}
|
||||
{view === 'compare'
|
||||
? <Box flexDirection="column" paddingX={2} paddingY={1}>
|
||||
<Box flexDirection="column" borderStyle="round" borderColor={ORANGE} paddingX={1}>
|
||||
<Text bold color={ORANGE}>Model Comparison</Text>
|
||||
<Text> </Text>
|
||||
<Text dimColor>Loading {PERIOD_LABELS[period]} model data...</Text>
|
||||
<Text dimColor>Loading {headerLabel} model data...</Text>
|
||||
</Box>
|
||||
</Box>
|
||||
: <Panel title="CodeBurn" color={ORANGE} width={dashWidth}><Text dimColor>Loading {PERIOD_LABELS[period]}...</Text></Panel>}
|
||||
{view !== 'compare' && <StatusBar width={dashWidth} showProvider={multipleProviders} view={view} findingCount={0} optimizeAvailable={false} compareAvailable={false} />}
|
||||
: <Panel title="CodeBurn" color={ORANGE} width={dashWidth}><Text dimColor>Loading {headerLabel}...</Text></Panel>}
|
||||
{view !== 'compare' && <StatusBar width={dashWidth} showProvider={multipleProviders} view={view} findingCount={0} optimizeAvailable={false} compareAvailable={false} customRange={isCustomRange} />}
|
||||
</Box>
|
||||
)
|
||||
}
|
||||
|
||||
return (
|
||||
<Box flexDirection="column" width={dashWidth}>
|
||||
<PeriodTabs active={period} providerName={activeProvider} showProvider={multipleProviders && view !== 'compare'} />
|
||||
{!isCustomRange && <PeriodTabs active={period} providerName={activeProvider} showProvider={multipleProviders && view !== 'compare'} />}
|
||||
{isCustomRange && <CustomRangeBanner label={headerLabel} width={dashWidth} />}
|
||||
{view === 'compare'
|
||||
? <CompareView projects={projects} onBack={() => setView('dashboard')} />
|
||||
: view === 'optimize' && optimizeResult
|
||||
? <OptimizeView findings={optimizeResult.findings} costRate={optimizeResult.costRate} projects={projects} label={PERIOD_LABELS[period]} width={dashWidth} healthScore={optimizeResult.healthScore} healthGrade={optimizeResult.healthGrade} />
|
||||
? <OptimizeView findings={optimizeResult.findings} costRate={optimizeResult.costRate} projects={projects} label={headerLabel} width={dashWidth} healthScore={optimizeResult.healthScore} healthGrade={optimizeResult.healthGrade} cursor={findingsCursor} />
|
||||
: <DashboardContent projects={projects} period={period} columns={columns} activeProvider={activeProvider} budgets={projectBudgets} planUsage={planUsage} />}
|
||||
{view !== 'compare' && <StatusBar width={dashWidth} showProvider={multipleProviders} view={view} findingCount={findingCount} optimizeAvailable={optimizeAvailable} compareAvailable={compareAvailable} />}
|
||||
{view !== 'compare' && <StatusBar width={dashWidth} showProvider={multipleProviders} view={view} findingCount={findingCount} optimizeAvailable={optimizeAvailable} compareAvailable={compareAvailable} customRange={isCustomRange} />}
|
||||
</Box>
|
||||
)
|
||||
}
|
||||
|
||||
function CustomRangeBanner({ label, width }: { label: string; width: number }) {
|
||||
return (
|
||||
<Box width={width} paddingX={1} marginBottom={1}>
|
||||
<Text dimColor>Custom range: </Text>
|
||||
<Text color={ORANGE} bold>{label}</Text>
|
||||
</Box>
|
||||
)
|
||||
}
|
||||
|
|
@ -824,7 +883,7 @@ function StaticDashboard({ projects, period, activeProvider, planUsage }: { proj
|
|||
)
|
||||
}
|
||||
|
||||
export async function renderDashboard(period: Period = 'week', provider: string = 'all', refreshSeconds?: number, projectFilter?: string[], excludeFilter?: string[], customRange?: DateRange | null): Promise<void> {
|
||||
export async function renderDashboard(period: Period = 'week', provider: string = 'all', refreshSeconds?: number, projectFilter?: string[], excludeFilter?: string[], customRange?: DateRange | null, customRangeLabel?: string): Promise<void> {
|
||||
await loadPricing()
|
||||
const range = customRange ?? getPeriodRange(period)
|
||||
const filteredProjects = filterProjectsByName(await parseAllSessions(range, provider), projectFilter, excludeFilter)
|
||||
|
|
@ -833,7 +892,7 @@ export async function renderDashboard(period: Period = 'week', provider: string
|
|||
patchStdoutForWindows()
|
||||
if (isTTY) {
|
||||
const { waitUntilExit } = render(
|
||||
<InteractiveDashboard initialProjects={filteredProjects} initialPeriod={period} initialProvider={provider} initialPlanUsage={planUsage ?? undefined} refreshSeconds={refreshSeconds} projectFilter={projectFilter} excludeFilter={excludeFilter} />
|
||||
<InteractiveDashboard initialProjects={filteredProjects} initialPeriod={period} initialProvider={provider} initialPlanUsage={planUsage ?? undefined} refreshSeconds={refreshSeconds} projectFilter={projectFilter} excludeFilter={excludeFilter} customRange={customRange} customRangeLabel={customRangeLabel} />
|
||||
)
|
||||
await waitUntilExit()
|
||||
} else {
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@ import { writeFile, mkdir, readdir, open, stat, rm } from 'fs/promises'
|
|||
import { dirname, join, resolve } from 'path'
|
||||
|
||||
import { CATEGORY_LABELS, type ProjectSummary, type TaskCategory } from './types.js'
|
||||
import { getCurrency, convertCost } from './currency.js'
|
||||
import { getCurrency, convertCost, roundForActiveCurrency } from './currency.js'
|
||||
import { dateKey } from './day-aggregator.js'
|
||||
import { aggregateModelEfficiency } from './model-efficiency.js'
|
||||
|
||||
|
|
@ -70,7 +70,7 @@ function buildDailyRows(projects: ProjectSummary[], period: string): Row[] {
|
|||
return Object.entries(daily).sort().map(([date, d]) => ({
|
||||
Period: period,
|
||||
Date: date,
|
||||
[`Cost (${code})`]: round2(convertCost(d.cost)),
|
||||
[`Cost (${code})`]: roundForActiveCurrency(convertCost(d.cost)),
|
||||
'API Calls': d.calls,
|
||||
Sessions: d.sessions.size,
|
||||
'Input Tokens': d.input,
|
||||
|
|
@ -98,7 +98,7 @@ function buildActivityRows(projects: ProjectSummary[], period: string): Row[] {
|
|||
.map(([cat, d]) => ({
|
||||
Period: period,
|
||||
Activity: CATEGORY_LABELS[cat as TaskCategory] ?? cat,
|
||||
[`Cost (${code})`]: round2(convertCost(d.cost)),
|
||||
[`Cost (${code})`]: roundForActiveCurrency(convertCost(d.cost)),
|
||||
'Share (%)': pct(d.cost, totalCost),
|
||||
Turns: d.turns,
|
||||
}))
|
||||
|
|
@ -130,14 +130,14 @@ function buildModelRows(projects: ProjectSummary[], period: string): Row[] {
|
|||
return {
|
||||
Period: period,
|
||||
Model: model,
|
||||
[`Cost (${code})`]: round2(convertCost(d.cost)),
|
||||
[`Cost (${code})`]: roundForActiveCurrency(convertCost(d.cost)),
|
||||
'Share (%)': pct(d.cost, totalCost),
|
||||
'API Calls': d.calls,
|
||||
'Edit Turns': efficiency?.editTurns ?? 0,
|
||||
'One-shot Rate (%)': efficiency?.oneShotRate ?? '',
|
||||
'Retries/Edit': efficiency?.retriesPerEdit ?? '',
|
||||
[`Cost/Edit (${code})`]: efficiency?.costPerEditUSD !== null && efficiency?.costPerEditUSD !== undefined
|
||||
? round2(convertCost(efficiency.costPerEditUSD))
|
||||
? roundForActiveCurrency(convertCost(efficiency.costPerEditUSD))
|
||||
: '',
|
||||
'Input Tokens': d.input,
|
||||
'Output Tokens': d.output,
|
||||
|
|
@ -193,8 +193,8 @@ function buildProjectRows(projects: ProjectSummary[]): Row[] {
|
|||
.sort((a, b) => b.totalCostUSD - a.totalCostUSD)
|
||||
.map(p => ({
|
||||
Project: p.projectPath,
|
||||
[`Cost (${code})`]: round2(convertCost(p.totalCostUSD)),
|
||||
[`Avg/Session (${code})`]: p.sessions.length > 0 ? round2(convertCost(p.totalCostUSD / p.sessions.length)) : '',
|
||||
[`Cost (${code})`]: roundForActiveCurrency(convertCost(p.totalCostUSD)),
|
||||
[`Avg/Session (${code})`]: p.sessions.length > 0 ? roundForActiveCurrency(convertCost(p.totalCostUSD / p.sessions.length)) : '',
|
||||
'Share (%)': pct(p.totalCostUSD, total),
|
||||
'API Calls': p.totalApiCalls,
|
||||
Sessions: p.sessions.length,
|
||||
|
|
@ -210,7 +210,7 @@ function buildSessionRows(projects: ProjectSummary[]): Row[] {
|
|||
Project: p.projectPath,
|
||||
'Session ID': s.sessionId,
|
||||
'Started At': s.firstTimestamp ?? '',
|
||||
[`Cost (${code})`]: round2(convertCost(s.totalCostUSD)),
|
||||
[`Cost (${code})`]: roundForActiveCurrency(convertCost(s.totalCostUSD)),
|
||||
'API Calls': s.apiCalls,
|
||||
Turns: s.turns.length,
|
||||
})
|
||||
|
|
@ -233,7 +233,7 @@ function buildSummaryRows(periods: PeriodExport[]): Row[] {
|
|||
const projectCount = p.projects.filter(proj => proj.totalCostUSD > 0).length
|
||||
return {
|
||||
Period: p.label,
|
||||
[`Cost (${code})`]: round2(convertCost(cost)),
|
||||
[`Cost (${code})`]: roundForActiveCurrency(convertCost(cost)),
|
||||
'API Calls': calls,
|
||||
Sessions: sessions,
|
||||
Projects: projectCount,
|
||||
|
|
|
|||
|
|
@ -65,13 +65,29 @@ function getCachePath(): string {
|
|||
return join(getCacheDir(), 'litellm-pricing.json')
|
||||
}
|
||||
|
||||
/// Clamp a per-token rate to a sane non-negative value. Defense in depth
|
||||
/// against a tampered LiteLLM JSON shipping a negative `input_cost_per_token`,
|
||||
/// which would otherwise produce negative costs that subtract from totals.
|
||||
/// We use Number.isFinite to also reject NaN/Infinity, and cap at $1/token
|
||||
/// (well above the most expensive frontier model) so a stray decimal-place
|
||||
/// shift in the upstream JSON can't wildly inflate spend numbers either.
|
||||
function safePerTokenRate(n: number | undefined): number | null {
|
||||
if (n === undefined || !Number.isFinite(n) || n < 0) return null
|
||||
if (n > 1) return 1
|
||||
return n
|
||||
}
|
||||
|
||||
function parseLiteLLMEntry(entry: LiteLLMEntry): ModelCosts | null {
|
||||
if (entry.input_cost_per_token === undefined || entry.output_cost_per_token === undefined) return null
|
||||
const inputCost = safePerTokenRate(entry.input_cost_per_token)
|
||||
const outputCost = safePerTokenRate(entry.output_cost_per_token)
|
||||
if (inputCost === null || outputCost === null) return null
|
||||
const cacheWrite = safePerTokenRate(entry.cache_creation_input_token_cost) ?? inputCost * 1.25
|
||||
const cacheRead = safePerTokenRate(entry.cache_read_input_token_cost) ?? inputCost * 0.1
|
||||
return {
|
||||
inputCostPerToken: entry.input_cost_per_token,
|
||||
outputCostPerToken: entry.output_cost_per_token,
|
||||
cacheWriteCostPerToken: entry.cache_creation_input_token_cost ?? entry.input_cost_per_token * 1.25,
|
||||
cacheReadCostPerToken: entry.cache_read_input_token_cost ?? entry.input_cost_per_token * 0.1,
|
||||
inputCostPerToken: inputCost,
|
||||
outputCostPerToken: outputCost,
|
||||
cacheWriteCostPerToken: cacheWrite,
|
||||
cacheReadCostPerToken: cacheRead,
|
||||
webSearchCostPerRequest: WEB_SEARCH_COST,
|
||||
fastMultiplier: entry.provider_specific_entry?.fast ?? 1,
|
||||
}
|
||||
|
|
|
|||
|
|
@ -203,7 +203,13 @@ function createParser(source: SessionSource, seenKeys: Set<string>): SessionPars
|
|||
|
||||
let sessionModel: string | undefined
|
||||
let sessionId = ''
|
||||
let prevCumulativeTotal = 0
|
||||
// Null sentinel rather than `0` so the FIRST event is never confused
|
||||
// with a duplicate. A session that only emits last_token_usage (no
|
||||
// total_token_usage) reports cumulativeTotal=0 on every event; with a
|
||||
// 0-initialized prev, the first event would have matched and been
|
||||
// dropped. Once we've observed any event, we record its cumulative
|
||||
// total and dedup on equality regardless of whether it is zero.
|
||||
let prevCumulativeTotal: number | null = null
|
||||
let prevInput = 0
|
||||
let prevCached = 0
|
||||
let prevOutput = 0
|
||||
|
|
@ -315,7 +321,12 @@ function createParser(source: SessionSource, seenKeys: Set<string>): SessionPars
|
|||
}
|
||||
|
||||
const cumulativeTotal = info.total_token_usage?.total_tokens ?? 0
|
||||
if (cumulativeTotal > 0 && cumulativeTotal === prevCumulativeTotal) continue
|
||||
// Dedup guard. Two consecutive events with cumulativeTotal=0 but
|
||||
// non-empty last_token_usage would have been double-counted with
|
||||
// the previous `> 0` clause. The null sentinel ensures the FIRST
|
||||
// event always passes (so a session that never reports cumulative
|
||||
// doesn't lose its opening turn).
|
||||
if (prevCumulativeTotal !== null && cumulativeTotal === prevCumulativeTotal) continue
|
||||
prevCumulativeTotal = cumulativeTotal
|
||||
|
||||
const last = info.last_token_usage
|
||||
|
|
|
|||
|
|
@ -66,11 +66,19 @@ type LegacyToolRequest = {
|
|||
type?: string
|
||||
}
|
||||
|
||||
// Per-event-type shapes. The previous union included a permissive catch-all
|
||||
// branch (`{ type: string; data: Record<string, unknown> }`); a literal type
|
||||
// like `'user.message'` is assignable to `string`, so TS picked the catch-all
|
||||
// over the specific branches when narrowing on `type`, which propagated
|
||||
// `unknown`/`{}` into `event.data.content` etc. We now keep only the three
|
||||
// shapes we actually read from. Unknown event types fall through the if/else
|
||||
// chain without further narrowing — they are not in the union, but JSON.parse
|
||||
// returns `any` so we re-type as LegacyCopilotEvent and let the runtime type
|
||||
// guards (`event.type === 'X'`) ignore anything else.
|
||||
type LegacyCopilotEvent =
|
||||
| { type: 'session.model_change'; timestamp?: string; data: { newModel: string } }
|
||||
| { type: 'user.message'; timestamp?: string; data: { content: string; interactionId?: string } }
|
||||
| { type: 'assistant.message'; timestamp?: string; data: { messageId: string; outputTokens: number; interactionId?: string; toolRequests?: LegacyToolRequest[] } }
|
||||
| { type: string; timestamp?: string; data: Record<string, unknown> }
|
||||
| { type: 'session.model_change'; timestamp?: string; data: { newModel: string; model?: string } }
|
||||
| { type: 'user.message'; timestamp?: string; data: { content: string; interactionId?: string; model?: string } }
|
||||
| { type: 'assistant.message'; timestamp?: string; data: { messageId: string; outputTokens: number; interactionId?: string; toolRequests?: LegacyToolRequest[]; model?: string } }
|
||||
|
||||
function parseLegacyEvents(content: string, sessionId: string, seenKeys: Set<string>): ParsedProviderCall[] {
|
||||
const results: ParsedProviderCall[] = []
|
||||
|
|
@ -103,7 +111,7 @@ function parseLegacyEvents(content: string, sessionId: string, seenKeys: Set<str
|
|||
}
|
||||
|
||||
if (event.type === 'assistant.message') {
|
||||
const { messageId, outputTokens, toolRequests = [] } = event.data
|
||||
const { messageId, outputTokens, toolRequests: rawToolRequests } = event.data
|
||||
if (outputTokens === 0) continue
|
||||
if (!currentModel) continue
|
||||
|
||||
|
|
@ -111,6 +119,11 @@ function parseLegacyEvents(content: string, sessionId: string, seenKeys: Set<str
|
|||
if (seenKeys.has(dedupKey)) continue
|
||||
seenKeys.add(dedupKey)
|
||||
|
||||
// Defensive: legacy / corrupt sessions have shipped toolRequests as a
|
||||
// string, null, or missing. Without this guard, .map throws and aborts
|
||||
// the whole file's parse loop, silently dropping every legitimate call
|
||||
// that follows the bad event.
|
||||
const toolRequests = Array.isArray(rawToolRequests) ? rawToolRequests : []
|
||||
const tools = toolRequests
|
||||
.map(t => t.name ?? '')
|
||||
.filter(Boolean)
|
||||
|
|
@ -242,7 +255,10 @@ function parseTranscriptEvents(content: string, sessionId: string, seenKeys: Set
|
|||
|
||||
const inputTokens = Math.ceil(pendingUserMessage.length / CHARS_PER_TOKEN)
|
||||
|
||||
const tools = (data.toolRequests ?? [])
|
||||
// Same defensive guard as the modern event branch — corrupt legacy
|
||||
// sessions have shipped toolRequests as non-array values.
|
||||
const legacyToolRequests = Array.isArray(data.toolRequests) ? data.toolRequests : []
|
||||
const tools = legacyToolRequests
|
||||
.map(t => t.name ?? '')
|
||||
.filter(Boolean)
|
||||
.map(n => toolNameMap[n] ?? n)
|
||||
|
|
|
|||
|
|
@ -140,10 +140,16 @@ const USER_MESSAGES_QUERY = `
|
|||
ORDER BY ROWID ASC
|
||||
`
|
||||
|
||||
const BUBBLE_QUERY_SINCE = BUBBLE_QUERY_BASE + `
|
||||
AND (json_extract(value, '$.createdAt') > ? OR json_extract(value, '$.createdAt') IS NULL)
|
||||
// Split into HEAD (predicates we always emit) and TAIL (ORDER BY) so the
|
||||
// caller can splice in an optional `ROWID >= ?` cutoff without rewriting
|
||||
// the whole template. The original combined string is preserved as
|
||||
// BUBBLE_QUERY_SINCE for any caller that doesn't want the cap.
|
||||
const BUBBLE_QUERY_SINCE_HEAD = BUBBLE_QUERY_BASE + `
|
||||
AND (json_extract(value, '$.createdAt') > ? OR json_extract(value, '$.createdAt') IS NULL)`
|
||||
const BUBBLE_QUERY_SINCE_TAIL = `
|
||||
ORDER BY ROWID ASC
|
||||
`
|
||||
const BUBBLE_QUERY_SINCE = BUBBLE_QUERY_SINCE_HEAD + BUBBLE_QUERY_SINCE_TAIL
|
||||
|
||||
function validateSchema(db: SqliteDatabase): boolean {
|
||||
try {
|
||||
|
|
@ -158,20 +164,40 @@ function validateSchema(db: SqliteDatabase): boolean {
|
|||
|
||||
type UserMsgRow = { conversation_id: string; created_at: string; text: string }
|
||||
|
||||
function buildUserMessageMap(db: SqliteDatabase, timeFloor: string): Map<string, string[]> {
|
||||
const map = new Map<string, string[]>()
|
||||
/// Per-conversation user-message buffer. We pop messages in arrival order via
|
||||
/// the `pos` cursor — a previous implementation called Array.shift() which is
|
||||
/// O(n) per call on large conversations and pinned multi-GB Cursor DBs at
|
||||
/// minutes-of-parse for power users. The cursor walk is O(1).
|
||||
type UserMessageQueue = {
|
||||
messages: string[]
|
||||
pos: number
|
||||
}
|
||||
|
||||
function buildUserMessageMap(db: SqliteDatabase, timeFloor: string): Map<string, UserMessageQueue> {
|
||||
const map = new Map<string, UserMessageQueue>()
|
||||
try {
|
||||
const rows = db.query<UserMsgRow>(USER_MESSAGES_QUERY, [timeFloor])
|
||||
for (const row of rows) {
|
||||
if (!row.conversation_id || !row.text) continue
|
||||
const existing = map.get(row.conversation_id) ?? []
|
||||
existing.push(row.text)
|
||||
map.set(row.conversation_id, existing)
|
||||
const existing = map.get(row.conversation_id)
|
||||
if (existing) {
|
||||
existing.messages.push(row.text)
|
||||
} else {
|
||||
map.set(row.conversation_id, { messages: [row.text], pos: 0 })
|
||||
}
|
||||
}
|
||||
} catch {}
|
||||
return map
|
||||
}
|
||||
|
||||
function takeUserMessage(queues: Map<string, UserMessageQueue>, conversationId: string): string {
|
||||
const queue = queues.get(conversationId)
|
||||
if (!queue || queue.pos >= queue.messages.length) return ''
|
||||
const msg = queue.messages[queue.pos]
|
||||
queue.pos += 1
|
||||
return msg
|
||||
}
|
||||
|
||||
function parseBubbles(db: SqliteDatabase, seenKeys: Set<string>): { calls: ParsedProviderCall[] } {
|
||||
const results: ParsedProviderCall[] = []
|
||||
let skipped = 0
|
||||
|
|
@ -179,11 +205,53 @@ function parseBubbles(db: SqliteDatabase, seenKeys: Set<string>): { calls: Parse
|
|||
const LOOKBACK_DAYS = 180
|
||||
const timeFloor = new Date(Date.now() - LOOKBACK_DAYS * 24 * 60 * 60 * 1000).toISOString()
|
||||
|
||||
// Hard cap on rows to scan. The BUBBLE_QUERY_SINCE filter relies on
|
||||
// json_extract over the value BLOB, which SQLite cannot serve from an
|
||||
// index — every row is JSON-decoded. Multi-GB Cursor DBs (power users,
|
||||
// years of usage) regularly exceed 500k bubble rows and were producing
|
||||
// 30s+ parse stalls. Compute a ROWID cutoff that limits the scan to the
|
||||
// MAX_BUBBLES most-recent bubbles when the user is over the cap, and
|
||||
// warn so they know older sessions may be missing.
|
||||
const MAX_BUBBLES = 250_000
|
||||
let rowIdCutoff = 0
|
||||
try {
|
||||
const countRows = db.query<{ cnt: number }>(
|
||||
"SELECT COUNT(*) as cnt FROM cursorDiskKV WHERE key LIKE 'bubbleId:%'"
|
||||
)
|
||||
const total = countRows[0]?.cnt ?? 0
|
||||
if (total > MAX_BUBBLES) {
|
||||
// Find the ROWID of the (MAX_BUBBLES)th most-recent bubble. Anything
|
||||
// below this rowid is older and gets skipped. Bubbles are written
|
||||
// chronologically so ROWID order ≈ insertion order.
|
||||
const cutoffRows = db.query<{ rid: number }>(
|
||||
`SELECT MIN(rid) as rid FROM (
|
||||
SELECT ROWID as rid FROM cursorDiskKV
|
||||
WHERE key LIKE 'bubbleId:%'
|
||||
ORDER BY ROWID DESC
|
||||
LIMIT ?
|
||||
)`,
|
||||
[MAX_BUBBLES]
|
||||
)
|
||||
rowIdCutoff = cutoffRows[0]?.rid ?? 0
|
||||
process.stderr.write(
|
||||
`codeburn: Cursor database has ${total.toLocaleString()} bubbles, ` +
|
||||
`scanning the most recent ${MAX_BUBBLES.toLocaleString()}. ` +
|
||||
`Older sessions may be missing from this report.\n`
|
||||
)
|
||||
}
|
||||
} catch { /* best-effort diagnostic */ }
|
||||
|
||||
const userMessages = buildUserMessageMap(db, timeFloor)
|
||||
|
||||
// Append the rowid cutoff when active. Empty string when not capped so the
|
||||
// query string compares identically to the un-capped version on small DBs.
|
||||
const rowIdFilter = rowIdCutoff > 0 ? ' AND ROWID >= ?' : ''
|
||||
const params: unknown[] = rowIdCutoff > 0 ? [timeFloor, rowIdCutoff] : [timeFloor]
|
||||
const cappedQuery = BUBBLE_QUERY_SINCE_HEAD + rowIdFilter + BUBBLE_QUERY_SINCE_TAIL
|
||||
|
||||
let rows: BubbleRow[]
|
||||
try {
|
||||
rows = db.query<BubbleRow>(BUBBLE_QUERY_SINCE, [timeFloor])
|
||||
rows = db.query<BubbleRow>(cappedQuery, params)
|
||||
} catch {
|
||||
return { calls: results }
|
||||
}
|
||||
|
|
@ -222,8 +290,7 @@ function parseBubbles(db: SqliteDatabase, seenKeys: Set<string>): { calls: Parse
|
|||
const costUSD = calculateCost(pricingModel, inputTokens, outputTokens, 0, 0, 0)
|
||||
|
||||
const timestamp = createdAt || new Date().toISOString()
|
||||
const convMessages = userMessages.get(conversationId) ?? []
|
||||
const userQuestion = convMessages.length > 0 ? convMessages.shift()! : ''
|
||||
const userQuestion = takeUserMessage(userMessages, conversationId)
|
||||
const assistantText = row.user_text ?? ''
|
||||
const userText = (userQuestion + ' ' + assistantText).trim()
|
||||
|
||||
|
|
|
|||
|
|
@ -92,18 +92,42 @@ function parseTimestamp(raw: number): string {
|
|||
return new Date(ms).toISOString()
|
||||
}
|
||||
|
||||
function validateSchema(db: SqliteDatabase): boolean {
|
||||
try {
|
||||
db.query<{ cnt: number }>(
|
||||
"SELECT COUNT(*) as cnt FROM session LIMIT 1"
|
||||
)
|
||||
db.query<{ cnt: number }>(
|
||||
"SELECT COUNT(*) as cnt FROM message LIMIT 1"
|
||||
)
|
||||
return true
|
||||
} catch {
|
||||
return false
|
||||
type SchemaCheckResult =
|
||||
| { ok: true }
|
||||
| { ok: false; missing: string[] }
|
||||
|
||||
/// Inspects OpenCode's SQLite schema. Returns the list of expected tables that
|
||||
/// are missing rather than just a boolean so the caller can produce an actionable
|
||||
/// warning ("missing 'part' table") instead of a generic "format not recognized".
|
||||
/// Only emits the warning when meaningful tables are absent — a brand-new
|
||||
/// OpenCode install with an empty DB but valid schema does NOT trigger it.
|
||||
function validateSchemaDetailed(db: SqliteDatabase): SchemaCheckResult {
|
||||
const required = ['session', 'message', 'part']
|
||||
const missing: string[] = []
|
||||
for (const table of required) {
|
||||
try {
|
||||
db.query<{ cnt: number }>(`SELECT COUNT(*) as cnt FROM ${table} LIMIT 1`)
|
||||
} catch {
|
||||
missing.push(table)
|
||||
}
|
||||
}
|
||||
return missing.length === 0 ? { ok: true } : { ok: false, missing }
|
||||
}
|
||||
|
||||
function validateSchema(db: SqliteDatabase): boolean {
|
||||
return validateSchemaDetailed(db).ok
|
||||
}
|
||||
|
||||
const warnedOpenCodeSchemas = new Set<string>()
|
||||
|
||||
function warnUnrecognizedOpenCodeSchemaOnce(missing: string[]): void {
|
||||
const key = missing.slice().sort().join(',')
|
||||
if (warnedOpenCodeSchemas.has(key)) return
|
||||
warnedOpenCodeSchemas.add(key)
|
||||
process.stderr.write(
|
||||
`codeburn: OpenCode database is missing expected tables (${missing.join(', ')}). ` +
|
||||
`Run OpenCode once to apply migrations, or report at https://github.com/getagentseal/codeburn/issues if this persists on a current OpenCode install.\n`
|
||||
)
|
||||
}
|
||||
|
||||
function createParser(
|
||||
|
|
@ -133,8 +157,14 @@ function createParser(
|
|||
}
|
||||
|
||||
try {
|
||||
if (!validateSchema(db)) {
|
||||
process.stderr.write('codeburn: OpenCode storage format not recognized. You may need to update CodeBurn.\n')
|
||||
const schema = validateSchemaDetailed(db)
|
||||
if (!schema.ok) {
|
||||
// Warn at most once per process per missing-table set so a directory
|
||||
// with a half-migrated OpenCode DB doesn't spam stderr on every
|
||||
// session iteration. Show which tables we couldn't find so the
|
||||
// user (or a triage agent) knows whether to re-run OpenCode's
|
||||
// migration or report a CodeBurn schema gap.
|
||||
warnUnrecognizedOpenCodeSchemaOnce(schema.missing)
|
||||
return
|
||||
}
|
||||
|
||||
|
|
|
|||
104
tests/currency-rounding.test.ts
Normal file
104
tests/currency-rounding.test.ts
Normal file
|
|
@ -0,0 +1,104 @@
|
|||
import { describe, it, expect, beforeEach, afterEach } from 'vitest'
|
||||
import { convertCost, roundForActiveCurrency, getFractionDigits } from '../src/currency.js'
|
||||
import { CurrencyState } from '../src/currency.js'
|
||||
import * as currencyMod from '../src/currency.js'
|
||||
|
||||
// We poke the module-level state directly via switchCurrency for these tests.
|
||||
// Each test restores USD afterwards so it doesn't bleed.
|
||||
async function setActive(code: string, rate: number): Promise<void> {
|
||||
// switchCurrency does network + persistence; for unit tests we set the
|
||||
// active state directly via the module's internal state. Since the module
|
||||
// doesn't expose a setter, we go through getCurrency()'s state and patch.
|
||||
// Instead use the public switchCurrency only when offline: nope, just
|
||||
// exploit the fact that the module exports `getCurrency` which returns a
|
||||
// ref. We can't easily mock fetch. So we test only convertCost (which uses
|
||||
// active.rate) and rounding helpers — both pure functions of the state.
|
||||
const state = currencyMod.getCurrency()
|
||||
// @ts-expect-error — directly mutating for test
|
||||
state.code = code
|
||||
// @ts-expect-error
|
||||
state.rate = rate
|
||||
// @ts-expect-error
|
||||
state.symbol = code
|
||||
}
|
||||
|
||||
beforeEach(async () => {
|
||||
await setActive('USD', 1)
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
await setActive('USD', 1)
|
||||
})
|
||||
|
||||
describe('convertCost — no rounding contract', () => {
|
||||
it('returns unrounded float for USD (rate=1)', () => {
|
||||
expect(convertCost(1.234567)).toBe(1.234567)
|
||||
expect(convertCost(0.001)).toBe(0.001)
|
||||
})
|
||||
|
||||
it('returns unrounded float for non-USD currencies', async () => {
|
||||
await setActive('JPY', 150)
|
||||
// 1 USD * 150 = 150, but a fractional input must NOT be rounded by convertCost.
|
||||
expect(convertCost(0.123456)).toBeCloseTo(18.5184, 4)
|
||||
expect(convertCost(1.5)).toBe(225)
|
||||
})
|
||||
|
||||
it('rounding is the caller\'s responsibility (display vs export)', async () => {
|
||||
// Regression guard: previously convertCost did its own rounding which
|
||||
// produced ¥412.37 in CSV exports while the dashboard rendered ¥412.
|
||||
// Confirm we now return the raw value and the caller decides.
|
||||
await setActive('JPY', 150)
|
||||
const raw = convertCost(2.7491)
|
||||
expect(raw).toBe(412.365) // unrounded
|
||||
expect(roundForActiveCurrency(raw)).toBe(412) // currency-aware rounding for export
|
||||
})
|
||||
})
|
||||
|
||||
describe('roundForActiveCurrency', () => {
|
||||
it('USD rounds to 2 decimals', async () => {
|
||||
await setActive('USD', 1)
|
||||
expect(roundForActiveCurrency(1.2345)).toBe(1.23)
|
||||
expect(roundForActiveCurrency(1.235)).toBeCloseTo(1.24, 2)
|
||||
expect(roundForActiveCurrency(0.005)).toBe(0.01)
|
||||
})
|
||||
|
||||
it('JPY rounds to whole numbers', async () => {
|
||||
await setActive('JPY', 150)
|
||||
expect(roundForActiveCurrency(412.37)).toBe(412)
|
||||
expect(roundForActiveCurrency(412.5)).toBe(413)
|
||||
expect(roundForActiveCurrency(0.4)).toBe(0)
|
||||
})
|
||||
|
||||
it('KRW rounds to whole numbers', async () => {
|
||||
await setActive('KRW', 1300)
|
||||
expect(roundForActiveCurrency(15999.7)).toBe(16000)
|
||||
})
|
||||
|
||||
it('EUR rounds to 2 decimals like USD', async () => {
|
||||
await setActive('EUR', 0.92)
|
||||
expect(roundForActiveCurrency(1.2345)).toBe(1.23)
|
||||
})
|
||||
|
||||
it('matches the display contract: roundForActiveCurrency(convertCost(x)) is what users see', async () => {
|
||||
await setActive('JPY', 150)
|
||||
// Dashboard displays via formatCost which uses getFractionDigits=0 for JPY.
|
||||
// CSV exports must produce the same integer value, not a 2-decimal float.
|
||||
expect(roundForActiveCurrency(convertCost(2.75))).toBe(413)
|
||||
expect(roundForActiveCurrency(convertCost(2.745))).toBe(412)
|
||||
})
|
||||
})
|
||||
|
||||
describe('getFractionDigits', () => {
|
||||
it('returns 0 for zero-fraction currencies', () => {
|
||||
expect(getFractionDigits('JPY')).toBe(0)
|
||||
expect(getFractionDigits('KRW')).toBe(0)
|
||||
expect(getFractionDigits('CLP')).toBe(0)
|
||||
})
|
||||
|
||||
it('returns 2 for typical currencies', () => {
|
||||
expect(getFractionDigits('USD')).toBe(2)
|
||||
expect(getFractionDigits('EUR')).toBe(2)
|
||||
expect(getFractionDigits('GBP')).toBe(2)
|
||||
expect(getFractionDigits('INR')).toBe(2)
|
||||
})
|
||||
})
|
||||
|
|
@ -56,6 +56,24 @@ describe('parseDateRangeFlags', () => {
|
|||
.toThrow('Invalid date format')
|
||||
})
|
||||
|
||||
it('rejects month/day overflow instead of silently rolling forward', () => {
|
||||
// Without overflow validation, JS Date silently turns Feb 31 into Mar 3
|
||||
// and 13/32 into 02/01 of the following year. That made `--from
|
||||
// 2026-02-31 --to 2026-03-15` quietly drop sessions on Feb 28 - Mar 2.
|
||||
expect(() => parseDateRangeFlags('2026-02-31', '2026-03-15'))
|
||||
.toThrow('Invalid date "2026-02-31"')
|
||||
expect(() => parseDateRangeFlags('2026-13-01', undefined))
|
||||
.toThrow('Invalid date "2026-13-01"')
|
||||
expect(() => parseDateRangeFlags('2026-04-31', undefined))
|
||||
.toThrow('Invalid date "2026-04-31"')
|
||||
expect(() => parseDateRangeFlags(undefined, '2026-02-30'))
|
||||
.toThrow('Invalid date "2026-02-30"')
|
||||
// Leap-day check: 2024 is a leap year, 2025 is not.
|
||||
expect(parseDateRangeFlags('2024-02-29', '2024-03-01')).not.toBeNull()
|
||||
expect(() => parseDateRangeFlags('2025-02-29', undefined))
|
||||
.toThrow('Invalid date "2025-02-29"')
|
||||
})
|
||||
|
||||
it('same day is valid (start midnight, end 23:59:59)', () => {
|
||||
const range = parseDateRangeFlags('2026-04-10', '2026-04-10')
|
||||
expect(range).not.toBeNull()
|
||||
|
|
|
|||
|
|
@ -310,4 +310,65 @@ describe('codex provider - JSONL parsing', () => {
|
|||
expect(calls[0]!.inputTokens).toBe(500)
|
||||
expect(calls[1]!.inputTokens).toBe(300)
|
||||
})
|
||||
|
||||
it('does not drop the first event when total_token_usage is omitted (cumulativeTotal=0)', async () => {
|
||||
// Regression for the prevCumulativeTotal-initialized-to-0 bug. Sessions
|
||||
// that emit only last_token_usage (no total_token_usage) report
|
||||
// cumulativeTotal=0 on every event. With a 0-initialized prev, the first
|
||||
// event matched the dedup guard and was silently dropped, losing the
|
||||
// session's opening turn. The null sentinel fixes this.
|
||||
const filePath = await writeSession(tmpDir, '2026-04-14', 'rollout-zero-total.jsonl', [
|
||||
sessionMeta(),
|
||||
tokenCount({
|
||||
timestamp: '2026-04-14T10:01:00Z',
|
||||
last: { input: 500, output: 200 },
|
||||
// No `total` — info.total_token_usage will be undefined.
|
||||
}),
|
||||
tokenCount({
|
||||
timestamp: '2026-04-14T10:01:01Z',
|
||||
last: { input: 100, output: 50 },
|
||||
}),
|
||||
])
|
||||
|
||||
const provider = createCodexProvider(tmpDir)
|
||||
const source = { path: filePath, project: 'test', provider: 'codex' }
|
||||
const parser = provider.createSessionParser(source, new Set())
|
||||
const calls: ParsedProviderCall[] = []
|
||||
for await (const call of parser.parse()) {
|
||||
calls.push(call)
|
||||
}
|
||||
|
||||
// Both events should produce calls — the first with input=500, second
|
||||
// with input=100. With the buggy 0-init, only the second would survive
|
||||
// (or neither, depending on equality timing).
|
||||
expect(calls.length).toBeGreaterThanOrEqual(1)
|
||||
expect(calls[0]!.inputTokens).toBe(500)
|
||||
})
|
||||
|
||||
it('still dedups consecutive zero-cumulative duplicates', async () => {
|
||||
// The other half of the regression: two consecutive events with the
|
||||
// same cumulativeTotal (here both 0 because total_token_usage is
|
||||
// omitted) and identical last_token_usage must NOT both ingest. The
|
||||
// second is a duplicate.
|
||||
const filePath = await writeSession(tmpDir, '2026-04-14', 'rollout-zero-dup.jsonl', [
|
||||
sessionMeta(),
|
||||
tokenCount({
|
||||
timestamp: '2026-04-14T10:01:00Z',
|
||||
last: { input: 500, output: 200 },
|
||||
}),
|
||||
tokenCount({
|
||||
timestamp: '2026-04-14T10:01:01Z',
|
||||
last: { input: 500, output: 200 },
|
||||
}),
|
||||
])
|
||||
|
||||
const provider = createCodexProvider(tmpDir)
|
||||
const source = { path: filePath, project: 'test', provider: 'codex' }
|
||||
const parser = provider.createSessionParser(source, new Set())
|
||||
const calls: ParsedProviderCall[] = []
|
||||
for await (const call of parser.parse()) {
|
||||
calls.push(call)
|
||||
}
|
||||
expect(calls).toHaveLength(1)
|
||||
})
|
||||
})
|
||||
|
|
|
|||
|
|
@ -126,6 +126,47 @@ describe('copilot provider - JSONL parsing', () => {
|
|||
expect(calls[0]!.tools).toEqual(['Bash', 'Read', 'Edit'])
|
||||
})
|
||||
|
||||
it('does not crash on malformed toolRequests (string / null / missing)', async () => {
|
||||
// Regression guard: a corrupt session previously aborted the whole file's
|
||||
// parse loop because .map was called on a non-array. The fix coerces any
|
||||
// non-array shape (string, null, missing) to []. We mix one corrupt event
|
||||
// between two healthy events and assert both healthy events still parse.
|
||||
const corruptToolRequestsString = JSON.stringify({
|
||||
type: 'assistant.message',
|
||||
timestamp: '2026-04-15T10:00:15Z',
|
||||
data: { messageId: 'corrupt-string', outputTokens: 50, toolRequests: 'not an array' },
|
||||
})
|
||||
const corruptToolRequestsNull = JSON.stringify({
|
||||
type: 'assistant.message',
|
||||
timestamp: '2026-04-15T10:00:16Z',
|
||||
data: { messageId: 'corrupt-null', outputTokens: 50, toolRequests: null },
|
||||
})
|
||||
const eventsPath = await createSessionDir('sess-corrupt', [
|
||||
modelChange('gpt-4.1'),
|
||||
assistantMessage({ messageId: 'msg-before', outputTokens: 100 }),
|
||||
corruptToolRequestsString,
|
||||
corruptToolRequestsNull,
|
||||
assistantMessage({ messageId: 'msg-after', outputTokens: 200 }),
|
||||
])
|
||||
|
||||
const source = { path: eventsPath, project: 'test', provider: 'copilot' }
|
||||
const calls: ParsedProviderCall[] = []
|
||||
for await (const call of copilot.createSessionParser(source, new Set()).parse()) calls.push(call)
|
||||
|
||||
// The healthy messages BEFORE and AFTER the corrupt events both parse —
|
||||
// proving that the corrupt event no longer aborts the per-file parse loop.
|
||||
// Pre-fix, .map on a non-array threw and we'd see < 4 calls.
|
||||
expect(calls).toHaveLength(4)
|
||||
expect(calls.find(c => c.outputTokens === 100)).toBeDefined() // msg-before
|
||||
expect(calls.find(c => c.outputTokens === 200)).toBeDefined() // msg-after
|
||||
// Corrupt events produce calls with empty tools, not crashes.
|
||||
const corruptCalls = calls.filter(c => c.outputTokens === 50)
|
||||
expect(corruptCalls.length).toBe(2)
|
||||
for (const c of corruptCalls) {
|
||||
expect(c.tools).toEqual([])
|
||||
}
|
||||
})
|
||||
|
||||
it('skips assistant messages with zero outputTokens', async () => {
|
||||
const eventsPath = await createSessionDir('sess-004', [
|
||||
modelChange('gpt-4.1'),
|
||||
|
|
|
|||
176
tests/providers/cursor-bubble-dedup.test.ts
Normal file
176
tests/providers/cursor-bubble-dedup.test.ts
Normal file
|
|
@ -0,0 +1,176 @@
|
|||
import { describe, it, expect, beforeEach, afterEach } from 'vitest'
|
||||
import { mkdtemp, rm, writeFile } from 'fs/promises'
|
||||
import { tmpdir } from 'os'
|
||||
import { join } from 'path'
|
||||
|
||||
import { isSqliteAvailable, openDatabase } from '../../src/sqlite.js'
|
||||
import { getAllProviders } from '../../src/providers/index.js'
|
||||
import type { Provider, ParsedProviderCall } from '../../src/providers/types.js'
|
||||
|
||||
/// Pinned regression for the v3 bubble-dedup fix. The previous (v2) code used
|
||||
/// the bubble row's mutable token counts as part of the deduplication key, so
|
||||
/// the same bubble was counted twice once Cursor wrote the streaming-complete
|
||||
/// final token totals on top of the streaming-in-progress row. v3 switched to
|
||||
/// the SQLite primary `key` column (which is the stable bubbleId:<id>:<id>
|
||||
/// path) so re-parsing the same DB after token updates produces zero new
|
||||
/// calls. This test:
|
||||
/// 1. Builds a tmp SQLite DB with the cursorDiskKV schema and one bubble row
|
||||
/// with low token counts (the streaming-in-progress shape).
|
||||
/// 2. Parses it through the cursor provider. Asserts one call.
|
||||
/// 3. Mutates the row in place to higher token counts (the streaming-complete
|
||||
/// shape) without changing the SQLite key.
|
||||
/// 4. Re-parses with the SAME seenKeys set. Asserts zero new calls.
|
||||
/// If a future refactor brings back token-count-based dedup, the second parse
|
||||
/// will produce a duplicate call and this test will fail.
|
||||
|
||||
const skipReason = isSqliteAvailable()
|
||||
? null
|
||||
: 'node:sqlite not available — needs Node 22+; skipping'
|
||||
|
||||
let tmpDir: string
|
||||
|
||||
beforeEach(async () => {
|
||||
tmpDir = await mkdtemp(join(tmpdir(), 'cursor-dedup-'))
|
||||
})
|
||||
|
||||
afterEach(async () => {
|
||||
await rm(tmpDir, { recursive: true, force: true })
|
||||
})
|
||||
|
||||
function buildBubbleValue(opts: {
|
||||
conversationId: string
|
||||
text: string
|
||||
inputTokens: number
|
||||
outputTokens: number
|
||||
type: 1 | 2
|
||||
createdAt?: string
|
||||
}): string {
|
||||
return JSON.stringify({
|
||||
type: opts.type,
|
||||
conversationId: opts.conversationId,
|
||||
text: opts.text,
|
||||
tokenCount: {
|
||||
inputTokens: opts.inputTokens,
|
||||
outputTokens: opts.outputTokens,
|
||||
},
|
||||
createdAt: opts.createdAt ?? new Date().toISOString(),
|
||||
modelId: 'gpt-5',
|
||||
capabilityType: 'composer',
|
||||
})
|
||||
}
|
||||
|
||||
async function createCursorTestDb(): Promise<string> {
|
||||
// Cursor uses a non-extension state DB filename (state.vscdb in the real app);
|
||||
// any path works for openDatabase as long as we set up the schema and the
|
||||
// directory layout the parser expects. The parser only checks the DB
|
||||
// contents — discovery is bypassed because we hand it the path directly.
|
||||
const dbPath = join(tmpDir, 'state.vscdb')
|
||||
await writeFile(dbPath, '')
|
||||
// Use the underlying node:sqlite to create the schema.
|
||||
// We need cursorDiskKV with key + value columns.
|
||||
const Module = await import('node:module')
|
||||
const requireForSqlite = Module.createRequire(import.meta.url)
|
||||
const { DatabaseSync } = requireForSqlite('node:sqlite') as {
|
||||
DatabaseSync: new (path: string) => {
|
||||
exec(sql: string): void
|
||||
prepare(sql: string): { run(...p: unknown[]): unknown }
|
||||
close(): void
|
||||
}
|
||||
}
|
||||
const db = new DatabaseSync(dbPath)
|
||||
db.exec('CREATE TABLE cursorDiskKV (key TEXT PRIMARY KEY, value TEXT)')
|
||||
|
||||
// Single assistant bubble (type=2). The parser yields one ParsedProviderCall
|
||||
// per bubbleId:% row, so a multi-row fixture would muddy the dedup count;
|
||||
// we keep the test surface minimal — one bubble through one parse, then
|
||||
// the same bubble again after token mutation.
|
||||
const bubbleKey = 'bubbleId:abc-123:bubble-xyz'
|
||||
db.prepare('INSERT INTO cursorDiskKV (key, value) VALUES (?, ?)').run(
|
||||
bubbleKey,
|
||||
buildBubbleValue({
|
||||
conversationId: 'abc-123',
|
||||
text: 'def hello(): pass',
|
||||
inputTokens: 100,
|
||||
outputTokens: 20,
|
||||
type: 2,
|
||||
})
|
||||
)
|
||||
|
||||
db.close()
|
||||
return dbPath
|
||||
}
|
||||
|
||||
async function updateAssistantBubbleTokens(dbPath: string, inputTokens: number, outputTokens: number): Promise<void> {
|
||||
const Module = await import('node:module')
|
||||
const requireForSqlite = Module.createRequire(import.meta.url)
|
||||
const { DatabaseSync } = requireForSqlite('node:sqlite') as {
|
||||
DatabaseSync: new (path: string) => {
|
||||
prepare(sql: string): { run(...p: unknown[]): unknown }
|
||||
close(): void
|
||||
}
|
||||
}
|
||||
const db = new DatabaseSync(dbPath)
|
||||
db.prepare('UPDATE cursorDiskKV SET value = ? WHERE key = ?').run(
|
||||
buildBubbleValue({
|
||||
conversationId: 'abc-123',
|
||||
text: 'def hello(): pass',
|
||||
inputTokens,
|
||||
outputTokens,
|
||||
type: 2,
|
||||
}),
|
||||
'bubbleId:abc-123:bubble-xyz'
|
||||
)
|
||||
db.close()
|
||||
}
|
||||
|
||||
async function getCursorProvider(): Promise<Provider> {
|
||||
const all = await getAllProviders()
|
||||
const p = all.find(p => p.name === 'cursor')
|
||||
if (!p) throw new Error('cursor provider not registered')
|
||||
return p
|
||||
}
|
||||
|
||||
describe.skipIf(skipReason !== null)('cursor bubble dedup (regression for v3 fix)', () => {
|
||||
it('does not double-count when bubble token counts mutate between parses', async () => {
|
||||
const dbPath = await createCursorTestDb()
|
||||
const provider = await getCursorProvider()
|
||||
|
||||
// First parse: streaming-in-progress shape.
|
||||
const seenKeys = new Set<string>()
|
||||
const source = { path: dbPath, project: 'test-project', provider: 'cursor' }
|
||||
const firstRunCalls: ParsedProviderCall[] = []
|
||||
for await (const call of provider.createSessionParser(source, seenKeys).parse()) {
|
||||
firstRunCalls.push(call)
|
||||
}
|
||||
expect(firstRunCalls.length).toBe(1)
|
||||
|
||||
// Cursor mutates the same bubble row to its final token totals when the
|
||||
// stream completes. Simulate by updating in place. The SQLite primary
|
||||
// key stays the same.
|
||||
await updateAssistantBubbleTokens(dbPath, 250, 80)
|
||||
|
||||
// Second parse with the SAME seenKeys: must yield zero new calls. If the
|
||||
// dedup key were derived from token counts (the v2 bug), this would
|
||||
// produce a duplicate.
|
||||
const secondRunCalls: ParsedProviderCall[] = []
|
||||
for await (const call of provider.createSessionParser(source, seenKeys).parse()) {
|
||||
secondRunCalls.push(call)
|
||||
}
|
||||
expect(secondRunCalls.length).toBe(0)
|
||||
})
|
||||
|
||||
it('does not yield the same bubble twice within a single parser run', async () => {
|
||||
const dbPath = await createCursorTestDb()
|
||||
const provider = await getCursorProvider()
|
||||
const seenKeys = new Set<string>()
|
||||
const source = { path: dbPath, project: 'test-project', provider: 'cursor' }
|
||||
const calls: ParsedProviderCall[] = []
|
||||
for await (const call of provider.createSessionParser(source, seenKeys).parse()) {
|
||||
calls.push(call)
|
||||
}
|
||||
// One bubble in the DB → one call. (The user message row at type=1 is
|
||||
// not surfaced as a separate ParsedProviderCall; it's threaded into the
|
||||
// assistant call's userMessage field.)
|
||||
expect(calls.length).toBe(1)
|
||||
})
|
||||
})
|
||||
Loading…
Add table
Add a link
Reference in a new issue