- Stop tracking a separate streamError flag. createReadStream's default
64 KiB highWaterMark means the stream may already be reading chunk 2
when we break out of the loop after yielding the first line; if that
later chunk errors, the flag could reject an otherwise-valid line.
readline's async iterator already re-throws stream errors on Node 16+,
which the existing catch handles.
- Test: 120 KB session_meta line forces multi-chunk line assembly.
- Test: truncated mid-write first line is rejected, not parsed as half
an object.
- Cap createReadStream at 1 MiB so a malformed file with no newline
cannot make readline buffer indefinitely (real session_meta lines
are 22-27 KB).
- Capture stream errors explicitly; readline's async iterator does
not always re-throw underlying stream errors per Node docs.
- Test: assert project is extracted from the >16 KB session_meta to
prove the line was actually parsed, not just discovered.
- Test: session_meta line with no trailing newline is still accepted.
- Test: empty rollout file is silently skipped.
`readFirstLine` allocated a fixed 16 KB buffer, but Codex CLI 0.128+
embeds the entire base_instructions / system prompt in the
`session_meta` line, pushing it past 20 KB. When the buffer doesn't
catch a newline, `isValidCodexSession` rejects the session, so every
recent Codex session is silently excluded from totals.
Switch to a streaming readline read so the first line is captured
regardless of length, and add a regression test that creates a
40 KB session_meta payload.
Locally, this changes my 30-day Codex total from €267 (only ~half
of sessions parsed) to €878 (all sessions parsed).
Codex Desktop on Windows uses "Codex Desktop" as the originator
string instead of "codex_cli" or "codex_vscode". The startsWith
check was case-sensitive, rejecting these sessions silently.
Fixes#1 (comment by @JiglioNero).
- Remove agent-*.jsonl exclusion filter that was dropping ~46% of API calls
- Scan subagents/ directories for subagent session files
- Normalize Codex token semantics: OpenAI includes cached tokens inside
input_tokens, subtract them to match Anthropic's separate reporting
- Fixes cost double-counting and 100% cache hit display for Codex users