feat(optimize): MCP tool coverage detector with cache-aware costing

Adds a per-tool optimizer finding for MCP servers whose schema is loaded on every turn but rarely invoked. Builds on the existing server-level `detectUnusedMcp` (zero invocations) by reporting partial-use cases: "loaded 54 tools, called 0" or "loaded 26 tools, called 2 (8% coverage)". Inventory comes from Claude Code's JSONL `attachment.deferred_tools_delta` entries: `addedNames` lists the exact tools available at that turn, including every fully-qualified `mcp__<server>__<tool>` name. We union across all delta entries in a session (not just the first) because tool availability can change mid-session when the user reloads MCP config or a subagent inherits a different tool set. Names that don't match the `mcp__<server>__<tool>` shape with both segments non-empty are rejected at extraction so downstream `split('__')` consumers can't be poisoned. Token-savings estimates are cache-aware. MCP tool schemas live in the cached prefix of the system prompt: a session pays the full input price on each cache-creation turn (rebuilds happen every ~5 minutes of inactivity) and the cache-read discount on subsequent turns. Each call's contribution is capped at its observed `cacheCreationInputTokens` / `cacheReadInputTokens` so we never claim more MCP overhead than the call's own cache buckets could contain. When multiple servers are flagged, costing happens in a single combined pass: the per-call cap applies to the total unused-schema budget across all flagged servers, not per server. Two flagged servers cannot both independently claim the same call's cache bucket, which would otherwise overstate `tokensSaved` and misclassify findings as high impact. A session counts toward `loadedSessions` (and toward the cost estimate) only if its observed inventory included the server. Pure invocation-only sessions, where the server appears in `mcpBreakdown` or `call.mcpTools` without any matching `deferred_tools_delta`, do not satisfy the `>= 2 sessions` threshold on their own. The same invariant applies in `estimateMcpSchemaCost` so the two passes agree. Coverage is computed against the inventory only: invocations of names not present in any observed inventory (older config, hallucinated tool, typo) do not inflate `toolsInvoked` and cannot drive `unusedCount` negative. `toolsInvoked` is derived as `inventory.size - unusedTools.length` to keep both numbers consistent. `detectUnusedMcp` and the new detector are explicitly disjoint: `detectUnusedMcp` skips servers that the coverage detector will report, not every server that happens to be in any inventory, so a small inventoried-but-uninvoked server below the coverage thresholds still gets flagged as "configured but never called." Thresholds for the coverage finding: - > 10 tools available (small servers are noise) - < 20% coverage - >= 2 sessions with observed inventory - High impact when total effective tokens >= 200_000 or >= 3 servers flagged Smoke-tested on a real account: 7 servers flagged across 93 sessions (`office-word-mcp` 0/54, `notebooklm-mcp` 0/38, `office-ppt-mcp` 0/37, `excel-mcp-server` 0/25, `github-mcp-server` 2/26, `peekaboo` 3/22, plus `claude_ai_Asana`). Combined-cap costing keeps `tokensSaved` honest. Changes: - src/types.ts: optional `mcpInventory: string[]` on `SessionSummary`. Provider-agnostic field; currently populated only by the Claude parser. - src/parser.ts: `extractMcpInventory` walks all entries, validates fully-qualified names, returns sorted unique list. `buildSessionSummary` passes it through; field is omitted when empty so JSON exports stay clean. - src/optimize.ts: `aggregateMcpCoverage`, `estimateMcpSchemaCost` (single- and multi-server signatures), `detectMcpToolCoverage`. Wired into `scanAndDetect`. `detectUnusedMcp` updated to disjoint with the new detector. - tests/mcp-coverage.test.ts: 23 cases covering aggregation, costing, combined-cap behaviour, threshold gates, invocation-only-session filtering, foreign-tool invocations, cache rebuild events, write+read on the same call, multi-server pluralisation. - tests/parser-mcp-inventory.test.ts: 12 cases for the JSONL extractor including malformed name rejection and tolerant attachment parsing. - CHANGELOG.md: entry under Unreleased / Added (CLI). Closes #2
2026-05-19 07:43:09 +00:00 · 2026-05-05 04:13:04 +03:00 · 2026-05-05 04:13:04 +03:00 · 1a080a006f
commit 1a080a006f
parent 18335a1f9d
6 changed files with 970 additions and 1 deletions
--- a/src/parser.ts
+++ b/src/parser.ts
@ -203,10 +203,54 @@ function groupIntoTurns(entries: JournalEntry[], seenMsgIds: Set<string>): Parse
  return turns
 }

+/**
+ * Extract MCP tool inventory observed across a session's JSONL entries.
+ *
+ * Claude Code emits `attachment.type === "deferred_tools_delta"` entries whose
+ * `addedNames` array lists every tool currently available at that turn (built-in
+ * tools plus all `mcp__<server>__<tool>` names exposed by configured MCP
+ * servers). Tool inventory can change mid-session if the user reloads MCP
+ * config, so we union every occurrence rather than trusting only the first.
+ *
+ * Built-in tools are filtered out: only `mcp__*` identifiers survive.
+ */
+// Fully-qualified MCP tool name shape: `mcp__<server>__<tool>`. Both server
+// and tool segments must be non-empty. Names like `mcp__server` (no tool
+// segment) or `mcp__server__` (trailing empty tool) would silently pollute
+// the inventory and break downstream `split('__')` consumers, so they're
+// rejected here.
+function isMcpToolName(name: string): boolean {
+  if (!name.startsWith('mcp__')) return false
+  const rest = name.slice(5) // strip `mcp__`
+  const sep = rest.indexOf('__')
+  if (sep <= 0) return false                   // missing or empty server
+  if (sep >= rest.length - 2) return false     // missing or empty tool
+  return true
+}
+
+export function extractMcpInventory(entries: JournalEntry[]): string[] {
+  const inventory = new Set<string>()
+  for (const entry of entries) {
+    const att = entry['attachment']
+    if (!att || typeof att !== 'object') continue
+    const a = att as { type?: unknown; addedNames?: unknown }
+    if (a.type !== 'deferred_tools_delta') continue
+    if (!Array.isArray(a.addedNames)) continue
+    for (const name of a.addedNames) {
+      if (typeof name !== 'string') continue
+      if (!isMcpToolName(name)) continue
+      inventory.add(name)
+    }
+  }
+  if (inventory.size === 0) return []
+  return Array.from(inventory).sort()
+}
+
 function buildSessionSummary(
  sessionId: string,
  project: string,
  turns: ClassifiedTurn[],
+  mcpInventory?: string[],
 ): SessionSummary {
  const modelBreakdown: SessionSummary['modelBreakdown'] = Object.create(null)
  const toolBreakdown: SessionSummary['toolBreakdown'] = Object.create(null)
@ -311,6 +355,7 @@ function buildSessionSummary(
    bashBreakdown,
    categoryBreakdown,
    skillBreakdown,
+    ...(mcpInventory && mcpInventory.length > 0 ? { mcpInventory } : {}),
  }
 }

@ -362,7 +407,14 @@ async function parseSessionFile(
  }
  const classified = turns.map(classifyTurn)

-  return buildSessionSummary(sessionId, project, classified)
+  // Inventory is extracted from the full entry stream, not just the
+  // turns we kept after date filtering: tool availability is set up
+  // once at the start of a session (with possible mid-session reloads),
+  // and we want to reflect what was loaded even if the user only ran
+  // turns inside a narrow date window.
+  const mcpInventory = extractMcpInventory(entries)
+
+  return buildSessionSummary(sessionId, project, classified, mcpInventory)
 }

 async function collectJsonlFiles(dirPath: string): Promise<string[]> {