Quiet routine pricing warnings + menubar recovery from stuck-loading (#266)

* Quiet routine pricing warnings + menubar recovery from stuck-loading CLI: - Default `codeburn` invocation no longer prints "no pricing data for model" warnings on every run. Greeting a fresh user with three lines of stderr before the dashboard even draws looked like the tool was broken on first launch. The warning now requires --verbose, and the suppressed pricing miss still results in $0 cost (correct for unmapped models). - Local-model heuristic skips the warning entirely for Ollama tags (`qwen3.6:35b-a3b-bf16`), GGUF/quantized fingerprints, and similar names that will never have public pricing. The "update codeburn" hint was actively misleading there. - When the warning does fire (with --verbose), it points users at `codeburn model-alias <model> <known-model>` as the actual escape hatch alongside the package update suggestion. Menubar: - Replace perpetual "Loading…" spinner with a FetchErrorOverlay when the per-key fetch fails and the cache is empty. User sees the error and a Retry button instead of an infinite hang. - Add diagnostic breadcrumbs (NSLog, invisible to normal users — Console.app / `log stream --process CodeBurnMenubar` only) for the four states that produce a stuck loading overlay: - subprocess timeout after 45s - fetch result dropped due to Task cancellation (rapid tab switch) - fetch result dropped due to mid-fetch calendar rollover - retry attempt where the last successful fetch is >2 min stale - Track lastSuccessByKey separately from cache freshness so the staleness diagnostic survives day-rollover cache wipes. * Stop flashing the compare-view loading screen on background refresh When the 30s CLI tick updated `projects` while the user was reading the model comparison results, the projects-watching effect always fired setLoadTrigger, which flipped phase to 'loading' and re-ran the slow scanSelfCorrections walk over every provider's session directory. The user lost their scroll position and saw a loading flash mid-read. Recompute the comparison rows in place when: - the user is already on the results phase, AND - both picked models still exist in the new aggregate. Skip the corrections rescan on these in-place refreshes — corrections drift slowly enough that holding the previous value until the user re-enters compare is acceptable, and the rescan is the slow part of the load. Initial selection and post-selection load still run the full pipeline.
2026-05-16 19:44:14 +00:00 · 2026-05-08 20:33:48 -07:00 · 2026-05-08 20:33:48 -07:00 · 8208cf8ff5
commit 8208cf8ff5
parent eafc8eb9f0
5 changed files with 165 additions and 24 deletions
--- a/mac/Sources/CodeBurnMenubar/AppStore.swift
+++ b/mac/Sources/CodeBurnMenubar/AppStore.swift
@ -46,6 +46,17 @@ final class AppStore {
    private var cache: [PayloadCacheKey: CachedPayload] = [:]
    private var cacheDate: String = ""
    private var switchTask: Task<Void, Never>?
+    /// Tracks the last successful fetch timestamp per key for stuck-loading
+    /// diagnostics. NOT used for cache-freshness logic — `CachedPayload.fetchedAt`
+    /// is authoritative there. This map persists across cache wipes (day
+    /// rollover, etc.) so we can distinguish "fresh install, never fetched"
+    /// from "cache was wiped 10 minutes ago and we still haven't refilled".
+    private var lastSuccessByKey: [PayloadCacheKey: Date] = [:]
+
+    private func staleSecondsForKey(_ key: PayloadCacheKey) -> TimeInterval {
+        guard let last = lastSuccessByKey[key] else { return .infinity }
+        return Date().timeIntervalSince(last)
+    }

    private var currentKey: PayloadCacheKey {
        PayloadCacheKey(period: selectedPeriod, provider: selectedProvider)
@ -148,19 +159,41 @@ final class AppStore {
        if didShowLoading {
            loadingCount += 1
        }
+        // Diagnostic anchor: if this key has been empty for a long time (the
+        // popover would currently be showing "Loading..."), log how stale the
+        // miss is so the next time a user reports a stuck-loading bug we have
+        // a concrete data point — "no successful fetch for (today, claude)
+        // in 14 minutes" beats squinting at unified-log noise. We deliberately
+        // skip the first-attempt case (no prior success ever, finite check
+        // below filters .infinity) — that's just the cold path, not a bug.
+        let staleSeconds = staleSecondsForKey(key)
+        if staleSeconds.isFinite, staleSeconds > 120 {
+            NSLog("CodeBurn: refresh attempt for stale key \(key.period.rawValue)/\(key.provider.rawValue) — last success was \(Int(staleSeconds))s ago")
+        }
        defer {
            inFlightKeys.remove(key)
            if didShowLoading { loadingCount = max(loadingCount - 1, 0) }
        }
        do {
            let fresh = try await DataClient.fetch(period: key.period, provider: key.provider, includeOptimize: includeOptimize)
-            guard !Task.isCancelled else { return }
+            if Task.isCancelled {
+                // Distinguish cancellation (user switched tabs mid-fetch) from
+                // the silent-no-result path. Without this log, a cancelled
+                // fetch leaves cache empty + lastError nil and the user sees
+                // perpetual loading with nothing in the diagnostics.
+                NSLog("CodeBurn: fetch for \(key.period.rawValue)/\(key.provider.rawValue) cancelled before result was applied")
+                return
+            }
            // Day-rollover race guard: if the calendar date changed during the
            // fetch, this payload was computed against yesterday's date and
            // would pollute today's freshly-cleared cache. Drop it; the next
            // tick will refetch with today's data.
-            if cacheDate != cacheDateAtStart { return }
+            if cacheDate != cacheDateAtStart {
+                NSLog("CodeBurn: dropping fetch result for \(key.period.rawValue)/\(key.provider.rawValue) — calendar rolled mid-fetch")
+                return
+            }
            cache[key] = CachedPayload(payload: fresh, fetchedAt: Date())
+            lastSuccessByKey[key] = Date()
            lastError = nil
        } catch {
            if Task.isCancelled { return }
@ -171,6 +204,7 @@ final class AppStore {
                    guard !Task.isCancelled else { return }
                    if cacheDate != cacheDateAtStart { return }
                    cache[key] = CachedPayload(payload: fallback, fetchedAt: Date())
+                    lastSuccessByKey[key] = Date()
                    lastError = nil
                    return
                } catch {
--- a/mac/Sources/CodeBurnMenubar/Data/DataClient.swift
+++ b/mac/Sources/CodeBurnMenubar/Data/DataClient.swift
@ -62,9 +62,16 @@ struct DataClient {
        }

        // Wall-clock timeout: if the CLI hangs (parser stuck, disk stall), kill it.
+        // Log when this fires so a recurring stuck-popover state has an actual
+        // diagnostic — historically users saw "Loading..." forever with no signal
+        // about what failed; the only way to debug was to read process state at
+        // the wrong time. The log line names the subcommand so we can correlate
+        // with a specific period/provider combination.
        let timeoutTask = Task.detached(priority: .utility) {
            try? await Task.sleep(nanoseconds: spawnTimeoutSeconds * 1_000_000_000)
            if process.isRunning {
+                NSLog("CodeBurn: CLI subprocess timed out after %llus for %@ — terminating",
+                      spawnTimeoutSeconds, subcommand.joined(separator: " "))
                process.terminate()
            }
        }
--- a/mac/Sources/CodeBurnMenubar/Views/MenuBarContent.swift
+++ b/mac/Sources/CodeBurnMenubar/Views/MenuBarContent.swift
@ -43,15 +43,21 @@ struct MenuBarContent: View {

                // Overlay fires only on cold cache for the current key. This
                // avoids the 1-frame `$0.00` flash on first-time period/provider
-                // switches (the body would otherwise render the empty payload
-                // for the runloop tick before the overlay slides in). With the
-                // cache no longer being wiped on every wake/manual-refresh,
-                // hasCachedData==false now means "we have never fetched this
-                // key before in this session", which is the right time to
-                // cover the popover.
+                // switches. When the fetch fails (CLI subprocess timeout, parse
+                // error, etc.), surface a retry card instead of leaving the
+                // user stuck on a perpetual "Loading..." spinner.
                if !store.hasCachedData {
-                    BurnLoadingOverlay(periodLabel: store.selectedPeriod.rawValue)
+                    if let err = store.lastError, !store.isLoading {
+                        FetchErrorOverlay(
+                            error: err,
+                            periodLabel: store.selectedPeriod.rawValue,
+                            retry: { Task { await store.refresh(includeOptimize: false, force: true, showLoading: true) } }
+                        )
                        .transition(.opacity)
+                    } else {
+                        BurnLoadingOverlay(periodLabel: store.selectedPeriod.rawValue)
+                            .transition(.opacity)
+                    }
                }
            }
            .frame(height: 520)
@ -126,6 +132,49 @@ private struct EmptyProviderState: View {
    }
 }

+/// Shown when a fetch failed and the cache is still empty for this key. The
+/// user previously sat on the "Loading…" spinner forever — the popover had
+/// no path to recover beyond the next 30s tick (which would just re-fail).
+/// Now they see what broke and can retry directly.
+private struct FetchErrorOverlay: View {
+    let error: String
+    let periodLabel: String
+    let retry: () -> Void
+
+    var body: some View {
+        ZStack {
+            Rectangle().fill(.ultraThinMaterial)
+            VStack(spacing: 12) {
+                Image(systemName: "exclamationmark.triangle.fill")
+                    .font(.system(size: 28))
+                    .foregroundStyle(Theme.brandAccent)
+                Text("Couldn't load \(periodLabel)")
+                    .font(.system(size: 12.5, weight: .semibold))
+                    .foregroundStyle(.primary)
+                Text(displayError)
+                    .font(.system(size: 10.5))
+                    .foregroundStyle(.secondary)
+                    .multilineTextAlignment(.center)
+                    .frame(maxWidth: 280)
+                    .lineLimit(3)
+                Button("Retry", action: retry)
+                    .buttonStyle(.borderedProminent)
+                    .tint(Theme.brandAccent)
+                    .controlSize(.small)
+            }
+            .padding(.horizontal, 20)
+        }
+    }
+
+    /// Strip the leading subprocess noise that creeps into NSError descriptions
+    /// so the visible message is the actual cause, not the framework wrapper.
+    private var displayError: String {
+        let trimmed = error.trimmingCharacters(in: .whitespacesAndNewlines)
+        if trimmed.count <= 240 { return trimmed }
+        return String(trimmed.prefix(240)) + "…"
+    }
+}
+
 /// Translucent overlay that blurs whatever's behind it (the previous tab/period content)
 /// and centers an animated burning flame -- the brand mark filling up bottom-to-top in
 /// yellow→orange→red, looping.
--- a/src/compare.tsx
+++ b/src/compare.tsx
@ -331,16 +331,40 @@ export function CompareView({ projects, onBack }: CompareViewProps) {
    const newModels = aggregateModelStats(projects)
    setModels(newModels)

-    if (pickedNames) {
-      const hasA = newModels.some(m => m.model === pickedNames[0])
-      const hasB = newModels.some(m => m.model === pickedNames[1])
-      if (hasA && hasB) {
-        setLoadTrigger(t => t + 1)
-      } else {
-        setPickedNames(null)
-        setPhase('select')
-      }
+    if (!pickedNames) return
+    const hasA = newModels.some(m => m.model === pickedNames[0])
+    const hasB = newModels.some(m => m.model === pickedNames[1])
+    if (!hasA || !hasB) {
+      setPickedNames(null)
+      setPhase('select')
+      return
    }
+
+    // When the periodic CLI refresh updates `projects` while the user is
+    // reading the results page, recompute the comparison rows IN PLACE rather
+    // than flipping to a loading screen. Previously every 30s tick bounced the
+    // user to a loading flash and reset their scroll position; the slow part
+    // (scanSelfCorrections, which walks every provider's session dir) is
+    // skipped on these refreshes — corrections drift slowly enough that
+    // staying with the existing values until the user re-enters compare from
+    // scratch is fine.
+    if (phase === 'results') {
+      const a = newModels.find(m => m.model === pickedNames[0])
+      const b = newModels.find(m => m.model === pickedNames[1])
+      if (!a || !b) return
+      const aCopy = { ...a, selfCorrections: selectedA?.selfCorrections ?? 0 }
+      const bCopy = { ...b, selfCorrections: selectedB?.selfCorrections ?? 0 }
+      setSelectedA(aCopy)
+      setSelectedB(bCopy)
+      setRows(computeComparison(aCopy, bCopy))
+      setCategories(computeCategoryComparison(projects, a.model, b.model))
+      setStyle(computeWorkingStyle(projects, a.model, b.model))
+      return
+    }
+
+    // Initial load (or returning from select after picking) — full pipeline,
+    // including scanSelfCorrections.
+    setLoadTrigger(t => t + 1)
  }, [projects])

  useEffect(() => {
--- a/src/models.ts
+++ b/src/models.ts
@ -235,6 +235,36 @@ export function getModelCosts(model: string): ModelCosts | null {
 // session that used it, hiding real spend until the user noticed.
 const warnedUnknownModels = new Set<string>()

+/// Heuristic for "this looks like a local model that will never be in LiteLLM's
+/// pricing JSON". We suppress the unknown-model warning for these because the
+/// "update codeburn" advice can't help — local Ollama models, llama.cpp tags,
+/// LM Studio loads, etc. are billed locally and don't have public pricing.
+/// Users still get $0 in cost reports for them (correct — local inference is
+/// effectively free); the warning was just noise.
+function looksLikeLocalModel(name: string): boolean {
+  // Ollama and LM Studio tags include `:tag` (e.g. qwen3.6:35b-a3b-bf16).
+  if (name.includes(':') && !name.startsWith('http')) return true
+  // GGUF / quantized fingerprints commonly seen in local inference.
+  if (/[-_](q[2-8](_[a-z0-9]+)?|bf16|fp16|gguf|f16|f32)$/i.test(name)) return true
+  return false
+}
+
+function shouldWarnAboutUnknownModel(name: string): boolean {
+  if (!name || name === '<synthetic>') return false
+  if (warnedUnknownModels.has(name)) return false
+  // Suppress for local/quantized models — the "update codeburn" hint is
+  // actively misleading there. Users who need cost visibility for local
+  // inference can still set an alias via `codeburn model-alias`.
+  if (looksLikeLocalModel(name)) return false
+  // The warning fired on every CLI invocation (including the default
+  // dashboard) which made first launches look broken — three "no pricing
+  // data" lines greet a user before the dashboard even draws. Now opt-in
+  // via --verbose. The unknown model still costs $0 in reports; users who
+  // suspect missing models run `codeburn --verbose` to see the list.
+  if (process.env['CODEBURN_VERBOSE'] !== '1') return false
+  return true
+}
+
 export function calculateCost(
  model: string,
  inputTokens: number,
@ -246,19 +276,16 @@ export function calculateCost(
 ): number {
  const costs = getModelCosts(model)
  if (!costs) {
-    // Skip the synthetic placeholder and the auto-router pseudo-models that
-    // intentionally have no direct pricing entry; calculateCost callers
-    // resolve those through aliasing first, so an unknown here is genuinely
-    // an unmapped real model.
-    if (model && model !== '<synthetic>' && !warnedUnknownModels.has(model)) {
+    if (shouldWarnAboutUnknownModel(model)) {
      warnedUnknownModels.add(model)
      // Strip control characters and cap length: model names come from JSONL
      // payloads written by external tools, so a hostile or corrupt file
      // could embed terminal escape sequences here.
      const safeName = model.replace(/[\x00-\x1F\x7F-\x9F]/g, '?').slice(0, 200)
+      const aliasHint = `Map it with: codeburn model-alias "${safeName}" <known-model>`
      process.stderr.write(
        `codeburn: no pricing data for model "${safeName}" — costs for this model will show $0. ` +
-        `Update with: npx codeburn@latest, or report at https://github.com/getagentseal/codeburn/issues.\n`
+        `${aliasHint}, or update with: npx codeburn@latest.\n`
      )
    }
    return 0