From 5dcdbfabf08f84f37513be3ec81ae112e7ff8e15 Mon Sep 17 00:00:00 2001 From: rcourtman Date: Sun, 10 May 2026 23:16:47 +0100 Subject: [PATCH] Document chat-side cost recording in ai-runtime contract Follow-up to a0b3bc7ed which closed the chat.Service cost-ledger gap. ai-runtime.md gains a Current State paragraph documenting: - The pre-fix bug (chat accumulated tokens via SSE done envelope but never recorded a cost.UsageEvent server-side; chat is the bulk of AI token spend so the dashboard was dramatically understating cost). - The fix shape (recordChatTurnCost runs after every loop return, success or error since the operator was billed regardless). - The threading path (chat.Config.CostStore wired by the router from AISettingsHandler.GetAIService.CostStore()). - The double-recording invariant (ExecutePatrolStream is deliberately not changed; its caller patrol_ai.go records via its own helper). - UseCase="chat" matches the canonical taxonomy noted on cost.UsageEvent.UseCase ("chat" or "patrol"). --- .../v6/internal/subsystems/ai-runtime.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/docs/release-control/v6/internal/subsystems/ai-runtime.md b/docs/release-control/v6/internal/subsystems/ai-runtime.md index 897120ae7..3c450e90e 100644 --- a/docs/release-control/v6/internal/subsystems/ai-runtime.md +++ b/docs/release-control/v6/internal/subsystems/ai-runtime.md @@ -1876,3 +1876,21 @@ selection, cost ledger (report_narrative / report_narrative_fleet use-cases), and budget gate the report PDF endpoint already enforces — there is exactly one canonical synthesis path for both surfaces. + +The same canonical AI runtime now also records user-chat token +usage to the cost ledger. `chat.Service.ExecuteStream` was a +long-standing gap: the agentic loop accumulated token counts via +stream callbacks and surfaced them in the SSE done envelope, but +nothing on the server side recorded a `cost.UsageEvent`. Chat is +the bulk of AI token spend, so the operator AI usage dashboard +was understating cost dramatically. `recordChatTurnCost` now runs +after every `loop.ExecuteWithTools` return — success or error, +since the operator was billed regardless of whether the loop +produced a clean response. It emits a `cost.UsageEvent` with +`UseCase="chat"` in the same shape the rest of the runtime uses. +The store is threaded through `chat.Config.CostStore`, wired by +the router from the per-tenant `AISettingsHandler.GetAIService` +via `Service.CostStore()`. `ExecutePatrolStream` deliberately +does NOT record here — its caller (`patrol_ai.go`) records via +its own helper, so cost is never double-counted on the +patrol-via-chat path.