From f74add827128e5d35cc0185da36cb38ccab4fadc Mon Sep 17 00:00:00 2001 From: rcourtman Date: Sun, 10 May 2026 16:34:34 +0100 Subject: [PATCH] Auto-seed Patrol preflight cache on Pulse startup MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the cold-start gap in the preflight observability layer: every Pulse restart blanked the cached "last verified" indicator until the next save or manual click, which meant operators saw "never verified" on every upgrade or process restart even when the configured Patrol model was working fine. NewAISettingsHandler now reuses the existing aiSettingsUpdateRequiresPatrolPreflight predicate with a nil "prior config" — semantically "no in-memory cache yet, just booted." When the loaded config has assistant enabled and a Patrol model, the handler dispatches the same async TriggerPatrolPreflightAsync the save path uses. Routine boots where assistant is disabled (or no Patrol model is selected) skip the dispatch so we never write a misleading "Pulse Assistant is not enabled" entry into the cache. Live verified: after Pulse restart, /api/settings/ai surfaces a fresh patrol_preflight with success=true within ~6s of boot, no operator action required. Predicate test extended with two named cases that document the dual-purpose use (startup seed + skip-when-disabled). ai-runtime, api-contracts, agent-lifecycle (dep), and storage-recovery (dep) contracts updated. --- .../v6/internal/subsystems/agent-lifecycle.md | 2 +- .../v6/internal/subsystems/ai-runtime.md | 2 +- .../v6/internal/subsystems/api-contracts.md | 2 +- .../internal/subsystems/storage-recovery.md | 2 +- .../src/api/__tests__/patrolPreflight.test.ts | 26 +++++++++++++++++++ internal/api/ai_handlers.go | 9 +++++++ internal/api/ai_handlers_test.go | 23 ++++++++++++++++ 7 files changed, 62 insertions(+), 4 deletions(-) diff --git a/docs/release-control/v6/internal/subsystems/agent-lifecycle.md b/docs/release-control/v6/internal/subsystems/agent-lifecycle.md index e9b1e8c4a..1f69f02b7 100644 --- a/docs/release-control/v6/internal/subsystems/agent-lifecycle.md +++ b/docs/release-control/v6/internal/subsystems/agent-lifecycle.md @@ -743,7 +743,7 @@ profile and assignment columns, but embedded table framing must route through ## Completion Obligations -1. Update this contract when agent lifecycle ownership changes. Routes added under the shared `internal/api/` extension point that are clearly outside lifecycle ownership (for example `POST /api/ai/patrol/preflight`, the `patrol_preflight` snapshot field added to `/api/settings/ai`, and the auto-trigger preflight dispatch on settings save — all owned by ai-runtime) do not extend this subsystem's contract; they live in their owning subsystem. +1. Update this contract when agent lifecycle ownership changes. Routes added under the shared `internal/api/` extension point that are clearly outside lifecycle ownership (for example `POST /api/ai/patrol/preflight`, the `patrol_preflight` snapshot field added to `/api/settings/ai`, the auto-trigger preflight dispatch on settings save, and the startup-seed dispatch in `NewAISettingsHandler` — all owned by ai-runtime) do not extend this subsystem's contract; they live in their owning subsystem. 2. Keep shared API proof routing aligned whenever install, register, or profile payloads change. 3. Update runtime and settings tests in the same slice when lifecycle behavior changes. 4. Keep host-agent test hooks, command-client factories, and timing overrides diff --git a/docs/release-control/v6/internal/subsystems/ai-runtime.md b/docs/release-control/v6/internal/subsystems/ai-runtime.md index 8e2db4e91..aef07e7fb 100644 --- a/docs/release-control/v6/internal/subsystems/ai-runtime.md +++ b/docs/release-control/v6/internal/subsystems/ai-runtime.md @@ -108,7 +108,7 @@ runtime cost control, and shared AI transport surfaces. ## Completion Obligations -1. Update this contract when canonical AI runtime or transport entry points move, including transport-level provider request-shape changes such as DeepSeek `tool_choice` coercion, runtime-failure classification splits (for example separating forced tool selection rejection, no tool-capable endpoint, and generic model-level lack of tool support into distinct causes), Patrol-specific verification surfaces such as `POST /api/ai/patrol/preflight` that exercise the full chat-completions path with a minimal tool definition rather than only listing models, Patrol-preflight cache observability where the AI Service caches the most recent preflight outcome (success, soft warning, or classified failure) and the AI settings response surfaces it as `patrol_preflight` so the UI can hydrate a "last verified" indicator without forcing operators to re-run preflight on every page load, and the auto-trigger contract on `HandleUpdateAISettings` where the save handler runs `TriggerPatrolPreflightAsync` only when the change actually moved Patrol transport (model swap, provider key for that model changed, or assistant just enabled with a Patrol model) so routine settings saves do not burn provider tokens +1. Update this contract when canonical AI runtime or transport entry points move, including transport-level provider request-shape changes such as DeepSeek `tool_choice` coercion, runtime-failure classification splits (for example separating forced tool selection rejection, no tool-capable endpoint, and generic model-level lack of tool support into distinct causes), Patrol-specific verification surfaces such as `POST /api/ai/patrol/preflight` that exercise the full chat-completions path with a minimal tool definition rather than only listing models, Patrol-preflight cache observability where the AI Service caches the most recent preflight outcome (success, soft warning, or classified failure) and the AI settings response surfaces it as `patrol_preflight` so the UI can hydrate a "last verified" indicator without forcing operators to re-run preflight on every page load, the auto-trigger contract on `HandleUpdateAISettings` where the save handler runs `TriggerPatrolPreflightAsync` only when the change actually moved Patrol transport (model swap, provider key for that model changed, or assistant just enabled with a Patrol model) so routine settings saves do not burn provider tokens, and the startup-seed contract where the AI Service handler dispatches the same async preflight on Pulse boot when assistant is enabled and a Patrol model is configured so the cache is populated for the first `/api/settings/ai` poll after a restart instead of blanking back to "never verified" 2. Keep AI runtime and shared API proof routing aligned in `registry.json` 3. Preserve explicit coverage for chat, Patrol, remediation, and cost-control behavior when AI runtime changes Patrol runtime failures are part of that runtime contract: provider, model, diff --git a/docs/release-control/v6/internal/subsystems/api-contracts.md b/docs/release-control/v6/internal/subsystems/api-contracts.md index 3fbeb4bb6..af4783389 100644 --- a/docs/release-control/v6/internal/subsystems/api-contracts.md +++ b/docs/release-control/v6/internal/subsystems/api-contracts.md @@ -894,7 +894,7 @@ the canonical monitored-system blocked payload. ## Completion Obligations -1. Update contract tests when payloads change, including admin verification endpoints such as `POST /api/ai/patrol/preflight` whose response shape (`tool_call_observed`, `duration_ms`, classified `cause`/`summary`/`recommendation`, plus `recorded_at`/`recorded_at_unix` for the cached snapshot) is part of the canonical Patrol diagnostic surface, the `patrol_preflight` snapshot field on `/api/settings/ai` that hydrates the Verify Patrol panel on page load, and the auto-trigger contract on `POST/PUT /api/settings/ai` whose handler dispatches preflight in the background only when the change actually moved Patrol transport so routine saves do not write a new `patrol_preflight` snapshot +1. Update contract tests when payloads change, including admin verification endpoints such as `POST /api/ai/patrol/preflight` whose response shape (`tool_call_observed`, `duration_ms`, classified `cause`/`summary`/`recommendation`, plus `recorded_at`/`recorded_at_unix` for the cached snapshot) is part of the canonical Patrol diagnostic surface, the `patrol_preflight` snapshot field on `/api/settings/ai` that hydrates the Verify Patrol panel on page load, the auto-trigger contract on `POST/PUT /api/settings/ai` whose handler dispatches preflight in the background only when the change actually moved Patrol transport so routine saves do not write a new `patrol_preflight` snapshot, and the startup-seed contract where `NewAISettingsHandler` dispatches the same async preflight after `LoadConfig()` succeeds so the first `/api/settings/ai` poll after a Pulse restart already carries a populated `patrol_preflight` snapshot 2. Update frontend API types in the same slice 3. Route runtime changes through the explicit API-contract proof policies in `registry.json`; default fallback proof routing is not allowed 4. Update this contract when canonical payload ownership changes diff --git a/docs/release-control/v6/internal/subsystems/storage-recovery.md b/docs/release-control/v6/internal/subsystems/storage-recovery.md index 7640fe98e..29b17204a 100644 --- a/docs/release-control/v6/internal/subsystems/storage-recovery.md +++ b/docs/release-control/v6/internal/subsystems/storage-recovery.md @@ -762,7 +762,7 @@ bypass the API fail-closed execution gate. ## Completion Obligations -1. Update this contract when canonical storage or recovery entry points move. Routes added under the shared `internal/api/` extension point that are clearly outside storage/recovery ownership (for example `POST /api/ai/patrol/preflight`, the `patrol_preflight` snapshot field added to `/api/settings/ai`, and the auto-trigger preflight dispatch on settings save — all owned by ai-runtime) do not extend this subsystem's contract; they live in their owning subsystem. +1. Update this contract when canonical storage or recovery entry points move. Routes added under the shared `internal/api/` extension point that are clearly outside storage/recovery ownership (for example `POST /api/ai/patrol/preflight`, the `patrol_preflight` snapshot field added to `/api/settings/ai`, the auto-trigger preflight dispatch on settings save, and the startup-seed dispatch in `NewAISettingsHandler` — all owned by ai-runtime) do not extend this subsystem's contract; they live in their owning subsystem. 2. Keep recovery store/runtime changes aligned with the storage and recovery frontend proofs in `registry.json` 3. Tighten guardrails when legacy storage or recovery presentation paths are removed 4. Preserve the dependency split: API payload ownership stays in `api-contracts`, settings shell ownership stays in `frontend-primitives`, and canonical resource truth stays in `unified-resources` diff --git a/frontend-modern/src/api/__tests__/patrolPreflight.test.ts b/frontend-modern/src/api/__tests__/patrolPreflight.test.ts index c61d4090b..3e706c444 100644 --- a/frontend-modern/src/api/__tests__/patrolPreflight.test.ts +++ b/frontend-modern/src/api/__tests__/patrolPreflight.test.ts @@ -102,6 +102,32 @@ describe('runPatrolPreflight', () => { expect(result.recorded_at_unix).toBe(1778421251); }); + it('rehydrates from a startup-seeded snapshot the same way as a manual preflight result', async () => { + // Pulse boots with NewAISettingsHandler dispatching async preflight + // when assistant + Patrol model are configured. The first + // /api/settings/ai poll after boot already carries patrol_preflight + // populated by that startup goroutine. The runPatrolPreflight + // client surface stays the same; tests guard that the shape we + // accept does not regress when the cache was populated server-side + // without an explicit POST from the frontend. + vi.mocked(apiFetchJSON).mockResolvedValueOnce({ + success: true, + provider: 'deepseek', + model: 'deepseek-v4-flash', + tool_call_observed: true, + duration_ms: 1820, + message: 'Provider accepted the preflight request and the model emitted a tool call.', + recorded_at: '2026-05-10T15:30:42Z', + recorded_at_unix: 1778430642, + }); + + const result = await runPatrolPreflight(); + + expect(result.success).toBe(true); + expect(result.tool_call_observed).toBe(true); + expect(result.recorded_at_unix).toBe(1778430642); + }); + it('exposes the soft-warning shape when the model accepted the request but did not call the tool', async () => { vi.mocked(apiFetchJSON).mockResolvedValueOnce({ success: false, diff --git a/internal/api/ai_handlers.go b/internal/api/ai_handlers.go index 9efcc79bf..16d2a348a 100644 --- a/internal/api/ai_handlers.go +++ b/internal/api/ai_handlers.go @@ -232,6 +232,15 @@ func NewAISettingsHandler(mtp *config.MultiTenantPersistence, mtm *monitoring.Mu } if err := defaultAIService.LoadConfig(); err != nil { log.Warn().Err(err).Msg("Failed to load AI config on startup") + } else if aiSettingsUpdateRequiresPatrolPreflight(nil, defaultAIService.GetConfig()) { + // Seed the Patrol preflight cache on startup so the UI's + // "last verified" indicator is populated on first load after + // a Pulse restart, without forcing operators to save + // settings or click Verify Patrol just to recover the + // observability they had before the restart. The dispatch + // is async with its own timeout, so it cannot block boot. + log.Info().Msg("Auto-seeding Patrol preflight cache for startup observability") + defaultAIService.TriggerPatrolPreflightAsync("", "") } } handler.defaultAIService = defaultAIService diff --git a/internal/api/ai_handlers_test.go b/internal/api/ai_handlers_test.go index 70bfb6067..ba86f5910 100644 --- a/internal/api/ai_handlers_test.go +++ b/internal/api/ai_handlers_test.go @@ -2923,6 +2923,29 @@ func TestAISettingsUpdateRequiresPatrolPreflight(t *testing.T) { new: enabled("deepseek:deepseek-v4-flash", "sk-new"), want: true, }, + { + // Same predicate also drives the startup-seed path in + // NewAISettingsHandler — passing nil for the prior config + // represents "no in-memory cache yet, just booted." When + // the loaded config has assistant + Patrol model, we should + // preflight to populate the cache before the first + // /api/settings/ai poll arrives so the UI's "last verified" + // indicator is not blank after every restart. + name: "startup seed (no prior cache, assistant enabled with patrol model) → trigger", + old: nil, + new: enabled("deepseek:deepseek-v4-flash", "sk-loaded-from-disk"), + want: true, + }, + { + // Startup-seed must NOT fire when assistant is disabled — + // otherwise we'd write a misleading "Pulse Assistant is not + // enabled" entry into the cache for an operator who simply + // hasn't enabled assistant yet. + name: "startup seed when assistant disabled → skip", + old: nil, + new: &config.AIConfig{Enabled: false, PatrolModel: "deepseek:deepseek-v4-flash"}, + want: false, + }, { name: "assistant just enabled → trigger", old: &config.AIConfig{Enabled: false, PatrolModel: "deepseek:deepseek-v4-flash", DeepSeekAPIKey: "sk-same"},