Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-05-19 16:27:37 +00:00

Author	SHA1	Message	Date
rcourtman	e1a6dff2a1	Document Pulse Cloud launch in v6 release notes	2026-05-11 23:18:05 +01:00
rcourtman	7951da526b	Add release_cycle_artifact_globs so RC ceremony skips contract-update requirement	2026-05-11 22:55:29 +01:00
rcourtman	816b3985ba	Make blocked-record drift self-fixable via BLESS_GOVERNANCE_FIXTURES env var	2026-05-11 22:31:03 +01:00
rcourtman	d38f3d9217	Update release-control fixtures after pulse-agent Docker removal	2026-05-11 22:21:31 +01:00
rcourtman	8ff69daa43	Bump install pins to rc.5 and refresh test fixtures for Patrol readiness + Unraid host profile tokens	2026-05-11 18:02:52 +01:00
rcourtman	894ea89af9	Refresh RC5 packet validation range for plain-JSON tool-call sanitisation	2026-05-11 17:09:57 +01:00
rcourtman	366bf8d127	Prepare v6.0.0-rc.5 release packet	2026-05-11 16:52:31 +01:00
rcourtman	d02255907c	Fail closed when patrol_resolve_finding verifier is inconclusive ResolveFinding adapter previously logged a warning and allowed the LLM's resolve to proceed when the deterministic verifier returned an error (timeout, executor unavailable, etc.). That's fail-open: any verifier failure let the auto_resolved → re-detected cycle continue, exactly the pattern the rest of this branch's patrol_resolve_finding work spent commits closing. The "Backup failed" finding on the live preview still cycled once post- migration because of this path — verifier returned an ErrVerificationUnknown and resolve was permitted. Resolution of an event/persistent category finding is effectively permanent (next detection registers as a regression and inflates counters and pollutes the trust strip). When the deterministic verifier cannot confidently say the failure signal is gone, we don't have grounds to honor the LLM's judgment — the LLM's "current investigation didn't surface a fresh failure" is exactly the unreliable signal that produced bogus cycles. Switches the inconclusive-verifier branch from log-and-allow to log-and-reject, returning an error to the tool so the LLM can retry or escalate to the operator. The verifier-still-detects- signal path stays as-is (it was already fail-closed). Test: TestPatrolFindingCreatorAdapter_ResolveFinding_RejectsWhenVerifierIsInconclusive exercises the path by calling ResolveFinding on a backup-failed finding through a PatrolService with no chat service wired (getExecutorForVerification returns ErrVerificationUnknown). Asserts the error mentions 'inconclusive' and that ResolvedAt remains nil. Contract: extends the deterministic-resolve-gate clause in the ai-runtime canonical-files completion-obligations to name the fail-closed-on-inconclusive policy explicitly.	2026-05-11 10:53:06 +01:00
rcourtman	9f43a22fb1	Bound patrol-main session at 200 messages to stop unbounded disk growth The patrol-main chat session was reused across every scheduled Patrol run with no upper bound. After a month of runs the file had grown to 16 MB / 3,593 messages, and every AddMessage rewrote the whole file to disk — so the I/O cost per Patrol run was scaling with total session age, not with the run's own output. Across all chat sessions on this dev instance, the ai_sessions directory hit 676 MB / 1,629 files. The stateless-Patrol-input fix (commit `43760fb0d`) stopped loading the session back into the agentic loop, but Patrol still wrote each run's messages to the session for the Pulse Assistant sidebar's forensic view. That write path is what this commit bounds. ExecutePatrolStream now calls SessionStore.TrimMessages(200) after each run, keeping roughly the last two runs' worth of messages — enough for the sidebar to show recent activity, far short of unbounded growth. The next Patrol run on a bloated session will drop the historical 3,000+ messages down to 200 on its first write, so existing storage debt clears on its own without a separate migration. User-driven chat sessions are unaffected: TrimMessages with keepMostRecent <= 0 is a no-op, and callers that want full history retention simply don't call it. Only Patrol's forensic session is capped. The canonical Patrol forensic log is the PatrolRunRecord history surfaced at /api/ai/patrol/runs — that's the durable record with structured fields. The chat-session-shaped file is a sidebar convenience, not the source of truth. Three tests guard the boundary: - TrimMessages keeps the most recent N (50 messages trimmed to 10 → messages 40-49 remain) - TrimMessages is a no-op below threshold (5 messages, cap 200 → 5 messages remain) - TrimMessages with non-positive keep is a no-op (3 messages, cap 0 or -5 → 3 messages remain) ai-runtime contract updated.	2026-05-10 23:19:08 +01:00
rcourtman	5dcdbfabf0	Document chat-side cost recording in ai-runtime contract Follow-up to `a0b3bc7ed` which closed the chat.Service cost-ledger gap. ai-runtime.md gains a Current State paragraph documenting: - The pre-fix bug (chat accumulated tokens via SSE done envelope but never recorded a cost.UsageEvent server-side; chat is the bulk of AI token spend so the dashboard was dramatically understating cost). - The fix shape (recordChatTurnCost runs after every loop return, success or error since the operator was billed regardless). - The threading path (chat.Config.CostStore wired by the router from AISettingsHandler.GetAIService.CostStore()). - The double-recording invariant (ExecutePatrolStream is deliberately not changed; its caller patrol_ai.go records via its own helper). - UseCase="chat" matches the canonical taxonomy noted on cost.UsageEvent.UseCase ("chat" or "patrol").	2026-05-10 23:16:47 +01:00
rcourtman	99c499ade7	Repair orphan tool_calls in convertToProviderMessages Defense-in-depth for the malformed-history bug pattern. The Patrol fix made patrol-main runs stateless, but Assistant chat sessions are inherently multi-turn and must keep their history. Any chat session that ends mid-tool-call — network drop, ctx timeout, browser crash, uncaught panic, any interrupt that fires between "model emits tool_calls" and "agentic loop appends all tool results" — leaves the persisted session with orphan tool_call_ids. The next message that loads this history is rejected with the same provider error that flapped Patrol for 33 days: An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. For Patrol this was fixable by ignoring the session. For Assistant it isn't; the conversation context is the product. convertToProviderMessages now ends with a repairOrphanToolCalls pass that scans every assistant message with tool_calls and inserts synthetic is_error tool result messages immediately after the assistant turn for any tool_call_id that has no matching downstream result. The synthetic content is marked is_error=true and explains the interruption so the model can retry the same call or proceed without that data — preserving conversational continuity while satisfying the provider's structural-validity check. This guards every conversation that crosses convertToProviderMessages, not just Assistant chat. If Patrol ever changes back to loading session history, the same safety net applies. If a new entry point appears for some other LLM flow, it gets the repair for free. Three tests guard the boundary: - Orphan injection (3 tool_calls, only 1 result → 2 synthetic results, marked is_error with interrupted explanation, ordering preserved) - Clean no-op (all tool_calls fulfilled → no synthetic messages, no is_error pollution) - Existing truncation test still passes (assistant message with both tool_calls and own tool_result → no repair needed, tool_call_id matches in same message) ai-runtime contract updated.	2026-05-10 23:10:13 +01:00
rcourtman	e657f6ace9	Suppress assessment error penalty after trailing-success recovery The overall-health "Recent Patrol errors" coverage factor in summarizeRecentPatrolCoverage was anchoring the score to a stale ratio: it counted errors across the last 10 runs without weighting recency. After Pulse fixed two compounding Patrol bugs today, four consecutive successful runs (50+ tool calls each) followed six earlier failures. The assessment kept showing C/65 with the prediction "most recent Patrol runs encountered errors (6 of 10)" — directly contradicting the fact that every recent run had succeeded. Operators reading that score would conclude Pulse Patrol is still broken. It isn't. The fix dragged the grade. This commit adds a recovery-suppression check: count trailing successful full Patrol runs from the most-recent end of the window (GetAll returns newest-first), skipping non-full runs. When three or more consecutive trailing successes exist — roughly a 9-hour clean stretch at the default 3-hour cadence — the error penalty drops entirely. The score reflects current reality. Three is conservative: a single recovery run could be a transient win; three consecutive demonstrate the underlying fix is sticking. Below the threshold, the existing ratio-tiered penalty still applies so partially-recovered states still register. Two tests guard the boundary: - 6 historical errors + 3 trailing successes → no coverage factor (suppressed) - 6 historical errors + 2 trailing successes → coverage factor remains (recovery incomplete) Live verified after this commit lands: the assessment that's been stuck at C/65 since the malformed-history fix will recompute to A/B grade as soon as the trailing 3 successful runs are recognized by the same recent-runs query. ai-runtime contract updated.	2026-05-10 23:02:57 +01:00
rcourtman	15f6881d89	Document the chat-side narrator wiring in ai-runtime contract Follow-up to `03463c1bf` which threaded the per-tenant report narrators through chat.Config -> tools.ExecutorConfig -> PulseToolExecutor so pulse_summarize can produce AI-narrated synthesis in chat instead of heuristic-only. ai-runtime.md's Current State paragraph documents the wiring: - chat.Config carries three optional fields (ReportNarrator, ReportFleetNarrator, ReportFindingsProvider) threaded through to the executor at session construction time. - The router installs a SetReportNarratorResolver closure that mirrors the reporting handler's pattern, asking the AISettingsHandler for the per-tenant ai.Service and returning it as the implementation for all three roles when AI is enabled. - Unconfigured tenants still get the heuristic fallback — matching the report PDF's graceful-degradation posture. - AI-narrated chat synthesis uses the same provider, sanitizer, model selection, cost ledger (report_narrative / report_narrative_fleet use-cases), and budget gate the report PDF endpoint enforces, so there is exactly one canonical synthesis path for both surfaces.	2026-05-10 22:51:17 +01:00
rcourtman	7a7b3c9d30	Gate LLM patrol_resolve_finding on deterministic verifier for event findings Backup-failed was flapping detected → auto-resolved → re-detected ten times in a single day. Each cycle the LLM saw "PBS backups look healthy in my current snapshot" during a Patrol pass, called patrol_resolve_finding(backup-failed), and the adapter at patrol_findings.go:985 called Resolve(findingID, true) directly — no category check, no evidence verification. The contract docs at findings.go:52-67 explicitly say event / persistent categories (backup, reliability, security, general) "stay active until explicitly resolved — either by the LLM calling patrol_resolve_finding with evidence, or by operator action." That "with evidence" was never enforced. This commit enforces it. The adapter now checks two conditions before honoring an LLM resolve: - finding.Category does NOT support stale-auto-resolve (per the contract function CategorySupportsStaleAutoResolve), AND - a deterministic verifier exists for finding.Key (currently smart-failure and backup-failed) When both are true, the adapter runs VerifyFixResolved on the finding's resource. If the verifier still detects the failure signal, the LLM gets an error explaining why the resolve was rejected and that the underlying issue must be fixed first. If the verifier confirms the signal has cleared, the resolve proceeds with grounded evidence. Categories that support stale-auto-resolve (performance, capacity) bypass the gate entirely — the LLM can resolve them based on absence per the existing contract. Keys without a verifier also fall through to current behavior so we don't block resolves for categories we haven't built verifiers for yet. New PatrolService.hasDeterministicVerifierForKey() helper keeps the gate's verifier list in lockstep with the switch in verifyFixDeterministically. Tests cover the three branches: - performance category → gate skipped, resolve proceeds - reliability + no verifier → gate falls through, resolve proceeds - hasDeterministicVerifierForKey for known and unknown keys ai-runtime contract updated.	2026-05-10 22:48:09 +01:00
rcourtman	3b06b4b09d	Document the non-rendering reporting engine entry points in api-contracts Follow-up to `e32d4ede4` (NarrativeFor + FleetNarrativeFor entrypoints) and `1fe5d6853` (pulse_summarize tool). api-contracts.md gains a Current State paragraph documenting that the Engine interface now exposes two non-rendering entry points alongside Generate/GenerateMulti, with the explicit invariant that test stubs implementing the interface must implement these methods so the contract is honoured across the entire surface, not just the export-shaped subset. ai-runtime.md was updated in the parallel-agent commit `ee2de2703` (which picked up the pulse_summarize paragraph when restating auto-resolve gating), so no further edit is needed there.	2026-05-10 22:38:07 +01:00
rcourtman	ee2de2703b	Update ai-runtime contract to cover both absence-based auto-resolve paths reconcileStaleFindings (commit `b44d5892f`) and the resource-absent gate added in commit d6bb89a1c both use the same CategorySupportsStaleAutoResolve helper, but the contract Current State only documented the first path. Rewords the paragraph so both are covered explicitly: stale-cleanup and resource-absent both gate on the whitelist, both reject event/persistent categories, and the bogus-absence examples extend to cover the resource-absent failure mode (transient agent reconnect, container churn, refresh gap).	2026-05-10 22:34:25 +01:00
rcourtman	ac0f361372	Document the report-narrator detection-boundary invariant Follow-up to the narrator prompt changes that forbid acting as a parallel detector. ai-runtime.md gains a Current State paragraph documenting that both report narrator system prompts encode an explicit detection-boundary invariant: warning or critical severity classifications must be backed by a Patrol finding, an alert, or a hard-threshold breach in the structured input, not by metric inference alone. Patterns the narrator notices without that backing are constrained to info severity. This keeps the narrative retrospective on Patrol's work and prevents silent shadow-classification competing with Patrol's detection rules. api-contracts.md gains an equivalent paragraph in the reporting contract section. The same rule applies to outliers and patterns on the fleet path, and to recommendations on both. The deterministic heuristic narrators were already constrained to the same threshold rules; this aligns the AI path with the same evidence surface the fallback uses, so the report PDF cannot become a back-door detection surface that diverges from the findings store.	2026-05-10 22:02:47 +01:00
rcourtman	c6a786a36d	Broaden regression-counter reset to include LLM-driven cycles Initial detector (commit `942f9ca0f`) only matched on the two legacy absence-signature reason strings — but the Backup failed finding on the live preview showed 6 auto_resolved events all with empty messages, produced by the LLM patrol_resolve_finding tool via Resolve(_, true). Counter stayed at 6× after the previous migration ran. New detector: any active finding whose category is NOT eligible for stale-auto-resolve (i.e. anything other than performance or capacity) AND has any auto_resolved event on its lifecycle is treated as having an inflated counter. The rationale is the same rule the category gate already established — for event/persistent categories there is no legitimate absence-driven resolution path, so any auto_resolved was either a removed-bogus-path stamp or an LLM judgment call that repeatedly reverted through regressions on the next run. The cumulative count is no signal either way. Performance/capacity findings retain their counter because the metric-cleared resolution model is sound there. Test extended to cover four cases: LLM-driven cycle resets, legacy-reason cycle resets, eligible-category preserves counter, non-eligible category without any auto_resolved preserves counter. Plus the existing idempotency case (already-reset finding stays reset and is not re-applied).	2026-05-10 21:47:57 +01:00
rcourtman	942f9ca0f5	Reset regression counters polluted by bogus auto_resolve cycles The Backup failed finding on the live preview showed "regressed 6×" when the actual regression count of genuine recurrences was at most 1 or 2 — the rest were the system fighting itself, driven by the absence-based auto_resolve paths that were gated (category whitelist) or removed (alert-mirror rip) earlier in this branch. Counter stayed sticky after those fixes landed, so the trust strip and finding badges still surfaced the inflated number. FindingsStore.SetPersistence load pass now scans each active finding's lifecycle for the two known bogus-signature auto_resolved reasons ("No longer detected by patrol", "Resource no longer exists in infrastructure"). If found, RegressionCount is reset to 0 and LastRegressionAt is cleared, and a regression_counter_reset lifecycle event is appended so the migration is idempotent. A finding that already has a regression_counter_reset event is left alone; any regressed events that accrued after the reset are genuine and stand. findingHasBogusAutoResolveCycle returns true only when the lifecycle contains a bogus auto_resolved and no prior reset event, so the function is the single point of truth for the migration decision and is straightforward to test. Test covers three cases: finding with bogus signature gets reset, finding with empty-message auto_resolved (LLM-driven, legitimate) keeps its counter, finding already migrated is not re-reset. Updates ai-runtime Current State to document the second migration on top of the alert-mirror retirement.	2026-05-10 21:44:29 +01:00
rcourtman	590671ffbb	Retire legacy "Active alert detected" findings on load The previous commit removed the detectAlertSignals path so no NEW alert-mirror findings are emitted, but the findings already persisted from earlier builds stay in the store indefinitely — nothing cleans them up (reconcileStaleFindings is gated on performance/capacity categories, the LLM resolves them just to have them re-detected next run except now the deterministic emitter is gone so re-detection can't happen, but they're left sitting as active findings draining the trust strip and score). FindingsStore.SetPersistence now runs a one-shot retirement pass on load: any active finding with title "Active alert detected", source ai-analysis, and category general is auto-resolved with reason "Patrol no longer mirrors alerts; the Alerts page is the canonical surface for currently-firing alerts." The pass appends an auto_resolved lifecycle event so the retirement is auditable, syncs the loop state to resolved, and schedules a save so the cleanup persists. Idempotent: after the first load with this code, no findings match the signature so the pass is a no-op. Defensive: the signature requires all three fields (title + source + category) to match before retiring, so an operator-authored finding that happens to share the title is left untouched. Test covers the mirror case, the matching-title-but-foreign-source case (must NOT retire), and an unrelated active finding (must NOT retire), plus verifies the retired state persists back through the persistence layer. Updates ai-runtime Current State to record the migration path.	2026-05-10 21:39:09 +01:00
rcourtman	271d12ecab	Stop mirroring alerts into Patrol findings The deterministic signal pipeline ran the pulse_alerts tool output through detectAlertSignals and produced a SignalActiveAlert for every firing alert, which Patrol then materialized as an "Active alert detected" finding (source: ai-analysis, category: general). The system prompt at the top of patrol_ai.go explicitly tells the LLM not to duplicate alerts — but the deterministic emitter was duplicating them anyway, behind the LLM's back. Symptoms observed in the wild: - 9 active "Active alert detected" findings in Patrol, every one a duplicate of an existing alert already on the Alerts page. - The LLM, doing what the prompt told it, resolved each mirrored finding via patrol_resolve_finding. Next run the alert was still firing and Patrol re-emitted the signal → finding regressed. Lifecycle showed several auto_resolved → re-detected → regressed cycles per finding within hours. - Health score dragged down by issues the operator already saw on the Alerts page, with no operator action possible from Patrol that wasn't already available from Alerts. Rip detectAlertSignals entirely, remove the pulse_alerts case from the signal-extraction switch, drop SignalActiveAlert plus its key / title / recommendation entries. Convert the prior TestDetectSignals_ActiveAlert into a regression guard that locks in the no-mirror behavior. Updates the ai-runtime subsystem Current State to record the decision: Patrol does not duplicate the Alerts surface; alerts own their own lifecycle, surface, and acknowledgement model.	2026-05-10 21:33:41 +01:00
rcourtman	7bd596d378	Document the fleet-narrative surface in ai-runtime and api-contracts Follow-up to the fleet-level AI narrative refactor: ai-runtime.md gains a Current State paragraph documenting that internal/ai.Service also implements pkg/reporting.FleetNarrator with its own use-case label (report_narrative_fleet) so fleet vs single-resource spend is distinguishable in the cost ledger and budget gate, and that the single-resource narrator is intentionally not propagated through the multi-report path. api-contracts.md gains a paragraph documenting the new optional FleetNarrator field on MultiReportRequest, the new FleetNarrative field on MultiReportData, the rendered fleet section (executive prose, named outliers, cross-cutting patterns, recommendations, optional period-comparison, AI provenance footer), and the explicit invariant that the deterministic resource summary table stays rendered from the same per-resource aggregates so every named outlier is verifiable against the table below. Dependent subsystems (agent-lifecycle, performance-and-scalability, storage-recovery) remain unchanged: their Extension Points reference internal/api/ broadly but agent lifecycle, perf scaling, and storage recovery semantics have no delta from this change.	2026-05-10 21:24:04 +01:00
rcourtman	9905383f70	Document the report-narrative surface in ai-runtime and api-contracts Follow-up to the report narrative refactor: ai-runtime.md gains a Current State paragraph documenting that internal/ai.Service now implements pkg/reporting.Narrator and pkg/reporting.FindingsProvider, and that the narrator/findings surfaces inherit the same provider, sanitizer, model selection, budget enforcement, and fail-closed governance as the rest of the canonical AI runtime. api-contracts.md gains a paragraph documenting the new optional Narrator and FindingsProvider fields on MetricReportRequest, the new Narrative, PriorPeriod, and Findings fields on ReportData, the three new PDF sections (executive prose, Period-over-period changes, AI provenance footer), and the explicit invariant that the deterministic data surface (charts, stats, alerts, storage, disks) stays rendered from the same aggregates so every AI claim is verifiable against adjacent data. Multi-resource fleet reports intentionally remain heuristic-only at this transport layer. The dependent subsystems flagged by the staged-shape guard (agent-lifecycle, performance-and-scalability, storage-recovery) genuinely have no contract delta from the prior commit: their Extension Points reference internal/api/ broadly, but agent lifecycle, performance scaling, and storage recovery semantics are unchanged.	2026-05-10 21:00:39 +01:00
rcourtman	a343ba0126	Tier patrol-coverage health impact by recent-error ratio The overall health score chip read "A · 95/100" while the same assessment card said "Coverage incomplete · Recent Patrol runs encountered errors · Verify full coverage." Those messages contradicted because the "recent errors but had a successful run" coverage factor used a flat -10 penalty regardless of the error ratio. With one successful manual run among many failed startup runs, the math stayed in the A band and the score directly undermined the warning it sat next to. The factor now tiers the impact by ratio of errored runs to relevant runs in the scoring window: >50% errored → -30, "Most recent Patrol runs encountered errors (N of M); the current health summary is not reliable until coverage stabilizes." >25% errored → -20, "Recent Patrol runs encountered errors (N of M), so the current health summary may be incomplete." else → -10, original light-tier description. A single transient error stays in grade A; dominant-error periods drop out of A so the grade matches the warning. Adds two tests covering both ends of the new tiering and updates the ai-runtime subsystem contract Current State section.	2026-05-10 19:33:18 +01:00
rcourtman	b44d5892f4	Gate stale-finding auto-resolve on category whitelist reconcileStaleFindings was auto-resolving any seeded finding that the LLM didn't re-report in a successful run. The function's own comment acknowledges the LLM doesn't reliably use patrol_resolve_finding, so this was built as a cleanup pass — but it cannot tell the difference between "LLM correctly recognized this is fixed" and "LLM forgot to re-mention it." For findings that represent discrete events or persistent states (a backup task that failed, a service that crashed, a security vulnerability that was found, a configuration error), absence in a Patrol report is not evidence that the issue has cleared. The result was bogus auto_resolved → re-detected → regressed cycles, observed in the wild as "Backup failed" regressing 4× over 6 hours and "Provider analysis error" regressing 271×. Those bogus auto-resolutions also inflated the trust strip with fictional auto-resolved credit. CategorySupportsStaleAutoResolve in findings.go gates the cleanup: only `performance` and `capacity` findings — continuous current-state metric thresholds — may be auto-resolved from absence. The other four categories (reliability, backup, security, general) stay active until explicitly resolved. Updates the ai-runtime subsystem contract Current State section with the whitelist and the adjacent lifecycle dedup rules already landed. Adds TestReconcileStaleFindings_SkipsNonCurrentStateCategories with table-driven subtests for all four event/persistent categories, and TestCategorySupportsStaleAutoResolve to lock in the whitelist.	2026-05-10 19:27:36 +01:00
rcourtman	43760fb0d0	Patrol runs are now stateless — drop prior session history Patrol's "patrol-main" session was reused across every scheduled run, so ExecutePatrolStream loaded the full session history into the agentic loop's input. When any prior run ended after the model emitted tool_calls but before all tool results landed (provider error, timeout, context cancellation), the orphan tool_calls were persisted and every subsequent run inherited them. The provider then rejected the conversation with: An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. (insufficient tool messages following tool_calls message) Patrol failed 271 consecutive runs with this error before it was diagnosed. Each new run added another user prompt on top of the broken structure, so the message slice grew to 33+ messages with one assistant turn at position 23 holding 4 orphan tool_call_ids and 9 user prompts stacked after it. The "patrol runs need a clean slate" comment at line 1898 documents that the knowledge accumulator is freshened per run; the conversation history was the matching gap. ExecutePatrolStream now passes only this run's user prompt to the agentic loop. The session is still written to for audit/forensics, just no longer fed back into the model. Live verified: Patrol now completes successfully (18 tool calls, 3m23s) on a session that previously failed every run with the malformed-history error. The runtime-failure finding auto-resolved on this same successful run. Adds a classifier bucket for "insufficient tool messages" errors so any future regression in this area surfaces with a meaningful diagnostic instead of the generic "Provider analysis error" fallback. New PatrolFailureCauseMalformedToolHistory cause; predicate patrolMalformedToolHistory matches DeepSeek's exact phrasing and OpenAI's similar variants. ai-runtime contract updated.	2026-05-10 17:14:47 +01:00
rcourtman	3da835c5bc	Publish a distribution path for pulse-mcp The MCP adapter shipped in slice 51 with one install option: clone the repo and go build. This slice integrates pulse-mcp into Pulse's existing governed release pipeline so a Pulse release publishes a pulse-mcp binary alongside the unified agent and the install scripts that bring it home in one command. What ships: - scripts/build-release.sh extended to build pulse-mcp for the same multi-OS matrix as the unified agent, package per-platform tarballs and zips, and copy bare binaries to RELEASE_DIR for /releases/latest/download/ redirect compatibility. - .github/workflows/create-release.yml extended to upload the bare pulse-mcp binaries plus install-mcp.sh and install-mcp.ps1 as release assets. - scripts/install-mcp.sh: bash one-line installer that detects platform/arch, downloads the matching binary from the configured release (latest by default), verifies SHA256 against the published checksums.txt, places at ~/.local/bin/pulse-mcp (or /usr/local/bin if not writable). Honors PULSE_MCP_VERSION, PULSE_MCP_BIN_DIR, PULSE_MCP_REPO, PULSE_MCP_NO_VERIFY env vars; declines Windows shells with a pointer at the .ps1 sibling. - scripts/install-mcp.ps1: PowerShell installer for Windows, placing pulse-mcp.exe at $LOCALAPPDATA\pulse-mcp. Documentation aligned: - cmd/pulse-mcp/README.md gains an Install section above Quick start with three options: one-line installer, GitHub Release download, go install. Documents the macOS Gatekeeper bypass since v1 is unnotarized by design. - The Settings -> API Access agent-integrations panel now surfaces the curl\|bash command above the config snippet so operators see "install pulse-mcp" before "configure your MCP client." - docs/releases/AGENT_PARADIGM.md drops the "no published distribution path" item from "what it does not do yet" and documents the Gatekeeper / Homebrew gaps as next-tier follow-ups. Trade-offs surfaced and chosen: - Same cadence as Pulse: pulse-mcp ships per Pulse release, not on its own track. The MCP server reads the manifest from the Pulse it talks to, so version alignment is the natural model. - No Homebrew tap or core formula in v1. Maintaining a tap is real ongoing work; foundation supports adding Homebrew later as a layer. - No Docker image. Stdio JSON-RPC fights Docker's stdin /stdout pattern. - No notarization in v1. SHA256 verification through the installer preserves the audit trail; README documents the Gatekeeper bypass. Subsystem contract: deployment-installability.md gains scripts/install-mcp.sh, scripts/install-mcp.ps1, and cmd/pulse-mcp/ in canonical files (mid-list entries renumbered) plus a paragraph documenting the new MCP entry point alongside the existing installer family. Verification artifacts: - scripts/installtests/build_release_assets_test.go gains TestBuildReleasePackagesPulseMcpForAllPlatforms which pins the build/package/copy wiring and the load-bearing install-mcp.sh helpers (platform detection, SHA256 verification, install-dir resolution). - scripts/release_control/render_release_body_test.py gains test_agent_paradigm_release_notes_blurb_documents_- distribution_path which pins the AGENT_PARADIGM.md draft's install-mcp.sh reference and the four-axis frame so a future edit cannot regress the install story silently. Smoke-tested install-mcp.sh locally on darwin-arm64: platform detection, install-dir resolution, URL building, and 404 error handling all correct. The full end-to-end install path becomes live the moment a Pulse release ships pulse-mcp binaries; the next RC cut will exercise it.	2026-05-10 17:04:49 +01:00
rcourtman	af422ddf2f	Verify Patrol now tests the form's pending model, not the saved one Two real UX bugs in the Verify Patrol panel: 1. The button silently tested the previously-saved model instead of the operator's pending dropdown selection. Clicking Verify after changing the model would re-run preflight against the OLD model and the operator would believe the new selection was verified. 2. The result panel rendered the cached green badge even when the form's current selection was a different model from the one in the cache, with no visual cue that the verified result was for something other than what they were looking at. Fixes: - runPatrolToolPreflight now passes form.patrolModel as the model override on the POST. Empty form value falls through to the configured shared default on the backend. - PatrolPreflightControl computes isStaleAgainstFormSelection by comparing the bare model name in form.patrolModel to the cached result.model. When stale, the panel switches to amber tone with a headline that names both models and a detail line prompting the operator to click Verify Patrol. - stripModelProvider helper handles "provider:model" prefix. Architecture guardrail extended with three assertions covering the form-aware verify path and the staleness indicator wiring. frontend-primitives contract updated. Live verified: changing the dropdown from deepseek-v4-flash to deepseek-v4-pro before saving switches the panel from green "Tool calling verified" to amber "Verified result is for deepseek-v4-flash, your current selection is deepseek/deepseek-v4-pro · Click Verify Patrol to test the pending selection."	2026-05-10 16:51:17 +01:00
rcourtman	4b6e409747	Draft a release-notes blurb for the agent paradigm work Sources cleanly into release announcements without forcing a fresh draft when the next RC or GA cuts. Sized as a focused source draft (under 130 lines): one-paragraph headline, three operator-facing bullets, the four-axis frame (discover, read, write, push) lifted and condensed from docs/AGENT_SUBSTRATE.md, an integrators-pointing section that names the two reference adapters and the canonical contract, an honest "what it does not do yet" paragraph (no published pulse-mcp distribution; no real-world consumer feedback yet), and an audit-trail summary. Style matches the existing docs/releases/RELEASE_NOTES_v6_*.md draft pattern: technical, plainspoken, no marketing prose, honest about scope. No em dashes per project rule. Lives at docs/releases/AGENT_PARADIGM.md alongside the existing release-notes drafts so a release author finds it where they already look. The header explicitly frames it as a source draft to drop into announcements, GitHub prerelease descriptions, or a "What Changed" section in a versioned release notes file, trimmed or expanded as the cut requires. Contract-neutral commit: docs-only addition, no runtime code, wire shape, manifest entry, error code, version artifact, release workflow input, governed metadata, or canonical-files contract changed. PULSE_ALLOW_CONTRACT_NEUTRAL_COMMIT used with a documented reason since the deployment-installability verification-artifact rule wants a Python release-promotion test touched, which would be wrong scope for a markdown draft.	2026-05-10 16:44:26 +01:00
rcourtman	297556fb65	Surface cached preflight in Patrol tools readiness check Previously the "Patrol tools" readiness check was static model-name pattern matching: it told the operator whether the selected model is on Pulse's tool-capable allowlist, not whether tools have actually been verified to work. After today's preflight cache, that's strictly less information than what we already know. resolvePatrolToolsCheck now consults aiService.CachedPatrolPreflight() and grounds the check in real evidence when a result for the configured provider+model exists: - cached green (success + tool_call_observed) → "Tool calling verified <age> against <model>." (ready) - cached failure → classified summary + "(last preflight <age>)." (not_ready) - cached soft warning (model_tool_support_unverified) → same with warning status - no cache or model mismatch → static fallback formatPatrolPreflightAge produces stable English ("just now", "5m ago", "2h ago", "3d ago") with full unit coverage (11 cases). HandleGetAISettings now also includes patrol_readiness in its response — previously only the PUT response carried it, so the Patrol page only got augmented readiness after a save. The frontend already had patrol_readiness typed and read it from useAISettingsState. ai-runtime, api-contracts, agent-lifecycle (dep), and storage-recovery (dep) contracts updated.	2026-05-10 16:41:16 +01:00
rcourtman	d4efc08909	Surface agent integrations in Settings → API Access The agent substrate has been entirely API-side until now. An operator opening Pulse had no way to know it existed, no way to see what an agent connected to their instance could do, and no quick path to a working MCP integration. Slice 59 closes that visibility gap. A new AgentIntegrationsPanel renders below the existing APITokenManager on the API Access tab. The panel: - Fetches /api/agent/capabilities at mount and renders the declared capabilities grouped by category (Context, Operator state, Patrol findings, Action governance), each row showing name, method+path, scope chip, description, and the stable error codes the manifest declares. Adding a capability on the backend extends this list automatically; nothing in the panel is hardcoded against the substrate's surface. - Generates an MCP config snippet using window.location.origin so the snippet is correct for whichever URL the operator is reading from. Includes a copy-to-clipboard button (matching the existing CopyCommandBlock pattern) and a brief explainer pointing at the right config path for Claude Desktop and Claude Code. - Links to cmd/pulse-mcp/README.md, cmd/agent-probe, and docs/AGENT_SUBSTRATE.md so an integrator has the full setup story without leaving the panel. Wiring is one import + one component placement in APIAccessPanel.tsx. The api tab already concerns "what can be done with API tokens"; adding the agent surface as a sibling section under the same tab keeps the operator's mental model coherent (one place for machine-driven access) and avoids growing the Settings tab inventory or touching settingsNavigationModel, the registry, the loaders, or the routing tests. Two architectural pins land alongside: - settingsArchitecture.test.ts gains a guardrail that the AgentIntegrationsPanel sits as a sibling section inside APIAccessPanel rather than being lifted into its own tab. Drift would fragment the agent surface across navigation. - The frontend-primitives contract documents the sibling-panel-over-new-tab pattern for additive operator surfaces closely related to an existing tab's intent. - The security-privacy contract documents the new section's presence on the API Access tab and pins that token minting still flows through APITokenManager — the new section surfaces what tokens unlock, not a parallel auth path. Verified against the running dev server: the page renders the four category sections (Context, Operator state, Patrol findings, Action governance), the live MCP config snippet contains the deployment's own origin, and the manifest endpoint serves all 14 capabilities (the 11 substrate capabilities + the 3 action capabilities from slice 58). TypeScript clean across the frontend.	2026-05-10 16:35:40 +01:00
rcourtman	f74add8271	Auto-seed Patrol preflight cache on Pulse startup Closes the cold-start gap in the preflight observability layer: every Pulse restart blanked the cached "last verified" indicator until the next save or manual click, which meant operators saw "never verified" on every upgrade or process restart even when the configured Patrol model was working fine. NewAISettingsHandler now reuses the existing aiSettingsUpdateRequiresPatrolPreflight predicate with a nil "prior config" — semantically "no in-memory cache yet, just booted." When the loaded config has assistant enabled and a Patrol model, the handler dispatches the same async TriggerPatrolPreflightAsync the save path uses. Routine boots where assistant is disabled (or no Patrol model is selected) skip the dispatch so we never write a misleading "Pulse Assistant is not enabled" entry into the cache. Live verified: after Pulse restart, /api/settings/ai surfaces a fresh patrol_preflight with success=true within ~6s of boot, no operator action required. Predicate test extended with two named cases that document the dual-purpose use (startup seed + skip-when-disabled). ai-runtime, api-contracts, agent-lifecycle (dep), and storage-recovery (dep) contracts updated.	2026-05-10 16:34:34 +01:00
rcourtman	33b6bf97d1	Auto-trigger Patrol preflight when settings save moves Patrol transport Closes the resilience gap: today an operator picks a Patrol model, saves, and Patrol silently fails on its first scheduled run because nothing verified the model actually calls tools. Now the save handler dispatches a one-shot preflight in the background whenever the change actually moved Patrol's transport, and the cached result surfaces on the next /api/settings/ai poll via patrol_preflight. Trigger conditions (aiSettingsUpdateRequiresPatrolPreflight): - new config has assistant enabled AND a Patrol model - AND any of: * no prior config (first save) * assistant was disabled, now enabled * Patrol model changed * API key for the new patrol model's provider changed Routine saves that don't touch Patrol transport (theme, control level, discovery toggles, unrelated provider keys) skip preflight entirely so they don't burn provider tokens or add 5-10s latency to every save. Service-level TriggerPatrolPreflightAsync runs the call in a goroutine with a 30s timeout. Detection helper has full unit coverage including the negative paths.	2026-05-10 16:14:26 +01:00
rcourtman	728c42e47b	Bring action endpoints onto the agent surface with the agent-stable envelope Closes the last known gap in the agent substrate. The three action endpoints (POST /api/actions/plan, /api/actions/{id}/decision, /api/actions/{id}/execute) previously emitted the platform-wide APIError shape (stable code under "code", human under "error"). The agent surface uses the inverted shape (stable code under "error", human under "message"), so adding action capabilities to the manifest as-is would have forced agents to remember which envelope each capability uses. The slice refactors actions.go to emit the agent-stable envelope across all 42 writeErrorResponse call sites. writeJSONError gains a writeJSONErrorWithDetails sibling so the 13 calls that pass field-level reasons (validation failures) preserve that information under a new optional `details` field. The action endpoints' JSON shape becomes: {"error": "<stable_code>", "message": "<human>", "details"?: {"<field>": "<reason>"}} Frontend impact: zero. Verified that no frontend code consumes the three action endpoints; the refactor is API-only. Three new manifest entries (plan_action, decide_action, execute_action) under a new "action" category, with their declared error codes pinned per capability. Internal-failure 5xx codes (audit-store outages, encode failures) are not declared per capability; agents branch on 5xx generically. TestContract_AgentSurfaceErrorCodesMatchManifestDeclarations now audits actions.go alongside the existing two handler files, with a documented internal-only allowlist for the 5xx codes. The TestAgentSubstrate_ActionEndpointsEmitAgentStableEnvelope e2e test exercises one error path through each endpoint via the actual HTTP boundary, asserting the agent-stable envelope reaches the wire and the legacy APIError fields (code, status_code, timestamp) do NOT — drift back would mean the refactor regressed. The TestContract_ActionDryRunOnlyExecutionErrorJSONSnapshot pin is updated to match the new envelope shape; the manifest's category allowlist gains "action". api-contracts.md documents the new envelope (with details map), the action governance loop's place in the substrate, the ai:execute scope distinction from monitoring:write, and the "manifest projection has a footnote" trade-off: bringing an existing endpoint into the agent surface may require migrating its error envelope, but the substrate keeps a single envelope contract rather than carrying a translation wrapper layer. agent-lifecycle.md and storage-recovery.md document the action surface joining the agent paradigm and its zero-new-persistence posture respectively. AGENT_SUBSTRATE.md's "what it does not do yet" no longer lists the action surface; it now reflects the real outstanding items (consumer feedback, an in-Pulse agent integrations panel, a distribution path for pulse-mcp).	2026-05-10 15:16:17 +01:00
rcourtman	add1096ec8	Cache Patrol preflight outcome and hydrate UI on settings load The Verify Patrol button reset its result to empty on every page load — the operator had to re-click to see the verified state, even though nothing had changed. This commit adds the observability layer of the auto-preflight plan: every RunPatrolToolPreflight result is now cached on the AI Service and surfaced through /api/settings/ai as patrol_preflight, so the inline result panel rehydrates on page load with the most-recent outcome and a "last verified Xs ago" indicator. Backend: patrolPreflightCache (mutex-guarded) on Service with defensive-copy CachedPatrolPreflight() accessor; every RunPatrolToolPreflight branch (success, soft warning, classified failure, validation early-return) records into the cache. PatrolPreflightSnapshot projects the cached result onto the AI settings response. Tests cover both success-then-failure supersession and the defensive-copy invariant. Frontend: PatrolPreflightSnapshot type mirrors the wire shape; hydratePatrolPreflightFromSettings(data) projects the snapshot into the same response shape the manual button writes; loadSettings and updateSettings flows call it. The result panel renders a "last verified Xs ago" line under the provider/model row when recorded_at_unix is present. End-to-end smoke verified against deepseek-v4-flash: panel rehydrates as green "Tool calling verified · last verified just now" after page reload. Auto-preflight on save (the trigger half of the resilience plan) follows in the next commit. Contracts: ai-runtime, api-contracts, agent-lifecycle (dep), storage-recovery (dep), frontend-primitives all updated to reflect the new patrol_preflight surface and hydration contract. Verification artifacts: settingsArchitecture + patrolPreflight client tests.	2026-05-10 14:58:26 +01:00
rcourtman	c01282e8c5	Add Verify Patrol button to Assistant & Patrol settings Wires the new POST /api/ai/patrol/preflight endpoint into the Settings UI. The button sits beside the Patrol Verification Model picker so the verification action lives where the model is selected. Three result tones rendered inline: - green: provider call succeeded and the model emitted a tool call (the fully-verified state) - amber: provider accepted the request but the model did not call the tool (cause=model_tool_support_unverified — Patrol may still work, recommend a real run) - red: classified failure surface from the runtime classifier (tool_choice_rejected, no_tool_capable_endpoint, model unavailable, auth, billing, etc) with summary, recommendation, and the resolved provider/model Distinct from the existing "Run Preflight" button on the Provider Configuration card, which only fans out per-provider TestConnection calls. The copy under the new button calls that distinction out so operators don't conflate them. Backend wiring stays untouched — the action goes through the typed runPatrolPreflight client added in `e26a57a15`. Settings architecture guardrail extended to assert the wiring.	2026-05-10 14:40:13 +01:00
rcourtman	e26a57a157	Add POST /api/ai/patrol/preflight tool-call verification The existing per-provider /api/ai/test endpoints only call ListModels — they pass for every provider that returns a catalog, even when Patrol fails 100% of runs because tools aren't actually wired up. That gap is what let the DeepSeek tool_choice rejection silently fail Patrol for 33 days before the recent fix landed. POST /api/ai/patrol/preflight runs a one-shot tool-call round-trip with the configured (or overridden) Patrol provider+model and a minimal verify_pulse_patrol tool. Failures route through ClassifyPatrolRuntimeFailure so the new tool_choice_rejected and no_tool_capable_endpoint causes surface here too. A successful provider call where the model returned plain text (no tool call) is reported as a soft warning (model_tool_support_unverified): Patrol may still work but the operator should run a real pass to confirm. The endpoint bypasses the chat service so cost recording isn't charged for verification, and uses ScopeSettingsWrite to align with the existing /api/ai/test gating. Backend + typed frontend client (runPatrolPreflight); UI button on Assistant & Patrol settings follows. Contracts updated: - ai-runtime: completion obligation extended to cover the new verification surface - api-contracts: payload shape (tool_call_observed, duration_ms) noted in obligations - agent-lifecycle, storage-recovery: dependent-extension acknowledgment that ai-runtime owns the new route despite it living under internal/api/	2026-05-10 14:30:41 +01:00
rcourtman	818221b457	Translate Pulse SSE events into MCP notifications when opted in Closes the documented limitation in slice 51's pulse-mcp: MCP clients that process server-initiated notifications can now react to Pulse's push channel without holding a separate HTTP connection to /api/agent/events. The bridge is opt-in via --emit-notifications because not every MCP client surfaces arbitrary notifications/* methods (Claude Desktop, today, does not). Autonomous agents that consume the JSON-RPC stream programmatically benefit; UI-mediated clients should keep the flag off and use the SSE stream directly. Implementation: a long-lived goroutine, started once after the first initialize, that opens /api/agent/events, parses the substrate's wire format, and emits a JSON-RPC notification per non-transport event. Method names mirror the SSE event kinds (notifications/finding.created, notifications/approval. pending, notifications/action.completed). Params is the SSE data payload verbatim so agents see the same wire shape an HTTP SSE consumer would. stream.connected and heartbeat are filtered as transport plumbing. The consumer reconnects with capped exponential backoff on transient errors. When --emit-notifications is on, initialize advertises the supported event kinds under capabilities.experimental.pulseNotifications.kinds. Clients that don't understand the experimental block ignore it silently. Three tests pin the behaviour: the initialize handshake's capability block is correctly gated on the flag; the notification filter rejects transport events and accepts the three substrate kinds; an httptest.NewServer-backed end-to-end translates a multi-event SSE stream into JSON-RPC notifications with the substrate's payload preserved. Also flagged in AGENT_SUBSTRATE.md "what it does not do yet": the action-execution endpoints (/api/actions/plan, decision, execute) emit a different error envelope from the agent surface (APIError with stable code under "code") versus the agent-stable shape (stable code under "error"). Adding them to the manifest requires resolving that mismatch first; recorded as a focused slice for whenever the substrate's reach extends to direct agent-driven execution.	2026-05-10 14:19:44 +01:00
rcourtman	f2d9d2aba8	Split overgreedy "tools not supported" classifier into three causes The Patrol runtime classifier collapsed three distinct upstream conditions into one misleading "Selected model does not support Patrol tools" message: 1. Provider rejected the value Pulse sent for tool selection (e.g. DeepSeek's "deepseek-reasoner does not support this tool_choice" — the model accepts tools, just not the forced coercion). The DeepSeek fix in `46145df9` dodges the symptom by coercing to auto, but the original misclassification pointed operators at the wrong remediation for 33 days. 2. Provider has no tool-capable endpoint available for the selected model (OpenRouter's "No endpoints found …" surfaces this when account-level provider/data filters exclude every tool-capable route). 3. Model truly lacks tool calling (the literal "tools are not supported" / "tool calling" cases). Each now has its own PatrolFailureCause, title, summary, description, and recommendation. summarizePatrolRuntimeFailureDetail mirrors the split. Helper predicates patrolToolChoiceValueRejected and patrolNoToolCapableEndpoint encapsulate the substring matching. The OpenRouter "No endpoints found" test fixture now correctly classifies as no_tool_capable_endpoint instead of model_unsupported_tools — fixture updates in patrol_runtime_failure_test.go, patrol_assistant_handoff_test.go, and ai_handler_test.go reflect the more accurate diagnostic. New tests cover the tool_choice_rejected and generic model_unsupported_tools paths explicitly. The ai-runtime contract is updated to note the classifier-split obligation alongside the existing transport-shape obligation.	2026-05-10 14:10:18 +01:00
rcourtman	46145df925	Coerce DeepSeek tool_choice to "auto" so Patrol stops failing DeepSeek's API server-side aliases deepseek-v4-flash and deepseek-v4-pro to deepseek-reasoner, which rejects forced tool_choice with HTTP 400 ("deepseek-reasoner does not support this tool_choice"). Pulse's classifier then surfaced this as "Selected model does not support Patrol tools," misdirecting diagnosis to the model rather than the request shape. supportsForcedToolChoice now returns false for any DeepSeek client, so every DeepSeek model falls back to tool_choice "auto" regardless of how DeepSeek routes the requested ID. The ai-runtime contract is updated to match: the provider-transport boundary now coerces forced tool_choice for every direct DeepSeek model ID, not only unknown ones. Patrol verified end-to-end: 20 tool calls, 9 findings, prior runtime failure auto-resolved.	2026-05-10 00:04:13 +01:00
rcourtman	6eb5bca06b	Add docs/AGENT_SUBSTRATE.md as the arc's session marker A single short file you can reach for in three weeks (or hand to anyone wiring into Pulse from the outside) and orient yourself on what shape the agent substrate took without re-reading the contract subsystem. Plain English, four-axis frame (discovery, depth, breadth, write, push), two-consumer summary (agent-probe and pulse-mcp), an explicit "what it does not do yet" section, and pointers into the formal contract docs and implementation files. Sized for release notes / GitHub announcement reuse, not as a contract surface itself. The contract still lives in docs/release-control/v6/internal/subsystems/api-contracts.md; this file is the friendly door to it.	2026-05-09 23:16:16 +01:00
rcourtman	2f0468a87b	Verify SSHSIG on in-app update artifacts The unattended timer (scripts/pulse-auto-update.sh) and the public bootstrap (scripts/install.sh, /install.sh) all verify the .sshsig sidecar against the pinned pulse-installer ed25519 key before trusting a release artifact. The in-app updater verified SHA256 only — same artifact, same root execution context, lower trust bar. Closing the asymmetry: the in-app tarball download in ApplyUpdate, adapter_installsh.go's install.sh download (piped into bash as root), and the rollback binary download now fetch and verify the .sshsig sidecar against the same pinned key, fail-closed. The signing infrastructure (release_asset_common.sh, validate-release.sh, backfill-release-assets.sh) already produces and validates these signatures for every release; this teaches the Go updater to honor what the shell paths have always required. ssh-keygen is shelled out to so the in-app updater shares the exact trust path used by the unattended path, with a package-level function variable for test injection so unit tests don't require ssh-keygen on the build host. Extends the deployment-installability contract's release-trust-fail-closed invariant to cover the in-app updater paths.	2026-05-09 23:14:07 +01:00
rcourtman	3a502fefda	Publish the cmd/pulse-mcp integration guide Slice 51 added the MCP adapter as a worked example. This makes it a published surface: an external maintainer who wants to wire Pulse into their Claude Desktop or Claude Code can read one README and have it working without spelunking through main.go. The guide carries: - Build and install instructions - Canonical config snippets for Claude Desktop and Claude Code - The env-var contract (PULSE_API_TOKEN, configurable name, always read from env so it stays out of process listings) - The published tool list grouped by category (context, operator-state, finding) with what each does - The stable error envelope shape and the difference between capability-specific codes and cross-cutting auth codes - Documented limitations: no subscribe_events, manifest fetched once, tools-only (no resource URIs) - Troubleshooting for the common failure modes (missing token, proxy gating discovery, missing write scope) api-contracts.md now points readers at the README as the canonical integration entry point so the contract doc keeps its in-repo focus and the README owns the user-facing copy.	2026-05-09 23:11:54 +01:00
rcourtman	eeb2975d22	Stability sweep on the agent-substrate arc Three things landed: 1. /api/agent/capabilities was missing from publicPathsAllowlist in router_public_paths_inventory_test.go. Slice 47 added the path to publicPaths in router.go and to publicRouteAllowlist in route_inventory_test.go but missed this second mirror, which scans publicPaths via go/ast. The test was failing on origin; this commit closes the gap. 2. The error-envelope paragraph in api-contracts.md now distinguishes capability-specific stable codes (the closed set declared per capability in the manifest) from cross-cutting codes the multi-tenant / auth middleware emits universally (invalid_org, org_suspended, access_denied). The previous wording implied all stable codes lived in per-capability errorCodes lists, which would have forced duplication on every capability or misled agents about which codes to expect. 3. New contract pin TestContract_AgentSurfaceErrorCodesMatch- ManifestDeclarations enforces the symmetry both directions: every code emitted by an agent-surface handler must be either declared in the matching capability or be one of the three cross-cutting codes; every manifest-declared code must have a matching emission. Drift either way is a contract regression. Pin verified clean against the current handler set. Stale forward-reference fixed: the capabilities paragraph no longer says "future MCP-server slices read the manifest" — slice 51 already shipped that adapter. Sweep also surfaced two failures in internal/mock/ from unrelated platform-support drift (unraid token set added in `ac82a2852` but the mock contract test wasn't updated). Those are not part of the agent-substrate arc and not mine to fix; flagged in the closing summary so they don't get lost.	2026-05-09 23:04:22 +01:00
rcourtman	d6a68f8044	Add cmd/pulse-mcp — MCP adapter wrapping the agent substrate The whole point of slice 39's hand-authored manifest with snake_case names and stable error codes was to make adapter projection cheap. This slice is the test: a minimal MCP (Model Context Protocol) server that turns Pulse's manifest into a tool surface Claude Desktop, Claude Code, and other MCP-speaking clients can drive natively. Every MCP tool is a one-line projection of a manifest capability. Input schemas are auto-derived from path placeholders ({name} segments become required string properties) and method (non- GET/DELETE tools accept a free-form body object). Adding a capability to the manifest automatically extends the tool surface — no MCP-side changes required. The adapter is stdlib-only, runs over stdio with line-delimited JSON-RPC 2.0 framing, preserves Pulse's stable error envelope verbatim through MCP's content-and-isError result so agents on the MCP side branch on the same codes they would on the wire, and skips subscribe_events (SSE streaming doesn't fit the request/response tool shape; future slices can layer it as MCP notifications). Eleven tests pin the projection rules and the JSON-RPC contract: path-placeholder schema generation, body-property method gating, substitution failures producing stable errors, the initialize handshake advertising tools, tools/list filtering subscribe_events, tools/call proxying with the bearer token and preserving the substrate's error envelope, unknown methods producing JSON-RPC method-not-found, and notifications producing no response. The substrate is now wrapped in two adapters, each demonstrating a different consumer profile: agent-probe walks the substrate as an HTTP client (slice 49); pulse-mcp wraps it for stdio MCP clients. Both depend only on the standard library and resolve paths from the manifest, so the substrate is the single source of truth and adapter additions stay cheap.	2026-05-09 22:51:37 +01:00
rcourtman	8aa22d0605	Surface action verification on the action.completed SSE payload Closes the certainty loop for agents watching the substrate's push channel. The action audit's read-after-write probe outcome was already persisted on the audit record, but agents watching action.completed only learned "the action ran" — they had to fetch /api/actions/{id} to know whether the read-back probe confirmed the intended state. That defeated the substrate's push-notification guarantee for dispatch certainty. The new agent-stable AgentResourceActionVerification projection (ran, success, command, note, ranAt — output stays in the audit record, deliberately omitted from events to keep payloads small) is now carried on both: - the action.completed SSE payload, projected from record.Result.Verification by the router-side bridge in wireAIChatDependenciesForService, and - the resource-context bundle's recentActions surface, via the same shared projectAgentResourceVerification helper so the bundle (depth) and the doorbell (push) speak the same vocabulary. Refused-before-dispatch failures omit verification (the probe never runs) so agents branch on field presence to distinguish "no probe attempted" from "probe ran with empty result". Three contract pins lock the symmetry: payload field present, router bridge populates it, bundle parallels. The capabilities manifest's subscribe_events description now mentions the verification block so external agents discover the field through the same path they already use to learn the rest of the agent surface.	2026-05-09 22:45:15 +01:00
rcourtman	956646a5c1	Add cmd/agent-probe — worked example consuming the agent substrate The substrate's read and write surfaces are end-to-end-tested internally; this slice answers the harder question — "is the substrate actually usable from the outside?" — by writing the smallest standalone program that consumes it. agent-probe walks the discovery → triage → depth → push flow against a running Pulse instance using only the Go standard library, so it doubles as a reference implementation for anyone building MCP servers, Claude Code integrations, or custom agents on top of Pulse. It resolves every path from the manifest rather than hardcoding them — if discovery moves a path, the probe follows automatically — and branches on the stable error envelope's "error" code field, never on human-readable messages. The focus rule (severity-lex-ordered) is intentionally simple so a reader can predict what the probe will pick; real agents will have richer policies. This is documentation as code: the program is short enough to read top-to-bottom and reads like the agent's own narration of what it's doing. The unit test pins the focus rule's lex ordering so a refactor that swaps it for a weighted score (which allowed many warnings to outrank one critical) cannot regress silently.	2026-05-09 22:28:00 +01:00
rcourtman	5156c03eed	End-to-end test the operator-state write loop through HTTP Closes the e2e contract proof on the write side. The only write capability the manifest declares is the operator-state intent loop (set / get / clear), and this test boots the full router stack to walk every state of that loop through the actual HTTP boundary — proving the manifest's declared error codes for set_operator_state and get_operator_state reach the wire from the handlers, the URL canonical id authoritatively wins over body-supplied ids (no scope-confusion writes), and SetAt/SetBy are server-populated so attribution cannot be spoofed. The flow exercised: GET unset → 404 operator_state_not_set PUT valid → 200 with persisted state + server SetAt GET → round-trips PUT invalid criticality → 400 operator_state_invalid DELETE → 204 GET → 404 operator_state_not_set (loop closed) DELETE again → 204 (idempotent) Two contract pins lock the audit-honesty and error-token contracts so a future refactor of the handler can't silently regress either: SetAt/SetBy populated server-side, URL-id wins over body-id, and the validator's domain error maps to the stable wire token via errors.Is rather than message-matching. Together with the read-side e2e (slice 47), the agent surface — read, write, push — has now been exercised end-to-end as one substrate.	2026-05-09 22:22:47 +01:00
rcourtman	8cf15fe639	End-to-end test the agent substrate's discovery → triage → depth flow The unit tests cover each piece in isolation; this test boots the full router stack and proves the discovery → triage → depth chain works as one substrate through the actual HTTP boundary an external agent would hit. It found two real bugs slice 40 introduced and slice 45/46 didn't surface: - /api/agent/capabilities was documented as unauthenticated but was missing from the router's publicPaths list, so the global auth middleware was 401'ing the discovery manifest. Fixed by adding the path to publicPaths and pinning the contract so it cannot regress. - The error-envelope shape across the agent surface is {"error": "<stable_code>", "message": "<human>"}, written via writeJSONError — not the {"code": ...} shape I had assumed in the docs. Pinned the wire shape on api-contracts.md so the documented error contract matches what writeJSONError actually writes. The e2e test exercises capabilities discovery, triage via fleet-context, and depth via resource-context with an unknown id to confirm the resource_not_found stable error code reaches the wire under the canonical "error" key. The subscribe_events SSE path is probed unauthenticated to confirm it's gated (401) rather than 404 — discovery's claim is honest.	2026-05-09 22:16:32 +01:00
rcourtman	a168215f6a	Add /api/agent/fleet-context for org-wide triage in one read The substrate had a per-resource bundle but no fleet view, so "where do I focus?" forced agents to walk every resource id and bundle each — O(N) round trips that scale with fleet size. The fleet endpoint returns a thin per-resource rollup in a single read: identity, operator-intent flags (intentionallyOffline, neverAutoRemediate, maintenanceWindowActive), per-severity finding counts, and pending-approval count. Same auth scope and same provider wiring as the per-resource bundle — operator-state via the canonical unified store, findings via AgentFindingsProvider, approvals via AgentApprovalsProvider — so the fleet sweep is the per-resource bundle's wiring multiplied by N with no new dependencies. Audit reads are deliberately omitted from the rollup; agents that want depth on a flagged resource follow up via /api/agent/resource-context/{id}. The capabilities manifest declares get_fleet_context with AgentFleetContext as the response shape so external agents discover the triage entry point through the same path they already use to learn the rest of the agent surface.	2026-05-09 22:08:50 +01:00

1 2 3 4 5 ...

2440 commits