Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-28 19:41:17 +00:00

Author	SHA1	Message	Date
rcourtman	d5b4850715	Harden AI session storage paths	2026-03-28 13:50:55 +00:00
rcourtman	4b61746f3b	Adapt Patrol retry budgets to provider context limits (#1370 )	2026-03-27 10:57:14 +00:00
rcourtman	608f184666	Retry Patrol with reduced seed context on provider window errors (#1370 )	2026-03-26 23:16:28 +00:00
rcourtman	c12394c17f	Route patrol investigations through patrol model (#1360 )	2026-03-26 09:16:38 +00:00
rcourtman	4ba888b450	Fix Pulse Assistant startup for legacy OpenAI-compatible configs (#1339 )	2026-03-25 23:54:17 +00:00
rcourtman	1de1392c9b	Preserve provider metadata in AI model lists (#1320 )	2026-03-25 13:08:15 +00:00
rcourtman	5f372e257f	Respect patrol model provider in quick analysis	2026-03-25 13:01:43 +00:00
rcourtman	73786a9e27	Skip patrol triggers when patrol is disabled (#1258 )	2026-03-25 11:33:34 +00:00
rcourtman	f9bf42498f	Fix Gemini cost estimation tiers (#1360 )	2026-03-25 09:55:17 +00:00
rcourtman	ae2edbde20	fix(ai): complete wiring on first-time configure; guard Ollama fallback Three follow-up fixes: 1. RestartAIChat() now performs the full post-start wiring (MCP providers, patrol adapter, investigation orchestrator) when the service starts for the first time via Restart(). Previously these were only wired via StartAIChat(), leaving first-time configure with a partially wired service. 2. The Ollama→OpenAI-compatible fallback in createProviderForModel is now guarded by !strings.HasPrefix(modelStr, "ollama:") so explicit "ollama:llama3" models are never silently rerouted to a different provider. 3. Windows install script registration check now uses the $Hostname override (if set) instead of always looking up $env:COMPUTERNAME, so post-install verification works correctly when a custom hostname is specified.	2026-03-13 12:06:08 +00:00
rcourtman	e137f3fbf7	fix(ai): start chat service on first-time configure without restart When Pulse starts before AI is configured, legacyService is nil. Saving AI settings called Restart() which bailed immediately on the nil check, leaving the service unstarted (503 on /api/ai/sessions) until a full process restart. Merged the nil and !IsRunning checks so first-time configure now starts the service inline, same as the already-handled stopped case. Also: bare model names that ParseModelString routes to Ollama (e.g. "qwen3-omni") now fall back to a configured custom OpenAI base URL when Ollama is not explicitly configured — handles manually-typed model names on self-hosted OpenAI-compatible endpoints. Fixes #1339, #1296	2026-03-13 11:13:27 +00:00
rcourtman	82c615b3b9	Filter virtual disks from SMART checks to prevent false positives (#1329 ) ZFS zvols (zd*), device-mapper, virtio disks, and other virtual block devices don't support SMART and were being reported as FAILED. Use lsblk JSON metadata to filter by device prefix, transport, subsystem, and vendor/model. Also treat missing smart_status as unknown rather than failed, and ignore UNKNOWN health in Patrol/AI signals.	2026-03-08 22:16:24 +00:00
rcourtman	499ab812e3	Fix post-release regressions and lock v5 to single-tenant runtime	2026-03-05 23:46:35 +00:00
rcourtman	5bd0563283	test(providers): update Ollama integration tests for timeout parameter	2026-03-01 23:28:16 +00:00
rcourtman	d46b5fc84b	fix(ai): route OpenRouter slash-delimited models to OpenAI provider (#1296 ) createProviderForModel() only handled "provider:model" colon format. Models like "google/gemini-2.5-flash" or "google/gemini-2.0-flash:free" (OpenRouter format) failed because the colon split produced invalid provider names. Now uses config.ParseModelString() which correctly detects slash- delimited models as OpenRouter (routed via OpenAI-compatible API).	2026-03-01 22:29:45 +00:00
rcourtman	2fcddecf80	feat(api): add POST /api/ai/patrol/undismiss endpoint to revert suppressed findings (#1300 ) The Undismiss() method existed on FindingsStore but was never exposed via the API. Users who dismissed findings as "not_an_issue" had no way to revert them. - Add HandleUndismissFinding handler and route - Add Undismiss() to UnifiedStore for parity with FindingsStore - Also remove matching explicit suppression rules on undismiss	2026-03-01 22:29:36 +00:00
rcourtman	d852964696	fix(ai): record patrol and QuickAnalysis token usage in cost store for budget enforcement Patrol runs, evaluation passes, and QuickAnalysis calls were consuming LLM tokens without recording them in the cost store. This made the cost_budget_usd_30d budget setting ineffective since enforceBudget() never saw patrol spend. - Add RecordUsage() to ai.Service for thread-safe cost recording - Add recordPatrolUsage() helper to PatrolService, called on both success and error paths for main patrol and evaluation pass - Record QuickAnalysis token usage in cost store - Return partial PatrolResponse (with token counts) on error instead of nil, so callers can always record consumed tokens - Propagate partial response through chat_service_adapter on error	2026-03-01 19:19:47 +00:00
rcourtman	c575c7e295	fix(patrol): rename wearout JSON field to ssd_life_remaining_pct (#1300 ) The AI also receives disk data via tool calls (pulse_metrics type="disks"), not just the patrol context table. The raw JSON field "wearout" was ambiguous — rename to "ssd_life_remaining_pct" so the field name itself communicates that 100 = healthy.	2026-02-27 23:12:27 +00:00
rcourtman	3006f51b60	fix(patrol): clarify wearout semantics so AI knows 100% = healthy (#1300 ) The patrol context table header said "Wearout" and the tool returned a raw "wearout" JSON field with no indication that 100 = full life remaining. The AI interpreted "wearout: 100" as fully worn out and raised false "100% Disk Wearout" findings on healthy NVMe drives. Rename the patrol table column to "SSD Life Remaining (100%=new)" and update the data type comment to clarify the semantics.	2026-02-27 23:05:02 +00:00
rcourtman	9aee8fa293	fix(ui): add Pro badge to Reporting tab and reduce patrol trigger log noise (#1285 , #1258 ) Show "Pro" badge on the Reporting settings tab so users know upfront that advanced reporting requires a Pro license, rather than discovering it after filling out the form. Downgrade patrol trigger queue-full and rejection messages from Warn to Debug — these are normal rate-limiting behavior, not actionable warnings.	2026-02-26 21:09:13 +00:00
rcourtman	24f5b1cb31	fix(patrol): cap per-run tokens and reset patrol session history	2026-02-24 11:29:47 +00:00
rcourtman	706502c22d	fix(alerts): default NotifyOnResolve to true and prevent patrol queue spam (#1259 , #1258 ) Recovery notifications were silently disabled for users with pre-5.1.12 configs because the NotifyOnResolve bool field defaults to false when absent from JSON. Use a *bool probe to detect missing field and default to true. Patrol trigger queue filled with warnings when the patrol loop wasn't running. Gate TriggerPatrolForAlert on p.running and clear the flag via defer when the loop exits.	2026-02-20 17:56:41 +00:00
rcourtman	5666d6a9e8	fix(ai): fsync knowledge store temp file before rename to prevent empty reads saveToDisk used os.WriteFile which doesn't sync to disk before the atomic rename. On CI runners with aggressive filesystem caching this can leave the destination file with zero bytes, causing TestKnowledgeStore_SaveLoad to fail with "unexpected end of JSON input".	2026-02-18 13:27:47 +00:00
rcourtman	7efcec3120	fix(agents,ai): host URL field, AI Docker routing, Proxmox registration logging (#1197 , #1210 , #1267 ) #1197: Add Custom URL input to the expanded host row in Settings → Agents. Loads existing URL via HostMetadataAPI on row expand; saves on button click. Only shown for host-type agent rows. #1210: Fix agent_connected always false for Docker hosts on Proxmox VMs. connectedAgentHostnames now also marks Docker host hostnames reachable when their matching VM/LXC has a node with a connected Proxmox agent, mirroring the routing logic already used in the control path. #1267/#1269: Improve Proxmox auto-registration failure logging. Response body is now included in the error message, and the warning directs users to delete the state file to force re-registration rather than claiming the node exists. (cherry picked from commit 305f6d3c94f0da4fc970450a6304da57d6d7fe80)	2026-02-18 12:57:09 +00:00
rcourtman	43af70ca1f	fix(patrol): skip alert triggers when Patrol is disabled TriggerPatrolForAlert was enqueuing into adHocTrigger regardless of whether Patrol was enabled. With patrolLoop not running (disabled), nothing drained the channel — it filled on the 10th alert and spammed "Patrol trigger queue full, dropping trigger" on every subsequent alert. Read p.config.Enabled in the same RLock as triggerManager and return early when disabled. Fixes #1258 (cherry picked from commit 69f399469538f0c9cd59084f6429fed8a793c042)	2026-02-18 12:53:12 +00:00
rcourtman	42c01c1be5	fix: probe all guest IPs for reachability, not just first Patrol only pinged the first IP address of each VM/container, causing false "unreachable" reports for guests with multiple IPs (common with Windows VMs that have IPv6 or multi-adapter setups). Now probes all IPs and marks reachable if any responds. Fixes #1215	2026-02-10 21:46:11 +00:00
rcourtman	8bb89c4031	test: add memory regression coverage for AI stores	2026-02-04 19:56:12 +00:00
rcourtman	d2604a6859	test: add AI memory regression coverage	2026-02-04 19:46:20 +00:00
rcourtman	526fb21076	Add tests for guest intelligence and reachability signals Cover gatherGuestIntelligence (discovery matching, instance fallback, reachability via mock prober, edge cases), parsePingOutput parsing, DetectReachabilitySignals, enriched seed context (Service/Reachable columns, quiet mode variants, health issues fallback), and extend signal helper tests for SignalGuestUnreachable.	2026-02-04 14:12:50 +00:00
rcourtman	34ca427458	Add unified guest intelligence to patrol seed context Enrich the patrol seed context with service identity (from discovery store) and network reachability (via ICMP ping through host agents). The guest metrics table now includes Service and Reachable columns, and a Service Health Issues section highlights running-but-unreachable guests. A new SignalGuestUnreachable signal type creates deterministic findings for unreachable guests. New files: - patrol_intelligence.go: GuestProber interface, GuestIntelligence type, gatherGuestIntelligence() with concurrent per-node probing - patrol_prober.go: agentExecProber implementation using batch ping commands via connected host agents	2026-02-04 14:08:57 +00:00
rcourtman	098a722e03	Cover blocked AI fetch hosts	2026-02-04 13:54:32 +00:00
rcourtman	dd3e9fc4a8	Cover loopback override in AI fetch guard	2026-02-04 13:53:29 +00:00
rcourtman	2d29b3dcd7	Unify Proxmox discovery and integrate PMG Patrol - Unified Proxmox VE discovery by redirecting Node requests to linked Host Agents. - Added smart deduplication and legacy fallback for Proxmox discovery results. - Integrated Proxmox Mail Gateway (PMG) into AI Patrol system. - Added comprehensive tests for discovery redirection and deduplication.	2026-02-04 13:52:36 +00:00
rcourtman	634594a168	Unify Proxmox discovery results - Redirect PVE node lookups to linked Host Agent ID when available. - Implement deduplication in discovery lists to prefer Host Agent data over redundant Node entries. - Add fallback mechanism to original Node ID for discovery retrieval ensuring compatibility with legacy data. - Update data adapters and added comprehensive unit tests for redirection and deduplication logic.	2026-02-04 13:46:56 +00:00
rcourtman	a6f2a674eb	fix: resolve test failures blocking release - KnowledgeStore: use atomic write (temp+rename) to prevent file corruption from concurrent async saves - Change password tests: add auth headers since endpoint now requires authentication - ClearSession test: expect 2 cookies (pulse_session + pulse_csrf) matching updated clearSession behavior - API token test: update to match current behavior where query-string tokens are accepted (needed for WebSocket connections) - Host agent config: allow ScopeHostManage to resolve any host, not just token-bound hosts	2026-02-03 23:53:54 +00:00
rcourtman	2ebe65bbc5	security: add scope checks to AI Patrol and agent profile endpoints - AI Patrol mutation endpoints (acknowledge, dismiss, suppress, snooze, resolve, findings/note, suppressions/) now require ai:execute scope to prevent low-privilege tokens from blinding patrol by hiding/suppressing findings - Agent profile admin endpoints (/api/admin/profiles/) now require settings:write scope to prevent low-privilege tokens from modifying fleet-wide agent behavior	2026-02-03 19:29:56 +00:00
rcourtman	69e3286e5e	security: fix AI OAuth scope bypass, approval replay attacks, and approval endpoint scope gating - OAuth endpoints now require settings:write scope (not just admin) - Approval endpoints now require ai:execute scope - Added CommandHash to approvals for replay protection - Approvals are now single-use (consumed on first use) - consumeApprovalWithValidation validates command matches approval	2026-02-03 19:15:15 +00:00
rcourtman	60f9e6f07f	security: fix multiple vulnerabilities (SAML, SSRF, Auth) Addressed several security findings: - SAML: Sanitized RelayState to prevent open redirects - SAML: Fixed logout to properly invalidate server-side sessions - Auth: Added auth, rate limiting, and logout checks to password change endpoint - AI: Added admin/scope gating (ai:execute) for command execution - AI: Blocked private IP ranges in fetch_url to prevent SSRF - Config: Enforced settings:read/write scopes for export/import - Agent: Added agent:exec scope requirement for WebSockets	2026-02-03 18:39:15 +00:00
rcourtman	f8bb14977d	fix(discovery): include IPAddresses in state adapter for URL suggestion The discovery state adapter was not copying IPAddresses from the models when converting VM/Container state. This caused getResourceExternalIP() to return empty strings, preventing URL suggestion from working.	2026-02-03 17:05:01 +00:00
rcourtman	935326ebb7	fix(api/ai): resolve critical auth, agent download, and lifecycle issues - Fix API-only mode to accept Bearer tokens and query params - Fix data race in API token validation using fine-grained locking - Fix unified agent download serving wrong binary for invalid arch - Fix AI infra discovery running when AI disabled and missing stop mechanism	2026-02-03 16:35:12 +00:00
rcourtman	3d8374e527	Fix AI investigation context and UI settings - Ensure correct org context is used for AI chat service resolution - Fix AI adapter tests - Update AI Intelligence page UI for advanced settings	2026-02-03 16:24:56 +00:00
rcourtman	8720708e70	fix: address AI patrol concurrency and streaming issues - HIGH: Create per-request AgenticLoop instead of sharing one across concurrent sessions. This prevents race conditions where ExecuteStream calls would overwrite each other's FSM, knowledge accumulator, and other session-specific state. - MEDIUM: TriggerManager.GetStatus now recomputes adaptive interval after pruning old events. Previously, currentInterval could remain stuck in busy/quiet mode after events aged out of the window. - MEDIUM: Patrol stream phases are now broadcast to subscribers. Fixed setStreamPhase() to emit phase events and SubscribeToStream() to send phase events to late joiners. UI was stuck on 'Starting patrol...' because phase events were never emitted. - LOW: Fixed TriggerStatus.CurrentInterval JSON serialization. Changed from time.Duration (serializes as nanoseconds) to int64 milliseconds to match the 'current_interval_ms' tag.	2026-02-03 14:39:00 +00:00
rcourtman	86a7c2283c	Revert "Detect incompatible models that don't support function calling" This reverts commit `11a72ee263`.	2026-02-03 13:36:30 +00:00
rcourtman	c6318a8484	Revert "Simplify incompatible model error message" This reverts commit `c58fe81700`.	2026-02-03 13:36:30 +00:00
rcourtman	c58fe81700	Simplify incompatible model error message	2026-02-03 13:30:54 +00:00
rcourtman	11a72ee263	Detect incompatible models that don't support function calling When local LLM servers (LM Studio, llama.cpp) receive tool definitions but the model doesn't support function calling, they output internal control tokens like <\|channel\|>, <\|im_start\|>, etc. instead of proper responses. This change detects these control tokens during streaming and returns a clear error message explaining that the model doesn't support function calling and recommending compatible models (Llama 3.1+, Mistral, Qwen). This is better than the previous approach of offering a "disable tools" option, which would have crippled Pulse Assistant/Patrol functionality. Users need to use compatible models for the AI features to work properly. Related to #1154	2026-02-03 13:28:37 +00:00
rcourtman	a55ae78715	Revert "Add config option to disable tools for OpenAI-compatible endpoints" This reverts commit `81229f206f`.	2026-02-03 13:26:26 +00:00
rcourtman	81229f206f	Add config option to disable tools for OpenAI-compatible endpoints Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible APIs but don't support function calling. When tools are sent to these models, they output raw control tokens instead of proper responses. This change adds: - openai_tools_disabled config field in AIConfig - AreToolsDisabledForProvider() method to check at runtime - API support to get/set the new setting - Tests for the new functionality When enabled and using a custom OpenAI base URL, the chat service will skip sending tools to the model, allowing basic chat functionality to work even with models that don't support function calling. Fixes #1154	2026-02-03 13:21:44 +00:00
rcourtman	e3556455c6	Revert "Sanitize LLM control tokens from OpenAI-compatible responses" This reverts commit `e5eb15918e`.	2026-02-03 13:14:33 +00:00
rcourtman	e5eb15918e	Sanitize LLM control tokens from OpenAI-compatible responses Some local models (llama.cpp, LM Studio) output internal control tokens like <\|channel\|>, <\|constrain\|>, <\|message\|> instead of using proper function calling. These tokens leak into the UI creating a poor UX. This adds sanitization to strip these control tokens from both streaming and non-streaming responses before they reach the user.	2026-02-03 13:12:17 +00:00

1 2 3 4 5 ...

279 commits