Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-30 12:30:17 +00:00

Author	SHA1	Message	Date
rcourtman	1de1392c9b	Preserve provider metadata in AI model lists (#1320 )	2026-03-25 13:08:15 +00:00
rcourtman	5f372e257f	Respect patrol model provider in quick analysis	2026-03-25 13:01:43 +00:00
rcourtman	d852964696	fix(ai): record patrol and QuickAnalysis token usage in cost store for budget enforcement Patrol runs, evaluation passes, and QuickAnalysis calls were consuming LLM tokens without recording them in the cost store. This made the cost_budget_usd_30d budget setting ineffective since enforceBudget() never saw patrol spend. - Add RecordUsage() to ai.Service for thread-safe cost recording - Add recordPatrolUsage() helper to PatrolService, called on both success and error paths for main patrol and evaluation pass - Record QuickAnalysis token usage in cost store - Return partial PatrolResponse (with token counts) on error instead of nil, so callers can always record consumed tokens - Propagate partial response through chat_service_adapter on error	2026-03-01 19:19:47 +00:00
rcourtman	24f5b1cb31	fix(patrol): cap per-run tokens and reset patrol session history	2026-02-24 11:29:47 +00:00
rcourtman	60f9e6f07f	security: fix multiple vulnerabilities (SAML, SSRF, Auth) Addressed several security findings: - SAML: Sanitized RelayState to prevent open redirects - SAML: Fixed logout to properly invalidate server-side sessions - Auth: Added auth, rate limiting, and logout checks to password change endpoint - AI: Added admin/scope gating (ai:execute) for command execution - AI: Blocked private IP ranges in fetch_url to prevent SSRF - Config: Enforced settings:read/write scopes for export/import - Agent: Added agent:exec scope requirement for WebSockets	2026-02-03 18:39:15 +00:00
rcourtman	935326ebb7	fix(api/ai): resolve critical auth, agent download, and lifecycle issues - Fix API-only mode to accept Bearer tokens and query params - Fix data race in API token validation using fine-grained locking - Fix unified agent download serving wrong binary for invalid arch - Fix AI infra discovery running when AI disabled and missing stop mechanism	2026-02-03 16:35:12 +00:00
rcourtman	0c802e7083	fix(patrol): improve service lifecycle, graceful shutdown, and concurrency	2026-02-01 16:27:25 +00:00
rcourtman	95a0d7a6bd	feat(backend): implement AI Patrol, Investigation, and system-wide refactors	2026-01-30 19:02:14 +00:00
rcourtman	03b5586ac8	refactor(ai): update patrol and service to use chat service adapter - Update patrol.go to use chat service for AI execution - Update service.go with chat service provider integration - Add patrol streaming endpoint to router	2026-01-28 21:24:34 +00:00
rcourtman	e194e17159	Update AI core services and adapters AI module improvements: Patrol System: - Better trigger handling - Improved history persistence - Enhanced coverage testing Knowledge Store: - Extended functionality - Better test coverage Adapters: - Discovery adapter updates - Investigation adapter improvements Unified Bridge: - Setup improvements - Better test coverage Alert handling and service updates.	2026-01-28 16:51:53 +00:00
rcourtman	7f7edfceb4	test: expand backend coverage	2026-01-25 21:08:44 +00:00
rcourtman	ff2841a5c6	Fix patrol scoping and config propagation	2026-01-24 23:07:55 +00:00
rcourtman	27f1a11acb	feat: add AI Intelligence system with investigation and forecasting Major new AI capabilities for infrastructure monitoring: Investigation System: - Autonomous finding investigation with configurable autonomy levels - Investigation orchestrator with rate limiting and guardrails - Safety checks for read-only mode enforcement - Chat-based investigation with approval workflows Forecasting & Remediation: - Trend forecasting for resource capacity planning - Remediation engine for generating fix proposals - Circuit breaker for AI operation protection Unified Findings: - Unified store bridging alerts and AI findings - Correlation and root cause analysis - Incident coordinator with metrics recording New Frontend: - AI Intelligence page with patrol controls - Investigation drawer for finding details - Unified findings panel with actions Supporting Infrastructure: - Learning store for user preference tracking - Proxmox event ingestion and correlation - Enhanced patrol with investigation triggers	2026-01-24 22:41:43 +00:00
rcourtman	37e7aebc98	feat: enhance AI patrol with streaming and improved findings - Add streaming support to patrol operations - Improve finding detection and reporting - Enhance agentic chat capabilities - Add alert integration improvements	2026-01-22 22:30:35 +00:00
rcourtman	4fe3d7df77	feat(ai): Add streaming support and notable models to AI providers - Add ChatStream method to all providers (Anthropic, OpenAI, Gemini, Ollama) for real-time streaming of AI responses with tool call support - Add StreamingProvider interface with StreamEvent types for content, thinking, tool_start, tool_end, done, and error events - Add notable models feature that fetches model metadata from models.dev to identify recent/recommended models (within last 3 months) - Add Notable field to ModelInfo struct to flag "latest and greatest" models - Add SupportsThinking method to check for extended reasoning capability The streaming support enables real-time AI chat responses instead of waiting for complete responses. The notable models feature helps users identify which models are current and recommended.	2026-01-19 19:10:58 +00:00
rcourtman	c26f0e6e6c	feat(ai): improve OpenCode integration and control level handling OpenCode client improvements: - Fix session listing with proper timestamp parsing - Model selection with provider inference (anthropic, google, etc) - Add session management APIs (summarize, diff, fork, revert) - Generated session titles from first user message Control level refactoring: - IsAutonomous() helper for cleaner checks - Legacy autonomous_mode maps to control_level for backwards compat - Simplified system instructions (rely on tool descriptions instead) Includes tests for model provider inference.	2026-01-17 14:43:28 +00:00
rcourtman	035436ad6e	fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking The agent was crashing with 'fatal error: concurrent map writes' when handleCheckUpdatesCommand spawned a goroutine that called collectOnce concurrently with the main collection loop. Both code paths access a.prevContainerCPU without synchronization. Added a.cpuMu mutex to protect all accesses to prevContainerCPU in: - pruneStaleCPUSamples() - collectContainer() delete operation - calculateContainerCPUPercent() Related to #1063	2026-01-15 21:10:55 +00:00
rcourtman	9cd53814a3	feat(alerts): add per-volume disk thresholds for host agents Allow users to set custom disk usage thresholds per mounted filesystem on host agents, rather than applying a single threshold to all volumes. This addresses NAS/NVR use cases where some volumes (e.g., NVR storage) intentionally run at 99% while others need strict monitoring. Backend: - Check for disk-specific overrides before using HostDefaults.Disk - Override key format: host:<hostId>/disk:<mountpoint> - Support both custom thresholds and disable per-disk Frontend: - Add 'hostDisk' resource type - Add "Host Disks" collapsible section in Thresholds → Hosts tab - Group disks by host for easier navigation Closes #1103	2026-01-13 23:38:20 +00:00
rcourtman	b177812fd3	revert: remove accidentally committed WIP OpenCode changes Reverts unintended changes from `4e064aa0` that broke the frontend build. The workflow fix for cmd/pulse package build remains intact.	2026-01-13 09:15:42 +00:00
rcourtman	4e064aa0cc	fix: build entire cmd/pulse package, not just main.go The static binary build was only compiling main.go, missing bootstrap.go and config.go which define osExit, bootstrapTokenCmd, and configCmd.	2026-01-13 09:06:21 +00:00
rcourtman	b2a6cd0fa3	fix(agent): add FreeBSD platform support to agent download and UI (#1051 ) - Add freebsd-amd64 and freebsd-arm64 to normalizeUnifiedAgentArch() so the download endpoint serves FreeBSD binaries when requested - Add FreeBSD/pfSense/OPNsense platform option to agent setup UI with note about bash installation requirement - Add FreeBSD test cases to unified_agent_test.go Fixes installation on pfSense/OPNsense where users were getting 404 errors because the backend didn't recognize the freebsd-amd64 arch parameter from install.sh.	2026-01-11 23:51:12 +00:00
rcourtman	ed78509f92	Fix flaky tests and improve coverage across alerts, api, and config packages - Fix deadlock and race conditions in internal/alerts - Add comprehensive error path tests for internal/config - Fix 401 handling in internal/api - Fix Docker Swarm task filtering test logic	2026-01-03 18:36:17 +00:00
rcourtman	9e339957c6	fix: Update runtime config when toggling Docker update actions setting The DisableDockerUpdateActions setting was being saved to disk but not updated in h.config, causing the UI toggle to appear to revert on page refresh since the API returned the stale runtime value. Related to #1023	2026-01-03 11:14:17 +00:00
rcourtman	31c704c7a7	refactor: fix lint issues in internal/ai package - Remove redundant nil checks before len() calls - Mark unused parameters with underscore - Convert if/else chains to switch statements for cleaner code - Add test assertions to resolve unused write warnings in patrol_test.go	2026-01-02 19:53:01 +00:00
rcourtman	180cddb55b	refactor: use license package constants for Pro features in AI service	2026-01-02 14:11:56 +00:00
rcourtman	c2de1b256b	fix(pro): add cleanup goroutine for alert analyzer memory leak - Add Start/Stop lifecycle methods to AlertTriggeredAnalyzer - Periodic cleanup of lastAnalyzed map every 30 minutes - Prevents memory growth from stale cooldown entries - Document that ai package feature constants are aliases of license constants - Call Start() in StartPatrol and Stop() in StopPatrol - Add tests for Start/Stop lifecycle	2026-01-02 13:12:24 +00:00
rcourtman	ae1c39960f	fix: Remove duplicate AI chat response streaming (issue #947 ) Content was being streamed twice: 1. During each iteration of the tool loop (intended for intermediate feedback) 2. Again after the loop ended with finalContent (redundant) This caused duplicate responses when using Ollama and other providers.	2025-12-29 09:18:05 +00:00
rcourtman	3040800e7b	fix: AI Patrol now respects exact user-configured thresholds BREAKING CHANGE: AI Patrol now uses EXACT alert thresholds by default instead of warning 5-15% before the threshold. Changes: - Default behavior: Patrol warns at your configured threshold (e.g., 96% = warns at 96%) - New setting: 'use_proactive_thresholds' enables the old early-warning behavior - API: Added use_proactive_thresholds to GET/PUT /api/settings/ai - Backend: Added SetProactiveMode/GetProactiveMode to PatrolService - Backend: Added GetThresholds to PatrolService for UI display - Tests: Updated and added tests for both exact and proactive modes - Also fixed unused imports in dockeragent/agent.go When proactive mode is disabled (default): - Watch: threshold - 5% (slight buffer) - Warning: exact threshold When proactive mode is enabled: - Watch: threshold - 15% - Warning: threshold - 5% Related to #951	2025-12-29 08:40:34 +00:00
rcourtman	fe3b4ed5b6	fix: require Pro license for auto-fix and autonomous mode - patrol.go: Auto-fix now requires both config flag AND ai_autofix license - service.go: IsAutonomous() checks for ai_autofix license before enabling - ai_handlers.go: API returns 402 if enabling auto-fix/autonomous without license	2025-12-25 21:26:46 +00:00
rcourtman	d74eae3a3e	fix(demo): support patrol analysis mock Adds structured XML finding responses to the demo mock AI service. This prevents the background patrol service from failing with 'Analysis failed' when running in demo mode without a real LLM provider.	2025-12-23 18:48:50 +00:00
rcourtman	03d7147615	fix(ai): force enabled state in demo mode Ensures the AI settings endpoint reports enabled=true and configured=true when running in demo mode (PULSE_MOCK_MODE=true), even if no provider is configured. This unlocks the frontend UI to allow interaction with the mock AI assistant.	2025-12-23 18:39:34 +00:00
rcourtman	ead3e9ec7e	feat(ai): add mock chat response for demo mode Allows the AI Assistant to provide realistic canned responses on the live demo server without needing a real API key. Handled automatically when PULSE_MOCK_MODE=true and no provider is configured.	2025-12-23 18:34:38 +00:00
rcourtman	b75728922c	feat: add demo AI findings for mock mode When MOCK_ENABLED=true, Pulse now injects realistic AI patrol findings to showcase the AI features without requiring actual LLM API calls. This enables the demo instance to demonstrate: - Critical/warning/info findings with realistic content - Patrol run history - Actionable recommendations Also includes refinements to dismissal logic from earlier work: - Only 'not_an_issue' creates permanent suppression - 'expected_behavior' and 'will_fix_later' just acknowledge	2025-12-22 17:16:26 +00:00
rcourtman	28ac86c8ab	fix: reduce WebSocket reconnection log noise in host agent Addresses #866 - agents were logging 'WebSocket connection failed' warnings even during normal reconnection scenarios (server restart, network blip, etc). Changes: - Normal close errors (1000, 1001, connection reset) now log at Debug level - Only log Warning after 3+ consecutive failures - Changed 'Connecting to Pulse' from Info to Debug to reduce noise - Successful connections still log at Info level The WebSocket is only used for AI command execution, not metrics, so transient disconnections don't affect monitoring functionality.	2025-12-22 14:11:23 +00:00
rcourtman	4e893117cd	fix: correct patrol interval logging The log was showing QuickCheckInterval (deprecated, always 0) instead of the actual Interval field. This caused confusing 'interval: 0' logs.	2025-12-21 21:52:57 +00:00
rcourtman	d9f1f7accd	feat(ai): add real-time anomaly detection endpoint Add /api/ai/intelligence/anomalies endpoint that compares live metrics against learned baselines to surface deviations - all deterministic (no LLM required). Backend: - Add AnomalyReport struct with severity classification - Add CheckResourceAnomalies method to baseline store - Add HandleGetAnomalies API handler - Add GetStateProvider getter to AI service Frontend: - Add AnomalyReport and AnomaliesResponse types - Add getAnomalies API function - Add AnomalySeverity type This is the first step toward surfacing deterministic intelligence directly in the UI without requiring LLM interaction.	2025-12-21 10:52:54 +00:00
rcourtman	96573f4aca	feat: enhance AI baseline context visibility and incident timeline improvements Backend: - Enhanced buildEnrichedResourceContext to ALWAYS show learned baselines with status indicators (normal/elevated/anomaly) instead of only when anomalous - This makes Pulse Pro's 'moat' visible - users can see the AI understands their infrastructure's normal behavior patterns - Added baseline import to service.go Frontend (user changes): - Added incident event type filtering with toggle buttons - Added resource incident panel to view all incidents for a resource - Added timeline expand/collapse functionality in alert history - Added incident note saving with proper incidentId tracking - Added startedAt parameter for proper incident timeline loading	2025-12-21 00:14:20 +00:00
rcourtman	5173fc3162	fix: normalize guest ID fallbacks to canonical instance:node:vmid format Multiple frontend components were using - as a fallback when guest.id was falsy. This format drops the node component, which is critical for clustered setups where the same VMID can exist on different nodes. Changes: - GuestDrawer.tsx: Updated guestId() and handleAskAI() to use canonical format - GuestRow.tsx: Updated buildGuestId() to use canonical format - Dashboard.tsx: Updated handleGuestRowClick() and guest rendering loop, also fixed legacy metadata fallback to use consistent keying - ThresholdsTable.tsx: Updated guestsGroupedByNode() to use canonical format Backend changes: - Removed temporary debug logging added during investigation - Added alert history section to AI buildEnrichedResourceContext() function The backend generates VM/Container IDs in instance:node:vmid format (e.g., delly:delly:101) via makeGuestID(). This format is now consistently used across all frontend fallbacks to prevent AI context, metadata, overrides, and metrics from colliding or desyncing in clustered environments.	2025-12-20 22:11:35 +00:00
rcourtman	ae522c9a2b	fix: Allow all threshold types (Storage, Temperature, Host Agent) to be set to 0 to disable alerting - Fixed normalizeStorageDefaults to allow Trigger=0 - Fixed normalizeNodeDefaults (Temperature) to allow Trigger=0 - Added comprehensive tests for all threshold normalization patterns - Updated existing test that expected old behavior Related to #864	2025-12-20 20:42:23 +00:00
rcourtman	781442cdd0	test: Add comprehensive tests for Host Agent threshold normalization with Trigger=0. Related to #864	2025-12-20 20:32:59 +00:00
rcourtman	db5e79bb37	fix: Allow Host Agent thresholds to be set to 0 to disable alerting. Related to #864	2025-12-20 20:25:20 +00:00
rcourtman	7f05d87809	fix: add missing HandleLicenseFeatures method and related changes - Add HandleLicenseFeatures handler that was missing from license_handlers.go - Add /api/license/features route to router - Update AI service and metadata provider - Update frontend license API and components - Fix CI build failure caused by tests referencing unimplemented method	2025-12-19 22:59:52 +00:00
rcourtman	4d1138793d	feat(license): add initial license implementation structure to fix build	2025-12-19 17:01:57 +00:00
rcourtman	0d6aaff253	fix: AI Patrol frequency not obeying settings Fixes #858 The patrol interval setting was not being properly applied due to: 1. ReconfigurePatrol() was setting the deprecated QuickCheckInterval field instead of the preferred Interval field 2. SetConfig() was comparing raw field values instead of using GetInterval() to compare effective intervals, causing change detection to fail 3. The API response was missing interval_ms, preventing the frontend from displaying the correct interval Changes: - Update StartPatrol() and ReconfigurePatrol() to use the Interval field - Fix SetConfig() to use GetInterval() for interval comparison - Add IntervalMs to PatrolStatusResponse and include it in the API response	2025-12-18 21:33:50 +00:00
rcourtman	c91307be94	fix: guest URL icon now appears/disappears immediately after AI sets/removes it The issue was a SolidJS reactivity problem in the Dashboard component. When guestMetadata signal was accessed inside a For loop callback and assigned to a plain variable, SolidJS lost reactive tracking. Changed from: const metadata = guestMetadata()[guestId] \|\| ... customUrl={metadata?.customUrl} To: const getMetadata = () => guestMetadata()[guestId] \|\| ... customUrl={getMetadata()?.customUrl} This ensures SolidJS properly tracks the signal dependency when the getter function is called directly in JSX props.	2025-12-18 14:42:47 +00:00
rcourtman	54fc259221	fix(ai): improve AI settings UX with validation and smart fallbacks Backend: - Add smart provider fallback when selected model's provider isn't configured - Automatically switch to a model from a configured provider instead of failing - Log warning when fallback occurs for visibility Frontend (AISettings.tsx): - Add helper functions to check if model's provider is configured - Group model dropdown: configured providers first, unconfigured marked with ⚠️ - Add inline warning when selecting model from unconfigured provider - Validate on save that model's provider is configured (or being added) - Warn before clearing last configured provider (would disable AI) - Warn before clearing provider that current model uses - Add patrol interval validation (must be 0 or >= 10 minutes) - Show red border + inline error for invalid patrol intervals 1-9 - Update patrol interval hint: '(0=off, 10+ to enable)' These changes prevent confusing '500 Internal Server Error' and 'AI is not enabled or configured' errors when model/provider mismatch.	2025-12-17 18:30:19 +00:00
rcourtman	7acff2215c	style: remove emojis from AI context formatting and prompts Replaced emoji indicators with text equivalents for better cross-platform compatibility and cleaner LLM prompts.	2025-12-13 21:26:49 +00:00
rcourtman	26802cd7bf	feat(backend): Implement remaining TODOs 1. resources/store.go: Implement sorting in Query.Execute() - Added sortResources function with support for common fields - Supports: name, type, status, cpu, memory, disk, last_seen - Both ascending and descending order supported 2. ai/service.go: Implement hasAgentForTarget properly - Now maps target to specific agent based on hostname/node - Uses ResourceProvider lookup for container→host mapping - Supports cluster peer routing for Proxmox clusters - Properly handles single-agent vs multi-agent scenarios	2025-12-13 13:21:23 +00:00
rcourtman	23a27b5b93	fix: correct AI tool description for guest resource ID format The set_resource_url tool had an incorrect example ID format ('pve1-delly-101') which caused the AI to save URLs with wrong IDs that didn't match the actual guest IDs used by Pulse ('instance-VMID' format like 'delly-150'). This fix updates the tool description to clearly document the correct format, so URLs saved by the AI will now properly appear in the dashboard.	2025-12-12 21:28:34 +00:00
rcourtman	8b077f69ce	feat: AI security and policy improvements for 5.0 - Add DOMPurify sanitization for AI chat markdown rendering (XSS fix) - Configure DOMPurify to add target=_blank and rel=noopener to links - Update system prompt to align with command approval policy - Clarify safe vs destructive commands in prompt - Improve patrol auto-fix mode guidance with safe operation list - Add verification requirements for auto-fix actions - Update observe-only mode to be clearer about read-only restrictions	2025-12-12 17:38:55 +00:00

1 2

75 commits