Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-05-22 03:02:35 +00:00

Author	SHA1	Message	Date
rcourtman	2ebe65bbc5	security: add scope checks to AI Patrol and agent profile endpoints - AI Patrol mutation endpoints (acknowledge, dismiss, suppress, snooze, resolve, findings/note, suppressions/) now require ai:execute scope to prevent low-privilege tokens from blinding patrol by hiding/suppressing findings - Agent profile admin endpoints (/api/admin/profiles/) now require settings:write scope to prevent low-privilege tokens from modifying fleet-wide agent behavior	2026-02-03 19:29:56 +00:00
rcourtman	1733bea15c	feat(ui): show backup permission warnings on Backups page When PVE backup polling detects permission errors (403/401/permission denied), track them per instance and surface them via the scheduler health endpoint. The Backups page now fetches instance warnings and displays a banner when backup permission issues are detected, telling users exactly how to fix the problem. Related to #1139	2026-02-03 19:27:10 +00:00
rcourtman	69e3286e5e	security: fix AI OAuth scope bypass, approval replay attacks, and approval endpoint scope gating - OAuth endpoints now require settings:write scope (not just admin) - Approval endpoints now require ai:execute scope - Added CommandHash to approvals for replay protection - Approvals are now single-use (consumed on first use) - consumeApprovalWithValidation validates command matches approval	2026-02-03 19:15:15 +00:00
rcourtman	316a56299c	fix(agent): grant PVEDatastoreAdmin for backup visibility The unified agent's Proxmox setup was missing the PVEDatastoreAdmin permission on /storage, causing local PVE backups to not appear in Pulse's backup overview for users who set up nodes via the agent. The UI-generated setup script already included this permission, but the agent path (--enable-proxmox) did not, creating an inconsistency. Related to #1139	2026-02-03 19:11:25 +00:00
rcourtman	43c696896f	security: fix high severity authz issues (AI chat, patrol autonomy, discovery, host config)	2026-02-03 19:00:56 +00:00
rcourtman	4fdc0cae64	feat(reporting): enrich metric reports with detailed resource info	2026-02-03 18:51:27 +00:00
rcourtman	225da6eb39	security: strengthen public URL capture to enforce scope and admin checks	2026-02-03 18:49:42 +00:00
rcourtman	83382ee251	security: enforce scope checks on admin diagnostics endpoint	2026-02-03 18:44:55 +00:00
rcourtman	8f92273e33	security: enforce scope checks for AI approvals and config management	2026-02-03 18:40:31 +00:00
rcourtman	60f9e6f07f	security: fix multiple vulnerabilities (SAML, SSRF, Auth) Addressed several security findings: - SAML: Sanitized RelayState to prevent open redirects - SAML: Fixed logout to properly invalidate server-side sessions - Auth: Added auth, rate limiting, and logout checks to password change endpoint - AI: Added admin/scope gating (ai:execute) for command execution - AI: Blocked private IP ranges in fetch_url to prevent SSRF - Config: Enforced settings:read/write scopes for export/import - Agent: Added agent:exec scope requirement for WebSockets	2026-02-03 18:39:15 +00:00
rcourtman	442d29e9b9	feat(reporting): enhance PDF reports with Executive Summary and actionable insights - Add professional cover page with branding and report period - Add Executive Summary page with health status banner (HEALTHY/WARNING/CRITICAL) - Add Quick Stats section with color-coded metrics and trend indicators - Add Key Observations with automated analysis of CPU, memory, disk, and disk wear - Add Recommended Actions section with prioritized, actionable items - Add Resource Details page with hardware info, storage pools, physical disks - Add color-coded tables for alerts, storage, and disk health - Add performance charts with area fills and proper scaling - Improve overall visual design with consistent color scheme - Fix SAML session invalidation to use correct SessionStore method	2026-02-03 18:17:31 +00:00
rcourtman	2e4e7b06a8	fix(tests): update reporting handlers tests to match new signature NewReportingHandlers now requires a MultiTenantMonitor parameter. Pass nil since the tests don't need the monitor functionality.	2026-02-03 17:48:44 +00:00
rcourtman	d716bbfdeb	fix(security): add proper authorization to sensitive endpoints - /api/agent-install-command: require admin + settings:write scope Previously only RequireAuth, allowing any authenticated user to mint high-privilege API tokens (host-agent:manage) - /api/system/ssh-config: require settings:write scope Previously any authenticated token could modify ~/.ssh/config - /api/system/verify-temperature-ssh: require settings:write scope Previously any authenticated token could trigger SSH connection attempts to arbitrary nodes (network scanning risk) - /api/diagnostics: require admin privileges Previously exposed API token metadata (IDs, hints, usage mapping) to any authenticated token, enabling enumeration attacks	2026-02-03 17:47:40 +00:00
rcourtman	12a5a98117	fix: SSE race conditions, alert user spoofing, and security status oracle SSE Broadcaster: - Add per-client mutex to prevent concurrent writes to ResponseWriter - Fix data race in cleanupLoop reading LastActive without synchronization - Update LastActive in SendHeartbeat so clients aren't incorrectly pruned after 5 minutes of idle heartbeat traffic Alert Acknowledgements: - Extract authenticated user from X-Authenticated-User header instead of hardcoding 'admin' or trusting request body's User field - Prevents audit log spoofing and ensures accurate user attribution Security Status Endpoint: - Remove ?token= query param validation from public /api/security/status - Prevents endpoint from acting as a token validity oracle for attackers - Authentication still works via session cookies and X-API-Token header	2026-02-03 17:40:58 +00:00
rcourtman	beae4c860c	fix: address 6 security and reliability issues Security fixes: - Auto-register now requires settings:write scope for API tokens - X-Forwarded-For in auto-register only trusted from verified proxies - Public URL capture requires authentication (no loopback bypass) - Lockout reset now uses RequireAdmin for session users Reliability fixes: - Docker stop command expiration clears PendingUninstall flag - Cancelled notifications get completed_at set and are cleaned up	2026-02-03 17:32:44 +00:00
rcourtman	b2639ed5a5	Fix security vulnerabilities and critical bugs - Fix WebSocket CORS bypass by strictly verifying origin - Fix OIDC refresh token persistence by encrypting at rest - Fix grouped webhook data mutation by cloning alerts - Fix host agent uninstall authorization and config fetch logic - Fix notification queue recovery for stuck sending items - Fix ignored update history limit parameter - Fix ineffective break statement in WebSocket write pump	2026-02-03 17:16:27 +00:00
rcourtman	c7f4030c29	fix(monitoring): prevent memory leak from stale metrics history and rate tracker entries MetricsHistory.Cleanup() was defined but never called, and even if called, it only removed old data points without deleting map entries for deleted containers/VMs. Each stale entry leaked ~224KB (7 pre-allocated slices). Changes: - Call metricsHistory.Cleanup() and rateTracker.Cleanup() in maintenance loop - Delete map entries entirely when all data points have expired - Return nil instead of empty slice in cleanupMetrics() to release backing arrays - Add Cleanup() method to RateTracker with 24-hour stale threshold - Add debug logging to track cleanup activity Related to #1153	2026-02-03 17:16:06 +00:00
rcourtman	f8bb14977d	fix(discovery): include IPAddresses in state adapter for URL suggestion The discovery state adapter was not copying IPAddresses from the models when converting VM/Container state. This caused getResourceExternalIP() to return empty strings, preventing URL suggestion from working.	2026-02-03 17:05:01 +00:00
rcourtman	bd030c7c87	security: fix webhook SSRF, rate limit spoofing, metrics retention, and url poisoning - Fix SSRF and rate limit bypass in SendEnhancedWebhook by validating the rendered URL. - Fix rate limit spoofing in updates API by using secure IP extraction (trusted proxies). - Fix memory leak in metrics history by correctly clearing fully stale data series. - Fix public URL poisoning by preventing overwrites when explicitly configured.	2026-02-03 16:58:13 +00:00
rcourtman	3ea3f0f827	feat(discovery): auto-suggest web interface URLs for discovered services Add deterministic URL suggestion based on service type and external IP: - Add SuggestedURL field to ResourceDiscovery type (Go + TypeScript) - Create url_suggestion.go with 60+ service defaults (Jellyfin, Plex, Home Assistant, Grafana, Proxmox, etc.) - Support HTTPS services, custom paths (/web, /dashboard/, /admin) - Fall back to discovered ports for unknown services - Add UI in DiscoveryTab with "Use this" button to populate URL input - Add comprehensive unit tests for URL suggestion logic Suggestion only appears when no custom URL is saved. User clicks "Use this" to populate the input, then "Save" to confirm.	2026-02-03 16:49:57 +00:00
rcourtman	4f40c3d751	fix: resolve critical stability and auth issues - Fix data race in webhook notifications by removing shared state - Fix duplicate monitors on config reload by stopping old instances - Prevent metrics ID deletion on transient startup errors - Support Bearer auth header for config export/import endpoints	2026-02-03 16:46:27 +00:00
rcourtman	935326ebb7	fix(api/ai): resolve critical auth, agent download, and lifecycle issues - Fix API-only mode to accept Bearer tokens and query params - Fix data race in API token validation using fine-grained locking - Fix unified agent download serving wrong binary for invalid arch - Fix AI infra discovery running when AI disabled and missing stop mechanism	2026-02-03 16:35:12 +00:00
rcourtman	a1b9de8f10	Enhance discovery UI and table consistency - Fix visual flash in discovery tab - Standardize table column widths and UI across Docker, Hosts, Storage, etc. - Add support for new K8s and Host charts - Fix Service Discovery tests	2026-02-03 16:25:09 +00:00
rcourtman	3d8374e527	Fix AI investigation context and UI settings - Ensure correct org context is used for AI chat service resolution - Fix AI adapter tests - Update AI Intelligence page UI for advanced settings	2026-02-03 16:24:56 +00:00
rcourtman	aeca5e39fa	Fix multi-tenant persistence and backend stability - Initialize Alert and Notification managers with tenant-specific data directories - Add panic recovery to WebSocket safeSend for stability - Record host metrics to history for sparkline support	2026-02-03 16:24:42 +00:00
rcourtman	bea3bbe5f6	Fix API token authentication and multi-tenancy logic - Fix AuthContextMiddleware to use tenant-specific config for token validation - Resolve data race in token LastUsedAt update - Fix invalid org IDs returning 501/402 instead of 400 - Prevent unauthenticated organization directory creation (DoS protection)	2026-02-03 16:24:28 +00:00
rcourtman	88d95f40be	feat: add Discovery Transparency & Trust features - Add AI provider indicator showing local (Ollama) vs cloud (Anthropic/OpenAI) analysis - Add "What Discovery Does" explanation section before first scan - Show commands preview before scan so users know what will run - Add scan details section showing raw command outputs for admins - Filter sensitive Docker labels (passwords, secrets, tokens) before AI analysis - Add comprehensive tests for label filtering This improves sysadmin confidence by making discovery transparent about what it does, what data it collects, and where that data goes.	2026-02-03 14:59:27 +00:00
rcourtman	8720708e70	fix: address AI patrol concurrency and streaming issues - HIGH: Create per-request AgenticLoop instead of sharing one across concurrent sessions. This prevents race conditions where ExecuteStream calls would overwrite each other's FSM, knowledge accumulator, and other session-specific state. - MEDIUM: TriggerManager.GetStatus now recomputes adaptive interval after pruning old events. Previously, currentInterval could remain stuck in busy/quiet mode after events aged out of the window. - MEDIUM: Patrol stream phases are now broadcast to subscribers. Fixed setStreamPhase() to emit phase events and SubscribeToStream() to send phase events to late joiners. UI was stuck on 'Starting patrol...' because phase events were never emitted. - LOW: Fixed TriggerStatus.CurrentInterval JSON serialization. Changed from time.Duration (serializes as nanoseconds) to int64 milliseconds to match the 'current_interval_ms' tag.	2026-02-03 14:39:00 +00:00
rcourtman	c2ed6067f1	Fix: discovery routing, host identification, and UX feedback - Fix routing for POST/PUT/DELETE on /api/discovery/host/ endpoints (Go's http.ServeMux was matching the longer prefix before method-specific routes) - Add HOST-specific AI prompt that focuses on identifying the host OS rather than services/containers running on it - Add success message UI after discovery completes - Fix timing so success appears after data is visible (not during refetch) - Add error handling and display for failed discoveries	2026-02-03 14:10:54 +00:00
rcourtman	896b5bfc89	Fix: enable backup monitoring for PVE instances via config migration Adds a config migration that ensures MonitorBackups is enabled for PVE instances, matching the existing PBS migration from issue #411. This fixes issue #1139 where local PVE backups weren't appearing in the backup overview because the MonitorBackups field defaulted to false when not explicitly set. Fixes #1139	2026-02-03 13:38:41 +00:00
rcourtman	86a7c2283c	Revert "Detect incompatible models that don't support function calling" This reverts commit `11a72ee263`.	2026-02-03 13:36:30 +00:00
rcourtman	c6318a8484	Revert "Simplify incompatible model error message" This reverts commit `c58fe81700`.	2026-02-03 13:36:30 +00:00
rcourtman	c58fe81700	Simplify incompatible model error message	2026-02-03 13:30:54 +00:00
rcourtman	11a72ee263	Detect incompatible models that don't support function calling When local LLM servers (LM Studio, llama.cpp) receive tool definitions but the model doesn't support function calling, they output internal control tokens like <\|channel\|>, <\|im_start\|>, etc. instead of proper responses. This change detects these control tokens during streaming and returns a clear error message explaining that the model doesn't support function calling and recommending compatible models (Llama 3.1+, Mistral, Qwen). This is better than the previous approach of offering a "disable tools" option, which would have crippled Pulse Assistant/Patrol functionality. Users need to use compatible models for the AI features to work properly. Related to #1154	2026-02-03 13:28:37 +00:00
rcourtman	a55ae78715	Revert "Add config option to disable tools for OpenAI-compatible endpoints" This reverts commit `81229f206f`.	2026-02-03 13:26:26 +00:00
rcourtman	81229f206f	Add config option to disable tools for OpenAI-compatible endpoints Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible APIs but don't support function calling. When tools are sent to these models, they output raw control tokens instead of proper responses. This change adds: - openai_tools_disabled config field in AIConfig - AreToolsDisabledForProvider() method to check at runtime - API support to get/set the new setting - Tests for the new functionality When enabled and using a custom OpenAI base URL, the chat service will skip sending tools to the model, allowing basic chat functionality to work even with models that don't support function calling. Fixes #1154	2026-02-03 13:21:44 +00:00
rcourtman	e3556455c6	Revert "Sanitize LLM control tokens from OpenAI-compatible responses" This reverts commit `e5eb15918e`.	2026-02-03 13:14:33 +00:00
rcourtman	e5eb15918e	Sanitize LLM control tokens from OpenAI-compatible responses Some local models (llama.cpp, LM Studio) output internal control tokens like <\|channel\|>, <\|constrain\|>, <\|message\|> instead of using proper function calling. These tokens leak into the UI creating a poor UX. This adds sanitization to strip these control tokens from both streaming and non-streaming responses before they reach the user.	2026-02-03 13:12:17 +00:00
rcourtman	71f80c8a99	Fix: alert resolution now records incident timeline during quiet hours - Fixed early return in handleAlertResolved that skipped incident recording when quiet hours suppressed recovery notifications - Added Host Agent alert delay configuration (backend + UI) - Host Agents now have dedicated time threshold settings like other resource types Related to #1179	2026-02-03 12:49:41 +00:00
rcourtman	900e05025a	Fix OpenAI-compatible endpoint support for chat Two issues fixed: 1. Custom base URL wasn't being passed to the OpenAI client in createProviderForModel() - requests went to api.openai.com instead of the configured endpoint (e.g., LM Studio, llama.cpp) 2. Tool schemas were missing the "properties" field when tools had no parameters. OpenAI API requires "properties" to always be present as an object, even if empty. Fixes #1154	2026-02-03 12:03:06 +00:00
rcourtman	8495878553	Fix: improve mock metrics sampler startup performance - Reduce minimum seed duration from 7 days to 1 hour for faster startup on resource-constrained systems (like demo server 1GB droplet) - Reduce sleep times from 200ms to 50ms between resource processing - Add diagnostic logging throughout mock metrics seeding to help debug issues where sparklines show no data - Add progress logging for nodes, VMs, containers, storage, docker hosts	2026-02-03 12:03:06 +00:00
rcourtman	a61f1b387a	Fix: data race in Docker detection test mock — add mutex for concurrent calls	2026-02-03 00:12:16 +00:00
rcourtman	744eeb0270	Chore: clean up staged changes for release - Remove standalone pulse-assistant architecture doc (content lives in CLAUDE.md) - Add CountdownTimer component for patrol schedule display - Rewrite patrol handler test to focus on interval persistence - Extract MockStateProvider to shared test file	2026-02-02 23:17:40 +00:00
rcourtman	c8483f8116	Fix: PBS backup verification status not updating after cache populated The PBS backup snapshot cache only compared BackupCount and LastBackup timestamp to decide whether to re-fetch. When PBS verify jobs complete, neither field changes — only the Verification field on individual snapshots changes — so the cache served stale data indefinitely. Add a 10-minute TTL per backup group so verification status changes are picked up periodically. Also add panic recovery to PBS and PVE backup goroutines, and use runtimeCtx for PBS backup polling to respect monitor shutdown. Closes #1174	2026-02-02 23:12:26 +00:00
rcourtman	02946d45ec	test: expand api handler coverage	2026-02-02 23:01:29 +00:00
rcourtman	eed80e2883	Fix: patrol interval not applied — omitempty caused preset to persist across reloads The "Every" dropdown on the Patrol page was not being respected. Setting 15 min would show "Runs every 6 hours" and the countdown timer was wrong. Root cause: PatrolSchedulePreset and PatrolIntervalMinutes had omitempty JSON tags. When the API handler cleared the preset to "", json.Marshal dropped the field. On reload, NewDefaultAIConfig() re-introduced "6hr" as the preset, which took priority over the user's custom minutes. Additional fixes in the same area: - Track nextScheduledAt explicitly in the patrol loop so next_patrol_at reflects the actual ticker schedule, not a stale lastPatrol + interval calculation that diverges when the interval changes mid-cycle. - Refetch patrol status in the frontend after an interval change so the countdown timer updates immediately. - Seed lastPatrol from persisted run history on startup so the header countdown timer appears immediately after a backend restart.	2026-02-02 22:53:24 +00:00
rcourtman	bfa648ddd5	Test: expand api feature test coverage Add tests for AI intelligence, Docker/K8s agents, log redaction, and general router helper functions.	2026-02-02 22:02:22 +00:00
rcourtman	43d7fffeef	Test: add coverage for auth and security handlers Add additional tests for OIDC, SAML, and tenant middleware to improve coverage of security-critical paths.	2026-02-02 22:02:11 +00:00
rcourtman	97a985efb8	Test: improve frontend embedding coverage Enhance tests for frontend embedding to cover filesystem overrides, dev proxy configuration, and SPA header handling.	2026-02-02 22:01:46 +00:00
rcourtman	eb2d07e48f	Chore: enhance core api and metrics testability Refactor Router to allow HTTP client injection for install script proxying. Add tests for unified agent install mechanism and additional metrics store coverage.	2026-02-02 22:01:36 +00:00

1 2 3 4 5 ...

1390 commits