Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-05-28 01:17:00 +00:00

Author	SHA1	Message	Date
rcourtman	07b4765b8d	fix: respect quiet hours for recovery notifications (#1068 ) Recovery notifications were bypassing the quiet hours check, causing users to receive recovery alerts during their configured quiet hours window even though the original "down" alerts were suppressed. - Add ShouldSuppressResolvedNotification() to alert manager - Check quiet hours before sending recovery notifications in monitor - Recovery notifications now follow same suppression rules as alerts	2026-01-09 21:47:36 +00:00
rcourtman	2a8f55d719	feat(enterprise): add Advanced Reporting and Audit Webhooks integration This commit adds enterprise-grade reporting and audit capabilities: Reporting: - Refactored metrics store from internal/ to pkg/ for enterprise access - Added pkg/reporting with shared interfaces for report generation - Created API endpoint: GET /api/admin/reports/generate - New ReportingPanel.tsx for PDF/CSV report configuration Audit Webhooks: - Extended pkg/audit with webhook URL management interface - Added API endpoint: GET/POST /api/admin/webhooks/audit - New AuditWebhookPanel.tsx for webhook configuration - Updated Settings.tsx with Reporting and Webhooks tabs Server Hardening: - Enterprise hooks now execute outside mutex with panic recovery - Removed dbPath from metrics Stats API to prevent path disclosure - Added storage metrics persistence to polling loop Documentation: - Updated README.md feature table - Updated docs/API.md with new endpoints - Updated docs/PULSE_PRO.md with feature descriptions - Updated docs/WEBHOOKS.md with audit webhooks section	2026-01-09 21:31:49 +00:00
rcourtman	92c150e979	feat(rbac): add OIDC group mapping tests and audit logging for RBAC actions	2026-01-09 19:25:33 +00:00
rcourtman	6ed1fdf806	feat(rbac): implement RBAC UI, OIDC group mapping, and API standard auth - Added Roles and Users settings panels - Implemented OIDC group-to-role mappings in config and auth flow - Standardized API token context handling via pkg/auth - Added Pulse Pro branding and upgrade banners to RBAC features - Cleanup: Removed empty code blocks and fixed lint errors	2026-01-09 19:16:34 +00:00
rcourtman	3e2824a7ff	feat: remove Enterprise badges, simplify Pro upgrade prompts - Replace barrel import in AuditLogPanel.tsx to fix ad-blocker crash - Remove all Enterprise/Pro badges from nav and feature headers - Simplify upgrade CTAs to clean 'Upgrade to Pro' links - Update docs: PULSE_PRO.md, API.md, README.md, SECURITY.md - Align terminology: single Pro tier, no separate Enterprise tier Also includes prior refactoring: - Move auth package to pkg/auth for enterprise reuse - Export server functions for testability - Stabilize CLI tests	2026-01-09 16:51:08 +00:00
rcourtman	22059210f7	fix(frontend): remove unused import and variable to satisfy hooks	2026-01-09 14:46:15 +00:00
rcourtman	5c4399d69f	feat(agent): add DisableCeph toggle, report_ip remote config, and improved IP detection (#929 )	2026-01-09 14:45:29 +00:00
rcourtman	6019e3e77e	fix: normalize custom OpenAI-compatible API URLs (#1067 ) Users providing base URLs like "https://openrouter.ai/api/v1" were getting HTML error responses because the client used the URL directly without appending "/chat/completions". - Normalize baseURL in NewOpenAIClient to ensure it ends with /chat/completions - Fix modelsEndpoint() to derive /models from the normalized baseURL - Add tests for URL normalization with various endpoint formats	2026-01-09 09:13:36 +00:00
rcourtman	020553a12d	fix: use flexible subnet matching instead of fixed /24 The previous implementation assumed /24 subnets, which failed for larger networks (e.g., /16 or /20). Now uses progressive subnet matching that tries /24, /20, and /16 to handle various network sizes. Example: If connection IP is 10.1.1.5 and a node has 10.1.2.6, it now correctly identifies them as being on the same network.	2026-01-08 23:24:50 +00:00
rcourtman	bd1df9f942	feat: automatic subnet preference for cluster node discovery When discovering cluster nodes, Pulse now automatically prefers IPs on the same subnet as the initial connection. This fixes the common issue where Pulse used internal cluster network IPs (e.g., 172.x.x.x) instead of management network IPs (e.g., 10.x.x.x). How it works: 1. Extract subnet from initial connection URL (assumes /24 for IPv4) 2. For each discovered node, query /nodes/{node}/network for all IPs 3. If cluster-reported IP is on a different subnet, find an IP on the preferred subnet and set it as IPOverride 4. Manual IPOverride settings are preserved and take precedence This eliminates the need for manual IPOverride configuration in most multi-network Proxmox setups. Refs #929, #1066	2026-01-08 23:12:30 +00:00
rcourtman	d5c93fd226	fix: add cluster endpoint IP override and Windows agent download support 1. Add IPOverride field to ClusterEndpoint struct - Allows users to specify a custom IP that takes precedence over auto-discovered IPs - Fixes #929 and #1066 where Pulse used internal cluster IPs instead of management IPs - Added EffectiveIP() method to cleanly handle the override logic 2. Update connection code to use EffectiveIP() - monitor.go: Use override when building endpoint URLs - temperature_proxy.go: Use override for proxy connections 3. Add bare Windows EXE files to GitHub releases - Fixes #1064 where LXC/barebone installs couldn't download Windows agents - Modified build-release.sh to copy EXEs alongside ZIPs - Added EXEs to checksum generation	2026-01-08 23:04:25 +00:00
rcourtman	568aac6bd0	fix: multiple triage fixes for stability and correctness 1. Use correct mutex (diagMu) in cleanupDiagnosticSnapshots to prevent "concurrent map iteration and map write" panics (Fixes #1063) 2. Use cluster name for storage instance comparison in UpdateStorageForInstance to prevent storage duplication in clustered Proxmox setups (Fixes #1062) 3. Fix KUBECONFIG unbound variable error in install.sh by using ${KUBECONFIG:-} default parameter expansion (Fixes #1065)	2026-01-08 22:54:33 +00:00
rcourtman	06ebaf50b2	fix: use consistent ID for shared storage to prevent duplication (#1049 ) Shared storage was duplicating across polling cycles because the ID included the node name of whichever node reported it first. When a different node reported first on the next cycle, a new ID was created. This fix updates the shared storage aggregation to use a consistent ID format (instance-cluster-storageName) that doesn't include the node name. Closes #1049. Thanks to @siccous for the report and initial investigation.	2026-01-08 21:29:24 +00:00
rcourtman	5f0214b949	fix: support ReportIP override in Proxmox auto-registration (#1061 )	2026-01-08 21:20:51 +00:00
rcourtman	33bb0a95bb	docs: Fix formatting in API reference	2026-01-08 20:15:25 +00:00
rcourtman	6de1c660b1	chore: Improve pre-commit data validation and ignore patterns	2026-01-08 20:04:02 +00:00
rcourtman	3801b7ad7a	chore: Ignore husky internal directory	2026-01-08 19:37:04 +00:00
rcourtman	73c5128a87	feat(audit): Add audit log API endpoints and UI with signature verification - Add GET /api/audit endpoint for listing events with filters - Add GET /api/audit/:id/verify endpoint for signature verification - Add AuditLogPanel UI component with filtering and verification - Update docs with audit API documentation - Add localStorage utils for persisting UI state - Update gitignore patterns	2026-01-08 19:19:57 +00:00
rcourtman	7342191075	docs: fix Helm chart install commands to use GitHub Pages repo The GHCR OCI registry (ghcr.io/rcourtman/pulse-chart) is returning 403/404 errors for unauthenticated users. Updated all Helm references to use the working GitHub Pages Helm repository at https://rcourtman.github.io/Pulse Fixes install issues reported by customers trying to deploy via Helm. Files updated: - docs/KUBERNETES.md - docs/INSTALL.md - docs/DEPLOYMENT_MODELS.md - docs/UPGRADE_v5.md	2026-01-08 14:27:45 +00:00
rcourtman	22e01e2244	feat: Add centralized agent configuration management (Pro) Allows administrators to create configuration profiles and assign them to agents for centralized fleet management. - Configuration profiles with customizable settings (Docker, K8s, Proxmox monitoring, log level, reporting interval) - Profile assignment to agents by ID - Agent-side remote config client to fetch settings on startup - Full CRUD API at /api/admin/profiles - Settings UI panel in Settings → Agents → Agent Profiles - Automatic cleanup of assignments when profiles are deleted	2026-01-08 12:06:36 +00:00
rcourtman	7db6b3e47d	feat: Add AI chat session sync across devices Implements server-side persistence for AI chat sessions, allowing users to continue conversations across devices and browser sessions. Related to #1059. Backend: - Add chat session CRUD API endpoints (GET/PUT/DELETE) - Add persistence layer with per-user session storage - Support session cleanup for old sessions (90 days) - Multi-user support via auth context Frontend: - Rewrite aiChat store with server sync (debounced) - Add session management UI (new conversation, switch, delete) - Local storage as fallback/cache - Initialize sync on app startup when AI is enabled	2026-01-08 10:47:45 +00:00
rcourtman	695ced6273	docs: Add API token scopes and kiosk mode documentation Documents all available token scopes, UI presets, and step-by-step instructions for setting up kiosk mode with read-only dashboard tokens. Related to #1055	2026-01-08 10:27:15 +00:00
rcourtman	f29badbd1f	feat: Add kiosk mode support with read-only dashboard tokens - Add "Kiosk / Dashboard" preset in API token manager for easy token creation - Backend returns token scopes in /api/security/status when authenticated via token - Frontend hides Settings tab when token lacks settings:read scope - URL-based token auth via ?token=xxx now properly reports scopes Users can now create a monitoring:read token and use it in kiosk displays without exposing settings or requiring cookie persistence. Related to #1055	2026-01-08 10:18:27 +00:00
rcourtman	49272bd48c	fix: Show usable RAIDZ capacity instead of raw pool size For RAIDZ/mirror pools, zpool list SIZE reports raw capacity (sum of all disks), but users expect usable capacity (accounting for parity). The dataset stats from statfs give the correct usable capacity. Now uses dataset Total when it's smaller than zpool Size, indicating RAIDZ/mirror overhead. Related to #1052	2026-01-08 09:38:18 +00:00
rcourtman	8c4bef27f0	docs: improve reverse proxy HTTPS detection and Swarm troubleshooting - Add detailed HTTPS detection troubleshooting to REVERSE_PROXY.md - Explain X-Forwarded-Proto header requirement for nginx/Caddy/Apache - Add Docker Swarm troubleshooting section to UNIFIED_AGENT.md - Document how to force Docker runtime if auto-detection fails Based on customer feedback.	2026-01-07 18:23:48 +00:00
rcourtman	e4c17777d0	feat: Add deployment strategy configuration to Helm chart Added strategy.type option to values.yaml (default: RollingUpdate) to allow users to configure the deployment strategy. Users with ReadWriteOnce (RWO) persistent volumes should set this to "Recreate" to avoid Multi-Attach errors during upgrades. Related to #1057	2026-01-07 17:57:41 +00:00
rcourtman	95fb896a03	fix: Agent 405 errors when reverse proxy redirects HTTP to HTTPS When a user's reverse proxy redirects HTTP to HTTPS, Go's default HTTP client behavior converts POST requests to GET on 301/302 redirects (per HTTP specification). This causes the Pulse server to return 405 "Only POST is allowed" errors. Added CheckRedirect to all agent HTTP clients (host, docker, kubernetes) that returns a clear error message guiding users to use the correct protocol in their --url flag instead of silently following redirects. Related to #1058	2026-01-07 17:56:07 +00:00
rcourtman	3f0808e9f9	docs: comprehensive core and Pro documentation overhaul - Major updates to README.md and docs/README.md for Pulse v5 - Added technical deep-dives for Pulse Pro (docs/PULSE_PRO.md) and AI Patrol (docs/AI.md) - Updated Prometheus metrics documentation and Helm schema for metrics separation - Refreshed security, installation, and deployment documentation for unified agent models - Cleaned up legacy summary files	2026-01-07 17:38:27 +00:00
rcourtman	9cfcdbb247	fix: Use per-node shared flag for storage deduplication The storage deduplication logic only checked cluster config's Shared flag, but this required the cluster config API call to succeed. When the per-node storage API already returns shared=1 (as the user verified), we should use that directly. Now we check three sources for shared storage detection: 1. Per-node API shared flag (storage.Shared) 2. Cluster config shared flag (if available) 3. Storage type heuristics (NFS, RBD, PBS, etc.) Related to #1049	2026-01-07 10:16:23 +00:00
rcourtman	dcdbee3c5c	feat: Add in-app help system with HelpIcon component Add contextual help icons throughout the UI to improve feature discoverability. Users can click (?) icons to see explanations with examples for settings they might not understand. - HelpIcon component with click-to-open popover - Centralized help content registry in /content/help/ - FeatureTip component for dismissible contextual tips - Help added to: alert delay, AI endpoints, update channel	2026-01-07 09:22:23 +00:00
rcourtman	b75b33b9fe	fix: Read form values from DOM for password manager compatibility Password managers may fill form fields programmatically without triggering input events, causing SolidJS signals to remain empty. This fix reads values directly from the DOM on submit, ensuring credentials filled by password managers are properly captured. Related to #1036	2026-01-06 22:25:11 +00:00
rcourtman	73e6a8edc5	fix: Add missing UI for physical disk polling interval setting The previous commit (`06261627`) added backend support for configurable physical disk polling intervals but didn't include the UI to configure it. Adds a dropdown selector (5/15/30/60 minutes) that appears when physical disk monitoring is enabled. Related to #1007	2026-01-06 20:32:24 +00:00
rcourtman	96d06da0d7	fix: Deduplicate shared storages (NFS, RBD, PBS, etc) in cluster view Shared storages were appearing multiple times (once per node) because the deduplication logic only checked the Proxmox `Shared` flag. Many storage types are inherently cluster-wide but don't set this flag: - RBD (Ceph block storage) - CephFS - PBS (Proxmox Backup Server) - GlusterFS - NFS - CIFS/SMB - iSCSI Now we detect shared storage based on both the Shared flag AND the storage type. Inherently shared storage types are deduplicated and shown once with a "cluster" node designation. Related to #1049	2026-01-06 17:44:52 +00:00
rcourtman	d3116defe3	fix: Prevent panic from send on closed websocket channel Add atomic `closed` flag to Client struct and `safeSend()` helper method to prevent race condition when sending to client channels. The race occurred when a client disconnected while a goroutine was trying to send initial state - the channel could be closed between the registration check and the actual send. All sends to client.send now go through safeSend() which checks the closed flag first. The flag is set atomically before closing the channel in all code paths (unregister, dispatchToClients, broadcast, shutdown). Related to #1048	2026-01-06 17:41:25 +00:00
rcourtman	48fdff3efb	fix: Preserve ackState for old acknowledged alerts during restore When LoadActiveAlerts skipped acknowledged alerts older than 1 hour, it was also not populating ackState. This meant that when the same alert (e.g., backup-age) was recreated on the next poll cycle, preserveAlertState couldn't find any acknowledgement record and the alert would retrigger notifications. Now ackState is populated even for skipped old acknowledged alerts, so if they reappear, the acknowledgement will be restored. Related to #1043	2026-01-06 11:00:36 +00:00
rcourtman	74ea90e4b3	fix: Podman sockets not prioritized when --docker-runtime=podman When --docker-runtime=podman is explicitly set, the agent should try Podman-specific sockets first before falling back to environment defaults (which try /var/run/docker.sock). Also adds /var/run/podman/podman.sock as a candidate socket path, which is used by CoreOS and some Fedora configurations. Related to #1045	2026-01-06 10:56:37 +00:00
rcourtman	d7000fafb6	fix: Empty array expansion fails on macOS bash 3.2 with set -u macOS ships with bash 3.2 (GPLv2) which has a bug where expanding an empty array like ${array[@]} with set -u enabled throws an "unbound variable" error, even when the array is initialized. Use ${arr[@]+"${arr[@]}"} pattern to safely handle empty arrays. Related to #1046	2026-01-06 10:52:44 +00:00
rcourtman	cfcba70b2b	chore: Bump version to 5.0.12	2026-01-05 23:48:57 +00:00
rcourtman	d0191d136f	fix: Add configurable poll timeout and handle external Ceph storage Changes: 1. Add MAX_POLL_TIMEOUT env var for large Proxmox clusters that need more than 3 minutes for polling (default: 3m, minimum: 30s) 2. Handle external Ceph storage gracefully - don't mark nodes unhealthy when Proxmox returns 'binary not installed' (e.g., for Ceph not managed by Proxmox) Related to #965	2026-01-05 23:34:33 +00:00
rcourtman	c6182b2ed3	feat: Add FreeBSD/OPNsense support for the Pulse agent Added FreeBSD amd64 and arm64 build targets to the release process: - Build host-agent and unified agent binaries for FreeBSD - Package FreeBSD tarballs in releases - Include FreeBSD binaries in universal tarball for download endpoint Updated agent install script with FreeBSD support: - Fixed architecture detection (FreeBSD reports 'amd64' not 'x86_64') - Added FreeBSD rc.d service handler with proper daemon management - Automatic service enabling via rc.conf This enables users to run the Pulse agent on FreeBSD-based systems like OPNsense, pfSense, and vanilla FreeBSD. Fixes #1041	2026-01-05 18:18:06 +00:00
rcourtman	0826c4ddb2	fix: Show linked agents in Managed Agents table with badge Previously, agents linked to Proxmox nodes were hidden from the Settings > Agents > Managed Agents table, which confused users who couldn't find their installed agents. Now all agents are shown in the table, with linked agents displaying an indigo 'Linked' badge that explains they're also merged with Proxmox nodes in the Dashboard. Fixes #1038	2026-01-05 17:57:11 +00:00
rcourtman	0b6bceb96f	fix: Hide non-functional edit button for Docker hosts in thresholds table. Related to discussion #1040	2026-01-05 17:13:43 +00:00
rcourtman	e4d7f6fd3d	fix: Allow querying non-PBS backup storage with Active=0 Previously, only PBS-type storages were queried when Active=0 because querying inactive storage can return 500 errors. However, this caused backups from datacenter backup tasks on shared storage (NFS, CIFS, etc.) to not appear when the storage reported Active=0 on some nodes. Now any storage with backup content is queried regardless of Active status. If the storage is truly unavailable, GetStorageContent returns an error which is already handled gracefully (logged and skipped). Related to #1037	2026-01-05 14:53:40 +00:00
rcourtman	2cc9214336	feat: Make container update alerts a free feature Update alerts for Docker containers are now available to all users, not just Pro license holders. The feature alerts when container image updates have been pending for longer than the configured delay (default: 24 hours). - Remove Pro license gating from update alerts - Add FeatureUpdateAlerts to free tier features - Remove obsolete license gating tests Related to #1031	2026-01-04 23:59:29 +00:00
rcourtman	f210ef5517	Auto-update Helm chart version to 5.0.11	2026-01-04 20:01:07 +00:00
rcourtman	9388a13718	Auto-update Helm chart documentation	2026-01-04 20:01:06 +00:00
rcourtman	3b70e29b87	test: add PULSE_DATA_DIR to TestMainCmd TestMainCmd was missing PULSE_DATA_DIR setup, causing it to try to access /etc/pulse which fails in CI.	2026-01-04 19:15:38 +00:00
rcourtman	21a819f6dc	test: use t.Setenv for safer test cleanup t.Setenv ensures environment variables are restored after test completion, preventing race conditions where background goroutines (like config watchers) might access unset env vars during cleanup.	2026-01-04 19:08:45 +00:00
rcourtman	fdba559167	test: skip tests requiring /etc/pulse in CI Tests that use the default /etc/pulse data directory fail in CI where the directory doesn't exist and can't be created.	2026-01-04 18:59:48 +00:00
rcourtman	1731489709	test: remove obsolete EnsureDirError test The test was checking an error path that no longer exists - NewConfigPersistence now falls back to /etc/pulse when directory creation fails, and calls log.Fatal() only when that also fails.	2026-01-04 18:51:02 +00:00

... 49 50 51 52 53 ...

4826 commits