Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-05-05 15:23:27 +00:00

Author	SHA1	Message	Date
rcourtman	8f05fc0a57	Improve backup-age alerts to show VM/CT names in multi-cluster setups (related to #668 ) This change fixes backup-age alert notifications to display VM/CT names instead of just "VMID XXX" in multi-cluster environments where backups are stored on PBS. Changes: - Store all guests per VMID (not just first match) to handle VMID collisions across clusters - Persist last-known guest names/types in metadata store for deleted VMs - Enrich backup correlation with persisted metadata when live inventory is empty - Update CheckBackups to handle multiple VMID matches intelligently The fix addresses two scenarios: 1. Multiple PVE clusters with same VMID backing up to one PBS 2. VMs deleted from Proxmox but backups still exist on PBS Backup-age alerts will now show proper VM/CT names when: - A unique guest exists with that VMID (live or persisted) - Multiple guests share a VMID (uses first match, consistent with current behavior) When truly ambiguous (multiple live VMs, same VMID, no way to determine origin), the alert gracefully falls back to showing "VMID XXX".	2025-11-08 18:24:04 +00:00
rcourtman	4891f06e76	Fix webhook alerts persisting when DisableAll* flags are enabled The original fix in `c6c0ac63e` only handled per-resource overrides when thresholds were disabled (trigger <= 0 or Disabled=true). It did not handle global DisableAll* flags (DisableAllStorage, DisableAllNodes, DisableAllGuests, etc.). When a user toggled a DisableAll* flag from false to true: - Check* functions returned early without processing - Existing active alerts remained in m.activeAlerts map - Those alerts continued generating webhook notifications - reevaluateActiveAlertsLocked didn't check DisableAll* flags This commit fixes the issue by: 1. Updating reevaluateActiveAlertsLocked to check all DisableAll* flags and resolve alerts for those resource types during config updates 2. Adding alert cleanup to Check* functions before early returns: - CheckStorage: clears usage and offline alerts - CheckNode: clears cpu/memory/disk/temperature and offline alerts - CheckPMG: clears queue/message alerts and offline alerts - CheckPBS: clears cpu/memory and offline alerts - CheckHost: calls existing cleanup helpers 3. Adding comprehensive test coverage for DisableAllStorage scenario Related to #561	2025-11-06 21:17:56 +00:00
rcourtman	b44084af3c	Skip false health alerts for Samsung 980/990 SSDs and improve Docker CPU calculation Related to #547 and #622 ## Samsung SSD Fix (#547) Samsung 980 and 990 series SSDs have known firmware bugs that cause them to report incorrect health status (typically FAILED or critical warnings) even when the drives are actually healthy. This is commonly due to incorrect temperature threshold reporting in the firmware. This change adds special handling to detect these drives and skip health status alerts while still monitoring wearout metrics, which remain reliable. The fix also clears any existing false alerts for these drives. Users experiencing these false alerts should update their Samsung SSD firmware to the latest version from Samsung, which typically resolves the issue. ## Docker Agent CPU Fix (#622) Addresses issue where Docker container CPU usage shows 0%. The Docker agent uses ContainerStatsOneShot which typically doesn't populate PreCPUStats, requiring manual delta tracking between collection cycles. Changes: - Fix logic bug where prevContainerCPU was updated before checking if previous sample existed, causing incorrect delta calculations - Add comprehensive debug logging showing which calculation method succeeded (PreCPUStats, system delta, or time-based fallback) - Add warning after 10 PreCPUStats failures to inform about manual tracking mode (normal for one-shot stats) - Add detailed failure logging when CPU calculation cannot complete Expected behavior: First collection cycle returns 0% (no previous sample), subsequent cycles show accurate CPU metrics.	2025-11-05 19:33:16 +00:00
rcourtman	fb22469eb0	Add disk usage threshold support for Docker containers Extends the Docker monitoring and alerting system to track writable layer usage as a percentage of the container's root filesystem. This helps identify containers with bloated copy-on-write layers before they consume excessive disk space. - Add disk threshold to DockerThresholdConfig (default: 85% trigger, 80% clear) - Evaluate disk alerts for running containers when RootFilesystemBytes > 0 - Include disk metadata (writable layer, total filesystem, block I/O stats) - Update frontend to display and configure disk thresholds - Add test coverage for disk usage alert hysteresis - Document disk monitoring in DOCKER_MONITORING.md Per-container and per-host overrides apply to disk thresholds the same way they do for CPU and memory.	2025-10-29 14:52:25 +00:00
rcourtman	b3285c05c8	Consolidate pending changes - Add Docker metadata test comment - Update alerts configuration and thresholds - Enhance config file watcher - Update documentation - Refine settings UI	2025-10-28 23:20:44 +00:00
rcourtman	68ce8e7520	feat: finalize swarm service monitoring (#598 )	2025-10-26 09:35:49 +00:00
rcourtman	77282bd3a6	Implement Pulse tag overrides and alert clear persistence	2025-10-25 14:28:32 +00:00
rcourtman	d643dcf0bc	perf: reduce polling allocations and guest metadata load	2025-10-25 13:12:47 +00:00
rcourtman	be26f957c0	Add snapshot size alert thresholds (#585 )	2025-10-22 13:30:40 +00:00
Pulse Automation Bot	cfdfe896be	Adjust backup and snapshot alert handling	2025-10-18 20:11:01 +00:00
Pulse Automation Bot	80b9d0602a	Add Apprise notification integration (#570 )	2025-10-18 16:39:39 +00:00
rcourtman	219fcc6de5	Stop disabled metrics from sending webhooks Refs #561	2025-10-16 08:57:12 +00:00
rcourtman	4838793677	feat: enhance alerts system with tests and improved thresholds - Add comprehensive test coverage for alerts package with 285+ new tests - Implement ThresholdsTable component with metric thresholds display - Enhance Alerts page UI with improved layout and metric filtering - Add frontend component tests for Alerts page and ThresholdsTable - Set up Vitest testing infrastructure for SolidJS components - Improve config persistence with better validation - Expand discovery tests with 333+ test cases - Update API, configuration, and Docker monitoring documentation	2025-10-15 22:25:04 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

14 commits