Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-05-19 07:54:10 +00:00

Author	SHA1	Message	Date
rcourtman	0dbd39d671	Add memory balloon/swap badges	2025-10-02 12:44:56 +00:00
rcourtman	6de5684d38	Trim memory tooltip to extras only	2025-10-02 12:43:53 +00:00
rcourtman	ac0b23021f	Add memory tooltip detail	2025-10-02 12:42:17 +00:00
rcourtman	3949dfdd52	Remove memory extras row per design	2025-10-02 12:39:46 +00:00
rcourtman	fc08dbf61c	Mock memory swap and varied balloon	2025-10-02 12:38:36 +00:00
rcourtman	55375e0b2a	Restore mock per-disk data	2025-10-02 12:34:31 +00:00
rcourtman	532cca296f	Tidy memory column extras	2025-10-02 12:28:01 +00:00
rcourtman	1192d51416	Expose guest agent network info and extended memory stats	2025-10-02 12:26:32 +00:00
rcourtman	655842ba9d	Fix disabled alert webhook delivery	2025-10-02 12:09:46 +00:00
rcourtman	d9079f7bbb	Refine dashboard disk column layout	2025-10-02 12:08:19 +00:00
rcourtman	9e31d68207	Fix wearout parsing for Proxmox disks (fixes #449 )	2025-10-02 11:57:06 +00:00
rcourtman	5a2fb939de	Handle non-numeric disk RPM values	2025-10-02 11:42:08 +00:00
rcourtman	cf365b5f80	fix: add default TimeThresholds to prevent hair-trigger alerts (fixes #491 ) Root cause: NewManager() was missing TimeThresholds initialization, causing all alert types to use 0-second delay. This meant alerts fired immediately on the first sample exceeding threshold, with no debouncing. Impact: LXC containers with brief CPU spikes to ~100% (normal for single-core saturation) triggered constant alerts instead of only alerting on sustained high CPU usage. Fix: Add default TimeThresholds: - guest: 10s delay (prevents alerts from brief CPU spikes) - node: 15s delay - storage: 30s delay - pbs: 30s delay This ensures CPU must stay above threshold for the configured duration before an alert fires, preventing noise from momentary spikes. Fixes #491	2025-10-02 11:31:56 +00:00
rcourtman	ad10d43542	refactor: create reusable NodeGroupHeader component and improve styling - Create shared NodeGroupHeader component to eliminate code duplication - Replace vertical line indicator with circular dot matching guest rows - Update online indicator to use bg-green-500 (matching guest indicators) - Reduce node row padding from py-2 to py-1 for more compact layout - Set background to dark:bg-gray-900 to match search bar styling - Apply changes consistently across Dashboard and Storage tabs	2025-10-02 08:29:29 +00:00
rcourtman	e15f54f851	Polish node row styling and restore disk detail support	2025-10-01 21:33:59 +00:00
rcourtman	bd658c33fa	feat: add powered-off alert toggle for guests Addresses #485 - adds UI controls for disabling powered-off alerts on a per-guest basis. Changes: - Add "Alert Powered-Off" / "No Powered-Off" toggle button for VMs/LXCs - Extend toggleNodeConnectivity() to handle guests in addition to nodes/PBS - Add disableConnectivity field to guest resource mapping - Update hasOverride logic to track connectivity state Previously, users could only disable ALL alerts for a guest or none. Now they can independently control resource metric alerts vs powered-off alerts, matching the functionality already available for nodes and PBS servers. User impact: - Enabled + Alert Powered-Off: All alerts including power state (default) - Enabled + No Powered-Off: Only resource alerts, ignore power state - Disabled: No alerts at all Backend already supports this via DisableConnectivity flag.	2025-10-01 20:56:44 +00:00
rcourtman	346cb19da6	feat(frontend): make node summary table sortable	2025-10-01 20:43:28 +00:00
rcourtman	d3c9313e3e	Refine dashboard node header styling	2025-10-01 20:32:33 +00:00
rcourtman	3f1fd2b36e	chore: bump version to v4.18.0	2025-10-01 19:24:20 +00:00
rcourtman	ce4c784769	feat: add powered-off VM/container alerting Implements alerts for powered-off VMs and containers as requested in GitHub discussion #487. Changes: - Modified CheckGuest to generate "powered-off" alerts for stopped guests - Added checkGuestPoweredOff() and clearGuestPoweredOffAlert() functions - Uses 2-poll confirmation (~10 seconds) to prevent false positives - Alert level is Warning (not Critical) by default - Alerts are automatically cleared when guest returns to running state - Respects existing disableConnectivity flag for per-guest configuration - Clears only metric alerts for non-running guests, preserves powered-off alerts - Updated DisableConnectivity comment to include powered-off alerts Configuration: - Can be disabled globally or per-guest via alert overrides - Uses existing disableConnectivity toggle (same as node offline alerts) - No frontend changes needed - types already support this Testing: - Build successful - Tests pass - Webhooks and emails will handle new alert type automatically	2025-10-01 19:19:49 +00:00
rcourtman	49be28bcae	Add Pushover webhook custom field handling	2025-10-01 19:09:06 +00:00
rcourtman	c3deb6170e	fix: prevent catastrophic data loss from encryption key regeneration CRITICAL FIX: Prevents nodes.enc configuration from being permanently lost when decryption fails due to encryption key regeneration or corruption. Root Cause Analysis: 1. If .encryption.key is deleted/regenerated, existing .enc files become unreadable 2. Previous code would fail to decrypt, try backup (also fails), then return error 3. This left NO nodes.enc file on disk 4. Next startup would see no .enc files and happily generate a new encryption key 5. User's node configuration was permanently lost Changes Made: 1. persistence.go (lines 600-645): When decryption fails for BOTH main file and backup, instead of returning error and leaving no file: - Log CRITICAL error with clear message about encryption key issue - Move corrupted file to timestamped .corrupted file for forensics - Create EMPTY but VALID encrypted nodes.enc file - Return empty config so system can start - This prevents encryption key regeneration on next startup 2. crypto.go (lines 93-121): Enhanced encryption key generation checks: - Now checks for nodes.enc* (including .backup, .corrupted files) - Uses glob patterns to find ANY encrypted file remnants - Refuses to generate new key if ANY .enc* files exist - Provides clear error message listing all found files - Forces manual intervention before allowing key regeneration Benefits: - System can still start even if decryption fails - Corrupted files are preserved with timestamps for forensic analysis - Encryption key cannot be silently regenerated if ANY encrypted data exists - Clear, prominent error logging helps diagnose the root cause - User is forced to manually address the issue rather than silently losing data This should prevent the recurring issue where node configurations mysteriously disappear, requiring manual reconfiguration through the UI.	2025-10-01 18:52:10 +00:00
rcourtman	b8ec92858b	fix: standardize ID formats across mock data and frontend to match production Mock data was using inconsistent ID formats that didn't match production code. This caused alert matching and fallback ID generation to fail. Backend changes (mock generator): - VMs: Use conditional logic matching production - standalone nodes use "node-vmid", clusters use "instance-node-vmid" (generator.go:503-509, 584-590) - Containers: Same conditional logic as VMs (generator.go:584-590) - Storage: Always use "instance-node-name" format matching production (generator.go:875, 922, 947, 982) - Shared storage: Use "shared" as node name and correct instance (generator.go:1007-1008) Frontend changes: - Dashboard.tsx: Guest ID fallback now matches backend conditional logic (Dashboard.tsx:964-969) - Storage.tsx: Storage ID fallback now uses "instance-node-name" format (Storage.tsx:603) Production format (from monitor.go): - Guest IDs: Standalone uses "node-vmid", cluster uses "instance-node-vmid" - Storage IDs: Always "instance-node-name" - Node IDs: Always "instance-node" This ensures: 1. Alert resourceId matching works correctly 2. Frontend fallbacks (if ever needed) generate correct IDs 3. Mock data accurately represents production behavior 4. Consistent filtering by instance+node works across all resource types	2025-10-01 18:38:42 +00:00
rcourtman	541cb12d18	fix: correct storage Instance field to match node.Instance in mock data Previously, storage Instance fields were set to `fmt.Sprintf("pve-%s", node.Name)`, creating values like "pve-pve1" that didn't match the parent node's Instance field ("mock-cluster"). This caused storage filtering and counting to fail when matching by instance + node, similar to the backup/snapshot issue fixed earlier. Changes: - Set storage.Instance = node.Instance for local storage (generator.go:862) - Set storage.Instance = node.Instance for local-zfs storage (generator.go:909) - Set storage.Instance = node.Instance for random storage (generator.go:934) - PBS storage already correctly used node.Instance (generator.go:969) This ensures storage counts display correctly on the Storage tab node summary cards and that filtering by instance + node works consistently across all resource types. Note: This is part of the broader pattern fix where all resources must match by both instance AND node name to handle duplicate hostnames across clusters correctly.	2025-10-01 18:30:57 +00:00
rcourtman	1fc905efdd	fix: add Instance field to mock backup and snapshot generation Previously, mock-generated backups and snapshots had empty Instance fields, causing the backups tab node summary counts to show 0. The frontend filters backups by both instance and node name (b.instance === node.instance && b.node === node.name), but without the Instance field populated, no matches were found. Changes: - Set Instance field on VM backups (generator.go:1030) - Set Instance field on container backups (generator.go:1068) - Set Instance field on VM snapshots (generator.go:1323) - Set Instance field on container snapshots (generator.go:1355) This ensures node backup counts display correctly across all tabs.	2025-10-01 18:21:09 +00:00
rcourtman	54a9c8f7d1	fix: complete node.id to instance+name matching across all tabs Fixed remaining instances where resources were matched using node.id instead of matching by both instance and node name: - Dashboard.tsx: VM and container counts in grouped view - UnifiedNodeSelector.tsx: Backup and snapshot counts This ensures all tabs (Dashboard, Storage, Backups) correctly count resources for nodes in mock mode where node.id format differs from instance format.	2025-10-01 18:13:30 +00:00
rcourtman	5100a8f335	fix: improve node matching in summary table with dual field comparison Match resources by both instance and node name for more robust duplicate hostname handling in the node summary cards.	2025-10-01 18:10:58 +00:00
rcourtman	dc561f009f	feat: improve ESC key reset behavior in dashboard Make ESC key a complete reset button that clears all active filters: - First press: Clears search, sorting, node selection, view mode, and status mode - Second press: Toggles filter section visibility (collapse/expand) This provides a quick way to reset the entire dashboard view to defaults.	2025-10-01 18:05:50 +00:00
rcourtman	a3a7ef1a56	chore: improve mock data realism for metrics Adjust mock data generation to produce more realistic resource usage patterns: - VMs: Lower typical CPU usage (0-25%), mean reversion toward 15% - Containers: Even lower CPU (0-12%), mean reversion toward 8% - Memory: More realistic distribution with mean reversion - Metrics updates: Smaller fluctuations with natural mean reversion - I/O patterns: Less frequent changes for more stability	2025-10-01 17:24:13 +00:00
rcourtman	3253c3bdd3	fix: correct instance vs node.name usage for duplicate hostname support Fix remaining issues where node.name was used for counting/grouping instead of instance/node.id, causing incorrect counts with duplicate hostnames. Changes: - NodeSummaryTable: Use node.id for counting VMs/containers/storage/disks - UnifiedNodeSelector: Use node.id for counting backups and snapshots - DiskList: Display node.name in empty state message instead of node ID The pattern is now consistent: - User-facing filtering: use node.name (what users see/search) - Counting/grouping: use instance/node.id (handles duplicates) - Display: convert node.id to node.name for readability	2025-10-01 17:14:03 +00:00
rcourtman	62c7aa19d1	fix: restore node row filtering functionality across all tabs Fix node summary card row selection to properly filter the tables below by implementing independent selection state instead of search box filters. Changes: - Clicking a node row now filters by node name (visual selection state) - Selection is independent of search, allowing both to work together - Toggle selection by clicking the same row again - Clear selection with ESC key - Fixes filtering in Dashboard, Storage, and Backups tabs The previous implementation had a mismatch between node IDs and instance fields. Now using simple node.name matching for reliable filtering.	2025-10-01 17:08:01 +00:00
rcourtman	fe01b72541	Refine search tips popovers	2025-10-01 16:57:43 +00:00
rcourtman	03f823868d	feat: refactor search tips into reusable popover component - Extract SearchTipsPopover as shared component - Improved visual design with better typography and spacing - Consistent search help across Dashboard, Backups, and Storage tabs - Better UX with clickable button instead of hover-only tooltip	2025-10-01 16:44:01 +00:00
rcourtman	abd0b67faa	fix: correct node summary counts for VMs, containers, storage, and backups	2025-10-01 16:40:38 +00:00
rcourtman	35c08b9066	fix: remove 'guests' from search placeholder	2025-10-01 16:33:33 +00:00
rcourtman	0e97303431	feat: consistent filter UX across all tabs - Deselectable radio toggles on all filter tabs - Blue reset button when filters are active - Clean search placeholders with help tooltips - Working tooltips with proper styling on Dashboard tab - Better placeholder text: "Search or filter guests..."	2025-10-01 16:33:18 +00:00
rcourtman	a236244730	feat: improve dashboard filter toggles UX - Make radio toggles deselectable by clicking active option - Reset button turns blue when filters are active - Add auto-start hot-dev in development environment	2025-10-01 16:23:01 +00:00
rcourtman	f8b0d21c32	chore: add claude.md to .gitignore	2025-10-01 15:54:48 +00:00
rcourtman	31317738be	chore: remove claude.md from repository	2025-10-01 15:54:43 +00:00
rcourtman	49311b1e39	fix: resolve multiple issues from #485 This commit addresses all issues reported in GitHub issue #485: 1. SMART Status Recognition - Fix disk health check to accept both "PASSED" and "OK" status - Previously only "PASSED" was recognized as healthy - Location: internal/monitoring/monitor.go:1255 2. ZFS Spare Device False Alerts - Skip ZFS SPARE devices unless they have actual errors - SPARE devices are intentional and should not trigger alerts - Updated in two locations: - pkg/proxmox/zfs.go:154 (device filtering) - internal/alerts/alerts.go:1077 (alert generation) 3. Memory Display Granularity - Increase byte formatting precision from 0 to 1 decimal place - Improves accuracy (e.g., "1.7 GB" instead of "1 GB" for 86% of 2GB) - Location: frontend-modern/src/utils/format.ts:3 4. Custom Alert Rules Evaluation - Add ReevaluateGuestAlert() method for proper threshold reevaluation - Add comments explaining custom rules evaluation limitations - Next poll cycle will properly clear stale alerts with new thresholds Additional improvements: - Fix ZFS pool alert locking to prevent deadlocks - Prevent discovery service from running in mock mode - Restore discovery service when exiting mock mode Fixes #485	2025-10-01 15:53:42 +00:00
rcourtman	b0f68933dd	docs: clarify SSH temperature usage	2025-10-01 15:26:00 +00:00
rcourtman	fa2656c8f0	docs: clarify SSH temperature usage	2025-10-01 15:23:41 +00:00
rcourtman	bd9c6444d6	Handle string wearout values from Proxmox disks	2025-10-01 15:06:35 +00:00
rcourtman	dc065e75f7	fix: auto-resolve alerts when thresholds increase Fixes #484 When users increase alert thresholds (either global defaults or resource-specific overrides), active alerts are now automatically re-evaluated and resolved if the current metric value is below the new threshold. Previously, alerts would remain active even after increasing the threshold above the current value, requiring manual resolution or waiting for the metric to drop below the original threshold and then rise again. Changes: - Add reevaluateActiveAlertsLocked() method to check all active alerts against updated thresholds - Call re-evaluation automatically in UpdateConfig() - Resolve alerts when current value is below new trigger/clear threshold - Handle all resource types: guests (qemu/lxc), nodes, PBS, storage - Add comprehensive unit tests for threshold update scenarios	2025-10-01 15:02:27 +00:00
rcourtman	a6e5a24a77	fix: load mock.env files during config initialization Ensures PULSE_MOCK_MODE environment variable is set before the mock package's init() function runs. This allows mock mode to work correctly when enabled via mock.env or mock.env.local files without requiring an explicit environment variable to be set at startup.	2025-10-01 14:45:52 +00:00
rcourtman	42f2213932	fix: correct mock node ID format to match real system Fix mock node IDs to use instance-nodename format (e.g., 'mock-cluster-pve1') instead of 'node/pve1' format. This matches the real system ID format used at monitoring/monitor.go:936 and fixes the grouped/list toggle in the dashboard. Before: - Clustered: node/pve1, node/pve2, etc. - Standalone: node/standalone1, node/standalone2 After: - Clustered: mock-cluster-pve1, mock-cluster-pve2, etc. - Standalone: standalone1-standalone1, standalone2-standalone2 This allows the dashboard grouping logic to properly match nodes by instance and display them correctly in grouped view.	2025-10-01 13:47:59 +00:00
rcourtman	1c2431fcf6	refactor: add mock.env to repository with local override support Make mock mode configuration part of the repository instead of a local-only file. This ensures consistent mock mode behavior across all environments (development, CI/CD, demo server) and makes it work out of the box for new contributors. Changes: - Add mock.env to repository with sensible defaults (mock mode OFF by default) - Support mock.env.local for personal overrides (gitignored) - Update .gitignore to allow mock.env but exclude .local variants - Backend loads mock.env then merges mock.env.local overrides - hot-dev.sh loads both files in correct order Benefits: - New developers can clone and use mock mode immediately - Demo server gets consistent mock configuration - Personal preferences stay private in .local file - No surprises - mock mode disabled by default in fresh clones - CI/CD can use mock mode without custom configuration Documentation: - Updated README.md to explain mock.env is in repo - Enhanced MOCK_MODE.md with local override instructions - Updated claude.md with new configuration strategy - Added mock.env.local.example for quick setup Example workflow: git clone <repo> npm run mock:on # Works immediately with repo defaults # Or create personal config: cp docs/development/mock.env.local.example mock.env.local # Edit mock.env.local with your preferences	2025-10-01 13:38:39 +00:00
rcourtman	6f2b6268a4	perf: optimize mock mode state retrieval and JSON encoding Improve performance when serving /api/state in mock mode by optimizing alert handling and JSON serialization. Changes: - Add UpdateAlertSnapshots() to cache alerts without blocking - Use lazy population of alert snapshots to avoid lock contention - Switch to json.Marshal for better performance with large payloads - Add debug logging to track /api/state performance - Simplify GetState() logic in mock mode Performance improvements: - Eliminates alert manager lock during /api/state requests - Reduces JSON encoding overhead for large mock datasets - Ensures sub-second response times even with 7 nodes and 90+ guests Testing: - Mock mode returns state instantly without blocking - Alert snapshots populate correctly on first request - Debug logs confirm fast execution path	2025-10-01 13:35:49 +00:00
rcourtman	67fc5977d1	feat: add hot-reloadable mock mode with auto-detection Implement a hot-reloadable mock mode system that works seamlessly in both development and production environments without requiring manual restarts or port changes. Key Features: - Backend watches mock.env and auto-reloads when changed (via fsnotify + polling) - npm commands for easy toggling: mock:on, mock:off, mock:status, mock:edit - Works in both hot-dev mode and systemd deployments - Reload completes in 2-5 seconds with no manual intervention - No port changes or process restarts required Implementation: - Extended ConfigWatcher to monitor both .env and mock.env - Added callback system to trigger ReloadableMonitor.Reload() - Enhanced toggle-mock.sh to support both hot-dev and systemd modes - Updated hot-dev.sh banner to show mock status and commands - Created comprehensive documentation in docs/development/MOCK_MODE.md Testing: - Backend builds successfully - Watcher initializes and monitors both files - npm run mock:on/off toggles successfully - mock.env updates correctly - Scripts work in both hot-dev and systemd modes Documentation: - Added Mock Mode section to README.md - Created detailed guide in docs/development/MOCK_MODE.md - Updated claude.md with mock mode architecture and usage Mock mode continues to return cached data instantly from memory (no API calls, no locks, no timeouts), ensuring fast /api/state responses.	2025-10-01 13:35:17 +00:00
rcourtman	f30e57e36d	feat: add GitHub Actions workflow to auto-update demo server on release	2025-10-01 11:34:53 +00:00

1 2 3 4 5 ...

1163 commits