Commit graph

950 commits

Author SHA1 Message Date
Pulse Monitor
ce6a76a0f9 fix: preserve PBS alert thresholds when updating node configuration (addresses #440)
When updating PBS nodes through the node configuration UI, alert thresholds
were being reset to defaults. This was because alert overrides are stored
separately from node configuration and weren't being preserved during node updates.

The fix ensures that when a node is updated, the alert configuration (including
any custom threshold overrides) is reloaded and preserved. This applies to both
PBS and PVE nodes to ensure consistent behavior.
2025-09-10 15:12:43 +00:00
Pulse Monitor
e731f954b3 fix: resolve PBS API permission errors and missing parameters (addresses #436)
- handle PBS node status endpoint permission errors gracefully (returns nil instead of error for 403s)
- add required cf and timeframe parameters to RRD endpoint calls
- properly handle nil nodeStatus returns in monitor.go

these API calls now fail silently as PBS API tokens often lack the required permissions for these endpoints, which is expected behavior
2025-09-10 14:51:52 +00:00
Pulse Monitor
670bf4665d fix: improve cluster detection reliability on first add (addresses #437)
- Add retry logic with delays to detectPVECluster function to handle API permission propagation
- Periodically re-check standalone nodes to detect if they're actually part of a cluster
- Increase timeout from 3 to 5 seconds for cluster detection attempts
- Skip retries for definitively standalone nodes (501 not implemented errors)

This addresses the issue where adding a PVE cluster doesn't detect it properly on first attempt,
requiring deletion and re-adding to work correctly. The retry mechanism gives time for
API permissions to fully propagate in Proxmox.
2025-09-10 14:39:01 +00:00
Pulse Monitor
13dc131cfc perf: load all guest metadata in single API call (addresses #398)
Instead of making individual API calls for each guest's metadata,
load all metadata once at the Dashboard level and pass it down as props.
This reduces hundreds of HTTP requests to just one when dealing with
large deployments.

With 800 guests, this changes from 800 individual requests to 1 batch request.
2025-09-10 13:30:55 +00:00
Pulse Monitor
6491c2da14 docs: correct alert description to reflect current capabilities (addresses #431)
The README previously claimed alerts for 'VMs go down' but currently only node down detection is implemented. Updated to accurately reflect that alerts are for nodes, not individual VMs/containers.
2025-09-10 13:23:24 +00:00
Pulse Monitor
402334671d fix: prevent QEMU guest agent errors from marking cluster nodes unhealthy (addresses #405)
The cluster client was incorrectly marking nodes as unhealthy when encountering
VM-specific QEMU guest agent errors. This caused storage and backup operations
to fail with "no healthy nodes available" even though the nodes were actually
accessible.

Changes:
- Added broader detection for guest agent errors in executeWithFailover
- Updated recovery logic to ignore VM-specific errors when recovering nodes
- Guest agent errors no longer affect node health status

This fixes the issue where users with clusters would see storage and backup
operations fail after any VM without a guest agent was queried.
2025-09-10 13:19:15 +00:00
Pulse Monitor
c668ac9c48 feat: add ~/.pulse marker file for Community Scripts compatibility
- Creates ~/.pulse marker file after successful install/update
- Addresses vhsdream's request in PR #7519
- Helps Community Scripts track that Pulse has been installed
- Improves compatibility between installation methods
2025-09-10 13:09:41 +00:00
Pulse Monitor
8bf06ee683 refactor: remove update command creation to avoid conflicts with Community Scripts
- Native installations no longer create /usr/local/bin/update
- Avoids conflicts with Community Scripts /bin/update command
- Community installations keep their update mechanism
- Native installations use: curl -fsSL ... | bash for updates
- Each installation method respects the other's update approach
2025-09-10 13:03:42 +00:00
Pulse Monitor
25ed6172a0 refactor: improve service name detection compatibility (addresses #430)
- Use centralized detectServiceName() function instead of duplicate logic
- Automatically detect whether system uses 'pulse' or 'pulse-backend' service
- Improves compatibility between official and community installer scripts
- Reduces confusion when users mix installation methods
2025-09-10 12:33:20 +00:00
Pulse Monitor
80e3ee311b fix: make physical disk progress bars consistent with rest of UI 2025-09-10 10:33:52 +00:00
Pulse Monitor
6f840834f1 feat: improve memory reporting accuracy using available memory (addresses #435)
- Calculate memory as (Total - Available) instead of raw Used value
- Excludes buffer/cache memory that Linux can reclaim when needed
- Prevents false alerts from Linux cache usage
- Falls back to traditional calculation on older Proxmox versions
- VMs already use FreeMem from guest agent when available
- Memory usage will appear lower but more accurate (e.g., 56% instead of 84%)
- Users may need to adjust alert thresholds accordingly
2025-09-10 10:17:07 +00:00
Pulse Monitor
8fbe53406a feat: improve memory reporting by using available memory instead of free (addresses #435)
- Add Available field to MemoryStatus struct to capture memory available for allocation
- Update node memory calculation to use Available memory when present
- This excludes non-reclaimable cache/buffers from used memory calculation
- Provides more accurate memory pressure indication, avoiding false alerts
- Falls back to traditional used memory if Available field is missing (older Proxmox versions)
2025-09-09 21:35:09 +00:00
Pulse Monitor
4f45238f9c Revert "fix: properly handle 100% thresholds to disable alerts (addresses #434)"
This reverts commit ffb744d711.
2025-09-09 21:27:29 +00:00
Pulse Monitor
e169071dac fix: properly handle 100% thresholds to disable alerts (addresses #434)
When a threshold is set to 100%, it now effectively disables alerts for that metric.
This allows users to turn off specific alerts without disabling all alerts for a resource.
Also clears any existing alerts when threshold is changed to 100%.
2025-09-09 21:04:00 +00:00
Pulse Monitor
0270d7cecc fix: always query guest agent for running VMs to ensure accurate disk usage (addresses #414)
- Changed logic to always query guest agent when available, not just when disk is 0
- This fixes issue where Proxmox returns incorrect non-zero values from cluster/resources
- Guest agent data is now preferred over cluster/resources data for all running VMs
- Improved logging to show when we're replacing cluster data with guest agent data

This should resolve the issue reported by FaboulousSan where VMs were showing
host disk space instead of actual VM disk usage.
2025-09-09 17:32:39 +00:00
Pulse Monitor
131a6b3cf8 fix: improve VM disk monitoring to filter network shares and special filesystems (addresses #414)
- Add comprehensive filtering for network filesystems (NFS, CIFS, SMB, FUSE, 9p)
- Skip Docker volumes, snap mounts, and other special mountpoints
- Add detailed logging to track which filesystems are included/excluded
- Add sanity check to detect when reported disk is way larger than allocated
- Improve logging with GB values and more context for debugging

This should prevent Pulse from accidentally including host disk space or
network shares when calculating VM disk usage. Users can use the existing
diagnostics system in the UI to troubleshoot VM disk issues.
2025-09-09 17:05:35 +00:00
Pulse Monitor
c9dbae467a chore: add dev environment files to gitignore 2025-09-09 07:23:13 +00:00
Pulse Monitor
ff040b53f8 chore: remove development environment files from repository 2025-09-09 07:20:36 +00:00
Pulse Monitor
b6cfe474b7 chore: remove remaining test files and scripts from repository 2025-09-09 07:07:10 +00:00
Pulse Monitor
64f7253ece chore: clean up repository - remove test, backup and local dev files 2025-09-08 22:02:05 +00:00
Pulse Monitor
6bdad4073e remove: disk health summary widget from dashboard per user request 2025-09-08 21:51:37 +00:00
Pulse Monitor
a1a078dc26 fix: prevent store mutation error by creating array copy before sorting 2025-09-08 21:50:02 +00:00
Pulse Monitor
f3af1f7efa refactor: redesign disk list as compact table to match Pulse's condensed data style 2025-09-08 21:45:33 +00:00
Pulse Monitor
8a6e3755a6 fix: replace emoji indicators with proper status badges in disk monitoring UI 2025-09-08 21:38:14 +00:00
Pulse Monitor
55cde3edc3 fix: add missing physicalDisks handler to WebSocket store (addresses #429)
The disk monitoring backend was working but frontend wasn't updating because the WebSocket store was missing the handler for physicalDisks data. Also added physicalDisks count to broadcast logging for better debugging.
2025-09-08 20:17:10 +00:00
Pulse Monitor
6289a0b5e9 feat: add mock disk data generation for testing UI 2025-09-08 20:15:49 +00:00
Pulse Monitor
76cc140b39 fix: add physicalDisks to WebSocket store initial state 2025-09-08 20:06:34 +00:00
Pulse Monitor
f5334fc413 feat: enhance disk monitoring UI with dashboard widget and node badges (addresses #429)
- Added DiskHealthSummary widget to dashboard showing:
  - Total disk health status overview
  - Healthy/failing/low-life disk counts
  - Average SSD life remaining with visual bar
  - Distribution of disks across nodes
- Added disk count badges to node selector in storage tab
- Shows disk counts next to storage pools count per node
- Webhook notifications automatically trigger for disk alerts via existing system
- Dashboard widget highlights issues with color-coded status indicators
2025-09-08 16:59:01 +00:00
Pulse Monitor
cb08dd85d4 feat: implement S.M.A.R.T. disk monitoring for Proxmox nodes (addresses #429)
- Added disk polling to monitoring cycle using Proxmox API
- Created CheckDiskHealth() alert manager for failing drives and low SSD life
- Added PhysicalDisk model to state with proper serialization
- Implemented DiskList component with health indicators and SSD wearout bars
- Added Physical Disks tab to Storage page with toggle between pools and disks
- Added ZFS health badges to storage cards for degraded/failed pools
- Alerts trigger for health != PASSED and SSD wearout < 10%
- Frontend displays disk model, type, temperature, and usage information
2025-09-08 16:40:05 +00:00
Pulse Monitor
820ee6499d fix: improve cluster detection to handle qdevice configurations (addresses #428)
- Add API validation for cluster nodes to filter out qdevice VMs
- Only include nodes with working Proxmox APIs in cluster endpoints
- Prevent connection failures when cluster has non-Proxmox participants
- Add detailed logging for cluster node validation process

This resolves issues where Proxmox clusters using corosync qdevice
(external quorum device) would fail to connect because Pulse tried
to connect to the qdevice VM which has no Proxmox API.
2025-09-07 21:19:30 +00:00
Pulse Monitor
95987141b9 fix: improve guest URL validation and error handling (addresses #427)
- Add client-side URL validation with instant feedback
- Show validation errors inline below URL input fields
- Prevent saving when URLs have validation errors
- Improve error message extraction in API client
- Handle incomplete URLs like 'https://emby.' gracefully
- Backend already had validation, now frontend shows it properly
2025-09-07 14:27:03 +00:00
Pulse Monitor
eab4c07986 fix: improve error handling for guest URL saving (addresses #427)
- Add more specific error messages when metadata save fails
- Better handling of permission and disk space errors
- This should help diagnose why guest URLs fail to save in some cases
- The atomic write operation was already in place but errors weren't clear
2025-09-07 14:05:22 +00:00
Pulse Monitor
00338fb8f6 fix: completely bypass auth for development environment
- Skip auth check entirely in App.tsx for development
- Add .env.dev file with DISABLE_AUTH=true and PULSE_MOCK_MODE=true
- Update hot-dev.sh to load .env.dev environment variables
- This ensures the app loads immediately without auth issues
- WebSocket and API now work without authentication in dev mode
2025-09-07 13:49:17 +00:00
Pulse Monitor
b3fb480ffb fix: resolve app loading issue by bypassing auth check hang
- Changed initial isLoading state to false to prevent infinite loading
- Initialize WebSocket store immediately on component mount
- Added error handling and debug logging to identify issues
- Added 10-second timeout fallback for auth checks
- The auth check was hanging, preventing the app from ever loading
2025-09-07 12:45:49 +00:00
Pulse Monitor
b571f03269 fix: resolve TypeScript errors in ResourceTable component
- Fixed Resource interface to properly define all used properties
- Added proper optional chaining for potentially undefined values
- Fixed displayValue to always return a number type
- Properly handle undefined thresholds and defaults in event handlers
- Fixed input value handling to work with strict TypeScript checks
2025-09-07 07:52:51 +00:00
Pulse Monitor
e0260cb0d1 fix: resolve PBS alert toggle and offline alert issues (addresses #426)
- Fixed PBS alert toggle not responding in thresholds settings
- PBS servers now use connectivity toggle like nodes instead of disabled toggle
- Added support for disableConnectivity flag on PBS instances in backend
- Fixed PBS ID format mismatch between frontend and backend
- PBS offline alerts now properly respect the disableConnectivity setting
- Prevents spam alerts by checking disableConnectivity flag for PBS offline alerts
2025-09-07 07:13:56 +00:00
Pulse Monitor
e4e4f515c7 fix: resolve VM disk monitoring issues (addresses #414, #416, #425)
- Always query guest agent for running VMs instead of only when disk is 0
- Add duplicate mount point detection to prevent inflated disk totals
- Show allocated disk size as fallback when guest agent unavailable
- Add comprehensive logging for guest agent disk queries
- Include diagnostic script for troubleshooting VM disk issues
2025-09-06 19:59:25 +00:00
Pulse Monitor
3abd6c43ba chore: bump version to v4.15.0-rc.2 2025-09-06 19:52:56 +00:00
Pulse Monitor
5325ef481e fix: comprehensive VM disk usage reporting improvements (addresses #414, #416, #348, #367, #425)
- Always query guest agent for running VMs (cluster/resources API always returns 0)
- Show allocated disk size when guest agent unavailable (instead of misleading 0%)
- Fix duplicate mount point counting issue (#425)
- Add comprehensive logging for guest agent queries
- Include diagnostic script for troubleshooting VM disk issues
- Update both monitor.go and monitor_optimized.go for consistency
2025-09-06 19:52:11 +00:00
Pulse Monitor
11541a1f6d fix: resolve TypeScript type comparison errors in backup VMID handling 2025-09-06 12:42:16 +00:00
Pulse Monitor
bb9bb9371f chore: bump version to v4.15.0-rc.1 2025-09-06 12:39:26 +00:00
Pulse Monitor
dda66c4cd3 security: fix path traversal and malformed token handling vulnerabilities
- Prevent path traversal attacks by cleaning and validating URL paths
- Use secure token comparison to prevent timing attacks
- Return appropriate HTTP status codes for different attack vectors
- Add comprehensive logging for security events
2025-09-06 12:38:46 +00:00
Pulse Monitor
961d9c81e3 fix: restore guest agent disk stats in optimized monitor (addresses #414)
The parallel optimization introduced in commit 634e0dd37 accidentally removed
all guest agent filesystem fetching logic from the optimized monitor code.
This caused VMs with guest agents to show no disk stats after v4.12.1.

Added back the guest agent fetching logic to pollVMsWithNodesOptimized:
- Fetches filesystem info when VM disk stats are 0
- Aggregates disk usage from all valid filesystems
- Skips special filesystems and Windows System Reserved partitions
- Uses guest agent data when available to show accurate disk usage

This restores disk stats display for VMs with working QEMU guest agents.
2025-09-06 11:17:04 +00:00
Pulse Monitor
5615662d9e feat: complete ZFS pool monitoring implementation (addresses #423)
- Implement proper API integration with list and detail endpoints
- Add ZFS pool and device status conversion
- Enable by default with PULSE_DISABLE_ZFS_MONITORING opt-out
- Test with real Proxmox nodes and verify functionality
- Add comprehensive error handling and logging
- Document feature configuration and requirements

The feature now properly:
- Fetches ZFS pool status from Proxmox API
- Detects degraded/faulted pools and devices
- Tracks read/write/checksum errors
- Generates appropriate alerts
- Displays issues in the Storage tab UI

Tested and verified working with real Proxmox clusters.
2025-09-06 10:56:17 +00:00
Pulse Monitor
9582afc0b1 fix: comprehensive PMG backup detection with debug mode (addresses #359)
- Added debug mode: localStorage.setItem('debug-pmg', 'true')
- Robust VMID=0 detection handles string and number types
- Debug logging shows exactly what's happening with PMG backups
- Created test suite that verifies all PMG backup scenarios
- All test cases pass including PBS 'ct' type with VMID='0'

Users experiencing issues can enable debug mode to help diagnose:
1. Open browser console
2. Run: localStorage.setItem('debug-pmg', 'true')
3. Reload page and check for [PMG Debug] messages
4. Share debug output if still showing as LXC

Test results:
✓ PBS PMG backup (ct type with VMID 0) → Host
✓ PBS PMG backup (ct type with numeric VMID 0) → Host
✓ Storage PMG backup (host type) → Host
✓ Storage PMG backup (lxc type with VMID 0) → Host
✓ Regular LXC backup → LXC
2025-09-06 10:49:20 +00:00
Pulse Monitor
70ee42468b fix: make PMG backup detection more robust for VMID=0 (addresses #359)
- Handle VMID as both string and number types consistently
- Check for both 'ct' and 'lxc' backup types (PBS uses 'ct')
- Check for both 'vm' and 'qemu' backup types for consistency
- Always check VMID=0 first before checking backup type
- PBS stores PMG backups as 'ct' type with VMID='0' (string)

This should properly identify all PMG host config backups regardless
of whether they come from PBS or regular storage, and regardless
of whether VMID is a string or number.
2025-09-06 10:44:01 +00:00
Pulse Monitor
8de7ca69ec debug: add logging to diagnose PMG backup detection issue #359
Added console.log statements to understand why PMG backups with VMID=0
are still showing as LXC in v4.14.0. This will help identify:
- What data type vmid is (string vs number)
- What backup type is being sent
- Whether the checks are being triggered
2025-09-06 10:43:28 +00:00
Pulse Monitor
a7647acc34 fix: make ZFS monitoring experimental and opt-in
- Add PULSE_ENABLE_ZFS_MONITORING env var (disabled by default)
- Fix API field mapping (health vs state, cksum vs checksum)
- Add proper API endpoint structures for list and detail
- Mark feature as experimental due to API complexity
- Simplify conversion to handle basic health status only

This is a safer approach until we can fully test with real Proxmox nodes
2025-09-06 10:41:49 +00:00
Pulse Monitor
c58be6878e feat: add ZFS pool status monitoring (addresses #423)
- Add ZFS pool status data structures to models
- Implement ZFS pool data collection via Proxmox API
- Add ZFS pool health alerts for degraded/faulted states
- Add ZFS device error detection and alerting
- Display ZFS pool status in Storage tab when issues detected
- Add mock data generation for testing ZFS monitoring
- Alert on read/write/checksum errors for pools and devices
2025-09-06 10:35:53 +00:00
Pulse Monitor
160a6eb957 fix: properly identify PMG host config backups (VMID=0) as Host type (addresses #359) 2025-09-06 10:35:02 +00:00