Commit graph

64 commits

Author SHA1 Message Date
Pulse Monitor
5d22ef6a0c fix: integrate mock data system with monitoring
Mock mode now properly returns simulated data including PMG host backups.
The monitor's GetState() method now checks for mock mode and returns
mock data when enabled, allowing full testing of UI features without
real Proxmox nodes.
2025-08-27 19:01:51 +00:00
Pulse Monitor
3995a7dacb fix: properly identify PMG host config backups
Addresses #359 - PMG host config backups with VMID=0 are now correctly
identified as "Host" type instead of being misidentified as LXC containers.
Added purple color scheme for Host type backups in the UI.
2025-08-27 18:52:27 +00:00
Pulse Monitor
daf58db4c3 fix: prevent shared storage conflicts between clusters (addresses #355)
When multiple clusters were added, shared storage from different clusters
would use the same ID (e.g., 'shared-local'), causing storage from one
cluster to overwrite storage from another. Now using instance-specific IDs
for shared storage to ensure each cluster's storage is properly tracked.
2025-08-27 14:26:42 +00:00
Pulse Monitor
3d0175a3bd fix: remove mock support from production builds 2025-08-27 13:50:39 +00:00
Pulse Monitor
8f4871b4a2 feat: enhance backup tab with useful metadata
- Changed deduplication display from ratio (14.4:1) to multiplier (14.4x)
- Added encryption indicators for PBS backups (lock icon)
- Added owner column showing who created each PBS backup
- Fixed owner display to use separate column instead of cramped next to node name
- Added owner field to PBSBackup model and populated from PBS API

These improvements make it easier to understand backup status at a glance
2025-08-27 07:38:30 +00:00
Pulse Monitor
8e5b7db949 feat: implement comprehensive alert system for PBS and storage resources
- Add PBS alert monitoring (CPU, memory, offline detection)
- Add storage offline detection with proper cluster awareness
- Remove bulk toggle feature from thresholds UI (unnecessary complexity)
- Add enable/disable buttons for PBS servers in thresholds tab
- Fix storage offline detection to avoid false positives in clusters
  (only alert on truly offline storage, not inactive cluster storage)

Alert improvements:
- PBS instances now properly monitored like nodes
- Storage devices generate offline alerts with confirmation system
- All resource types support custom thresholds and disable toggles
- Consistent alert ID format across all resource types
- Proper hysteresis and confirmation counts to prevent flapping

addresses #123 (if there was an issue about missing PBS alerts)
2025-08-26 20:46:58 +00:00
Pulse Monitor
9bbf6757e3 feat: add PBS deduplication factor display to backup frequency chart
- capture deduplication_factor from PBS API datastore status endpoint
- display average deduplication ratio in backup frequency chart header
- shows as green 'Deduplication: X.X:1' when PBS datastores provide this data
2025-08-26 16:50:50 +00:00
Pulse Monitor
063cd670a6 feat: improve filter UX with full-width search fields and dynamic node summary filtering
- Remove max-width constraint on search fields to utilize available space
- Node summary table now updates based on search/filter criteria
- Only show nodes with matching guests when filtering is active
- Calculate node metrics based on filtered guests only
- Show matched guest count in node summary when filtering
- Provides better visual feedback on what the filters are affecting
2025-08-26 11:18:44 +00:00
Pulse Monitor
6988c2b0c8 feat: add mock data system for UI testing (partial integration)
- Created comprehensive mock data generator for nodes, VMs, containers
- Added toggle scripts for easy switching between real and mock mode
- Integrated with backend-watch.sh for auto-rebuild with mock support
- Modified monitor to skip polling when mock mode is enabled
- Added CLAUDE.md documentation for future sessions

Note: Mock system initializes but data isn't fully integrated with GetState() yet.
Currently shows mixed real + mock data. Works for UI testing purposes.
2025-08-26 07:56:15 +00:00
Pulse Monitor
da745b0d88 fix: correct VM disk monitoring documentation for PVE 9
TESTED AND CONFIRMED: API tokens CAN access guest agent data on PVE 9!
- Created test tokens and verified they work
- Guest agent API returns proper disk usage data
- The cluster/resources endpoint shows disk=0 but that's not what Pulse uses
- Pulse correctly fetches data via /nodes/{node}/qemu/{vmid}/agent/get-fsinfo

The misinformation about PVE 9 not working was completely wrong. It does work when properly configured with PVEAuditor role which includes VM.GuestAgent.Audit permission.
2025-08-25 15:25:10 +00:00
Pulse Monitor
35cecd475a docs: provide honest assessment of PVE 9 VM disk monitoring
Stop making definitive claims about what works or doesn't work. The reality:
- Some users (like you) have it working fine in cluster configs
- Others report 0% disk usage
- The exact conditions that make it work are unclear
- Results vary between different setups

Updated all docs and messages to reflect this uncertainty rather than making false claims about non-existent workarounds or absolute limitations.
2025-08-25 15:20:34 +00:00
Pulse Monitor
6f86ad5b7f fix: correct the misinformation about PVE 9 VM disk monitoring
Previous advice was completely wrong. The facts:
- VM.Monitor permission doesn't exist in PVE 9 (was removed)
- It was replaced with VM.GuestAgent.Audit
- But even with correct permissions, API tokens CANNOT access guest agent data on PVE 9
- This is Proxmox bug #1373 with NO working workaround for API tokens
- Users must accept 0% VM disk usage on PVE 9 until Proxmox fixes it upstream

Updated all documentation and error messages to reflect this reality instead of giving false hope about non-existent workarounds.
2025-08-25 15:04:41 +00:00
Pulse Monitor
6fd96d7bed fix: remove misleading root@pam authentication advice
The root@pam suggestion doesn't actually work since it requires the Linux system root password, not a Proxmox-specific password. Most users don't know or have disabled their Linux root password for security.

Updated all documentation and error messages to correctly advise users to grant VM.Monitor permission to their API token user instead.
2025-08-25 14:59:37 +00:00
Pulse Monitor
7592c95021 fix: prevent duplicate node names in alert IDs for single-node setups
When the instance name equals the node name (common in single-node setups),
avoid generating redundant IDs like "pve-pve-100" by using just "pve-100".
This fixes alert acknowledgment issues where the UI couldn't match alert
IDs due to the duplicate node name pattern.

Addresses #353
2025-08-25 14:11:58 +00:00
Pulse Monitor
9dea067521 improve: add better diagnostics for guest agent issues
- Add verification steps for qemu-guest-agent service status
- Clarify that the service is socket-activated (not systemctl enable)
- Add diagnostic commands users can run to verify agent is working
- Update FAQ with correct troubleshooting steps for agent issues

This helps users like @RLSinRFV who were trying to enable the service
when it's actually socket-activated and should start automatically.
2025-08-25 09:12:25 +00:00
Pulse Monitor
04809119d7 fix: correct VM disk monitoring guidance for PVE 8 users
The real issue for PVE 8 users seeing 0% disk usage:
- Users who added nodes BEFORE v4.7 don't have VM.Monitor permission
- The setup script always created tokens with privsep=0, so that wasn't the issue
- Solution: Re-run the setup script or manually add VM.Monitor permission

Updated error messages and documentation to reflect the actual cause
and provide the correct fix for users experiencing this issue.
2025-08-25 09:07:22 +00:00
Pulse Monitor
4675b5bf92 improve: clearer VM disk monitoring error messages (addresses #348, #344)
- Add detailed logging when VM disk monitoring fails due to permissions
- Explain Proxmox 9 limitation: API tokens cannot access guest agent data (PVE bug #1373)
- Explain Proxmox 8 requirements: VM.Monitor permission and privsep=0 for tokens
- Update setup script to show appropriate warnings for each PVE version
- Update FAQ with troubleshooting steps for 0% disk usage on VMs
- Log messages now clearly indicate workarounds for each scenario

The core issue: Proxmox 9 removed VM.Monitor permission and the replacement
permissions don't allow API tokens to access guest agent filesystem info.
This is a Proxmox upstream bug that affects their own web UI as well.

For users experiencing this issue:
- PVE 9: Use root@pam credentials or wait for Proxmox to fix upstream
- PVE 8: Ensure token has VM.Monitor and privsep=0
- All versions: QEMU guest agent must be installed in VMs
2025-08-25 09:00:40 +00:00
Pulse Monitor
6c4a931a65 fix: document PVE 9 VM disk monitoring limitation properly
addresses #348

After extensive testing and research:

CONFIRMED: This is a Proxmox 9 API limitation, not a configuration issue
- Guest agent get-fsinfo works when called as root (qm agent <vmid> get-fsinfo)
- API tokens CANNOT access this data even with VM.GuestAgent.Audit permission
- Proxmox's own web UI also shows 0% for VM disk usage (bug #1373)

Updated:
- Setup script now clearly explains this is a known Proxmox limitation
- Changed log level from Warn to Debug for permission errors (expected on PVE 9)
- Added references to Proxmox bug #1373

Workarounds for users:
1. Use root@pam credentials instead of API tokens for full VM disk monitoring
2. Container (LXC) disk usage works correctly with tokens
3. Wait for Proxmox to fix this upstream

The guest agent returns the data (total-bytes, used-bytes) but Proxmox's
API doesn't allow token access to it. This is not something we can fix
in Pulse - it needs to be addressed in Proxmox itself.
2025-08-24 22:44:16 +00:00
Pulse Monitor
2ae72e2490 fix: improve PVE 9 guest agent permissions handling
addresses #348

- Updated setup script to properly detect and handle Proxmox 9 where VM.Monitor was removed
- For PVE 9+, now creates custom role with Sys.Audit permissions (replaces VM.Monitor)
- Attempts to add VM.Agent or Sys.Modify permissions for better guest agent access
- Added better error logging to identify permission issues with guest agent API
- Warns users about PVE 9 permission requirements if disk usage shows 0%

The setup script now:
1. Properly detects PVE version using pveversion command
2. Creates appropriate roles based on PVE version (VM.Monitor for PVE 8, Sys.Audit for PVE 9)
3. Provides clear instructions if guest agent access still doesn't work
2025-08-24 22:24:34 +00:00
Pulse Monitor
4c4b89431d feat: add comprehensive diagnostics for VM guest agent disk usage issues
Improved logging to help users diagnose why VM disk usage might not be showing:
- Clearly identify when agent is enabled in config but not running in guest OS
- Detect timeout issues with unresponsive agents
- Log when agent returns no filesystem info
- Show which filesystems are included/excluded from calculations
- Distinguish between no agent, agent not running, and agent working

This will help users understand exactly why their VM disk usage isn't showing
and what steps they need to take to fix it (install qemu-guest-agent, restart
the service, etc).

addresses discussion #344
2025-08-24 08:04:13 +00:00
Pulse Monitor
2fd1eecc39 fix: VM disk usage not showing when QEMU Guest Agent is enabled
The agent field in Proxmox can have values other than just 0 or 1 when features are enabled, causing the strict equality check (== 1) to fail. Changed to check for any value > 0 to properly detect when the agent is enabled.

addresses discussion #344
2025-08-24 07:56:04 +00:00
Pulse Monitor
a67390d019 fix: make setup script endpoint public to address authentication errors
- Setup script no longer requires authentication (uses setup codes instead)
- Fixed discovery service not starting when toggled via settings
- Addresses #347 and discussion #344
2025-08-23 07:16:31 +00:00
Pulse Monitor
a8b7d2748e feat: encrypt webhook data at rest for improved security
Webhooks now stored encrypted (webhooks.enc) instead of plain text:
- Automatic migration from webhooks.json to webhooks.enc
- Uses same AES-256-GCM encryption as nodes and email configs
- Original file backed up as webhooks.json.backup
- Protects sensitive webhook URLs and authentication headers

This addresses the security concern where webhook URLs containing API tokens
(like Telegram bot tokens) were stored in plain text.
2025-08-22 10:19:42 +00:00
Pulse Monitor
e0900ac006 feat: add VM disk usage monitoring via QEMU guest agent
- Add GetVMFSInfo method to fetch filesystem data from guest agent
- Integrate guest agent disk stats for VMs in both polling modes
- Aggregate real disk usage from all filesystems (skip special mounts)
- Fall back gracefully to allocated size when agent unavailable
- Add VM.Monitor permission to auto-negotiation script via PulseMonitor role
- Update frontend NodeModal with new permission instructions

VMs with QEMU guest agent now show actual disk usage like LXCs do.
Addresses #344
2025-08-21 23:25:59 +00:00
Pulse Monitor
4a2e7b4547 feat: add toggle to disable network discovery
Addresses #343 - users can now disable Proxmox/PBS server discovery through:
- UI toggle in Settings > System > Network Settings
- Environment variable DISCOVERY_ENABLED=false
- system.json configuration

Discovery runs by default but can be completely disabled for environments where automatic scanning causes issues (e.g., shared hosting networks).
2025-08-21 21:13:29 +00:00
Pulse Monitor
12cdf8d369 feat: add disable alerts option for individual guests
- Add ability to completely disable alerts for specific guests in Custom Overrides
- Refactor override editing to use single form instead of inline editing
- Add dashboard indicators for guests with custom overrides (blue cog for custom thresholds, grey bell-slash for disabled)
- Remove complex Proxmox tag-based alert control system in favor of simpler UI controls
- Improve layout and UX for alert override management
2025-08-20 18:51:22 +00:00
Pulse Monitor
7445cf7055 feat: auto-hash plain text credentials from environment variables
- Automatically hash plain text API tokens (SHA3-256) and passwords (bcrypt) when loaded from env vars
- Remove unnecessary PULSE_SETUP_TOKEN feature in favor of simpler env var approach
- Remove HandleInitialSetup endpoint - not needed with env var configuration
- Update authentication to always use hashed comparisons (no plain text warnings)
- Update documentation to clearly explain auto-hashing capability
- Maintain backward compatibility with pre-hashed credentials

This makes Pulse secure by default while keeping deployment simple - users can
provide plain text credentials via environment variables and Pulse automatically
hashes them for security.
2025-08-19 14:58:01 +00:00
Pulse Monitor
40e6ed89a7 chore: reorganize repository structure for better maintainability
- Move development scripts to scripts/ directory (dev.sh, hot-dev.sh, build.sh, etc.)
- Move UPGRADE_NOTICE to docs/ directory
- Remove empty 2025-08-14 file
- Update all references to moved scripts in documentation
2025-08-18 21:57:40 +00:00
Pulse Monitor
bb2320a857 fix: prevent syslog spam on standalone Proxmox nodes
- Only check cluster status during initial configuration, not during polling
- Cache cluster membership in config to avoid repeated API calls
- Skip cluster/resources endpoint entirely for standalone nodes
- Change cluster detection failure from WARN to DEBUG (expected for standalone)

This addresses #322 where standalone PVE nodes were causing certificate
lookup errors in syslog every minute during polling.
2025-08-18 09:43:04 +00:00
Pulse Monitor
fc1bd556b6 fix: add missing Type field when creating VMs/containers from cluster resources
addresses #329 - VMs were being displayed as LXC containers because the Type field wasn't being set when using the efficient cluster/resources polling method
2025-08-18 07:46:35 +00:00
Pulse Monitor
0629d3bbcb fix: prevent cluster/resources calls on non-clustered nodes
Non-clustered Proxmox nodes were getting certificate verification errors
when Pulse tried to use the cluster/resources endpoint. Now checks if
the node is actually in a cluster before attempting efficient polling.
2025-08-17 20:09:45 +00:00
Pulse Monitor
c85eebc165 fix: set PBS status to offline when connection fails (addresses #326)
When both GetVersion and GetDatastores fail for PBS, properly set the
Status field to 'offline' and ConnectionHealth to 'error'. This prevents
the red dot from appearing when the instance state is undefined.
2025-08-17 19:06:06 +00:00
Pulse Monitor
afcf9e4772 fix: properly address syslog spam on non-clustered nodes (#322)
- Mark deprecated poll functions that cause duplicate GetNodes() calls
- Add warnings when deprecated functions are called directly
- Previous fix only created WithNodes versions but didn't prevent the originals from being called
- This completes the fix started in commit fcd782370
- Reduces API calls and prevents certificate verification spam in syslog

The deprecated functions (pollVMs, pollContainers, pollStorage, pollStorageBackups)
still exist for backward compatibility but log warnings if called.
2025-08-17 17:03:48 +00:00
Pulse Monitor
09846faeb1 fix: guest alerts and webhook notifications working properly
- Fixed double CPU percentage multiplication for containers/VMs
- Added CheckGuest calls to efficient polling path
- Fixed newline escaping in grouped webhook notifications
- Guest alerts now properly trigger for containers and VMs

These changes address issues where guest alerts weren't being triggered
at all due to the efficient polling path not calling CheckGuest, and
webhook notifications were failing due to unescaped newlines in grouped
alert messages breaking JSON templates.
2025-08-17 08:07:39 +00:00
Pulse Monitor
9698290fd0 fix: reduce API calls to prevent syslog spam on non-clustered nodes (#322)
- Cache nodes list in pollPVEInstance and pass to sub-functions
- Prevents multiple GetNodes() calls per polling cycle
- Reduces API calls from ~5 per cycle to 1 per cycle
- Fixes syslog spam on standalone PVE nodes trying to find cluster certificates
- Fixes PBS 'Transport endpoint not connected' errors from excessive polling

Previously we were calling GetNodes() in:
- pollPVEInstance (main)
- pollVMs
- pollContainers
- pollStorage
- pollStorageBackups

Now we call it once and pass the list to avoid duplicate API calls that trigger
certificate checks on non-clustered nodes.
2025-08-16 12:48:03 +00:00
Pulse Monitor
a01dff8514 fix: resolve WebSocket metric updates and improve polling efficiency
- Fix alternating zero I/O metrics by implementing rate caching for stale data from Proxmox
- Hardcode polling interval to 10 seconds (matching Proxmox cluster/resources update cycle)
- Remove polling interval settings from UI (no longer user-configurable)
- Implement efficient VM/container polling using single cluster/resources API call
- Remove 'Remove Password' feature (auth is now mandatory)
- Fix CSRF validation for Basic Auth (exempt from CSRF checks)
- Fix Generate API Token modal and authentication
- Remove redundant 'Active' status from Authentication section
- Remove Connection Timeout setting from frontend (backend-only)
- Clean up frontend console logging (reduce verbosity)
- Remove PBS polling interval setting (fixed at 10s)
- Add frontend rebuild detection to backend-watch script
- Improve first-run setup flow and error handling
2025-08-16 12:12:10 +00:00
Pulse Monitor
1a994ba3c6 chore: add debug logging for notification troubleshooting 2025-08-14 20:46:52 +00:00
Pulse Monitor
53c6fc89a3 fix: improve cluster handling with offline nodes and fix node card border styling
- Cluster now handles offline nodes gracefully without marking endpoints unhealthy
- Fixed error 595 (node unreachable) not being treated as node-specific failure
- Added parallel health checks with shorter timeouts for better performance
- Fixed inconsistent border width on offline node cards (removed conflicting border-l-4)
- Switched to ring utility for consistent outline on offline/alert nodes
- Improved logout functionality with proper CSRF token handling

addresses #312, #315
2025-08-14 15:46:37 +00:00
Pulse Monitor
2c5a56c046 fix: update cluster node online indicators based on actual status
- tracks online/offline status for individual cluster nodes
- updates ClusterEndpoint.Online field during node polling
- fixes issue where all cluster nodes showed green indicator regardless of status

fixes #312
2025-08-14 14:24:19 +00:00
Pulse Monitor
8492b0932d fix: dashboard now uses actual configured host URLs for node links
addresses #306 - The dashboard and storage views were hardcoding port 8006 for node links,
but now they properly use the host URLs from the node configuration. This ensures users
are redirected to the correct URL when clicking on node names, respecting custom ports
and protocols configured in the settings.

- Added host field to Node struct in Go models
- Updated monitor.go to populate host field from instance config
- Added host field to TypeScript Node interface
- Modified Dashboard and Storage components to use nodeHostMap for correct URLs
- Falls back to old behavior if host field is not available
2025-08-12 14:28:19 +00:00
Pulse Monitor
378ebcb250 Major improvements to security, alerts, and ease of use
Security enhancements:
- Fixed critical issue: PBS tokens no longer logged in plaintext
- PVE tokens now properly masked in all log outputs
- Enhanced token security documentation

Alert system fixes:
- Fixed storage alerts not working due to threshold being 0
- Added automatic defaults preservation for alert thresholds
- Storage alerts now properly trigger at 85% usage

Node management improvements:
- Fixed node deletion causing 'Node not found' errors
- Added instant discovery refresh when nodes are deleted
- Added manual refresh buttons for discovery
- Fixed PBS token cleanup in auto-registration scripts
- Fixed /dev/tty errors when running scripts in Docker containers

Bug fixes:
- Fixed CPU MHz field type mismatch causing JSON unmarshal errors
- Suppressed non-critical container snapshot API errors
- Fixed auto-registration using Docker internal IPs instead of actual host IPs

Documentation updates:
- Added comprehensive security documentation
- Streamlined setup documentation focusing on ease of use
- Removed marketing language and consolidated repetitive content

Frontend improvements:
- Added WebSocket support for real-time node updates
- Added discovery refresh buttons in Settings
- Improved node deletion feedback
2025-08-11 13:59:58 +00:00
Pulse Monitor
161bbf5ec4 fix: exclude ct templates and isos from backup tab (fixes #265)
- filter out vztmpl (container templates) from backup list
- filter out iso files from backup list
- only show actual vm/container backups in the backup tab
- remove unnecessary checks for template/iso content types
2025-08-11 10:13:12 +00:00
Pulse Monitor
0491abf885 fix: prevent cpu alerts for non-running vms and containers (fixes #273)
- check if vm/container status is "running" before using cpu value
- set cpu to 0 for stopped, paused, suspended states
- prevents false high cpu alerts for offline vms
- handles all non-running states, not just "stopped"
2025-08-11 10:08:41 +00:00
Pulse Monitor
2f2ab19c0b fix: QEMU guest agent VMs now show memory usage correctly
- Fall back to vmStatus.Mem when guest agent doesn't report FreeMem
- Fixes issue where VMs with guest agent showed 0% memory usage
- Addresses issue #294
2025-08-11 08:45:49 +00:00
Pulse Monitor
068322bb45 feat: add DISCOVERY_SUBNET environment variable support for Docker network discovery configuration 2025-08-10 19:44:31 +00:00
Pulse Monitor
fbf8e5f1ce fix: RAM usage calculation and webhook test functionality
- Fixed incorrect RAM usage display for VMs without guest agent (issue #280)
  - VMs without guest agent now show 0% usage instead of 100%
  - Only show actual usage when guest agent provides FreeMem data
  - Containers continue to show accurate usage as before

- Fixed webhook test functionality (issue #279)
  - Added proper webhook ID handling in test notification endpoint
  - Created SendTestWebhook method to test specific webhooks
  - Frontend can now successfully trigger webhook tests
2025-08-10 10:59:26 +00:00
Pulse Monitor
f8ef3f9259 fix: multiple critical issues in monitoring and notifications
- PBS instances now show as online when datastores are accessible even if version endpoint fails
- Email sending now uses proper STARTTLS support for compatibility with providers like SMTP2GO
- Email recipient input no longer filters entries while typing
- Auto-update setting now properly persists and loads from config
- Fixed CPU usage alerts for offline VMs (already addressed in previous commits)
2025-08-09 23:26:12 +00:00
Pulse Monitor
0ebfb8ec01 hotfix: backup type detection for PBS backups
- Added format field checking for pbs-ct and pbs-vm
- Changed unknown type fallback from VM to LXC (more common)
- Fixes issue where all backups showed as VM type
2025-08-09 22:42:04 +00:00
Pulse Monitor
a368d3b3c9 attempt to address: Discord webhooks, backup types, storage duplicates, alert issues
- Added service field to WebhookConfig to identify Discord webhooks
- Use Discord-specific template when sending Discord webhooks
- Fixed backup type detection for PBS backups (vm/ct)
- Fixed shared storage duplicate IDs across instances
- Fixed alert acknowledge/clear response format to match frontend expectations
2025-08-09 22:27:10 +00:00
Pulse Monitor
311ef7619e fix: critical production issues for v4.1.0-rc.5
- Fixed Discord/Slack/Teams webhooks not persisting (Issue #272)
- Fixed email recipients not saving and Enter key issue (Issue #270)
- Fixed auto-update toggle not saving (Issue #269)
- Fixed false CPU alerts for stopped VMs/containers (Issue #273)
- Automatic alert clearing for stopped guests
- Preserve passwords when updating email config

chore: bump version to v4.1.0-rc.5
2025-08-09 18:27:30 +00:00