- Create shared NodeGroupHeader component to eliminate code duplication
- Replace vertical line indicator with circular dot matching guest rows
- Update online indicator to use bg-green-500 (matching guest indicators)
- Reduce node row padding from py-2 to py-1 for more compact layout
- Set background to dark:bg-gray-900 to match search bar styling
- Apply changes consistently across Dashboard and Storage tabs
This commit addresses all issues reported in GitHub issue #485:
1. **SMART Status Recognition**
- Fix disk health check to accept both "PASSED" and "OK" status
- Previously only "PASSED" was recognized as healthy
- Location: internal/monitoring/monitor.go:1255
2. **ZFS Spare Device False Alerts**
- Skip ZFS SPARE devices unless they have actual errors
- SPARE devices are intentional and should not trigger alerts
- Updated in two locations:
- pkg/proxmox/zfs.go:154 (device filtering)
- internal/alerts/alerts.go:1077 (alert generation)
3. **Memory Display Granularity**
- Increase byte formatting precision from 0 to 1 decimal place
- Improves accuracy (e.g., "1.7 GB" instead of "1 GB" for 86% of 2GB)
- Location: frontend-modern/src/utils/format.ts:3
4. **Custom Alert Rules Evaluation**
- Add ReevaluateGuestAlert() method for proper threshold reevaluation
- Add comments explaining custom rules evaluation limitations
- Next poll cycle will properly clear stale alerts with new thresholds
Additional improvements:
- Fix ZFS pool alert locking to prevent deadlocks
- Prevent discovery service from running in mock mode
- Restore discovery service when exiting mock mode
Fixes#485
Make mock mode configuration part of the repository instead of a local-only
file. This ensures consistent mock mode behavior across all environments
(development, CI/CD, demo server) and makes it work out of the box for
new contributors.
Changes:
- Add mock.env to repository with sensible defaults (mock mode OFF by default)
- Support mock.env.local for personal overrides (gitignored)
- Update .gitignore to allow mock.env but exclude .local variants
- Backend loads mock.env then merges mock.env.local overrides
- hot-dev.sh loads both files in correct order
Benefits:
- New developers can clone and use mock mode immediately
- Demo server gets consistent mock configuration
- Personal preferences stay private in .local file
- No surprises - mock mode disabled by default in fresh clones
- CI/CD can use mock mode without custom configuration
Documentation:
- Updated README.md to explain mock.env is in repo
- Enhanced MOCK_MODE.md with local override instructions
- Updated claude.md with new configuration strategy
- Added mock.env.local.example for quick setup
Example workflow:
git clone <repo>
npm run mock:on # Works immediately with repo defaults
# Or create personal config:
cp docs/development/mock.env.local.example mock.env.local
# Edit mock.env.local with your preferences
Improve performance when serving /api/state in mock mode by optimizing
alert handling and JSON serialization.
Changes:
- Add UpdateAlertSnapshots() to cache alerts without blocking
- Use lazy population of alert snapshots to avoid lock contention
- Switch to json.Marshal for better performance with large payloads
- Add debug logging to track /api/state performance
- Simplify GetState() logic in mock mode
Performance improvements:
- Eliminates alert manager lock during /api/state requests
- Reduces JSON encoding overhead for large mock datasets
- Ensures sub-second response times even with 7 nodes and 90+ guests
Testing:
- Mock mode returns state instantly without blocking
- Alert snapshots populate correctly on first request
- Debug logs confirm fast execution path
Implement a hot-reloadable mock mode system that works seamlessly in both
development and production environments without requiring manual restarts
or port changes.
Key Features:
- Backend watches mock.env and auto-reloads when changed (via fsnotify + polling)
- npm commands for easy toggling: mock:on, mock:off, mock:status, mock:edit
- Works in both hot-dev mode and systemd deployments
- Reload completes in 2-5 seconds with no manual intervention
- No port changes or process restarts required
Implementation:
- Extended ConfigWatcher to monitor both .env and mock.env
- Added callback system to trigger ReloadableMonitor.Reload()
- Enhanced toggle-mock.sh to support both hot-dev and systemd modes
- Updated hot-dev.sh banner to show mock status and commands
- Created comprehensive documentation in docs/development/MOCK_MODE.md
Testing:
- Backend builds successfully
- Watcher initializes and monitors both files
- npm run mock:on/off toggles successfully
- mock.env updates correctly
- Scripts work in both hot-dev and systemd modes
Documentation:
- Added Mock Mode section to README.md
- Created detailed guide in docs/development/MOCK_MODE.md
- Updated claude.md with mock mode architecture and usage
Mock mode continues to return cached data instantly from memory
(no API calls, no locks, no timeouts), ensuring fast /api/state responses.
Additional safeguards to prevent dev/production config conflicts:
1. **hot-dev.sh**: Explicitly export PULSE_DATA_DIR before starting backend
- Ensures backend always uses /opt/pulse/tmp/dev-config in dev mode
- Prevents accidental fallback to /etc/pulse
- Adds logging to show which config directory is being used
2. **sync-production-config.sh**: Smart encryption key handling
- Never overwrites existing dev encryption key
- Warns if production key is newer (unusual scenario)
- Keeps dev key to avoid breaking encrypted configs
- Adds detailed logging of sync decisions
These changes ensure that when Vite restarts:
- Backend always uses the correct dev-config directory
- Sync script never breaks working dev configuration
- All decisions are logged clearly for debugging
Related to previous commit fixing nodes.enc corruption.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses critical issues where nodes configuration was being
lost or corrupted, causing user frustration and data loss.
## Changes:
### 1. Sync Script Protection (sync-production-config.sh)
- Never overwrites newer dev config with older production files
- Validates timestamps before syncing
- Shows detailed logging of sync decisions
- Prevents accidental overwrites of working configuration
### 2. Timestamped Backups (persistence.go)
- Creates timestamped backup before EVERY save (e.g., nodes.enc.backup-20251001-073000)
- Maintains "latest" backup for quick recovery
- Auto-cleans old backups (keeps last 10)
- Ensures we can always recover from corruption
### 3. Empty Config Protection (persistence.go)
- BLOCKS attempts to save empty nodes config when existing nodes exist
- Prevents accidental data wipes
- Returns error with clear message about what was blocked
### 4. Enhanced Corruption Recovery (persistence.go)
- Detects "cipher: message authentication failed" errors
- Automatically attempts recovery from backup files
- Renames corrupted files with timestamps for forensics
- Logs detailed recovery process
### 5. Performance Logging (GuestRow.tsx)
- Added timing for individual metadata API calls
- Helps identify performance bottlenecks
## Why This Matters:
Previous behavior allowed:
- Corrupted files to overwrite working configs
- Empty configs to delete all nodes
- No way to recover from corruption
- Race conditions during rapid restarts
New behavior ensures:
- Multiple backup copies always exist
- Corruption auto-recovers from backups
- Empty saves are blocked
- Sync script validates before overwriting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Corrected widespread misinformation claiming API tokens cannot access guest agent data on Proxmox 9.
Changes:
- Rewrote VM_DISK_MONITORING.md with accurate technical explanation
- Deleted VM_DISK_STATS_TROUBLESHOOTING.md (contained false information)
- Updated FAQ.md with correct quick reference and troubleshooting link
- Added comprehensive VM disk troubleshooting section to TROUBLESHOOTING.md
- Fixed README.md troubleshooting reference
- Updated frontend tooltip to show accurate permission requirements
- Corrected backend log messages to remove "known limitation" language
- Updated test-vm-disk.sh diagnostic script with accurate guidance
Key corrections:
- API tokens work fine for guest agent queries on both PVE 8 and 9
- Proxmox API returning disk=0 is normal behavior, not a bug
- Both tokens and passwords work equally well
- Only requirements: guest agent installed + proper permissions
- Permission issues are config problems, not authentication method limitations
Documentation now provides clear user journey: FAQ → Troubleshooting → Full Guide
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Added streaming discovery that shows servers as they're found
- Backend sends WebSocket updates for each discovered server
- Frontend displays servers immediately without waiting for full scan
- Created sync-production-config.sh to preserve nodes when switching modes
- Updated toggle-mock.sh to sync config when disabling mock mode
- Dev environment now maintains separate config that syncs from production
- Enabled discovery service in dev environment by default
addresses real-time discovery UX and mock/production mode configuration persistence
- Fix port conflict: backend now uses 7656, frontend uses 7655
- Fix mock mode not loading: use load_env_file for proper export
- Fix pipefail crashes on port checks: disable during lsof checks
- Add error handling for /etc/pulse/.env permission issues
- Update .gitignore to exclude sensitive files and temp scripts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The alert acknowledgment endpoints were hanging because GetState() was called
synchronously to broadcast updates via WebSocket, which could take significant
time with many nodes/guests. This caused the HTTP response to timeout, showing
an error to users even though the alert was successfully acknowledged.
Fixed by:
- Sending HTTP response immediately after acknowledging the alert
- Moving WebSocket broadcast to a goroutine to avoid blocking
- Applied fix to all alert endpoints (acknowledge, unacknowledge, clear, bulk ops)
This resolves the issue where users saw 'Failed to acknowledge alert' errors
but the alert was actually acknowledged (disappeared on refresh).
- Skip auth check entirely in App.tsx for development
- Add .env.dev file with DISABLE_AUTH=true and PULSE_MOCK_MODE=true
- Update hot-dev.sh to load .env.dev environment variables
- This ensures the app loads immediately without auth issues
- WebSocket and API now work without authentication in dev mode
- Fixed PBS alert toggle not responding in thresholds settings
- PBS servers now use connectivity toggle like nodes instead of disabled toggle
- Added support for disableConnectivity flag on PBS instances in backend
- Fixed PBS ID format mismatch between frontend and backend
- PBS offline alerts now properly respect the disableConnectivity setting
- Prevents spam alerts by checking disableConnectivity flag for PBS offline alerts
- Add dynamic metric fluctuations for VMs and containers in mock data
- Fix alert acknowledgment to dim instead of hide alerts
- Implement unacknowledge functionality with backend persistence
- Simplify alert UI to single-click toggle (remove selection system)
- Add proper hysteresis for alert resolution when metrics drop
- Fix SVG icon boundaries in alert displays
- Add webhook disable toggles for testing without notifications
- Fix frontend directory duplication issue (addresses frontend-modern recreation)
- Improve alert sorting to show most recent first
- Make mock system generate realistic metric changes for proper alert lifecycle
- Changed button type from 'button' to 'submit' so it actually submits the form
- This was the root cause - the button looked clickable but had no onClick handler
- Now the form properly submits when clicking Change Password
- Ensure disk metrics from /nodes endpoint are preserved when GetNodeStatus fails
- Add better fallback logic to prevent showing 0% or '-' for disk usage
- Improve logging to distinguish between rootfs and /nodes endpoint metrics
- Handle cases where neither rootfs nor valid node disk data is available
This fixes the regression introduced in v4.12.1 where disk stats would show as
'-' when GetNodeStatus failed due to network issues or rate limiting
- Update screenshot tool to use MacBook Air resolution (2560x1600)
- Remove empty side borders from screenshots
- Use mock data for all screenshots for privacy
- Fix mobile alert buttons overflowing viewport
- Exempt localhost from API rate limiting for better dev experience
- Update documentation to showcase all features with screenshots
- Reorganize README visual tour into feature sections
- Add high-quality screenshots with 3x device scale factor for crisp text
- Implement mock alert history generator spanning 90 days
- Update documentation with detailed screenshot descriptions
- Add visual tour section to README with key screenshots
- Fix mock mode to properly separate from production data
- Clean up screenshot script to use actual mock data instead of DOM injection
- Enhance FAQ and webhooks docs with relevant screenshots
- Fixed escape key not clearing tag filters from search box
- Replaced PVENodeTable/PBSNodeTable with unified NodeSummaryTable
- Column order now correctly shows: Node, Status, Uptime, CPU, Memory, Disk, VMs/Containers
- Fixed alert acknowledgment persistence bug in monitor.go
- Reordered columns in NodeSummaryTable for better logical grouping
- Consolidated development environment with improved hot-dev script
- Updated vite.config.dev.ts for consistent port usage (7656)
- Enhanced Go main.go with better development mode detection
- Removed obsolete development scripts for cleaner repository
- Added comprehensive development documentation in CLAUDE.md
Development improvements ensure consistent port 7655 usage and
eliminate conflicts between different development approaches.
- Shows PVE/PBS version in node summary tables
- Helps quickly identify nodes needing updates
- Extracts just version number from PVE (e.g. 9.0.5)
- Also improved mock mode system for local development
- Add systemd timer for daily update checks (2-6 AM window)
- Create pulse-auto-update.sh script with safe rollback on failure
- Add --enable-auto-updates flag to install script
- Prompt users during fresh install to enable auto-updates
- Respect autoUpdateEnabled flag in system.json
- Only install stable releases, never RCs
- Full logging to systemd journal
- Tested and verified working in container
- Updated test-security.sh to detect and handle DISABLE_AUTH mode
- Fixed test-release.sh to properly check for compiled binary
- Updated test-proxy-scenarios.sh arithmetic operations for bash compatibility
- All tests now properly detect authentication state and adapt accordingly
- Created comprehensive mock data generator for nodes, VMs, containers
- Added toggle scripts for easy switching between real and mock mode
- Integrated with backend-watch.sh for auto-rebuild with mock support
- Modified monitor to skip polling when mock mode is enabled
- Added CLAUDE.md documentation for future sessions
Note: Mock system initializes but data isn't fully integrated with GetState() yet.
Currently shows mixed real + mock data. Works for UI testing purposes.
Replaced the two-step setup code process with a simpler token-in-URL approach:
- Auth token is now embedded directly in the setup URL
- No more prompting users for setup codes
- Same security level with better UX
- Backwards compatible with old setupCode field
The new flow generates a command like:
curl -sSL "http://pulse/api/setup-script?...&auth_token=TOKEN" | bash
This makes it much easier for users, especially in Proxmox shell where
interactive prompts can be problematic.
- Added 'pulse --version' and 'pulse version' commands
- Version info embedded at build time (version, commit, build date)
- Added Privacy section to README - no telemetry/analytics
- Added example alert messages to show webhook capabilities
- Build script now properly embeds version information
- Added test-release.sh for core functionality testing
- Added test-edge-cases.sh for URL and header edge cases
- Added test-proxy-scenarios.sh for reverse proxy testing
- Added test-security.sh for security vulnerability testing
- Added test-installation-methods.sh for deployment validation
- Added test-all.sh master script to run all tests
- These tests would have caught issue #334 and prevent similar issues
- Move development scripts to scripts/ directory (dev.sh, hot-dev.sh, build.sh, etc.)
- Move UPGRADE_NOTICE to docs/ directory
- Remove empty 2025-08-14 file
- Update all references to moved scripts in documentation
- Fix alternating zero I/O metrics by implementing rate caching for stale data from Proxmox
- Hardcode polling interval to 10 seconds (matching Proxmox cluster/resources update cycle)
- Remove polling interval settings from UI (no longer user-configurable)
- Implement efficient VM/container polling using single cluster/resources API call
- Remove 'Remove Password' feature (auth is now mandatory)
- Fix CSRF validation for Basic Auth (exempt from CSRF checks)
- Fix Generate API Token modal and authentication
- Remove redundant 'Active' status from Authentication section
- Remove Connection Timeout setting from frontend (backend-only)
- Clean up frontend console logging (reduce verbosity)
- Remove PBS polling interval setting (fixed at 10s)
- Add frontend rebuild detection to backend-watch script
- Improve first-run setup flow and error handling
- Fix auto-updater to handle single-binary structure
- Fix Docker build to copy frontend before Go compilation
- Add development script for frontend rebuilds
- Remove unnecessary frontend directory copying in updater
The embedded frontend change simplifies deployment but required
updates to various build and update systems.
Replaced sudo-based updater with a cleaner directory-based approach:
- Pulse binary now installs to /opt/pulse/bin/pulse (owned by pulse user)
- Symlink created at /usr/local/bin/pulse for PATH convenience
- Pulse user has full write access to /opt/pulse, enabling self-updates
- Removed sudo dependency and security risks
- Simplified update logic - no special scripts or permissions needed
This is more secure, simpler, and works in all environments (containers, VMs, bare metal)
- Created pulse-updater script that runs with sudo to update root-owned binary
- Modified install.sh to set up sudoers permissions for pulse user
- Updated build-release.sh to include scripts directory in releases
- Install script now installs sudo (if missing) and configures NOPASSWD access
This fixes the 'Failed to apply update' error when Pulse runs as non-root user
and needs to update the binary at /usr/local/bin/pulse