Users were abandoning Pulse due to catastrophic temperature monitoring setup failures. This commit addresses the root causes:
**Problem 1: Silent Failures**
- Installations reported "SUCCESS" even when proxy never started
- UI showed green checkmarks with no temperature data
- Zero feedback when things went wrong
**Problem 2: Missing Diagnostics**
- Service failures logged only in journald
- Users saw "Something going on with the proxy" with no actionable guidance
- No way to troubleshoot from error messages
**Problem 3: Standalone Node Issues**
- Proxy daemon logged continuous pvecm errors as warnings
- "ipcc_send_rec" and "Unknown error -1" messages confused users
- These are expected for non-clustered/LXC setups
**Solutions Implemented:**
1. **Health Gate in install.sh (lines 1588-1629)**
- Verify service is running after installation
- Check socket exists on host
- Confirm socket visible inside container via bind mount
- Fail loudly with specific diagnostics if any check fails
2. **Actionable Error Messages in install-sensor-proxy.sh (lines 822-877)**
- When service fails to start: dump full systemctl status + 40 lines of logs
- When socket missing: show permissions, service status, and remediation command
- Include common issues checklist (missing user, permission errors, lm-sensors, etc.)
- Direct link to troubleshooting docs
3. **Better Standalone Node Detection in ssh.go (lines 585-595)**
- Recognize "Unknown error -1" and "Unable to load access control list" as LXC indicators
- Log at INFO level (not WARN) since this is expected behavior
- Clarify message: "using localhost for temperature collection"
**Impact:**
- Eliminates "green checkmark but no temps" scenario
- Users get immediate actionable feedback on failures
- Standalone/LXC installations work silently without error spam
- Reduces support burden from #571 (15+ comments of user frustration)
Related to #571
This fixes two bugs that prevented temperature monitoring from working
after running install-sensor-proxy.sh on LXC deployments:
1. CRITICAL: Pulse service not restarted after systemd override
- The installer wrote PULSE_SENSOR_PROXY_SOCKET env var to systemd
drop-in and ran daemon-reload, but never restarted Pulse service
- Running Pulse instances continued using old environment variables
- Temperatures wouldn't work until manual Pulse restart
- Now: Automatically restart Pulse if running after writing override
2. Added guard to check if Pulse service exists before configuring
- Installer would write systemd override even if Pulse not installed
- Left orphaned drop-in files that confused users
- Now: Check if pulse.service exists, warn and skip if not found
3. MINOR: Fix inconsistent Docker mount instructions
- docker-compose.yml showed :ro (read-only) mount
- Installer output showed :rw (read-write) mount
- Changed installer to match compose file (:ro is correct and secure)
Impact: Users in #600 reported "socketFound=false" even after running
installer successfully. This was because Pulse never picked up the new
socket path without a restart.
Adds build support for 32-bit x86 (i386/i686) and ARMv6 (older Raspberry Pi models) architectures across all agents and install scripts.
Changes:
- Add linux-386 and linux-armv6 to build-release.sh builds array
- Update Dockerfile to build docker-agent, host-agent, and sensor-proxy for new architectures
- Update all install scripts to detect and handle i386/i686 and armv6l architectures
- Add architecture normalization in router download endpoints
- Update update manager architecture mapping
- Update validate-release.sh to expect 24 binaries (was 18)
This enables Pulse agents to run on older/legacy hardware including 32-bit x86 systems and Raspberry Pi Zero/Zero W devices.
Fixes two critical bugs in refresh_smart_cache() that prevented SMART
temperature collection from working:
1. Invalid smartctl parameter: Changed -n standby,after to -n standby
The 'after' parameter is not valid in smartctl 7.4 and causes:
"INVALID ARGUMENT TO -n: standby,after"
Valid syntax is standby[,STATUS[,STATUS2]] where STATUS must be numeric.
2. Broken process detection: Replaced exec -a with lock file approach
The original exec -a pulse-sensor-wrapper-refresh bash line replaced
the subshell with a new bash process that had no script to run, causing
the function to exit immediately without collecting any SMART data.
New approach uses a lock file ($CACHE_DIR/smart-refresh.lock) with
trap-based cleanup to prevent concurrent refresh operations.
Credits to @ZaDarkSide for identifying these issues in PR #672.
The self-heal timer runs 'systemctl list-unit-files | grep -q' every hour.
When grep matches and exits early, systemctl logs "Failed to print table:
Broken pipe" to syslog. This is cosmetic but floods Proxmox logs and
confuses operators.
Changes:
- Redirect stderr from systemctl to /dev/null
- Prevents the broken pipe message from reaching syslog
- Self-heal functionality unchanged
This addresses the concern raised in discussion #628.
Extends temperature monitoring to collect SMART temps for SATA/SAS disks,
addressing issue #652 where physical disk temperatures showed as empty.
Architecture:
- Deploys pulse-sensor-wrapper.sh as SSH forced command on Proxmox nodes
- Wrapper collects both CPU/GPU temps (sensors -j) and disk temps (smartctl)
- Implements 30-min cache with background refresh to avoid performance impact
- Uses smartctl -n standby,after to skip sleeping drives without waking them
- Returns unified JSON: {sensors: {...}, smart: [...]}
Backend changes:
- Add DiskTemp model with device, serial, WWN, temperature, lastUpdated
- Extend Temperature model with SMART []DiskTemp field and HasSMART flag
- Add WWN field to PhysicalDisk for reliable disk matching
- Update parseSensorsJSON to handle both legacy and new wrapper formats
- Rewrite mergeNVMeTempsIntoDisks to match SMART temps by WWN → serial → devpath
- Preserve legacy NVMe temperature support for backward compatibility
Performance considerations:
- SMART data cached for 30 minutes per node to avoid excessive smartctl calls
- Background refresh prevents blocking temperature requests
- Respects drive standby state to avoid spinning up idle arrays
- Staggered disk scanning with 0.1s delay to avoid saturating SATA controllers
Install script:
- Deploys wrapper to /usr/local/bin/pulse-sensor-wrapper.sh
- Updates SSH forced command from "sensors -j" to wrapper script
- Backward compatible - falls back to direct sensors output if wrapper missing
Testing note:
- Requires real hardware with smartmontools installed for full functionality
- Empty smart array returned gracefully when smartctl unavailable
- Legacy sensor-only nodes continue working without changes
Related to #571
This addresses multiple temperature monitoring issues:
1. Fix single-node Proxmox installation failure: Add '|| true' to pvecm status
calls to prevent script exit on standalone (non-clustered) nodes with
'set -euo pipefail'. The script now properly falls through to standalone
node configuration when cluster detection fails.
2. Build pulse-sensor-proxy for all Linux architectures (amd64, arm64, armv7)
in Dockerfile to ensure binaries are available for download on all supported
platforms. This resolves the missing binary issue from v4.23.0.
Note: AMD Tctl sensor support was already implemented in a previous commit.
This addresses GitHub Discussion #605 where users were unclear about
configuring the temperature proxy when running Pulse in Docker.
Changes:
**install-sensor-proxy.sh:**
- Add Docker-specific post-install instructions when --standalone flag is used
- Show required docker-compose.yml bind mount configuration
- Provide verification commands for Docker deployments
- Link to full documentation for troubleshooting
**TEMPERATURE_MONITORING.md:**
- Add prominent "Quick Start for Docker Deployments" section at the top
- Move Docker instructions earlier in the document for better visibility
- Provide complete 4-step setup process with verification commands
These changes ensure Docker users immediately see:
1. How to install the proxy on the Proxmox host
2. What bind mount to add to docker-compose.yml
3. How to restart and verify the setup
4. Where to find detailed troubleshooting
The installer now provides actionable next steps instead of just
confirming installation, reducing confusion for containerized deployments.
Major improvements to the install script based on comprehensive review:
## 1. Temperature Monitoring - No Restart Required ✨
- Ask about temperature monitoring BEFORE container creation (not after)
- Add bind mount during `pct create` instead of requiring restart later
- Quick mode defaults to "yes", Advanced mode asks user
- Host path: /run/pulse-sensor-proxy → /mnt/pulse-proxy in container
- Support --skip-restart flag in install-sensor-proxy.sh
- Eliminates disruptive container restart on fresh installs
## 2. Shell Injection Prevention 🔒
- Replace `eval pct create` with array-based command building
- Prevents quoting bugs with special characters in hostnames/nameservers
- Safer handling of user input in container creation
## 3. Non-Interactive Install Support 🤖
- Replace bare `read` with `safe_read_with_default` in prompts
- Prevents hangs when running `curl | bash` non-interactively
- Proper fallback to sensible defaults
## 4. Cleanup on Interrupt 🧹
- Track container ID globally during creation
- Properly cleanup orphaned containers on Ctrl+C/SIGTERM
- New handle_install_interrupt() function
- Prevents leftover containers after cancelled installs
## 5. Air-Gapped Network Support 🌐
- Replace 8.8.8.8 ping check with `hostname -I` IP detection
- Supports restricted/firewalled networks where external ping fails
- More reliable for DHCP-only environments
Changes:
- install.sh: Refactor temperature prompt timing and mount setup
- install.sh: Convert pct create to array-based args (lines 1018-1055)
- install.sh: Add handle_install_interrupt trap (lines 38-48)
- install.sh: Replace ping check with IP detection (line 1082)
- scripts/install-sensor-proxy.sh: Add --skip-restart flag support
- scripts/install-sensor-proxy.sh: Improve mount detection and updates
Impact:
- Fresh installs now complete without any container restarts
- Temperature monitoring works immediately after first boot
- Safer and more robust for automation/CI scenarios
- Better experience on restricted networks
Co-authored-by: Codex AI
The installer was configuring pulse-backend.service.d but the actual
service is pulse.service, so the PULSE_SENSOR_PROXY_SOCKET environment
variable wasn't being set.
Changed: pulse-backend.service → pulse.service
This ensures Pulse actually uses the proxy socket for temperature
monitoring instead of attempting SSH connections.
Changed curl flags from -fsSL to -fSL to enable error output.
The -s flag was silencing all curl errors including SSL/TLS issues,
making it impossible to diagnose download failures.
With -S (show errors), stderr now captures meaningful error messages
like certificate problems, connection failures, etc.
- Back up container config before making mount modifications
- Restore original config if socket verification fails
- Clean up backup file on success or when verification is skipped
- Leave host-level resources (user, binary, service) in place for idempotency
This ensures failed installations don't leave containers in an
inconsistent state while keeping successfully installed host services
for faster re-runs.
Setup script improvements (config_handlers.go):
- Remove redundant mount configuration and container restart logic
- Let installer handle all mount/restart operations (single source of truth)
- Eliminate hard-coded mp0 assumption
Installer improvements (install-sensor-proxy.sh):
- Add mount configuration persistence validation via pct config check
- Surface pct set errors instead of silencing with 2>/dev/null
- Capture and display curl download errors with temp files
- Check systemd daemon-reload/enable/restart exit codes
- Show journalctl output when service fails to start
- Make socket verification fatal (was warning)
- Provide clear manual steps when hot-plug fails on running container
This makes the installation fail fast with actionable error messages
instead of silently proceeding with broken configuration.
Added clear messaging to explain why the socket bind mount is configured,
focusing on the security benefits rather than technical implementation.
Changes:
- Add explanatory header "Secure Container Communication Setup"
- Explain the three key benefits:
• Container communicates via Unix socket (not SSH)
• No SSH keys exposed inside container (enhanced security)
• Proxy on host manages all temperature collection
- Update technical messages to be more user-friendly:
• "Configuring socket bind mount" instead of "Ensuring..."
• "Restarting container to activate secure communication"
• "Verifying secure communication channel"
• "✓ Secure socket communication ready"
• "Configuring Pulse to use proxy"
This helps users understand WHY the bind mount exists (security) rather
than just seeing technical implementation details.
Instead of warning the user to restart the container manually, the script
now automatically restarts it when the socket mount configuration is
updated. This ensures the mount is immediately active and temperature
monitoring works right away without user intervention.
Uses 'pct restart' if running, 'pct start' if stopped.
The install-sensor-proxy.sh script now tries GitHub releases first, then falls
back to downloading from the Pulse server if GitHub fails or doesn't have the
binary (common when building from main).
The LXC installer sets PULSE_SENSOR_PROXY_FALLBACK_URL to point to the Pulse
server running inside the newly created LXC, ensuring the proxy binary can be
downloaded from /api/install/pulse-sensor-proxy.
This fixes the issue where installing with --main would fail to install
pulse-sensor-proxy on the host because GitHub releases don't include it yet.
Adds a one-command Docker deployment flow that:
- Detects if running in LXC and installs Docker if needed
- Automatically installs pulse-sensor-proxy on the Proxmox host
- Configures bind mount for proxy socket into LXC
- Generates optimized docker-compose.yml with proxy socket
- Enables temperature monitoring via host-side proxy
The install-docker.sh script handles the complete setup including:
- Docker installation (if needed)
- ACL configuration for container UIDs
- Bind mount setup
- Automatic apparmor=unconfined for socket access
Accessible via: curl -sSL http://pulse:7655/api/install/install-docker.sh | bash
- Cleanup script now detects forced command restriction on standalone nodes
- Logs helpful message explaining limitation (security by design)
- Does not fail when standalone nodes cannot be cleaned up
- Documents that standalone node cleanup is limited by forced command security
- Automatic cleanup works fully for cluster nodes
- Manual cleanup command provided for standalone nodes if needed
- Cleanup script now tries proxy's SSH key first for standalone nodes
- Falls back to default SSH if proxy key not available
- Fixes cleanup failure when Proxmox host doesn't have direct SSH to standalone nodes
- Create cleanup script that removes Pulse SSH keys from nodes
- Add systemd path unit to watch for cleanup requests
- Add systemd service to execute cleanup script
- Update install-sensor-proxy.sh to install cleanup system
- Handles both cluster nodes (pulse-managed-key) and standalone nodes (pulse-proxy-key)
- Cleanup is triggered automatically when nodes are deleted from Pulse
- All cleanup actions are logged via syslog for auditability
Implements automatic temperature monitoring setup for standalone
Proxmox/Pimox nodes without manual SSH key configuration.
Changes:
- Add /api/system/proxy-public-key endpoint to expose proxy's SSH public key
- Setup script now detects standalone nodes (non-cluster)
- Auto-fetches and installs proxy SSH key with forced commands
- Add Raspberry Pi temperature support via cpu_thermal and /sys/class/thermal
- Enhance setup script with better error handling for lm-sensors installation
- Add RPi detection to skip lm-sensors and use native thermal interface
Security:
- Public key endpoint is safe (public keys are meant to be public)
- All installed keys use forced command="sensors -j" with full restrictions
- No shell access, port forwarding, or other SSH features enabled
Fixes two issues with the sensor proxy installation:
1. Local node IP detection now uses exact matching instead of substring matching to avoid false negatives
2. Removes duplicate output filtering in the setup script wrapper
These changes ensure that the proxy SSH key is correctly configured on the local node during cluster installations.
Implements automated cleanup workflow when nodes are deleted from Pulse, removing all monitoring footprint from the host. Changes include a new RPC handler in the sensor proxy for cleanup requests, enhanced node deletion modal with detailed cleanup explanations, and improved SSH key management with proper tagging for atomic updates.
Improvements to pulse-sensor-proxy:
- Fix cluster discovery to use pvecm status for IP addresses instead of node names
- Add standalone node support for non-clustered Proxmox hosts
- Enhanced SSH key push with detailed logging, success/failure tracking, and error reporting
- Add --pulse-server flag to installer for custom Pulse URLs
- Configure www-data group membership for Proxmox IPC access
UI and API cleanup:
- Remove unused "Ensure cluster keys" button from Settings
- Remove /api/diagnostics/temperature-proxy/ensure-cluster-keys endpoint
- Remove EnsureClusterKeys method from tempproxy client
The setup script already handles SSH key distribution during initial configuration,
making the manual refresh button redundant.
Made the setup and installation output more concise and reassuring for users. Less verbosity, clearer messaging.
**Setup script improvements:**
- Changed "Container Detection" → "Enhanced Security"
- Simplified prompts: "Enable secure proxy? [Y/n]"
- Cleaned up success messages: "✓ Secure proxy architecture enabled"
- Removed verbose status messages (node-by-node cleanup output)
- Only show essential information users need to see
**install-sensor-proxy.sh improvements:**
- Added --quiet flag to suppress verbose output
- In quiet mode, only shows: "✓ pulse-sensor-proxy installed and running"
- Full output still available when run manually
- Removed redundant "Installation complete!" banners
- Cleaner legacy key cleanup messaging
**Result:**
Users see a clean, professional installation flow that builds confidence. Technical details are hidden unless needed. Messages are clear and reassuring rather than verbose.
When pulse-sensor-proxy is installed, automatically remove old SSH keys that were stored in the container for security.
Changes:
**install-sensor-proxy.sh:**
- Checks container for SSH private keys (id_rsa, id_ed25519, etc.)
- Removes any found keys from container
- Warns user that legacy keys were cleaned up
- Explains proxy now handles SSH
**Setup script (config_handlers.go):**
- After successful proxy install, removes old SSH keys from all cluster nodes
- Cleans up authorized_keys entries that match the old container-based key
- Keeps only proxy-managed keys (pulse-sensor-proxy comment)
This provides a clean migration path from the old direct-SSH method to the secure proxy architecture. Users upgrading from pre-v4.24 versions get automatic cleanup of insecure container-stored keys.
Complete the pulse-sensor-proxy rename by updating the installer script name and all references to it.
Updated:
- Renamed scripts/install-temp-proxy.sh → scripts/install-sensor-proxy.sh
- Updated all documentation references
- Updated install.sh references
- Updated build-release.sh comments
2025-10-13 13:23:53 +00:00
Renamed from scripts/install-temp-proxy.sh (Browse further)