Fix persistent temperature monitoring issues for standalone Proxmox nodes (addresses #571)

This commit resolves the recurring temperature monitoring failures that have plagued multiple releases:

1. **Fix user mismatch (v4.27.1 regression)**:
   - Changed binary default user from 'pulse-sensor' to 'pulse-sensor-proxy'
   - Aligns with the user created by install-sensor-proxy.sh (line 389)
   - Prevents panic when binary is run outside systemd context
   - Systemd unit already uses User=pulse-sensor-proxy, so this makes manual runs work too

2. **Fix standalone node validation (v4.25.0+ regression)**:
   - pvecm status exits with code 2 on standalone nodes (not in a cluster)
   - This caused validation to fail, rejecting all temperature requests
   - Added discoverLocalHostAddresses() helper that discovers actual host IPs/hostnames
   - On standalone nodes, cluster membership list is populated with host's own addresses
   - Maintains SSRF protection while allowing standalone operation
   - Added comprehensive test coverage

3. **Make installer fail loudly on proxy setup failure**:
   - Previously, failed proxy installation only printed a warning
   - Install script then claimed "Pulse installation complete!" (confusing for users)
   - Now exits with clear error message and remediation steps
   - Forces operators to fix proxy issues before claiming success
   - Users who skip temperature monitoring are unaffected

4. **Add test coverage to prevent future regressions**:
   - Added TestDiscoverLocalHostAddresses to verify local address discovery
   - Validates no loopback or link-local addresses are returned
   - All existing tests pass with new changes

Pattern of failures across releases:
- v4.23.0: Missing proxy binaries in release
- v4.24.0-rc.3: AMD CPU sensor naming (Tctl vs Tdie)
- v4.25.0: Single-node pvecm status exit code
- v4.27.1: User mismatch (pulse-sensor vs pulse-sensor-proxy)

This comprehensive fix addresses the root causes rather than applying another tactical patch.

Related to #571
This commit is contained in:
rcourtman 2025-11-09 16:53:14 +00:00
parent 62a9f40cc7
commit c9d1671afd
4 changed files with 138 additions and 7 deletions

View file

@ -172,3 +172,34 @@ func TestReadAllWithLimit(t *testing.T) {
t.Fatalf("expected unlimited read to return full data without exceeding")
}
}
func TestDiscoverLocalHostAddresses(t *testing.T) {
// This test verifies that discoverLocalHostAddresses returns valid addresses
// It will vary by host but should always return at least hostname or IP addresses
addresses, err := discoverLocalHostAddresses()
if err != nil {
t.Fatalf("discoverLocalHostAddresses failed: %v", err)
}
if len(addresses) == 0 {
t.Fatal("expected at least one address from discoverLocalHostAddresses")
}
// Verify addresses are non-empty and don't contain loopback
for _, addr := range addresses {
if addr == "" {
t.Error("got empty address in results")
}
if addr == "127.0.0.1" || addr == "::1" {
t.Errorf("discoverLocalHostAddresses should not return loopback address: %s", addr)
}
if strings.HasPrefix(addr, "127.") {
t.Errorf("discoverLocalHostAddresses should not return loopback range: %s", addr)
}
if strings.HasPrefix(addr, "fe80:") {
t.Errorf("discoverLocalHostAddresses should not return link-local IPv6: %s", addr)
}
}
t.Logf("Discovered %d local addresses: %v", len(addresses), addresses)
}