Pulse/cmd/pulse-sensor-proxy
rcourtman d3875eaae5 Dramatically improve temperature proxy installation robustness
Users were abandoning Pulse due to catastrophic temperature monitoring setup failures. This commit addresses the root causes:

**Problem 1: Silent Failures**
- Installations reported "SUCCESS" even when proxy never started
- UI showed green checkmarks with no temperature data
- Zero feedback when things went wrong

**Problem 2: Missing Diagnostics**
- Service failures logged only in journald
- Users saw "Something going on with the proxy" with no actionable guidance
- No way to troubleshoot from error messages

**Problem 3: Standalone Node Issues**
- Proxy daemon logged continuous pvecm errors as warnings
- "ipcc_send_rec" and "Unknown error -1" messages confused users
- These are expected for non-clustered/LXC setups

**Solutions Implemented:**

1. **Health Gate in install.sh (lines 1588-1629)**
   - Verify service is running after installation
   - Check socket exists on host
   - Confirm socket visible inside container via bind mount
   - Fail loudly with specific diagnostics if any check fails

2. **Actionable Error Messages in install-sensor-proxy.sh (lines 822-877)**
   - When service fails to start: dump full systemctl status + 40 lines of logs
   - When socket missing: show permissions, service status, and remediation command
   - Include common issues checklist (missing user, permission errors, lm-sensors, etc.)
   - Direct link to troubleshooting docs

3. **Better Standalone Node Detection in ssh.go (lines 585-595)**
   - Recognize "Unknown error -1" and "Unable to load access control list" as LXC indicators
   - Log at INFO level (not WARN) since this is expected behavior
   - Clarify message: "using localhost for temperature collection"

**Impact:**
- Eliminates "green checkmark but no temps" scenario
- Users get immediate actionable feedback on failures
- Standalone/LXC installations work silently without error spam
- Reduces support burden from #571 (15+ comments of user frustration)

Related to #571
2025-11-13 10:14:19 +00:00
..
audit.go Make pulse-sensor-proxy resilient to read-only filesystems 2025-11-06 00:18:51 +00:00
audit_test.go Make pulse-sensor-proxy resilient to read-only filesystems 2025-11-06 00:18:51 +00:00
auth.go feat(security): Implement range-based rate limiting 2025-11-07 17:08:45 +00:00
auth_test.go feat(security): Implement GID authorization enforcement 2025-11-07 17:09:16 +00:00
capabilities.go feat(security): Add capability-based authorization 2025-11-07 17:09:32 +00:00
cleanup.go feat: add comprehensive node cleanup system 2025-10-17 18:53:45 +00:00
config.example.yaml feat(security): Add node allowlist validation to prevent SSRF attacks 2025-11-07 17:08:28 +00:00
config.go feat(security): Add node allowlist validation to prevent SSRF attacks 2025-11-07 17:08:28 +00:00
main.go Fix persistent temperature monitoring issues for standalone Proxmox nodes (addresses #571) 2025-11-09 16:53:14 +00:00
main_test.go feat(security): Add capability-based authorization 2025-11-07 17:09:32 +00:00
metrics.go feat(security): Add node allowlist validation to prevent SSRF attacks 2025-11-07 17:08:28 +00:00
ssh.go Dramatically improve temperature proxy installation robustness 2025-11-13 10:14:19 +00:00
ssh_test.go Fix persistent temperature monitoring issues for standalone Proxmox nodes (addresses #571) 2025-11-09 16:53:14 +00:00
throttle.go feat(security): Implement range-based rate limiting 2025-11-07 17:08:45 +00:00
throttle_test.go feat(security): Implement range-based rate limiting 2025-11-07 17:08:45 +00:00
validation.go Improve sensor proxy cluster validation (Related to #703) 2025-11-12 19:17:45 +00:00
validation_fuzz_test.go security: complete Phase 1 sensor proxy hardening 2025-10-20 15:13:37 +00:00
validation_test.go Improve sensor proxy cluster validation (Related to #703) 2025-11-12 19:17:45 +00:00