Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-28 11:30:15 +00:00

Author	SHA1	Message	Date
rcourtman	cedf0c8f0f	fix(temperature): parse string sensor values without zeroing readings (#1224 )	2026-02-09 14:00:09 +00:00
rcourtman	1e77763870	feat: improve monitoring and temperature handling Temperature Monitoring: - Enhance temperature collection and processing - Add temperature tests Monitor Improvements: - Improve monitor reload handling - Add reload tests Test Coverage: - Add Ceph monitoring tests - Add Docker commands tests - Add host agent temperature tests - Add extra coverage tests	2026-01-24 22:43:31 +00:00
rcourtman	7049f5b43c	refactor: simplify temperature monitoring after sensor proxy removal Remove proxy-related temperature code paths: - temperature.go: remove proxy client integration and fallback logic - config.go: remove SensorProxyEnabled and related config fields - monitor.go: remove proxy client initialization and state Temperature monitoring now relies solely on the unified agent approach.	2026-01-21 12:00:28 +00:00
rcourtman	c81bbba8a3	perf: Use strconv.Itoa instead of fmt.Sprintf for int conversion strconv.Itoa is faster than fmt.Sprintf("%d", ...) because it doesn't need to parse a format string. Changed 4 occurrences in monitoring package where integers are converted to strings.	2025-12-02 15:21:41 +00:00
rcourtman	02a00473f2	fix: Make parseNVMeTemps deterministic by checking Composite before Sensor 1 Go map iteration order is non-deterministic, causing flaky test failures. This fix ensures "Composite" sensor is always preferred over "Sensor 1" by checking them in separate loops rather than relying on iteration order.	2025-12-01 20:03:05 +00:00
rcourtman	b370799988	chore: remove more dead code Remove 330 lines of unreachable code: - internal/monitoring/temperature_service.go: unused temperature service abstraction - internal/monitoring/temperature.go: unused NewTemperatureCollector wrapper - internal/mock/generator.go: unused GenerateAlerts function - internal/mock/integration.go: unused ToggleMockMode wrapper - internal/notifications/notifications.go: unused sendEmailWithContent, generatePayloadFromTemplate, isPrivateRange172, groupAlerts - internal/notifications/email_providers.go: unused GetProviderDefaults	2025-11-27 00:10:55 +00:00
courtmanr@gmail.com	a68f2de3e6	Relax container SSH check for temperature monitoring (ref #727 )	2025-11-25 08:00:08 +00:00
courtmanr@gmail.com	d23ab9a7f7	Feat: Add support for Raspberry Pi RP1 ADC temperature sensor (Fixes #745 ) - Added 'rp1_adc' to the list of recognized CPU temperature chips	2025-11-23 22:33:16 +00:00
courtmanr@gmail.com	78308cbc10	Fix: Prevent single node auth failure from disabling global SSH temperature collection - Removed global legacySSHDisabled flag that was triggered by any single node auth failure - Changed disableLegacySSHOnAuthFailure to only log warnings - Fixed potential context leak in monitor.go - Updated tests to reflect removal of global disable logic	2025-11-23 22:24:15 +00:00
rcourtman	596bdbfb13	Handle standby SMART temps and capture disk identity	2025-11-22 07:35:13 +00:00
rcourtman	a03f8115b6	Improve installer temperature proxy and backup polling	2025-11-18 18:42:33 +00:00
rcourtman	13daa61d1d	Harden turnkey install and proxy auto-registration	2025-11-18 00:24:50 +00:00
rcourtman	f9341ae1fc	Improve temperature proxy workflow	2025-11-17 14:25:46 +00:00
rcourtman	47d5c14aef	Improve temperature proxy control-plane flow	2025-11-15 21:49:51 +00:00
rcourtman	aa357e5013	Fix HTTP mode for pulse-sensor-proxy and improve installer safety ## HTTP Server Fixes - Add source IP middleware to enforce allowed_source_subnets - Fix missing source subnet validation for external HTTP requests - HTTP health endpoint now respects subnet restrictions ## Installer Improvements - Auto-configure allowed_source_subnets with Pulse server IP - Add cluster node hostnames to allowed_nodes (not just IPs) - Fix node validation to accept both hostnames and IPs - Add Pulse server reachability check before installation - Add port availability check for HTTP mode - Add automatic rollback on service startup failure - Add HTTP endpoint health check after installation - Fix config backup and deduplication (prevent duplicate keys) - Fix IPv4 validation with loopback rejection - Improve registration retry logic with detailed errors - Add automatic LXC bind mount cleanup on uninstall ## Temperature Collection Fixes - Add local temperature collection for self-monitoring nodes - Fix node identifier matching (use hostname not SSH host) - Fix JSON double-encoding in HTTP client response Related to #XXX (temperature monitoring fixes)	2025-11-13 18:22:36 +00:00
rcourtman	2ee693cc63	Add HTTP mode to pulse-sensor-proxy for multi-instance temperature monitoring This implements HTTP/HTTPS support for pulse-sensor-proxy to enable temperature monitoring across multiple separate Proxmox instances. Architecture changes: - Dual-mode operation: Unix socket (local) + HTTPS (remote) - Unix socket remains default for security/performance (no breaking change) - HTTP mode enables temps from external PVE hosts Backend implementation: - Add HTTPS server with TLS + Bearer token authentication to sensor-proxy - Add TemperatureProxyURL and TemperatureProxyToken fields to PVEInstance - Add HTTP client (internal/tempproxy/http_client.go) for remote proxy calls - Update temperature collector to prefer HTTP proxy when configured - Fallback logic: HTTP proxy → Unix socket → direct SSH (if not containerized) Configuration: - pulse-sensor-proxy config: http_enabled, http_listen_addr, http_tls_cert/key, http_auth_token - PVEInstance config: temperature_proxy_url, temperature_proxy_token - Environment variables: PULSE_SENSOR_PROXY_HTTP_* for all HTTP settings Security: - TLS 1.2+ with modern cipher suites - Constant-time token comparison (timing attack prevention) - Rate limiting applied to HTTP requests (shared with socket mode) - Audit logging for all HTTP requests Next steps: - Update installer script to support HTTP mode + auto-registration - Add Pulse API endpoint for proxy registration - Generate TLS certificates during installation - Test multi-instance temperature collection Related to #571 (multi-instance architecture)	2025-11-13 16:13:53 +00:00
rcourtman	48fabdd827	Improve Docker temperature monitoring documentation for clarity (related to #600 ) Updated the Quick Start for Docker section in TEMPERATURE_MONITORING.md to be more user-friendly and address common setup issues: - Added clear explanation of why the proxy is needed (containers can't access hardware) - Provided concrete IP example instead of placeholder - Showed full docker-compose.yml context with proper YAML structure - Added sudo to commands where needed - Updated docker-compose commands to v2 syntax with note about v1 - Expanded verification steps with clearer success indicators - Added reminder to check container name in verification commands These improvements should help users who encounter blank temperature displays due to missing proxy installation or bind mount configuration.	2025-11-07 15:09:42 +00:00
rcourtman	2a79d57f73	Add SMART temperature collection for physical disks (related to #652 ) Extends temperature monitoring to collect SMART temps for SATA/SAS disks, addressing issue #652 where physical disk temperatures showed as empty. Architecture: - Deploys pulse-sensor-wrapper.sh as SSH forced command on Proxmox nodes - Wrapper collects both CPU/GPU temps (sensors -j) and disk temps (smartctl) - Implements 30-min cache with background refresh to avoid performance impact - Uses smartctl -n standby,after to skip sleeping drives without waking them - Returns unified JSON: {sensors: {...}, smart: [...]} Backend changes: - Add DiskTemp model with device, serial, WWN, temperature, lastUpdated - Extend Temperature model with SMART []DiskTemp field and HasSMART flag - Add WWN field to PhysicalDisk for reliable disk matching - Update parseSensorsJSON to handle both legacy and new wrapper formats - Rewrite mergeNVMeTempsIntoDisks to match SMART temps by WWN → serial → devpath - Preserve legacy NVMe temperature support for backward compatibility Performance considerations: - SMART data cached for 30 minutes per node to avoid excessive smartctl calls - Background refresh prevents blocking temperature requests - Respects drive standby state to avoid spinning up idle arrays - Staggered disk scanning with 0.1s delay to avoid saturating SATA controllers Install script: - Deploys wrapper to /usr/local/bin/pulse-sensor-wrapper.sh - Updates SSH forced command from "sensors -j" to wrapper script - Backward compatible - falls back to direct sensors output if wrapper missing Testing note: - Requires real hardware with smartmontools installed for full functionality - Empty smart array returned gracefully when smartctl unavailable - Legacy sensor-only nodes continue working without changes	2025-11-07 11:46:57 +00:00
rcourtman	dfe960deb4	Fix container SSH detection and improve troubleshooting for issue #617 Related to #617 This fixes a misconfiguration scenario where Docker containers could attempt direct SSH connections (producing [preauth] log spam) instead of using the sensor proxy. Changes: - Fix container detection to check PULSE_DOCKER=true in addition to system.InContainer() heuristics (both temperature.go and config_handlers.go) - Upgrade temperature collection log from Error to Warn with actionable guidance about mounting the proxy socket - Add Info log when dev mode override is active so operators understand the security posture - Add troubleshooting section to docs for SSH [preauth] logs from containers The container detection was inconsistent - monitor.go checked both flags but temperature.go and config_handlers.go only checked InContainer(). Now all locations consistently check PULSE_DOCKER \|\| InContainer().	2025-11-06 09:57:53 +00:00
rcourtman	12dc8693c4	Add NVIDIA GPU temperature monitoring support (nouveau driver) - Add nouveau chip recognition to temperature parser - Implement parseNouveauGPUTemps() for NVIDIA GPU temps via nouveau driver - Map "GPU core" sensor to edge temperature field - Supports systems using open-source nouveau driver This complements the AMD GPU support added previously. Systems using the nouveau driver will now see NVIDIA GPU temperatures in the dashboard. For proprietary nvidia driver users, GPU temps are not available via lm-sensors and would require nvidia-smi integration.	2025-11-06 00:24:42 +00:00
rcourtman	d62259ffa7	Add AMD GPU temperature monitoring support Related to #600 - Add GPU field to Temperature model with edge, junction, and mem sensors - Add amdgpu chip recognition to temperature parser - Implement parseGPUTemps() to extract AMD GPU temperature data - Update frontend TypeScript types to include GPU temperatures - Display GPU temps in node table tooltip alongside CPU temps - Set hasGPU flag when GPU data is available This enables temperature monitoring for AMD GPUs (amdgpu sensors) that was previously being collected via SSH but silently discarded during parsing.	2025-11-06 00:19:04 +00:00
rcourtman	e21a72578f	Add configurable SSH port for temperature monitoring Related to #595 This change adds support for custom SSH ports when collecting temperature data from Proxmox nodes, resolving issues for users who run SSH on non-standard ports. Why SSH is still needed: Temperature monitoring requires reading /sys/class/hwmon sensors on Proxmox nodes, which is not exposed via the Proxmox API. Even when using API tokens for authentication, Pulse needs SSH access to collect temperature data. Changes: - Add `sshPort` configuration to SystemSettings (system.json) - Add `SSHPort` field to Config with environment variable support (SSH_PORT) - Add per-node SSH port override capability for PVE, PBS, and PMG instances - Update TemperatureCollector to accept and use custom SSH port - Update SSH known_hosts manager to support non-standard ports - Add NewTemperatureCollectorWithPort() constructor with port parameter - Maintain backward compatibility with NewTemperatureCollector() (uses port 22) - Update frontend TypeScript types for SSH port configuration Configuration methods: 1. Environment variable: SSH_PORT=2222 2. system.json: {"sshPort": 2222} 3. Per-node override in nodes.enc (future UI support) Default behavior: - Defaults to port 22 if not configured - Maintains full backward compatibility - No changes required for existing deployments The implementation includes proper ssh-keyscan port handling and known_hosts management for non-standard ports using [host]:port notation per SSH standards.	2025-11-05 20:03:29 +00:00
rcourtman	6404b6a5fc	Expand temperature sensor compatibility for SuperIO and AMD CPUs Users with NCT6687 SuperIO chips and AMD processors reporting only chiplet temperatures were unable to see CPU temperature data. Added support for Nuvoton/Winbond/Fintek SuperIO chips and AMD Tccd chiplet temperatures, with debug logging to aid troubleshooting unsupported sensor configurations. Related to discussion #586	2025-11-05 18:47:21 +00:00
rcourtman	10862db4e4	Enhance container detection for temperature SSH safeguards (refs #601 )	2025-11-04 22:30:35 +00:00
rcourtman	5c4be1921c	chore: snapshot current changes	2025-11-02 22:47:55 +00:00
rcourtman	dd2beffc8c	Stop legacy temperature SSH retries when auth fails (#595 )	2025-10-22 19:35:51 +00:00
rcourtman	30879c3b7b	Handle AMD Tctl temperature readings (refs #586 )	2025-10-22 12:58:34 +00:00
rcourtman	524f42cc28	security: complete Phase 1 sensor proxy hardening Implements comprehensive security hardening for pulse-sensor-proxy: - Privilege drop from root to unprivileged user (UID 995) - Hash-chained tamper-evident audit logging with remote forwarding - Per-UID rate limiting (0.2 QPS, burst 2) with concurrency caps - Enhanced command validation with 10+ attack pattern tests - Fuzz testing (7M+ executions, 0 crashes) - SSH hardening, AppArmor/seccomp profiles, operational runbooks All 27 Phase 1 tasks complete. Ready for production deployment.	2025-10-20 15:13:37 +00:00
Richard Courtman	de3bb47930	fix: improve turnkey temperature monitoring for standalone nodes - Fix script input handling to work with standard curl \| bash pattern by prioritizing /dev/tty - Add Raspberry Pi temperature sensor support (cpu_thermal chip and generic temp sensors) - Add comprehensive documentation for turnkey standalone node setup - Fix printf formatting error in setup script	2025-10-18 06:51:56 +00:00
Richard Courtman	669d7dc05c	feat: add turnkey temperature monitoring for standalone nodes Implements automatic temperature monitoring setup for standalone Proxmox/Pimox nodes without manual SSH key configuration. Changes: - Add /api/system/proxy-public-key endpoint to expose proxy's SSH public key - Setup script now detects standalone nodes (non-cluster) - Auto-fetches and installs proxy SSH key with forced commands - Add Raspberry Pi temperature support via cpu_thermal and /sys/class/thermal - Enhance setup script with better error handling for lm-sensors installation - Add RPi detection to skip lm-sensors and use native thermal interface Security: - Public key endpoint is safe (public keys are meant to be public) - All installed keys use forced command="sensors -j" with full restrictions - No shell access, port forwarding, or other SSH features enabled	2025-10-17 22:15:50 +00:00
rcourtman	123e0f04ca	feat: add comprehensive node cleanup system Implements automated cleanup workflow when nodes are deleted from Pulse, removing all monitoring footprint from the host. Changes include a new RPC handler in the sensor proxy for cleanup requests, enhanced node deletion modal with detailed cleanup explanations, and improved SSH key management with proper tagging for atomic updates.	2025-10-17 18:53:45 +00:00
rcourtman	dd9bd65a2e	fix: Add hasCPU/hasNVMe flags to prevent false 'no CPU sensor' errors Addresses #101 v4.23.0 introduced a regression where systems with only NVMe temperatures (no CPU sensor) would display "No CPU sensor" in the UI. This was caused by the Available flag being set to true when NVMe temps existed, even without CPU data, triggering the error message in the frontend. Backend changes: - Add HasCPU and HasNVMe boolean fields to Temperature model - Extend CPU sensor detection to support more chip types: zenpower, k8temp, acpitz, it87 (case-insensitive matching) - HasCPU is set based on CPU chip detection (coretemp, k10temp, etc.), not value thresholds - This prevents false negatives when sensors report 0°C during resets - CPU temperature values now accepted even when 0 (checked with !IsNaN instead of > 0) - extractTempInput returns NaN instead of 0 when no data found - Available flag means "any temperature data exists" for backward compatibility - Update mock generator to properly set the new flags - Add unit tests for NVMe-only and 0°C scenarios to prevent regression - Removed amd_energy from CPU chip list (power sensor, not temperature) Frontend changes: - Add hasCPU and hasNVMe optional fields to Temperature interface - Update NodeSummaryTable to check hasCPU flag with fallback to available for backward compatibility with older API responses - Update NodeCard temperature display logic with same fallback pattern - Systems with only NVMe temps now show "-" instead of error message - Fallback ensures UI works with both old and new API responses Testing: - All unit tests pass including NVMe-only and 0°C test cases - Fix prevents false "no CPU sensor" errors when sensors temporarily report 0°C - Fix eliminates false "no CPU sensor" errors for NVMe-only systems	2025-10-13 10:17:17 +00:00
rcourtman	e7bc338891	feat: Implement secure temperature proxy for containerized deployments Addresses #528 Introduces pulse-temp-proxy architecture to eliminate SSH key exposure in containers: Architecture: - pulse-temp-proxy runs on Proxmox host (outside LXC/Docker) - SSH keys stored on host filesystem (/var/lib/pulse-temp-proxy/ssh/) - Pulse communicates via unix socket (bind-mounted into container) - Proxy handles cluster discovery, key rollout, and temperature fetching Components: - cmd/pulse-temp-proxy: Standalone Go binary with unix socket RPC server - internal/tempproxy: Client library for Pulse backend - scripts/install-temp-proxy.sh: Idempotent installer for existing deployments - scripts/pulse-temp-proxy.service: Systemd service for proxy Integration: - Pulse automatically detects and uses proxy when socket exists - Falls back to direct SSH for native installations - Installer automatically configures proxy for new LXC deployments - Existing LXC users can upgrade by running install-temp-proxy.sh Security improvements: - Container compromise no longer exposes SSH keys - SSH keys never enter container filesystem - Maintains forced command restrictions - Transparent to users - no workflow changes Documentation: - Updated TEMPERATURE_MONITORING.md with new architecture - Added verification steps and upgrade instructions - Preserved legacy documentation for native installs	2025-10-12 21:35:35 +00:00
rcourtman	274f36daa8	Improve dashboard responsiveness and temperature handling	2025-10-12 10:34:06 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

35 commits