Pulse/internal/monitoring
rcourtman c9b4f7e88b Fix incorrect temperature data during cluster initialization
During cluster startup, nodes were temporarily using the primary cluster
endpoint for temperature collection before cluster metadata validation
completed. This caused all nodes to show the same (incorrect) temperature
values for ~4 minutes until validation finished and per-node endpoints
were established.

Example: minipc would show delly's temperature (90°C) instead of its own
(50°C) from startup until cluster validation completed.

Root cause:
- Temperature collection started immediately at startup
- Cluster endpoint validation happened asynchronously
- Code fell back to primary endpoint when ClusterEndpoints was empty
- All nodes used same endpoint, got same temperature data

Fix: Skip temperature collection for cluster nodes until:
1. ClusterEndpoints array is populated (validation complete)
2. Node's specific endpoint is found in the cluster metadata

This ensures correct temperature data from the very first collection,
maintaining data integrity during startup. When persisted config exists,
endpoints are available immediately so no delay occurs. For new clusters,
temperature collection begins once validation completes (~30s).

Preserves Pulse's correctness guarantee: users can trust metrics
immediately after restart without waiting for "warm-up" period.
2025-11-14 23:38:44 +00:00
..
backoff.go feat: implement error handling with circuit breakers and backoff (Phase 2 Task 7) 2025-10-20 15:13:37 +00:00
backoff_test.go test: add comprehensive unit tests for backoff and circuit breaker (Phase 2 Task 9a) 2025-10-20 15:13:38 +00:00
backup_guard_test.go Guard PBS backups from failed polls 2025-11-05 19:26:20 +00:00
ceph.go Fix settings security tab navigation 2025-10-11 23:29:47 +00:00
circuit_breaker.go feat: enhance scheduler health API with rich instance metadata 2025-10-20 15:13:38 +00:00
circuit_breaker_test.go test: add comprehensive unit tests for backoff and circuit breaker (Phase 2 Task 9a) 2025-10-20 15:13:38 +00:00
container_disk_usage.go feat: add professional logging with runtime configuration and performance optimization 2025-10-20 15:13:38 +00:00
diagnostic_snapshots.go Refine Proxmox node memory fallback (#582) 2025-10-22 15:36:26 +00:00
docker_commands.go feat: add docker agent command handling 2025-10-15 19:27:19 +00:00
docker_commands_test.go chore: snapshot current changes 2025-11-02 22:47:55 +00:00
fake_executor_integration.go test: add comprehensive integration test harness for adaptive polling (Phase 2 Task 9c) 2025-10-20 15:13:38 +00:00
fs_filters.go Filter read-only filesystems from host agent disk metrics (related to #690) 2025-11-12 09:47:02 +00:00
fs_filters_test.go Ignore read-only guest filesystems in disk aggregation 2025-10-14 16:13:53 +00:00
harness_integration.go Surface LXC interface IPs via PVE interfaces API (#596) 2025-10-23 08:07:32 +00:00
helpers_test.go Related to #692: Skip unsupported guest OS info calls 2025-11-12 19:17:09 +00:00
integration_integration_test.go test: add soak test with runtime instrumentation (Phase 2 Task 9d) 2025-10-20 15:13:38 +00:00
main_test.go Harden setup token flow and enforce encrypted persistence 2025-10-25 16:00:37 +00:00
metrics.go perf: reduce polling allocations and guest metadata load 2025-10-25 13:12:47 +00:00
metrics_history.go Fix settings security tab navigation 2025-10-11 23:29:47 +00:00
metrics_history_concurrency_test.go Fix settings security tab navigation 2025-10-11 23:29:47 +00:00
monitor.go Fix incorrect temperature data during cluster initialization 2025-11-14 23:38:44 +00:00
monitor_docker_test.go Ensure agent ID collisions respect token boundaries (Related to #658) 2025-11-12 22:46:56 +00:00
monitor_health_test.go feat: enhance scheduler health API with rich instance metadata 2025-10-20 15:13:38 +00:00
monitor_host_agents_test.go perf: reduce polling allocations and guest metadata load 2025-10-25 13:12:47 +00:00
monitor_memory_test.go Fix monitoring test panic and goroutine leaks 2025-11-11 23:52:24 +00:00
monitor_pmg_test.go Fix PMG API parameter issues causing 400 errors 2025-11-05 19:28:37 +00:00
monitor_polling.go Fix guest agent disk data regression on Proxmox 8.3+ 2025-11-06 18:42:46 +00:00
monitor_snapshots_test.go Fix inflated RAM usage reporting for LXC containers 2025-11-06 00:16:18 +00:00
monitor_storage_test.go Fix inflated RAM usage reporting for LXC containers 2025-11-06 00:16:18 +00:00
monitor_timeout_test.go monitoring: add poll watchdog to prevent worker leaks (refs #696) 2025-11-14 11:24:59 +00:00
poller.go feat: add professional logging with runtime configuration and performance optimization 2025-10-20 15:13:38 +00:00
ratetracker.go Fix settings security tab navigation 2025-10-11 23:29:47 +00:00
ratetracker_concurrency_test.go Fix settings security tab navigation 2025-10-11 23:29:47 +00:00
reload.go Propagate config updates to settings nodes (#588) 2025-10-22 13:45:13 +00:00
scheduler.go feat: enhance scheduler health API with rich instance metadata 2025-10-20 15:13:38 +00:00
staleness_tracker.go release: prepare v4.25.0 2025-10-22 10:46:18 +00:00
staleness_tracker_test.go test: add comprehensive staleness tracker unit tests (Phase 2 Task 9b) 2025-10-20 15:13:38 +00:00
storage_backup_preserve_test.go Preserve storage backups after partial failures (Related to #704) 2025-11-12 21:10:18 +00:00
task_queue.go perf: reduce polling allocations and guest metadata load 2025-10-25 13:12:47 +00:00
temperature.go Fix HTTP mode for pulse-sensor-proxy and improve installer safety 2025-11-13 18:22:36 +00:00
temperature_service.go Add configurable SSH port for temperature monitoring 2025-11-05 20:03:29 +00:00
temperature_test.go Expand temperature sensor compatibility for SuperIO and AMD CPUs 2025-11-05 18:47:21 +00:00