Related to #547 and #622
## Samsung SSD Fix (#547)
Samsung 980 and 990 series SSDs have known firmware bugs that cause them to
report incorrect health status (typically FAILED or critical warnings) even
when the drives are actually healthy. This is commonly due to incorrect
temperature threshold reporting in the firmware.
This change adds special handling to detect these drives and skip health
status alerts while still monitoring wearout metrics, which remain reliable.
The fix also clears any existing false alerts for these drives.
Users experiencing these false alerts should update their Samsung SSD firmware
to the latest version from Samsung, which typically resolves the issue.
## Docker Agent CPU Fix (#622)
Addresses issue where Docker container CPU usage shows 0%. The Docker
agent uses ContainerStatsOneShot which typically doesn't populate
PreCPUStats, requiring manual delta tracking between collection cycles.
Changes:
- Fix logic bug where prevContainerCPU was updated before checking if
previous sample existed, causing incorrect delta calculations
- Add comprehensive debug logging showing which calculation method
succeeded (PreCPUStats, system delta, or time-based fallback)
- Add warning after 10 PreCPUStats failures to inform about manual
tracking mode (normal for one-shot stats)
- Add detailed failure logging when CPU calculation cannot complete
Expected behavior: First collection cycle returns 0% (no previous
sample), subsequent cycles show accurate CPU metrics.
This commit addresses multiple issues in the Docker/host agent removal flow:
Agent Stop Fix:
- Add systemctl stop command after agent acknowledgement to prevent systemd restart
- Previous behavior: agent disabled but systemd immediately restarted it (Restart=always)
- New behavior: agent disables itself, sends ack, then stops systemd service completely
UX Improvements:
- Add real-time elapsed time counter during removal wait
- Show progress indicators prominently (no longer hidden in dropdown)
- Display expected time range (30-60 seconds) and last heartbeat
- Auto-show timeout warning after 2 minutes with actionable "Force remove" button
- Add contextual help explaining what's happening at each stage
Security Enhancement:
- Automatically revoke API tokens when removing Docker/host agents
- Previous behavior: tokens remained valid after agent removal
- New behavior: tokens are revoked and persisted immediately on removal
- Prevents removed agents from re-authenticating with old credentials
Extends Docker container monitoring with comprehensive disk and storage information:
- Writable layer size and root filesystem usage displayed in new Disk column
- Block I/O statistics (read/write bytes totals) shown in container drawer
- Mount metadata including type, source, destination, mode, and driver details
- Configurable via --collect-disk flag (enabled by default, can be disabled for large fleets)
Also fixes config watcher to consistently use production auth config path instead of following PULSE_DATA_DIR when in mock mode.