Commit graph

33 commits

Author SHA1 Message Date
rcourtman
b4a33c4f2d Fix offline buffering: add tests, remove unused config, fix flaky test
- Add unit tests for internal/buffer package
- Fix misleading "ring buffer" comment (it's a bounded FIFO queue)
- Remove unused BufferCapacity config field from both agents
- Rewrite flaky integration test to use polling instead of fixed sleeps
2025-12-02 22:31:44 +00:00
courtmanr@gmail.com
caf0c10206 feat: Implement offline buffering for host and docker agents
- Add internal/buffer package with generic ring buffer
- Add buffering logic to host agent for failed reports
- Add buffering logic to docker agent for failed reports
- Add BufferCapacity configuration option
- Add integration tests for buffering logic
2025-12-02 22:12:47 +00:00
rcourtman
4f824ab148 style: Apply gofmt to 37 files
Standardize code formatting across test files and monitor.go.
No functional changes.
2025-12-02 17:21:48 +00:00
rcourtman
8360ed8916 Add unit tests for dockeragent runtime detection functions
Test coverage for:
- detectRuntime: 11 test cases covering podman/docker detection via
  endpoint path, InitBinary, ServerVersion, DriverStatus, SecurityOptions
- buildRuntimeCandidates: 6 test cases verifying candidate ordering,
  deduplication, and preference-based filtering
- randomDuration: 5 test cases for boundary conditions and randomness
- determineSelfUpdateArch: validates architecture detection output

Coverage increased from 17.5% to 21.4%.
2025-11-30 08:47:46 +00:00
rcourtman
1fc4807a07 Add unit tests for Docker swarm utility functions (dockeragent)
Test coverage for serviceMode, buildContainerIndex, lookupContainer,
copyStringMap, and isTaskCompletedState. 52 test cases covering service
mode detection, container index building/lookup, and task state classification.
Coverage improved from 14.7% to 17.5%.
2025-11-30 05:32:52 +00:00
rcourtman
943c2b5082 Add unit tests for extractPodmanMetadata and detectHostRemovedError (dockeragent)
- TestExtractPodmanMetadata: 14 test cases covering pod metadata, infra
  containers, compose metadata, auto-update settings, user namespace
  handling, whitespace trimming, and precedence rules
- TestDetectHostRemovedError: 11 test cases covering JSON parsing,
  case-insensitive matching, error code validation, and edge cases

Coverage improved from 11.7% to 14.7% for internal/dockeragent.
2025-11-30 05:03:31 +00:00
rcourtman
b1bc704e3a Consolidate duplicate normalizeVersion functions into shared utility
- Move normalizeVersion to utils.NormalizeVersion for single source of truth
- Update agentupdate and dockeragent packages to use shared function
- Add 14 test cases for version normalization

This prevents bugs like issue #773 where a fix applied to one copy
but not the other caused an update loop.
2025-11-29 22:57:33 +00:00
rcourtman
97d3f30a7f Add unit tests for dockeragent utility functions
Tests for calculateCPUPercent, calculateMemoryUsage, safeFloat,
parseTime, trimLeadingSlash, and summarizeBlockIO. 28 test cases
covering edge cases like zero deltas, cache handling, NaN/Inf,
and case-insensitive op matching.

Coverage improved from 8.0% to 11.7%.
2025-11-29 22:18:10 +00:00
rcourtman
04d1e1bcf4 Fix standalone docker agent version comparison prefix mismatch
The unified agent got the version normalization fix (1b866598), but the
standalone docker agent's checkForUpdates() still used direct string
comparison. When server returns "4.34.0" and agent has "v4.34.0", this
caused an infinite self-update loop.

Apply the same normalizeVersion() function used in the unified agent.

Related to #773
2025-11-29 00:04:43 +00:00
rcourtman
b5798012fc Fix Docker CPU calculation on systemUsage counter reset
When systemUsage counter goes backward (common in unprivileged LXC
containers), the previous code used the absolute value as systemDelta.
This created an artificially small denominator, inflating CPU to ~100%.

Now leaves systemDelta as 0 on counter reset, falling through to the
time-based calculation which produces accurate results.

Related to #770
2025-11-28 15:07:49 +00:00
rcourtman
d425bc3df4 fix: multiple agent installation and update issues
- Default enableDocker to false in UI to prevent unintended Docker
  agent activation on host-only installs (Related to #766)
- Deploy agent scripts and binaries during web UI upgrades, not just
  the main binary (Related to #760)
- Apply symlink resolution fix to standalone docker agent self-update
  to prevent cross-device rename failures (Related to #737)
2025-11-27 15:49:03 +00:00
rcourtman
8152197207 fix: mark unused parameters to satisfy unparam linter
Mark intentionally unused parameters with underscore to:
- Silence unparam warnings for legitimate unused parameters
- Keep function signatures intact for API compatibility
- Remove unused req from serveChecksum helper
2025-11-27 10:12:48 +00:00
rcourtman
e1b9c133c3 fix: remove ineffectual assignments
- Fix loop variable reassignment in config_handlers.go
- Remove redundant boolean assignments in swarm.go
2025-11-27 09:48:29 +00:00
rcourtman
f76c1fb43b chore: update to non-deprecated Docker SDK types
- Use container.Summary instead of types.Container
- Use swarmtypes.ServiceListOptions instead of types.ServiceListOptions
- Use swarmtypes.TaskListOptions instead of types.TaskListOptions

These types were deprecated in favor of package-specific types.
2025-11-27 09:36:05 +00:00
rcourtman
dc4669f9f6 security: harden agent installers and auto-update mechanism
Install script (scripts/install.sh):
- Add multi-platform support: Unraid, OpenRC/Alpine, Synology DSM 6/7
- Add input validation for URL, token format, and interval
- Add binary magic verification (ELF/Mach-O/PE)
- Add cleanup trap for temp files
- Wrap script in main() for partial download protection
- Fix shellcheck compliance issues
- Add curl timeouts

Agent auto-update (agentupdate, dockeragent):
- Enforce TLS 1.2 minimum version
- Make SHA256 checksum verification mandatory
- Add 100MB binary size limit
- Add binary magic verification before replacement
- Add Unraid persistent binary update after self-update
- Add 5-minute download timeout

Frontend:
- Update Linux install description to note auto-detection of init systems
2025-11-26 13:14:58 +00:00
rcourtman
9daf1d5398 fix: cache daemon ID at init to prevent Podman token binding conflicts
Podman can return unstable or empty daemon IDs across API calls. When
the agent fetched info.ID on every report cycle, this could cause the
agent identity to change mid-session, triggering "token already in use"
errors on the server.

Cache the daemon ID at initialization and use it consistently for all
reports.

Related to #740
2025-11-26 10:23:22 +00:00
rcourtman
ae3b78d661 fix: propagate unified agent version and improve legacy cleanup
Issues found during scenario testing:

1. Version propagation: The hostagent and dockeragent packages were
   reporting their own Version (0.1.0-dev) instead of the unified
   agent's version. Added AgentVersion config field to pass the
   parent's version down.

2. macOS legacy cleanup: The install.sh script was missing cleanup
   for pulse-docker-agent on macOS.

3. Windows legacy cleanup: The install.ps1 script was missing cleanup
   for legacy PulseHostAgent and PulseDockerAgent services.

These fixes ensure:
- Unified agent reports consistent version across host/docker metrics
- Legacy agents are properly removed on all platforms during upgrade
- Users migrating from legacy agents get a clean transition
2025-11-25 23:39:10 +00:00
rcourtman
ea335546fc feat: improve legacy agent detection and migration UX
Add seamless migration path from legacy agents to unified agent:

- Add AgentType field to report payloads (unified vs legacy detection)
- Update server to detect legacy agents by type instead of version
- Add UI banner showing upgrade command when legacy agents are detected
- Add deprecation notice to install-host-agent.ps1
- Create install-docker-agent.sh stub that redirects to unified installer

Legacy agents (pulse-host-agent, pulse-docker-agent) now show a "Legacy"
badge in the UI with a one-click copy command to upgrade to the unified
agent.
2025-11-25 23:26:22 +00:00
courtmanr@gmail.com
4640633430 Improve agent update logging and installer warnings (related to #737) 2025-11-23 22:07:37 +00:00
rcourtman
6fb839cbdf Add log level control for docker agent
Related to #742
2025-11-22 07:43:48 +00:00
rcourtman
fdcec85931 Fix critical version embedding issues for 4.26 release
Addresses the root cause of issue #631 (infinite Docker agent restart loop)
and prevents similar issues with host-agent and sensor-proxy.

Changes:
- Set dockeragent.Version default to "dev" instead of hardcoded version
- Add version embedding to server build in Dockerfile
- Add version embedding to host-agent builds (all platforms)
- Add version embedding to sensor-proxy builds (all platforms)

This ensures:
1. Server's /api/agent/version endpoint returns correct v4.26.0
2. Downloaded agent binaries have matching embedded versions
3. Dev builds skip auto-update (Version="dev")
4. No version mismatch triggers infinite restart loops

Related to #631
2025-11-06 11:42:52 +00:00
rcourtman
b44084af3c Skip false health alerts for Samsung 980/990 SSDs and improve Docker CPU calculation
Related to #547 and #622

## Samsung SSD Fix (#547)
Samsung 980 and 990 series SSDs have known firmware bugs that cause them to
report incorrect health status (typically FAILED or critical warnings) even
when the drives are actually healthy. This is commonly due to incorrect
temperature threshold reporting in the firmware.

This change adds special handling to detect these drives and skip health
status alerts while still monitoring wearout metrics, which remain reliable.
The fix also clears any existing false alerts for these drives.

Users experiencing these false alerts should update their Samsung SSD firmware
to the latest version from Samsung, which typically resolves the issue.

## Docker Agent CPU Fix (#622)
Addresses issue where Docker container CPU usage shows 0%. The Docker
agent uses ContainerStatsOneShot which typically doesn't populate
PreCPUStats, requiring manual delta tracking between collection cycles.

Changes:
- Fix logic bug where prevContainerCPU was updated before checking if
  previous sample existed, causing incorrect delta calculations
- Add comprehensive debug logging showing which calculation method
  succeeded (PreCPUStats, system delta, or time-based fallback)
- Add warning after 10 PreCPUStats failures to inform about manual
  tracking mode (normal for one-shot stats)
- Add detailed failure logging when CPU calculation cannot complete

Expected behavior: First collection cycle returns 0% (no previous
sample), subsequent cycles show accurate CPU metrics.
2025-11-05 19:33:16 +00:00
rcourtman
adda6eea38 Update docker CPU metrics and add OpenRC installer support (Refs #255) 2025-11-04 22:16:50 +00:00
rcourtman
6eb1a10d9b Refactor: Code cleanup and localStorage consolidation
This commit includes comprehensive codebase cleanup and refactoring:

## Code Cleanup
- Remove dead TypeScript code (types/monitoring.ts - 194 lines duplicate)
- Remove unused Go functions (GetClusterNodes, MigratePassword, GetClusterHealthInfo)
- Clean up commented-out code blocks across multiple files
- Remove unused TypeScript exports (helpTextClass, private tag color helpers)
- Delete obsolete test files and components

## localStorage Consolidation
- Centralize all storage keys into STORAGE_KEYS constant
- Update 5 files to use centralized keys:
  * utils/apiClient.ts (AUTH, LEGACY_TOKEN)
  * components/Dashboard/Dashboard.tsx (GUEST_METADATA)
  * components/Docker/DockerHosts.tsx (DOCKER_METADATA)
  * App.tsx (PLATFORMS_SEEN)
  * stores/updates.ts (UPDATES)
- Benefits: Single source of truth, prevents typos, better maintainability

## Previous Work Committed
- Docker monitoring improvements and disk metrics
- Security enhancements and setup fixes
- API refactoring and cleanup
- Documentation updates
- Build system improvements

## Testing
- All frontend tests pass (29 tests)
- All Go tests pass (15 packages)
- Production build successful
- Zero breaking changes

Total: 186 files changed, 5825 insertions(+), 11602 deletions(-)
2025-11-04 21:50:46 +00:00
rcourtman
5c4be1921c chore: snapshot current changes 2025-11-02 22:47:55 +00:00
rcourtman
730c6bf864 Fix Docker agent removal and improve security
This commit addresses multiple issues in the Docker/host agent removal flow:

Agent Stop Fix:
- Add systemctl stop command after agent acknowledgement to prevent systemd restart
- Previous behavior: agent disabled but systemd immediately restarted it (Restart=always)
- New behavior: agent disables itself, sends ack, then stops systemd service completely

UX Improvements:
- Add real-time elapsed time counter during removal wait
- Show progress indicators prominently (no longer hidden in dropdown)
- Display expected time range (30-60 seconds) and last heartbeat
- Auto-show timeout warning after 2 minutes with actionable "Force remove" button
- Add contextual help explaining what's happening at each stage

Security Enhancement:
- Automatically revoke API tokens when removing Docker/host agents
- Previous behavior: tokens remained valid after agent removal
- New behavior: tokens are revoked and persisted immediately on removal
- Prevents removed agents from re-authenticating with old credentials
2025-10-29 12:27:36 +00:00
rcourtman
32392d1212 Add disk metrics, block I/O, and mount details to Docker monitoring
Extends Docker container monitoring with comprehensive disk and storage information:
- Writable layer size and root filesystem usage displayed in new Disk column
- Block I/O statistics (read/write bytes totals) shown in container drawer
- Mount metadata including type, source, destination, mode, and driver details
- Configurable via --collect-disk flag (enabled by default, can be disabled for large fleets)

Also fixes config watcher to consistently use production auth config path instead of following PULSE_DATA_DIR when in mock mode.
2025-10-29 12:05:36 +00:00
rcourtman
f2acdd59af Normalize docker agent version handling 2025-10-28 08:42:58 +00:00
rcourtman
68ce8e7520 feat: finalize swarm service monitoring (#598) 2025-10-26 09:35:49 +00:00
rcourtman
8e83eaf823 Add container state filtering to Docker agent 2025-10-25 21:40:59 +00:00
rcourtman
79dc620b34 Docker agent: add arch-aware self-update download
Refs #526
2025-10-16 08:43:59 +00:00
rcourtman
91fecacfef feat: add docker agent command handling 2025-10-15 19:27:19 +00:00
rcourtman
f46ff1792b Fix settings security tab navigation 2025-10-11 23:29:47 +00:00