When the agent runs as root, os.Getuid() returns 0 so it only probes
/run/user/0/docker.sock. Rootless Docker installs live at
/run/user/1000/docker.sock (or similar). Glob /run/user/*/docker.sock
and /run/user/*/podman/podman.sock to discover sockets for all users.
Probe ~/.docker/run/docker.sock for RuntimeDocker and RuntimeAuto
before falling back to /var/run/docker.sock. This lets the agent
connect on macOS without requiring DOCKER_HOST to be set manually.
Ref #1200
Docker container URL preserved on update (#1054): container updates
recreate the container with a new runtime ID. The agent now includes
{oldContainerId, newContainerId} in the completion ACK payload; the
server uses this to copy persisted metadata (custom URLs, descriptions,
tags) to the new ID so nothing is lost. Migration is a copy, not a move,
so rollback scenarios still find metadata under the original ID.
Reduce metrics.db write amplification (#1124): add a UNIQUE index on
(resource_type, resource_id, metric_type, timestamp, tier) so rollup
reprocessing after a failed checkpoint uses INSERT OR IGNORE instead of
creating duplicate rows. Existing duplicates are deduplicated once on
startup if the index creation would otherwise fail. Also sets
wal_autocheckpoint(500) to checkpoint the WAL more frequently, preventing
unbounded WAL growth.
Fixes#1054Fixes#1124
FreeBSD auto-update (#1254): determineArch() now includes freebsd in its
OS switch, producing freebsd-amd64/arm64 instead of falling through to
a uname -m fallback that incorrectly returned linux-<arch>. FreeBSD agents
were downloading Linux ELF binaries and failing to exec them.
Docker rootless socket (#1200): buildRuntimeCandidates() now probes
/run/user/<uid>/docker.sock before the system-wide /var/run/docker.sock,
enabling auto-detection of Docker rootless installations.
Duplicate PVE/PBS hosts (#1245, #1252): handleSecureAutoRegister() now
deduplicates by host URL, updating the existing instance's token in-place
instead of appending a duplicate entry on each re-run of the setup script.
Fixes#1254Fixes#1200Fixes#1245Fixes#1252
(cherry picked from commit 0f1d9e9b9fea6c8b9e65872e8a78e25f93653eef)
Docker's one-shot stats API (stream=false) returns PreCPUStats from the
daemon's internal cache, which many Docker versions don't update between
non-streaming reads. This causes every call to return the same stale
PreCPUStats from container start, producing a constant lifetime-average
CPU% (e.g. 3.4%) instead of current usage.
Switch to always using manual delta tracking, which stores the previous
sample from our own reads and computes accurate deltas between collection
cycles. The first cycle returns 0 while establishing a baseline; all
subsequent cycles produce correct current CPU percentages.
The Docker agent was not passing the disk exclusion list to
hostmetricsCollect(), so excluded mounts appeared in the Docker tab
disk totals. Also add server-side fsfilters filtering to Docker
report processing for parity with the host agent path.
- Add PBS/PMG polling interval environment variable overrides in config.go
- Fix temp path expectation in detect_root_test.go using filepath.Join
- Use EvalSymlinks for symlink target comparison in self_update_test.go
- Add Linux-only skip for MAC fallback test in agent_new_test.go
- Add OS-aware RAID/SMART assertions in agent_metrics_test.go
The agent was crashing with 'fatal error: concurrent map writes' when
handleCheckUpdatesCommand spawned a goroutine that called collectOnce
concurrently with the main collection loop. Both code paths access
a.prevContainerCPU without synchronization.
Added a.cpuMu mutex to protect all accesses to prevContainerCPU in:
- pruneStaleCPUSamples()
- collectContainer() delete operation
- calculateContainerCPUPercent()
Related to #1063
When Docker daemon runs inside an LXC container, it may report 0 for
MemTotal because it can't read the cgroup memory limits correctly.
This caused the UI to show "0B / 7GB" and trigger false alerts with
overflow percentages (214748364799.6%).
The fix checks if Docker's info.MemTotal is 0 and falls back to
gopsutil's /proc/meminfo reading (snapshot.Memory.TotalBytes) which
works correctly in LXC environments.
Fixes#1075
When a user's reverse proxy redirects HTTP to HTTPS, Go's default HTTP
client behavior converts POST requests to GET on 301/302 redirects
(per HTTP specification). This causes the Pulse server to return 405
"Only POST is allowed" errors.
Added CheckRedirect to all agent HTTP clients (host, docker, kubernetes)
that returns a clear error message guiding users to use the correct
protocol in their --url flag instead of silently following redirects.
Related to #1058
When --docker-runtime=podman is explicitly set, the agent should try
Podman-specific sockets first before falling back to environment
defaults (which try /var/run/docker.sock).
Also adds /var/run/podman/podman.sock as a candidate socket path,
which is used by CoreOS and some Fedora configurations.
Related to #1045
When a Docker agent tries to register with a token that's already bound
to another agent, the error was logged generically as "Failed to send
docker report". Users had to dig into logs to understand the issue.
Now logs a prominent error message:
"DOCKER REGISTRATION FAILED: This API token is already used by another
Docker agent. Each Docker host requires its own unique token. Generate
a new token in Pulse Settings > Agents and reinstall with the new token."
Related to #1027
When collecting reports, the runtime re-detection was passing RuntimeAuto
instead of the user's configured preference. This caused podman to switch
back to docker on systems like CoreOS where podman provides a docker-
compatible socket at /var/run/docker.sock.
Now the current runtime (set at init from user's --docker-runtime flag)
is passed as the preference, preventing spurious runtime switching.
Related to #1022
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add PULSE_DISABLE_DOCKER_UPDATE_CHECKS environment variable and
--disable-docker-update-checks flag to disable Docker image update
detection. This is useful for:
- Avoiding Docker Hub rate limits
- Users who don't want update notifications in their dashboard
Related to Discussion #982
When running as a unified agent (pulse-agent with --enable-docker), the
Docker module was using a different fallback chain for agent ID than the
host module. In unified mode with empty machineID, the Docker module fell
back to daemonID while the host module fell back to hostname. This caused
the server to reject Docker reports with 'token already in use by agent'
errors because the same API token was bound to different agent IDs.
The fix ensures that in unified mode, the Docker module uses the exact
same fallback chain as the host module: machineID -> hostname. The daemonID
fallback is only used in standalone mode for backward compatibility.
Fixes#985, #986
- Separated metrics collection into internal/dockeragent/collect.go
- Added agent self-update pre-flight check (--self-test)
- Implemented signed binary verification with key rotation for updates
- Added batch update support to frontend with parallel processing
- Cleaned up agent.go and added startup cleanup for backup containers
- Updated documentation for Docker features and agent security
Fixed an issue where all Docker containers were showing 'click to update'
even when they were up to date. The root cause was comparing the wrong
digest types:
- Previously: Compared ImageID (local config hash) vs registry manifest digest
- Now: Uses RepoDigests from image inspect, which is the actual manifest
digest that Docker received from the registry when pulling the image
For multi-arch images, the registry returns a manifest list digest, while
Docker stores the platform-specific image config digest locally. These
will never match, causing false positives for all multi-arch images.
Changes:
- Added ImageInspectWithRaw to dockerClient interface
- Added getImageRepoDigest method to extract RepoDigest from image
- Added matchesImageReference helper for Docker Hub naming conventions
- Added tests for matchesImageReference
Fixes#955
- Add comprehensive test coverage for agent report, flush buffer, and deps
- Expand flow, HTTP, CPU, and swarm test coverage
- Refactor registry access to use deps interface for better testability
- Add container update and self-update test scenarios
Allows specifying which IP address the agent should report, useful for:
- Multi-homed systems with separate management networks
- Systems with private monitoring interfaces
- VPN/overlay network scenarios
Usage:
pulse-agent --report-ip 192.168.1.100
PULSE_REPORT_IP=192.168.1.100 pulse-agent
Content was being streamed twice:
1. During each iteration of the tool loop (intended for intermediate feedback)
2. Again after the loop ended with finalContent (redundant)
This caused duplicate responses when using Ollama and other providers.
- Add container update command handling to unified agent
- Agent can now receive update_container commands from Pulse server
- Pulls latest image, stops container, creates backup, starts new container
- Automatic rollback on failure
- Backup container cleaned up after 5 minutes
- Added comprehensive test coverage for container update logic
BREAKING CHANGE: AI Patrol now uses EXACT alert thresholds by default
instead of warning 5-15% before the threshold.
Changes:
- Default behavior: Patrol warns at your configured threshold (e.g., 96% = warns at 96%)
- New setting: 'use_proactive_thresholds' enables the old early-warning behavior
- API: Added use_proactive_thresholds to GET/PUT /api/settings/ai
- Backend: Added SetProactiveMode/GetProactiveMode to PatrolService
- Backend: Added GetThresholds to PatrolService for UI display
- Tests: Updated and added tests for both exact and proactive modes
- Also fixed unused imports in dockeragent/agent.go
When proactive mode is disabled (default):
- Watch: threshold - 5% (slight buffer)
- Warning: exact threshold
When proactive mode is enabled:
- Watch: threshold - 15%
- Warning: threshold - 5%
Related to #951
The GuestURL field was missing from NodeFrontend and its converter,
causing configured Guest URLs to be ignored when clicking on cluster
node names. The frontend would fall back to the auto-detected IP
instead of using the user-configured Guest URL.
Related to #940
- Add registry checker tests (caching, enable/disable, parsing, concurrency)
- Add alert integration tests for update detection and Pro license gating
- Add API handler tests for /api/infra-updates endpoints
- Test cleanup of tracking maps when containers are removed
- Test threshold-based alerting behavior
- Add GHCR (GitHub Container Registry) token support for public images
- Clean up dockerUpdateFirstSeen tracking when containers are removed
- Improve UpdateIcon tooltip to show digest info
- Add cursor-help to indicate hoverable tooltip
- Add Env field to Container struct in pkg/agents/docker/report.go
- Extract env vars from inspect.Config.Env in Docker agent
- Mask sensitive values (password, secret, key, token, etc.) with ***
- Display env vars in container drawer with green badges (amber for masked)
- Add tests for maskSensitiveEnvVars function
Related to #916
Users can now exclude specific mount points from disk monitoring:
- Via CLI: --disk-exclude /mnt/backup --disk-exclude '/media/*'
- Via env: PULSE_DISK_EXCLUDE=/mnt/backup,*pbs*
Patterns support:
- Exact paths: /mnt/backup
- Prefix patterns: /mnt/ext*
- Contains patterns: *pbs*
This addresses the common case where external disks or
PBS datastores are being monitored but shouldn't be.
Related to #823
- Log payload size (in KB and bytes) at debug level
- Warn when payload approaches 400KB (512KB limit)
- Helps diagnose 'request body too large' errors
- Add unit tests for internal/buffer package
- Fix misleading "ring buffer" comment (it's a bounded FIFO queue)
- Remove unused BufferCapacity config field from both agents
- Rewrite flaky integration test to use polling instead of fixed sleeps
Test coverage for serviceMode, buildContainerIndex, lookupContainer,
copyStringMap, and isTaskCompletedState. 52 test cases covering service
mode detection, container index building/lookup, and task state classification.
Coverage improved from 14.7% to 17.5%.
- TestExtractPodmanMetadata: 14 test cases covering pod metadata, infra
containers, compose metadata, auto-update settings, user namespace
handling, whitespace trimming, and precedence rules
- TestDetectHostRemovedError: 11 test cases covering JSON parsing,
case-insensitive matching, error code validation, and edge cases
Coverage improved from 11.7% to 14.7% for internal/dockeragent.
- Move normalizeVersion to utils.NormalizeVersion for single source of truth
- Update agentupdate and dockeragent packages to use shared function
- Add 14 test cases for version normalization
This prevents bugs like issue #773 where a fix applied to one copy
but not the other caused an update loop.
Tests for calculateCPUPercent, calculateMemoryUsage, safeFloat,
parseTime, trimLeadingSlash, and summarizeBlockIO. 28 test cases
covering edge cases like zero deltas, cache handling, NaN/Inf,
and case-insensitive op matching.
Coverage improved from 8.0% to 11.7%.
The unified agent got the version normalization fix (1b866598), but the
standalone docker agent's checkForUpdates() still used direct string
comparison. When server returns "4.34.0" and agent has "v4.34.0", this
caused an infinite self-update loop.
Apply the same normalizeVersion() function used in the unified agent.
Related to #773
When systemUsage counter goes backward (common in unprivileged LXC
containers), the previous code used the absolute value as systemDelta.
This created an artificially small denominator, inflating CPU to ~100%.
Now leaves systemDelta as 0 on counter reset, falling through to the
time-based calculation which produces accurate results.
Related to #770
- Default enableDocker to false in UI to prevent unintended Docker
agent activation on host-only installs (Related to #766)
- Deploy agent scripts and binaries during web UI upgrades, not just
the main binary (Related to #760)
- Apply symlink resolution fix to standalone docker agent self-update
to prevent cross-device rename failures (Related to #737)
Mark intentionally unused parameters with underscore to:
- Silence unparam warnings for legitimate unused parameters
- Keep function signatures intact for API compatibility
- Remove unused req from serveChecksum helper