The cold-startup race: if Pulse and the agent restart together, the
monitor has no connection-health data when the agent calls
checkRegistrationWithPulse at startup. The server defaults to
registered=true (no known-disconnected entry), so the agent skips
re-registration even though the token is stale. The node stays broken
until the next manual agent restart.
Fix: after the initial runProxmoxSetup call, start a background goroutine
that waits 2 minutes (giving the monitor time to poll PVE and record
failure state), then rechecks every 5 minutes via RunHealthCheck.
RunHealthCheck only acts on types that have a local registration marker.
Types without a marker are skipped to prevent uncontrolled token rotation
when Pulse is temporarily unreachable — those need a full startup setup
cycle via RunAll.
Together with the two earlier commits this closes all three stale-token
scenarios: install-time 401, long-running stale state, and cold-startup
race.
Two gaps in the existing flow allowed a disconnected PVE node to stay
broken indefinitely even after the agent restarted:
1. Server-side: autoRegisteredNodeExists checked only that a PVE/PBS
instance existed in the config, not whether its connection was
healthy. A node with a stale token would return registered=true on
every check, causing the agent to skip re-registration forever.
Fixed: also consult GetConnectionStatuses(); return registered=false
when the monitor has a definitive disconnected entry so the agent can
rotate and re-register.
2. Agent-side: the type-specific registration marker was cleared only on
success. If rotation succeeded but the Pulse update failed (e.g.
transient network error), the old marker from a previous successful
registration persisted, leaving next-startup to skip setup again.
Fixed: clear the marker before entering the token setup/rotation
phase so any failure leaves the system in a retriable state.
Together these two fixes make the stale-token scenario self-healing:
the monitor detects the broken connection, the next agent startup sees
registered=false, clears its marker, rotates the token, and updates
Pulse — without manual intervention.
When the agent is reinstalled on a Proxmox host, it rotates the PVE API
token in Proxmox but the Pulse server's /api/setup-script-url endpoint
requires settings:write scope — agent tokens only have agent:report — so
the 401 aborted the update, leaving Pulse with a stale token and a
disconnected PVE node.
Three-part fix:
- server: accept agent API tokens on /api/auto-register for updating
existing nodes (new nodes still require setup-token auth)
- agent: fall through instead of aborting when setup token fetch returns
4xx; send X-API-Token header so the server can authenticate via the
agent token instead
- update: allow HTTP auto-update URLs for RFC 1918 private network
addresses (LAN installs without HTTPS no longer block auto-update)
Move the guest-agent file-read of /proc/meminfo earlier in the memory
fallback chain so it runs before RRD, giving real-time MemAvailable that
correctly excludes reclaimable buff/cache on Linux VMs. Also add
VM.GuestAgent.FileRead permission for PVE 9 and fix install.sh to use
comma-separated privilege strings.
Two nodes in the same PVE cluster generated identical Proxmox API token
names, so the second node's setup rotated the shared token and broke the
first node. Include the hostname in the token name so each node gets its
own token. Also refresh the stored cluster credential on the server when
a new endpoint merges into an existing cluster entry.
The agent gate only allowed temperature collection on Linux (lm-sensors).
FreeBSD exposes CPU and ACPI thermal zone temperatures via sysctl
(dev.cpu.N.temperature, hw.acpi.thermal.tzN.temperature). Parse sysctl
output directly in Go without shell involvement.
The --disk-exclude agent flag only filtered local metric collection but
had no effect on server-side Proxmox disk health and SSD wearout alerts,
which poll the Proxmox API directly. Users excluding disks (e.g.
--disk-exclude sda) still received alerts for those disks.
Agent now sends its DiskExclude patterns in each report. The server
stores them on the Host model and consults them during Proxmox disk
polling — excluded disks get a synthetic healthy status passed to
CheckDiskHealth so any existing alerts clear immediately.
Also adds FreeBSD pseudo-filesystem types (fdescfs, devfs, linprocfs,
linsysfs) to the virtual FS filter and /var/run/ to special mount
prefixes, fixing false disk-full alerts on FreeBSD for fdescfs mounts.
registerWithPulse() was a one-shot call at agent startup — if it failed
(timing, transient network, Pulse not ready), the agent silently continued
as a generic Host forever. Wrap the HTTP POST in a retry loop with
exponential backoff (5s, 10s, 20s, 40s, 60s) and distinguish 4xx errors
(no retry) from 5xx/network errors (retry).
#1197: Add Custom URL input to the expanded host row in Settings → Agents.
Loads existing URL via HostMetadataAPI on row expand; saves on button click.
Only shown for host-type agent rows.
#1210: Fix agent_connected always false for Docker hosts on Proxmox VMs.
connectedAgentHostnames now also marks Docker host hostnames reachable when
their matching VM/LXC has a node with a connected Proxmox agent, mirroring
the routing logic already used in the control path.
#1267/#1269: Improve Proxmox auto-registration failure logging. Response body
is now included in the error message, and the warning directs users to delete
the state file to force re-registration rather than claiming the node exists.
(cherry picked from commit 305f6d3c94f0da4fc970450a6304da57d6d7fe80)
Relax the Linux-only gate on SMART collection to also run on FreeBSD.
Add FreeBSD disk discovery via sysctl kern.disks (lsblk is Linux-only).
The smartctl invocation and JSON parsing are already platform-agnostic.
Expand the smartctl collector to capture detailed SMART attributes (SATA
and NVMe), propagate them through the full data pipeline, persist them
as time-series metrics, and display them in an interactive disk detail
drawer with historical sparkline charts.
Backend: add SMARTAttributes struct, writeSMARTMetrics for persistent
storage, "disk" resource type in metrics API with live fallback.
Frontend: enhanced DiskList with Power-On column and SMART warnings,
new DiskDetail drawer matching NodeDrawer styling patterns, generic
HistoryChart metric support with proper tooltip formatting.
The unified agent's Proxmox setup was missing the PVEDatastoreAdmin
permission on /storage, causing local PVE backups to not appear in
Pulse's backup overview for users who set up nodes via the agent.
The UI-generated setup script already included this permission, but
the agent path (--enable-proxmox) did not, creating an inconsistency.
Related to #1139
- Add PBS/PMG polling interval environment variable overrides in config.go
- Fix temp path expectation in detect_root_test.go using filepath.Join
- Use EvalSymlinks for symlink target comparison in self_update_test.go
- Add Linux-only skip for MAC fallback test in agent_new_test.go
- Add OS-aware RAID/SMART assertions in agent_metrics_test.go
- Added Roles and Users settings panels
- Implemented OIDC group-to-role mappings in config and auth flow
- Standardized API token context handling via pkg/auth
- Added Pulse Pro branding and upgrade banners to RBAC features
- Cleanup: Removed empty code blocks and fixed lint errors
Users providing base URLs like "https://openrouter.ai/api/v1" were
getting HTML error responses because the client used the URL directly
without appending "/chat/completions".
- Normalize baseURL in NewOpenAIClient to ensure it ends with /chat/completions
- Fix modelsEndpoint() to derive /models from the normalized baseURL
- Add tests for URL normalization with various endpoint formats
Implements server-side persistence for AI chat sessions, allowing users
to continue conversations across devices and browser sessions. Related
to #1059.
Backend:
- Add chat session CRUD API endpoints (GET/PUT/DELETE)
- Add persistence layer with per-user session storage
- Support session cleanup for old sessions (90 days)
- Multi-user support via auth context
Frontend:
- Rewrite aiChat store with server sync (debounced)
- Add session management UI (new conversation, switch, delete)
- Local storage as fallback/cache
- Initialize sync on app startup when AI is enabled
When a user's reverse proxy redirects HTTP to HTTPS, Go's default HTTP
client behavior converts POST requests to GET on 301/302 redirects
(per HTTP specification). This causes the Pulse server to return 405
"Only POST is allowed" errors.
Added CheckRedirect to all agent HTTP clients (host, docker, kubernetes)
that returns a clear error message guiding users to use the correct
protocol in their --url flag instead of silently following redirects.
Related to #1058
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, the PULSE_DISK_EXCLUDE environment variable and --disk-exclude
flag only filtered mount points in the hostmetrics collector. This change
extends the exclusion to SMART data collection.
Changes:
- Updated smartctl.CollectLocal() to accept diskExclude patterns
- Added matchesDeviceExclude() for block device pattern matching
- Patterns support: exact match (sda), prefix (nvme*), contains (*cache*)
- Updated hostagent to pass DiskExclude to SMART collector
- Added comprehensive tests for pattern matching
- Updated documentation