The security hardening in beae4c86 added a settings:write scope
requirement to /api/auto-register, but agent install tokens only have
host-agent:report scope. This broke Proxmox auto-registration for all
agent-generated tokens. Accept either settings:write or host-agent:report
scope for auto-registration.
Fixes#1191
Agent tokens created from the Settings UI and the backend install
command handler were missing the agent:exec scope, which was added
as a security requirement in 60f9e6f0. This caused all newly
installed agents to fail registration with "Agent exec token missing
required scope: agent:exec".
Fixes#1191
The in-memory metrics buffer was changed from 1000 to 86400 points per
metric to support 30-day sparklines, but this pre-allocated ~18 MB per
guest (7 slices × 86400 × 32 bytes). With 50 guests that's 920 MB —
explaining why users needed to double their LXC memory after upgrading
to 5.1.0.
- Revert in-memory buffer to 1000 points / 24h retention
- Remove eager slice pre-allocation (use append growth instead)
- Add LTTB (Largest Triangle Three Buckets) downsampling algorithm
- Chart endpoints now use a two-tier strategy: in-memory for ranges
≤ 2h, SQLite persistent store + LTTB for longer ranges
- Reduce frontend ring buffer from 86400 to 2000 points
Related to #1190
When the server doesn't have agent binaries locally (common for
LXC/bare-metal installations), it was redirecting to GitHub releases.
The agent's HTTP client followed the redirect, but GitHub doesn't
provide the X-Checksum-Sha256 header that agents require for security
verification, causing every update attempt to fail silently.
Proxy the download through the server instead, computing and attaching
the checksum header so agents can verify and install the update.
Host network sparklines were displaying wildly incorrect values (e.g., 147 GB/s
for an idle Raspberry Pi) because cumulative byte counters (total bytes since
boot) were being stored directly instead of being converted to rates.
Changes:
- monitor.go: Use RateTracker to calculate network rates for hosts, matching
the existing pattern used for VMs and containers. Only record network
metrics when we have enough samples to calculate valid rates.
- router.go: Remove network metrics from live fallback for hosts since we
can't calculate rates from a single snapshot. Better to show nothing than
misleading cumulative totals.
The fix follows the established codebase pattern where:
1. Agent reports cumulative RXBytes/TXBytes
2. RateTracker compares consecutive samples to calculate bytes/second
3. Rates are stored in metrics history for sparkline display
Enrich the patrol seed context with service identity (from discovery
store) and network reachability (via ICMP ping through host agents).
The guest metrics table now includes Service and Reachable columns,
and a Service Health Issues section highlights running-but-unreachable
guests. A new SignalGuestUnreachable signal type creates deterministic
findings for unreachable guests.
New files:
- patrol_intelligence.go: GuestProber interface, GuestIntelligence
type, gatherGuestIntelligence() with concurrent per-node probing
- patrol_prober.go: agentExecProber implementation using batch ping
commands via connected host agents