Two test regressions introduced when agent-report tokens were allowed as
fallback auth for /api/auto-register:
1. Org mismatch was not checked: a token belonging to org-a could authenticate
a request whose context carried org-b. Add an explicit org consistency check
before setting authenticated=true in the fallback path.
2. The security regression test assumed only setup tokens could authenticate
auto-register. That contract has intentionally changed: agent-report tokens
can now authenticate but are restricted to updating existing nodes (403 for
new-node attempts). Update the test to assert the actual security boundary.
The isKnownDisconnected helper was building the key as
instanceType+"-"+instanceName ("pve-delly"), but the PVE
PollProvider's connectionKey function returns the bare instance
name ("delly"). PBS uses "pbs-"+name. The mismatch meant the
disconnected-node check always missed, rendering the server-side
stale-token detection inert.
Fix: use type-specific key construction matching the PollProvider
connectionKey implementations.
Two gaps in the existing flow allowed a disconnected PVE node to stay
broken indefinitely even after the agent restarted:
1. Server-side: autoRegisteredNodeExists checked only that a PVE/PBS
instance existed in the config, not whether its connection was
healthy. A node with a stale token would return registered=true on
every check, causing the agent to skip re-registration forever.
Fixed: also consult GetConnectionStatuses(); return registered=false
when the monitor has a definitive disconnected entry so the agent can
rotate and re-register.
2. Agent-side: the type-specific registration marker was cleared only on
success. If rotation succeeded but the Pulse update failed (e.g.
transient network error), the old marker from a previous successful
registration persisted, leaving next-startup to skip setup again.
Fixed: clear the marker before entering the token setup/rotation
phase so any failure leaves the system in a retriable state.
Together these two fixes make the stale-token scenario self-healing:
the monitor detects the broken connection, the next agent startup sees
registered=false, clears its marker, rotates the token, and updates
Pulse — without manual intervention.
When the agent is reinstalled on a Proxmox host, it rotates the PVE API
token in Proxmox but the Pulse server's /api/setup-script-url endpoint
requires settings:write scope — agent tokens only have agent:report — so
the 401 aborted the update, leaving Pulse with a stale token and a
disconnected PVE node.
Three-part fix:
- server: accept agent API tokens on /api/auto-register for updating
existing nodes (new nodes still require setup-token auth)
- agent: fall through instead of aborting when setup token fetch returns
4xx; send X-API-Token header so the server can authenticate via the
agent token instead
- update: allow HTTP auto-update URLs for RFC 1918 private network
addresses (LAN installs without HTTPS no longer block auto-update)
The /api/agent/version endpoint was returning the git-describe version
(e.g. 6.0.0-rc.2+git.58.g53a9339.dirty) for dev builds, which is always
semantically newer than the binary's embedded version (v6.0.0-rc.1).
This caused agents to loop: check version → see "newer" → self-update →
restart → still same binary version → loop again.
The agent's guard ("skip if server reports 'dev'") only fires for the
literal string "dev". Fix: return "dev" whenever IsDevelopment is true,
which covers all git-built/dirty dev instances.
The free-tier onboarding overflow adds +1 to MaxMonitoredSystems for 14
days after initial setup. Once rc.2 made self-hosted core monitoring
uncapped (MaxMonitoredSystems = 0 on Free), the bonus math silently
converted "unlimited" into a hard cap of 1 — the UI then surfaced
"Over plan by N. N monitored, 1 included." on healthy installs.
Guard the addition on limit > 0 at all three call sites (ledger path,
commercial entitlement payload, runtime capabilities payload) so the
bonus only extends plans that actually have a cap.
Refs #1429