Commit graph

1004 commits

Author SHA1 Message Date
rcourtman
9bac3f421d Fix agent-token fallback to reject cross-org tokens and update security contract test
Two test regressions introduced when agent-report tokens were allowed as
fallback auth for /api/auto-register:

1. Org mismatch was not checked: a token belonging to org-a could authenticate
   a request whose context carried org-b. Add an explicit org consistency check
   before setting authenticated=true in the fallback path.

2. The security regression test assumed only setup tokens could authenticate
   auto-register. That contract has intentionally changed: agent-report tokens
   can now authenticate but are restricted to updating existing nodes (403 for
   new-node attempts). Update the test to assert the actual security boundary.
2026-04-18 23:10:50 +01:00
rcourtman
688bdd4246 Fix PVE connection health key in registration check
The isKnownDisconnected helper was building the key as
instanceType+"-"+instanceName ("pve-delly"), but the PVE
PollProvider's connectionKey function returns the bare instance
name ("delly"). PBS uses "pbs-"+name. The mismatch meant the
disconnected-node check always missed, rendering the server-side
stale-token detection inert.

Fix: use type-specific key construction matching the PollProvider
connectionKey implementations.
2026-04-18 22:37:24 +01:00
rcourtman
501c61b82f Fix PVE stale token self-healing after failed registration
Two gaps in the existing flow allowed a disconnected PVE node to stay
broken indefinitely even after the agent restarted:

1. Server-side: autoRegisteredNodeExists checked only that a PVE/PBS
   instance existed in the config, not whether its connection was
   healthy. A node with a stale token would return registered=true on
   every check, causing the agent to skip re-registration forever.
   Fixed: also consult GetConnectionStatuses(); return registered=false
   when the monitor has a definitive disconnected entry so the agent can
   rotate and re-register.

2. Agent-side: the type-specific registration marker was cleared only on
   success. If rotation succeeded but the Pulse update failed (e.g.
   transient network error), the old marker from a previous successful
   registration persisted, leaving next-startup to skip setup again.
   Fixed: clear the marker before entering the token setup/rotation
   phase so any failure leaves the system in a retriable state.

Together these two fixes make the stale-token scenario self-healing:
the monitor detects the broken connection, the next agent startup sees
registered=false, clears its marker, rotates the token, and updates
Pulse — without manual intervention.
2026-04-18 22:07:30 +01:00
rcourtman
b0b790cf55 Fix PVE token re-registration after agent reinstall
When the agent is reinstalled on a Proxmox host, it rotates the PVE API
token in Proxmox but the Pulse server's /api/setup-script-url endpoint
requires settings:write scope — agent tokens only have agent:report — so
the 401 aborted the update, leaving Pulse with a stale token and a
disconnected PVE node.

Three-part fix:
- server: accept agent API tokens on /api/auto-register for updating
  existing nodes (new nodes still require setup-token auth)
- agent: fall through instead of aborting when setup token fetch returns
  4xx; send X-API-Token header so the server can authenticate via the
  agent token instead
- update: allow HTTP auto-update URLs for RFC 1918 private network
  addresses (LAN installs without HTTPS no longer block auto-update)
2026-04-18 21:44:42 +01:00
rcourtman
8f133b1be1 Fix agent infinite update loop on dev builds
The /api/agent/version endpoint was returning the git-describe version
(e.g. 6.0.0-rc.2+git.58.g53a9339.dirty) for dev builds, which is always
semantically newer than the binary's embedded version (v6.0.0-rc.1).
This caused agents to loop: check version → see "newer" → self-update →
restart → still same binary version → loop again.

The agent's guard ("skip if server reports 'dev'") only fires for the
literal string "dev". Fix: return "dev" whenever IsDevelopment is true,
which covers all git-built/dirty dev instances.
2026-04-18 20:42:12 +01:00
rcourtman
e6b0d47bd6 Gate Docker mutations on authz-plugin posture 2026-04-18 10:45:25 +01:00
rcourtman
5ebb3d9952 Harden session and setup token auth flows 2026-04-18 00:06:50 +01:00
rcourtman
f2d5892aa5 Skip onboarding overflow bonus on uncapped plans
The free-tier onboarding overflow adds +1 to MaxMonitoredSystems for 14
days after initial setup. Once rc.2 made self-hosted core monitoring
uncapped (MaxMonitoredSystems = 0 on Free), the bonus math silently
converted "unlimited" into a hard cap of 1 — the UI then surfaced
"Over plan by N. N monitored, 1 included." on healthy installs.

Guard the addition on limit > 0 at all three call sites (ledger path,
commercial entitlement payload, runtime capabilities payload) so the
bonus only extends plans that actually have a cap.

Refs #1429
2026-04-17 12:16:03 +01:00
rcourtman
1bc08f3bc1 Canonicalize self-hosted purchase handoff intent
Refs #1409
2026-04-16 10:17:37 +01:00
rcourtman
5914a4127d Make self-hosted core monitoring uncapped
Refs #1409
2026-04-16 01:21:57 +01:00
rcourtman
d573d3a85f Preserve standalone host continuity across restart
Refs #1402
2026-04-15 16:23:42 +01:00
rcourtman
bd8b2efd1b Add monitored-system admission extension hook 2026-04-15 14:04:21 +01:00
rcourtman
3596acfeb2 Trim stale SAML rebinding coverage 2026-04-15 13:57:40 +01:00
rcourtman
27367da17f Isolate internal/api tests from system data dir 2026-04-15 13:40:24 +01:00
rcourtman
8d703f2371 Explain monitored-system over-plan legitimacy 2026-04-15 13:38:57 +01:00
rcourtman
a33983175b Port v5 SAML public URL rebinding 2026-04-15 13:17:01 +01:00
rcourtman
6c1364ef54 Clarify monitored-system admission freeze posture 2026-04-15 13:15:10 +01:00
rcourtman
429f12decd Recover unavailable Pulse Account handoffs 2026-04-15 10:09:00 +01:00
rcourtman
f3c4d4d83d Grandfather active v5 Pro customers as uncapped 2026-04-15 00:35:24 +01:00
rcourtman
b84e4067e8 Uncap grandfathered lifetime entitlements 2026-04-14 23:34:37 +01:00
rcourtman
7cac04e2ff Track v6 install version through licensing runtime 2026-04-14 11:44:59 +01:00
rcourtman
58e67c7b19 Canonicalize usage-data telemetry reporting 2026-04-14 11:05:10 +01:00
rcourtman
2d8385c1d8 Fix same-host websocket proxy origin checks 2026-04-13 12:04:29 +01:00
rcourtman
51b494b137 Stabilize RC backend test contracts 2026-04-12 09:02:56 +01:00
rcourtman
05fa111ca1 Stabilize backend race tests for v6 RC publish 2026-04-11 22:46:34 +01:00
rcourtman
5a3affbfd9 Rebaseline API chart CI SLOs 2026-04-11 19:43:18 +01:00
rcourtman
b33cc3ac60 Hide admin operations from public demo 2026-04-11 17:20:58 +01:00
rcourtman
1e28a03b57 Stabilize rc1 mock mode and metrics history 2026-04-11 16:47:37 +01:00
rcourtman
1caa34536b Use mock unified state for demo infrastructure charts 2026-04-11 15:35:05 +01:00
rcourtman
347a013e79 Stabilize RC release proof contracts 2026-04-11 14:51:10 +01:00
rcourtman
b64782c083 Align mock state resources with canonical contract 2026-04-11 13:40:01 +01:00
rcourtman
f1713b5fee Stabilize demo resource ordering 2026-04-11 00:24:03 +01:00
rcourtman
efb218958a Fix Kubernetes demo chart history coverage 2026-04-10 23:40:13 +01:00
rcourtman
05b132b110 Fix Kubernetes pod demo history contracts 2026-04-10 23:04:35 +01:00
rcourtman
4ca488101e Optimize compact storage summary chart path 2026-04-10 19:09:12 +01:00
rcourtman
cceca653dc Compact dashboard demo hot path 2026-04-10 18:30:39 +01:00
rcourtman
8d97bc3995 Tighten dashboard summary hot paths 2026-04-10 17:32:30 +01:00
rcourtman
b643d6d255 Unify dashboard storage trend loading 2026-04-10 14:53:00 +01:00
rcourtman
3100ad1b3d Prove IPv6 trusted proxy websocket continuity 2026-04-10 12:54:12 +01:00
rcourtman
b846d66fd0 Gate release mock fixtures behind demo entitlement 2026-04-10 12:33:57 +01:00
rcourtman
4a8c901a29 Calibrate resources load floor for GitHub runners 2026-04-10 00:35:35 +01:00
rcourtman
a831f7252a Calibrate remaining API RC dry-run proofs 2026-04-09 23:28:39 +01:00
rcourtman
b4199a4b96 Fix remaining RC backend proof drift 2026-04-09 23:06:49 +01:00
rcourtman
a6ba806af9 Stabilize remaining RC dry-run proofs 2026-04-09 22:43:09 +01:00
rcourtman
826358eefc Stabilize remaining RC backend proofs 2026-04-09 22:13:02 +01:00
rcourtman
0ec2dec65e Calibrate RC CI SLO envelopes 2026-04-09 21:27:03 +01:00
rcourtman
81fef82bdd Restore RC backend proof regressions 2026-04-09 20:15:17 +01:00
rcourtman
a4a223a7c6 Exclude disabled platform connections from monitored-system counts 2026-04-09 16:35:38 +01:00
rcourtman
441e45e74a Fail closed platform saves on unavailable usage 2026-04-09 15:48:28 +01:00
rcourtman
537cf2713e Guard monitored-system admission limits 2026-04-09 11:17:59 +01:00