Commit graph

5606 commits

Author SHA1 Message Date
rcourtman
8a21162f35 refactor: split host alert checker
Move host-agent identity, metric projection, disk/SMART/RAID/Unraid health handling, cleanup, and offline lifecycle into internal/alerts/host.go.

Keep shared health-assessment evaluation package-level for now because storage ZFS and host SMART/RAID still share that bridge, while recording host.go as the host checker owner in the alerts subsystem contract.

Proof: go test ./internal/alerts/...
2026-05-06 13:50:13 +01:00
rcourtman
3d8cb6c8a5 refactor: split node alert checker
Move Proxmox node metric, temperature, offline lifecycle, host-agent deduplication, and node display-name cache support into internal/alerts/node.go.

Keep the Manager API unchanged while recording the node checker owner in the alerts subsystem contract and adding a focused display-name cache key characterization.

Proof: go test ./internal/alerts/...
2026-05-06 13:47:54 +01:00
rcourtman
a0e8896893 refactor: split PBS and storage alert checkers
Move PBS connectivity and metric evaluation into internal/alerts/pbs.go, and move storage connectivity, usage, and ZFS health evaluation into internal/alerts/storage.go.

Keep the Manager API unchanged while recording PBS and storage as resource-checker owners in the alerts subsystem contract, with focused characterization tests for PBS offline normalization and ZFS device labels.

Proof: go test ./internal/alerts/...
2026-05-06 13:45:16 +01:00
rcourtman
d2ac17fd80 refactor: split Docker alert checker
Move Docker host connectivity, container state and health, metric projection, service gap/update-state checks, image update timing, and Docker tracking cleanup into internal/alerts/docker.go.

Keep the Manager API unchanged while recording Docker as the resource-checker owner and strengthening Docker resource ID normalization proof.

Proof: go test ./internal/alerts/...
2026-05-06 13:31:05 +01:00
rcourtman
8c0261ec43 refactor: split PMG alert checker
Move PMG connectivity, queue, per-node queue, quarantine, and anomaly evaluation into internal/alerts/pmg.go while keeping the Manager API unchanged.

Record PMG as the resource-checker owner in the alerts contract and add a PMG connection-health normalization proof through CheckPMG.

Proof: go test ./internal/alerts/...

Proof: go test ./internal/monitoring
2026-05-06 13:28:19 +01:00
rcourtman
0d642c32ef refactor: split alert read model
Move active-alert projection, sorting, metadata coercion, recently resolved reads, history wrappers, and notify-existing redispatch into internal/alerts/read_model.go.

Keep the Manager API unchanged while recording the read-side alerts contract owner and strengthening resolved-alert clone proof.

Proof: go test ./internal/alerts/...
2026-05-06 13:22:12 +01:00
rcourtman
9d4fabf915 refactor: split alert notification policy
Move alert dispatch, flapping suppression, quiet-hours suppression, monitor-only suppression, cooldown, and rate-limit policy into internal/alerts/notification_policy.go.

Keep Manager behavior and public API unchanged while recording the new alerts contract owner and adding monitor-only dispatch proof.

Proof: go test ./internal/alerts/...; go test -json ./internal/api -count=1; go test ./internal/api ./internal/monitoring ./internal/ai/... ./internal/websocket (internal/api did not reproduce on isolated rerun; other packages passed in the broad run).
2026-05-06 13:14:18 +01:00
rcourtman
edae6d1edc refactor: split alert config and callbacks
Extract alert config types, normalization, and identity helpers into internal/alerts/config while preserving the existing alerts package API through aliases and wrappers.

Move Manager callback lifecycle state into a same-package callbackBus, keeping public Set/Subscribe methods unchanged.

Harden metrics SQLite artifacts to owner-only permissions and cover permissive umask behavior.

Proof: go test -json ./internal/api -count=1; go test ./internal/alerts/... ./internal/monitoring ./internal/ai/... ./internal/websocket ./internal/config ./pkg/metrics; go test ./internal/alerts/... ./pkg/metrics
2026-05-06 13:01:32 +01:00
rcourtman
d6ca8b12e6 Add agentless availability targets
Refs #1460
2026-05-06 10:35:34 +01:00
rcourtman
2f8e5184bd Remove navigation guide modal and reopen control
The four-step coachmark over the top tabs was a tour pretending to be
guidance: each step duplicated the tab title in one sentence, and the
Reopen control on /settings/system-general spawned a centered panel with
no spotlight target because the tabs only exist on dashboard routes.

Delete the modal, the localStorage dismissal key, the reopen event, the
Reopen row in General settings, and the matching guardrails so the
shared-primitives tests stop pinning the deleted owner split. Drop the
WhatsNew dismissal helpers and addInitScript bypasses from the
integration suite, and the dedicated tour test in
19-telemetry-disclosure.
2026-05-06 09:49:15 +01:00
rcourtman
0895916283 Fix self-hosted startup web listener fail-fast
Refs #1461
2026-05-06 09:16:54 +01:00
rcourtman
01474a18b6 Fail closed on incomplete OpenAI SSE streams
Keep the buffered EOF compatibility path for OpenAI-compatible streams that omit [DONE] but provide a terminal finish_reason, while rejecting truncated tool-call streams before they can produce executable tool calls.

Refs #1411

Refs #1412
2026-05-05 22:10:50 +01:00
rcourtman
d6e96ebeca Fix v6 demo release signing key deployment 2026-05-05 21:40:14 +01:00
rcourtman
4aa91f6af3 Refresh RC4 packet after watcher lifecycle fix 2026-05-05 18:30:06 +01:00
rcourtman
7cebe78859 Fix config watcher stop lifecycle race 2026-05-05 18:26:53 +01:00
rcourtman
868239a648 Stabilize TrueNAS poller enable-disable proof 2026-05-05 16:50:10 +01:00
rcourtman
09c8e75f4d Refresh RC4 packet validation metadata 2026-05-05 16:27:49 +01:00
rcourtman
1a3e5ec27d Fix tenant monitor broadcast nil hub panic 2026-05-05 16:25:00 +01:00
rcourtman
96c2e160c9 Fix RC4 release validation blockers 2026-05-05 15:59:23 +01:00
rcourtman
f149c5d643 Prepare v6.0.0-rc.4 release packet 2026-05-05 15:32:32 +01:00
rcourtman
3f16d7845a Improve Patrol header mobile controls 2026-05-05 15:14:43 +01:00
rcourtman
0c21f82f34 Clarify Agent Security docs entry 2026-05-05 15:13:27 +01:00
rcourtman
cd2abe879e Fix mock mode legacy sidecar drift 2026-05-05 15:12:31 +01:00
rcourtman
d7225a45a0 Fix Proxmox guest memory fallbacks
Also fixes Ceph pool threshold resource identity.

Refs #1341
2026-05-05 14:59:29 +01:00
rcourtman
28981e0f9f Fix Ceph pool threshold resource identity
Refs #1341
2026-05-05 14:52:55 +01:00
rcourtman
35b2deebfb Harden Proxmox guest snapshot polling
Refs #1437
2026-05-05 14:51:28 +01:00
rcourtman
ce7b459aa7 Harden runtime Proxmox token ACLs 2026-05-05 14:42:05 +01:00
rcourtman
30180727ad Harden Proxmox setup token ACLs 2026-05-05 14:19:50 +01:00
rcourtman
c61ea4947a Make Proxmox onboarding API-first 2026-05-05 13:25:17 +01:00
rcourtman
cf103ca9fe Harden root agent service defaults 2026-05-05 13:03:13 +01:00
rcourtman
81b31e4d3b Remove monitored-system volume caps
Retire runtime/API/UI monitored-system volume enforcement now that infrastructure monitoring is no longer capped.

Keep only legacy metadata scrubbing and purchase-start compatibility for old max_monitored_systems references.

Rename the remaining preview surface to monitored-system impact and make previews explanatory rather than save-blocking.

Update subsystem contracts and RA7 evidence for the caps-retired invariant.
2026-05-05 12:59:59 +01:00
rcourtman
aa5472553f Fix Workloads empty state source detection
Refs #1456
2026-05-05 09:42:31 +01:00
rcourtman
632f0af7f3 Keep uncapped continuity from writing raw caps 2026-05-05 09:33:44 +01:00
rcourtman
641660dced Fix mdadm RAID fallback discovery
Refs #1455
2026-05-05 09:29:34 +01:00
rcourtman
fed3b776e0 Fail closed on ambiguous email principal resolution 2026-05-05 09:26:10 +01:00
rcourtman
d91c2afedb Fail closed dry-run action execution 2026-05-05 09:22:04 +01:00
rcourtman
53a928ee2d Prevent contact-email principal takeover 2026-05-05 09:19:29 +01:00
rcourtman
fe30ecc81e Fix TrueNAS CORE agent supervisor restart
Refs #1457
2026-05-05 09:13:03 +01:00
rcourtman
04fb02defc Use stable principals in Stripe webhook fixtures 2026-05-05 09:10:44 +01:00
rcourtman
e5d094a3da Use stable principals in Stripe webhook fixtures 2026-05-05 09:07:54 +01:00
rcourtman
df14e5d356 Pin strict organization identity invariants 2026-05-05 09:06:01 +01:00
rcourtman
235e7343b2 Align AI action audits with execution lifecycle 2026-05-04 23:35:39 +01:00
rcourtman
82a2494ffa Add action execution safety contract 2026-05-04 23:19:58 +01:00
rcourtman
34e890ec67 Align workspace owner proof with stable user IDs 2026-05-04 23:19:04 +01:00
rcourtman
ea0b20cd19 Use strict org principals for runtime access 2026-05-04 23:16:15 +01:00
rcourtman
002d68cef7 Require stored principal for checkout magic links 2026-05-04 23:06:47 +01:00
rcourtman
2040285085 Add action decision API 2026-05-04 22:56:55 +01:00
rcourtman
adaad70077 Canonicalize legacy hosted signup principals 2026-05-04 22:52:00 +01:00
rcourtman
7af1276c3b Fail closed on blank magic-link principals 2026-05-04 22:43:35 +01:00
rcourtman
2fa271bbe9 Fix storage primary issue impact handling
Refs #423
2026-05-04 18:42:09 +01:00