Commit graph

3416 commits

Author SHA1 Message Date
rcourtman
d4ff967815 fix: scope shared storage aggregation to per-instance to prevent cross-instance merging
The shared storage deduplication key was just the storage name, causing
storages with the same name from different Proxmox instances (or PVE + PBS)
to be incorrectly merged into a single entry. This made one random host
appear to have all storages from all instances.

Include the instance name in the aggregation key so shared storage is only
merged within the same Proxmox cluster/instance.

Fixes #1246
2026-02-11 09:18:09 +00:00
rcourtman
2ba590d994 fix: fall back to SMART attributes 194/190 for disk temperature
When the top-level temperature.current field is 0 or missing (common
on some SATA drives), temperature was reported as 0°C with no fallback.
Now extracts temperature from ATA SMART attribute 194 (Temperature_Celsius)
or 190 (Airflow_Temperature_Cel) as a fallback.

Fixes #1243
2026-02-11 09:09:55 +00:00
rcourtman
284bdd7ade fix: prevent metrics.db bloat with automatic vacuum and WAL checkpointing
The metrics database could grow to 5GB+ for modest setups because:
1. Retention deletes rows hourly but SQLite never reclaims the space
2. WAL file grows unbounded without explicit checkpointing
3. No cleanup runs on startup, so restarts accumulate stale data

Fixes:
- Enable auto_vacuum=INCREMENTAL so deleted pages can be reclaimed
- Run incremental_vacuum after each retention cleanup
- Force WAL checkpoint(TRUNCATE) after deletes to prevent WAL bloat
- Run retention on startup to clean stale data immediately

Expected DB size for a 50-resource setup drops from 5GB+ to ~60-70MB.

Ref: GitHub Discussion #1231
2026-02-10 23:13:32 +00:00
rcourtman
0787388ae7 chore: bump version to 5.1.8 2026-02-10 21:55:08 +00:00
rcourtman
03939c3f9e fix: deduplicate bind-mounted volumes in disk total calculation
The dedup logic only handled btrfs/zfs subvolumes, but Kubernetes
bind-mounts the same device at both pod and plugin paths, causing
xfs/ext4 volumes to be double-counted. Now deduplicates by
device+totalBytes for all filesystem types.

Fixes #1158
2026-02-10 21:52:25 +00:00
rcourtman
42c01c1be5 fix: probe all guest IPs for reachability, not just first
Patrol only pinged the first IP address of each VM/container, causing
false "unreachable" reports for guests with multiple IPs (common with
Windows VMs that have IPv6 or multi-adapter setups). Now probes all
IPs and marks reachable if any responds.

Fixes #1215
2026-02-10 21:46:11 +00:00
rcourtman
6140cb5be4 fix: auto-default discovery interval to 24h when enabled
When users enable AI discovery without setting an interval, the
default of 0 silently stays in manual-only mode. Now normalizes
0 to 24h on save so discovery actually starts automatically.

Fixes #1225
2026-02-10 21:45:59 +00:00
rcourtman
ad97f19cbf fix: add trailing slash to profile API URLs to prevent POST→GET redirect (#1212)
Go's ServeMux 301-redirects /api/admin/profiles to /api/admin/profiles/
when the pattern is registered with a trailing slash. The fetch() API
follows 301 by changing POST to GET (per HTTP spec), so the create
request silently became a list request — returning 200 OK without
actually creating anything.
2026-02-10 21:32:34 +00:00
rcourtman
a6bc55428c fix: load system settings for proxy auth and OIDC users with local theme (#1219)
The proxy auth and OIDC initialization paths skipped loading system
settings when the user had a local theme preference in localStorage.
This meant shouldHideDockerUpdateActions() always returned false,
so the "Update All" button showed even when disabled server-side.

Add the same else branch that the standard login path already has,
which loads system settings asynchronously regardless of theme pref.
2026-02-10 21:20:17 +00:00
rcourtman
ae4632b5b5 fix: correct UpdateAlertDelayHours doc comment (0 normalizes to 24, -1 disables) 2026-02-10 21:13:12 +00:00
rcourtman
a68e0050f8 fix(docker): use manual CPU delta tracking instead of stale PreCPUStats (#1229)
Docker's one-shot stats API (stream=false) returns PreCPUStats from the
daemon's internal cache, which many Docker versions don't update between
non-streaming reads. This causes every call to return the same stale
PreCPUStats from container start, producing a constant lifetime-average
CPU% (e.g. 3.4%) instead of current usage.

Switch to always using manual delta tracking, which stores the previous
sample from our own reads and computes accurate deltas between collection
cycles. The first cycle returns 0 while establishing a baseline; all
subsequent cycles produce correct current CPU percentages.
2026-02-10 20:49:29 +00:00
rcourtman
47ceffe0c2 fix(smart): parse raw.string instead of raw.value for SATA attributes (#1239)
Seagate drives pack vendor-specific data in the upper bytes of the
48-bit SMART raw value, causing Power_On_Hours to report billions of
years instead of the actual value. Use smartctl's raw.string field
(e.g. "16951 (223 173 0)") and extract the first integer, which is
the correct interpretation. Falls back to raw.value when the string
is empty or non-numeric.
2026-02-10 20:42:15 +00:00
rcourtman
c3c7bdf4a4 fix(ui): wrap NodeSummaryTable and HostsOverview with ScrollableTable for mobile (#1196)
NodeSummaryTable and HostsOverview used raw overflow-x-auto divs with
min-width: 800px, causing column overlap on narrow screens. Wrap both
with ScrollableTable and bump min-width to 1000px to match actual
column totals. Also bump DockerHostSummaryTable min-width from 800px
to 920px to match its column widths.
2026-02-10 19:24:03 +00:00
rcourtman
26776b2075 fix(agent): apply --disk-exclude to Docker agent disk metrics (#1237)
The Docker agent was not passing the disk exclusion list to
hostmetricsCollect(), so excluded mounts appeared in the Docker tab
disk totals. Also add server-side fsfilters filtering to Docker
report processing for parity with the host agent path.
2026-02-10 16:59:35 +00:00
rcourtman
dea68a7521 fix(ui): change memory balloon segment color from yellow to blue (#1193)
Cherry-pick c2daf4b1 from main. Yellow was confusing users as it
matched the warning threshold color.
2026-02-10 12:57:30 +00:00
rcourtman
47adcbd8af feat(agent): add FreeBSD S.M.A.R.T. disk collection support (#1236)
Relax the Linux-only gate on SMART collection to also run on FreeBSD.
Add FreeBSD disk discovery via sysctl kern.disks (lsblk is Linux-only).
The smartctl invocation and JSON parsing are already platform-agnostic.
2026-02-10 12:44:15 +00:00
rcourtman
120a1032d6 chore(release): bump version to 5.1.7 2026-02-10 09:24:59 +00:00
rcourtman
72ccd2f839 fix(ui): prevent Docker host summary column overlap on mobile (#1223, #1196)
The host summary table used minWidth='100%' on mobile, forcing 8 columns
into ~375px with table-layout:fixed. This made HOST and Uptime headers
overlap into "HOSTIIME". Set mobileMinWidth to 800px so the table scrolls
horizontally instead of crushing columns.
2026-02-09 23:30:04 +00:00
rcourtman
f7a14feb0f fix(mock): align Docker container store type with real monitor
Mock seeding wrote Docker container metrics as "docker" but the real
monitor uses "dockerContainer". This made mock-mode charts miss the
SQLite store path after the API normalization fix in 7336ec2d.
2026-02-09 22:42:08 +00:00
rcourtman
97f844e19e fix(ui): make Show All columns actually show all including IP/OS (#1222)
resetToDefaults() was restoring defaultHidden (which includes os, ip)
instead of clearing the hidden list. The button says "Show all" so it
should set hiddenColumns to [] rather than reverting to defaults.
2026-02-09 22:35:11 +00:00
rcourtman
e06ebb1bb7 fix(ui): set versionInfo on cached update path so agent badges work (#1228)
When checkForUpdates used cached data, it returned without setting
updateStore.versionInfo. This caused getServerVersion() to return
empty, making checkAgentVersion() skip the comparison and report all
agents as up-to-date (green) regardless of their actual version.
2026-02-09 22:33:56 +00:00
rcourtman
7336ec2d87 fix(metrics): normalize docker resource type in metrics history API (#1229)
Frontend sends resourceType="docker" but the SQLite store uses
"dockerContainer". The /api/metrics-store/history handler now
normalizes the alias so queries return the correct historical data
instead of falling back to a single live data point.
2026-02-09 22:33:24 +00:00
rcourtman
79016ec1b0 fix(ui): add missing v prefix to release download URLs (#1232)
UpdateBanner release-tag links were already fixed; this also fixes
the manual download commands in UpdatesSettingsPanel which were
generating 404 URLs (missing v in both the tag path and asset filename).
2026-02-09 22:23:38 +00:00
rcourtman
c92ccc122e fix(state): deduplicate PVE nodes and AI mention resources (#1217, #1214)
Backend: nodes with the same logical identity (cluster+name) are merged
using a health-weighted preference, preserving host-agent links across
node-ID churn.

Frontend: extract buildMentionResources() with alias-based dedup so
docker hosts and standalone host agents sharing an ID/hostname appear
once in the @ mention autocomplete.
2026-02-09 22:19:55 +00:00
rcourtman
815c990e85 fix(proxmox): avoid 403 on apt update checks 2026-02-09 20:28:09 +00:00
rcourtman
9e8d702a18 chore(release): bump version to 5.1.6 2026-02-09 14:00:53 +00:00
rcourtman
721be9bce6 fix(config): honor legacy env aliases for docker update-action toggle (#1219) 2026-02-09 14:00:24 +00:00
rcourtman
2d78cf4e84 fix(dashboard): restore guest URL add/edit controls in table rows (#1221) 2026-02-09 14:00:15 +00:00
rcourtman
cedf0c8f0f fix(temperature): parse string sensor values without zeroing readings (#1224) 2026-02-09 14:00:09 +00:00
rcourtman
0d6fffbb1c fix(servicediscovery): run automatic refresh for changed/stale resources (#1225) 2026-02-09 14:00:02 +00:00
rcourtman
1f74c12ef8 fix(alerts): preserve docker update delay across host identity churn (#1226) 2026-02-09 13:59:52 +00:00
rcourtman
8036d9c3fd Improve issue triage with version-aware automation 2026-02-08 19:28:24 +00:00
rcourtman
8a48acef1d fix: hotfix 5.1.5 — node duplication, alert scrambling, ntfy resolved formatting
- fix(models): filter nodes by instance in UpdateNodesForInstance to prevent
  PVE node duplication across poll cycles (#1214, #1192, #1217)
- fix(alerts): sort GetActiveAlerts output for stable ordering, preventing
  hostname scrambling in frontend (#1218)
- fix(notifications): add ntfy-specific resolved webhook formatting with
  plain-text body and proper headers (#1213)
- fix(frontend): respect "hide Docker update actions" setting in
  DockerFilter Update All button (#1219)
- fix(frontend): add missing v prefix to GitHub release tag URLs (#1195)
- fix(monitoring): reduce disk detection warning from Warn to Debug to
  eliminate log spam for pass-through disks (#1216)
- chore: bump VERSION to 5.1.5
2026-02-08 11:48:22 +00:00
rcourtman
d1e61d8a8a fix: ship alerting hotfixes and prepare 5.1.4 2026-02-07 22:05:55 +00:00
rcourtman
3d0082c07e chore: update dev paths to /Volumes/Development
Migrated hardcoded paths from ~/Development to /Volumes/Development.
2026-02-07 19:20:37 +00:00
rcourtman
839ed5cc1e docs(release): finalize hotfix 5.1.3 checklist and version bump 2026-02-07 14:18:53 +00:00
rcourtman
f253ed2778 fix(license): harden release key validation and fingerprint logging 2026-02-07 14:18:44 +00:00
rcourtman
6909264a02 fix(alerts): reduce swarm alert noise and preserve notification state (#1096) 2026-02-07 14:18:39 +00:00
rcourtman
13af83f3fc fix(monitoring): preserve recent PVE nodes on empty polls (#1094) 2026-02-07 14:18:33 +00:00
rcourtman
c949e9c9f9 Auto-update Helm chart version to 5.1.2 2026-02-04 23:16:55 +00:00
rcourtman
981fc00d4c Auto-update Helm chart documentation 2026-02-04 23:16:54 +00:00
rcourtman
0f961054c6 fix: allow agent tokens to auto-register Proxmox nodes
The security hardening in beae4c86 added a settings:write scope
requirement to /api/auto-register, but agent install tokens only have
host-agent:report scope. This broke Proxmox auto-registration for all
agent-generated tokens. Accept either settings:write or host-agent:report
scope for auto-registration.

Fixes #1191
2026-02-04 22:55:25 +00:00
rcourtman
3d6488d159 fix: add agent:exec to API scope options in token creation UI
Users couldn't manually select agent:exec when creating tokens
via Settings → Security → API Tokens because it wasn't listed
in the scope options.
2026-02-04 22:42:48 +00:00
rcourtman
247cb0baa6 chore: bump version to 5.1.2 2026-02-04 22:34:10 +00:00
rcourtman
f6338f34fa fix: add agent:exec scope to generated agent tokens
Agent tokens created from the Settings UI and the backend install
command handler were missing the agent:exec scope, which was added
as a security requirement in 60f9e6f0. This caused all newly
installed agents to fail registration with "Agent exec token missing
required scope: agent:exec".

Fixes #1191
2026-02-04 22:33:01 +00:00
rcourtman
69d20abc0c Auto-update Helm chart version to 5.1.1 2026-02-04 21:11:14 +00:00
rcourtman
7d3bf20b3e Auto-update Helm chart documentation 2026-02-04 21:11:13 +00:00
rcourtman
5bbc4329bd Remove pprof diagnostics endpoint 2026-02-04 20:44:00 +00:00
rcourtman
a37b59b7e4 Add admin-gated pprof diagnostics endpoint 2026-02-04 20:39:24 +00:00
rcourtman
0635d91581 chore: bump version to 5.1.1 2026-02-04 20:37:10 +00:00