Users can now pass --env KEY=VALUE (repeatable) to the install script to
inject custom environment variables into the agent's service file. Useful
for KUBECONFIG and similar paths not auto-detected by the installer.
The Settings UI adds a textarea for entering env vars that get appended
to the generated install command. Both frontend and script validate key
format and reject unsafe value characters.
When a PVE node has a linked host agent (or vice versa), they now merge
into a single mention resource instead of appearing as duplicate entries.
Uses alias cross-referencing via both linkedHostAgentId and linkedNodeId
(node-backend-id) to handle one-way and two-way links.
Two changes to prevent duplicates in Settings > Virtual Environment:
1. Install script: only clear Proxmox state files on fresh installs,
not upgrades. Previously every install forced re-registration.
2. Auto-register dedup: match agent re-registrations by server name
when both the existing entry and new request have Pulse-created
tokens (pulse-monitor@pam!pulse-*). This catches the case where
the agent creates a new token after state files are cleared.
Add pencil icon + link column to the hosts overview table and the
docker unified table (containers and services), matching the existing
VM/guest URL column pattern. Uses the shared UrlEditPopover component
and existing metadata APIs. No backend changes needed.
The agent gate only allowed temperature collection on Linux (lm-sensors).
FreeBSD exposes CPU and ACPI thermal zone temperatures via sysctl
(dev.cpu.N.temperature, hw.acpi.thermal.tzN.temperature). Parse sysctl
output directly in Go without shell involvement.
Recovery notifications were silently disabled for users with pre-5.1.12
configs because the NotifyOnResolve bool field defaults to false when
absent from JSON. Use a *bool probe to detect missing field and default
to true.
Patrol trigger queue filled with warnings when the patrol loop wasn't
running. Gate TriggerPatrolForAlert on p.running and clear the flag
via defer when the loop exits.
When editing an existing webhook, header values are masked as
***REDACTED*** for security. The "Test" button in the edit form
sent these redacted values to the webhook endpoint, causing auth
failures (HTTP 403) on services like ntfy.sh that require tokens.
The test button outside the edit form worked because it used the
server-side saved config with real header values.
Fix: frontend now includes the webhook ID in test payloads for
existing webhooks, and the backend TestWebhook handler merges
redacted values with the saved originals before sending the test
(same logic already used by UpdateWebhook).
The inline URL edit was removed from the hosts table in 5.1.2 and
the only way to set a host URL was buried in the Discovery tab
(which requires AI features and a successful discovery scan).
Adds a Web Interface URL field directly in the System card of the
host drawer's Overview tab - always accessible without AI/Pro.
Users can add, edit, and remove URLs with inline editing.
Also fixes the Discovery tab using GuestMetadataAPI instead of
HostMetadataAPI for host URLs, which caused saved URLs to not
appear in the hosts table.
Two remaining issues from #1255 after the 5.1.10 fixes:
1. OIDC/SAML provider edit fields appeared blank because the GET
endpoint returned a flattened response while the frontend reads
nested oidc/saml objects. Now returns the full provider config
with secrets redacted (client secret, SP private key).
2. SSO users didn't appear in Settings > Users because RBAC entries
were only created when group-role mappings matched. Now ensures
every SSO user is registered in RBAC on login, even without
role mappings.
Also fixes: SAML SP private key and certificate lost on edit (no
preservation logic existed), OIDC client secret preservation
hardened to check actual secret presence not just flag.
Proxmox status.Mem includes page cache as "used" memory, inflating
reported VM usage. The existing fallbacks (balloon meminfo, RRD, linked
host agent) were frequently unavailable, causing most VMs to fall
through to the inflated status-mem source.
Adds a new last-resort fallback that reads /proc/meminfo via the QEMU
guest agent file-read endpoint to get accurate MemAvailable. Results
are cached (60s positive, 5min negative backoff for unsupported VMs).
Also fixes: RRD memavailable fallback missing from traditional polling
path, cache key collisions in multi-PVE setups, FreeMem underflow
guard inconsistency, and integer overflow in kB-to-bytes conversion.
The --disk-exclude agent flag only filtered local metric collection but
had no effect on server-side Proxmox disk health and SSD wearout alerts,
which poll the Proxmox API directly. Users excluding disks (e.g.
--disk-exclude sda) still received alerts for those disks.
Agent now sends its DiskExclude patterns in each report. The server
stores them on the Host model and consults them during Proxmox disk
polling — excluded disks get a synthetic healthy status passed to
CheckDiskHealth so any existing alerts clear immediately.
Also adds FreeBSD pseudo-filesystem types (fdescfs, devfs, linprocfs,
linsysfs) to the virtual FS filter and /var/run/ to special mount
prefixes, fixing false disk-full alerts on FreeBSD for fdescfs mounts.
When no guest agent MemInfo or RRD data is available, prefer the linked
Pulse host agent's memory (read from /proc/meminfo via gopsutil, which
excludes page cache) over Proxmox's status.Mem (total - free, inflated
by reclaimable cache). Applied to both efficient and traditional polling
paths. Diagnostic fields added to VMMemoryRaw for visibility.
The --cacert flag was only used for curl during installation. On systems
with custom CA certificates (e.g. TrueNAS CORE with certs in
/etc/certificates/CA), the agent process had no way to trust the custom
CA and users had to fall back to --insecure.
Set SSL_CERT_FILE in the agent's runtime environment when --cacert is
provided. Go's crypto/x509 reads this natively, so the agent trusts the
custom CA without any binary changes. All service types are covered:
systemd, upstart, launchd, FreeBSD rc.d, OpenRC, SysV init, and Unraid.
Also validates the --cacert path at install time: directories and missing
paths now fail early with a clear message instead of silently proceeding.
registerWithPulse() was a one-shot call at agent startup — if it failed
(timing, transient network, Pulse not ready), the agent silently continued
as a generic Host forever. Wrap the HTTP POST in a retry loop with
exponential backoff (5s, 10s, 20s, 40s, 60s) and distinguish 4xx errors
(no retry) from 5xx/network errors (retry).
Probe ~/.docker/run/docker.sock for RuntimeDocker and RuntimeAuto
before falling back to /var/run/docker.sock. This lets the agent
connect on macOS without requiring DOCKER_HOST to be set manually.
Ref #1200
saveToDisk used os.WriteFile which doesn't sync to disk before the
atomic rename. On CI runners with aggressive filesystem caching this
can leave the destination file with zero bytes, causing
TestKnowledgeStore_SaveLoad to fail with "unexpected end of JSON input".
syscall.SysctlRaw is Darwin-only in Go's standard library; FreeBSD
requires the equivalent from golang.org/x/sys/unix. This fixes the
Docker cross-compilation build failure for the freebsd/amd64 target.
(cherry picked from commit 5fe16c75a075b817f90b7192d8270a7bd6677017)
Extends the TrueNAS SCALE installer to also support TrueNAS CORE
(FreeBSD-based). The installer auto-detects the platform and configures
the appropriate service manager: systemd for SCALE, rc.d for CORE.
- Rename is_truenas_scale() to is_truenas() with FreeBSD detection
- Add FreeBSD rc.d service script generation with placeholder substitution
- Add FreeBSD bootstrap script for Init/Shutdown task persistence
- Split install/uninstall paths by OS throughout the TrueNAS block
- Add --cacert <path> flag for custom CA bundles (wired to curl only,
not passed to the agent binary)
- Fix --cacert incorrectly mapping to --insecure in exec args
- Fix missing closing quote on RCSCRIPT_LINK in FreeBSD bootstrap
- Fix unreachable echo after exit 0 in FreeBSD bootstrap
Co-authored-by: wilddev65 <wilddev65@users.noreply.github.com>
(cherry picked from commit affdbaeebaf2b1135431b232593122f464c6bb53)
Backport of v6 commits a87c9950 and 347d7db1.
Part 1 (a87c9950): Wrap the four guest agent c.get() errors with
fmt.Errorf("guest agent ...: %w", err) so isVMSpecificError() correctly
scopes them to the VM rather than the cluster endpoint.
Part 2 (347d7db1): Replace the 20+ pattern blocklist in
executeWithFailover with an allowlist via isEndpointConnectivityError().
Only true TCP/DNS/TLS failures mark an endpoint unhealthy. Any HTTP
response from Proxmox — including 500 — proves the node is reachable
and returns the error without affecting endpoint health.
Windows VMs and VMs without qemu-guest-agent triggered an uncached
GetVMRRDData HTTP call on every poll cycle. Add vmRRDMemCache using the
same read-through cache pattern as nodeRRDMemCache (shared rrdCacheMu,
same TTL, same cleanup path).
(cherry picked from commit 582f16004a0f275de4c458e5d288be70eee613e4)
Add per-container custom URL inputs to the expanded Docker agent row in
Settings → Agents. URLs are stored via DockerMetadataAPI with the key
format {hostId}:container:{containerId}, matching the backend migration
that preserves URLs across container image updates.
(cherry picked from commit 0babb2e546a0a1edaf3210073e3d5b6b7655239b)
#1197: Add Custom URL input to the expanded host row in Settings → Agents.
Loads existing URL via HostMetadataAPI on row expand; saves on button click.
Only shown for host-type agent rows.
#1210: Fix agent_connected always false for Docker hosts on Proxmox VMs.
connectedAgentHostnames now also marks Docker host hostnames reachable when
their matching VM/LXC has a node with a connected Proxmox agent, mirroring
the routing logic already used in the control path.
#1267/#1269: Improve Proxmox auto-registration failure logging. Response body
is now included in the error message, and the warning directs users to delete
the state file to force re-registration rather than claiming the node exists.
(cherry picked from commit 305f6d3c94f0da4fc970450a6304da57d6d7fe80)
#1266: ageFormatted already includes 'ago' from formatTimeDiff(); remove the
duplicate literal suffix from the backup age tooltip in GuestRow.tsx.
#1143: Remove /var/lib/docker from specialMountPrefixes so real block devices
mounted there are visible in disk usage. Container overlay layers (fstype=overlay)
are already filtered by virtualFSTypes and are unaffected.
(cherry picked from commit 5acef3405d4288f627788675123e266d661c2fe3)
Linux VM page cache (#1270): QEMU VM memory now falls back to Proxmox
RRD's memavailable metric (which excludes reclaimable page cache) when
the qemu-guest-agent doesn't provide MemInfo.Available. Previously the
fallback was detailedStatus.Mem (total - MemFree), inflating usage to
80%+ on VMs with normal Linux page cache. Mirrors the existing LXC
rrd-memavailable path.
FreeBSD ZFS ARC (#1264, #1051): The host agent now reads
kstat.zfs.misc.arcstats.size via SysctlRaw on FreeBSD and subtracts
the ARC size from reported memory usage. ZFS ARC is reclaimable under
memory pressure (like Linux SReclaimable) but gopsutil counts it as
wired/non-reclaimable, causing false 90%+ memory alerts on TrueNAS
and FreeBSD hosts. Build-tagged so it compiles cleanly on all platforms.
Fixes#1270Fixes#1264Fixes#1051
(cherry picked from commit 94502f83ff9ffc6da28aaadc946a2f7d8b4e9bac)
Full-width setting resets (#1130): loadFromServer() was skipping the
server sync when localStorage already had a value. Server is now always
the source of truth on session load, so the saved preference propagates
across browsers and after reinstalls.
Webhook test 403 after save (#1273): selectService() was unconditionally
resetting headerInputs to the service template defaults, stripping any
user-added auth headers (e.g. Authorization: Bearer for ntfy.sh) when
editing an existing webhook. Now only resets headers when creating a new
webhook (when !editingId()).
Fixes#1130Fixes#1273
(cherry picked from commit f9b62f0ae726785ef3134c24bc24d99e71d11d7e)
Docker container URL preserved on update (#1054): container updates
recreate the container with a new runtime ID. The agent now includes
{oldContainerId, newContainerId} in the completion ACK payload; the
server uses this to copy persisted metadata (custom URLs, descriptions,
tags) to the new ID so nothing is lost. Migration is a copy, not a move,
so rollback scenarios still find metadata under the original ID.
Reduce metrics.db write amplification (#1124): add a UNIQUE index on
(resource_type, resource_id, metric_type, timestamp, tier) so rollup
reprocessing after a failed checkpoint uses INSERT OR IGNORE instead of
creating duplicate rows. Existing duplicates are deduplicated once on
startup if the index creation would otherwise fail. Also sets
wal_autocheckpoint(500) to checkpoint the WAL more frequently, preventing
unbounded WAL growth.
Fixes#1054Fixes#1124
FreeBSD auto-update (#1254): determineArch() now includes freebsd in its
OS switch, producing freebsd-amd64/arm64 instead of falling through to
a uname -m fallback that incorrectly returned linux-<arch>. FreeBSD agents
were downloading Linux ELF binaries and failing to exec them.
Docker rootless socket (#1200): buildRuntimeCandidates() now probes
/run/user/<uid>/docker.sock before the system-wide /var/run/docker.sock,
enabling auto-detection of Docker rootless installations.
Duplicate PVE/PBS hosts (#1245, #1252): handleSecureAutoRegister() now
deduplicates by host URL, updating the existing instance's token in-place
instead of appending a duplicate entry on each re-run of the setup script.
Fixes#1254Fixes#1200Fixes#1245Fixes#1252
(cherry picked from commit 0f1d9e9b9fea6c8b9e65872e8a78e25f93653eef)
The enable/disable toggle PUT sends back the flat list-response shape
(no nested oidc/saml objects). handleUpdateSSOProvider was unmarshaling
this directly, leaving OIDC and SAML as nil and overwriting all stored
credentials on every toggle.
Now preserves existing sub-config objects when the incoming payload omits
them, matching the existing ClientSecret preservation behaviour.
Fixes part of #1255
(cherry picked from commit 44868e99d66aa157f5c62d100151a6f8bc940205)
r.ssoConfig was never loaded from persistence in NewRouter(), so on every
restart all SSO providers were silently discarded (handleListSSOProviders
would reinitialize to an empty config on the first request).
Also adds ssoProviders to /api/security/status so the login page can
render SAML/OIDC login buttons for enabled providers.
Fixes part of #1255
(cherry picked from commit 395cd101ff4acb1b7f89ec3d907b84cbec217dc8)
TriggerPatrolForAlert was enqueuing into adHocTrigger regardless of
whether Patrol was enabled. With patrolLoop not running (disabled),
nothing drained the channel — it filled on the 10th alert and spammed
"Patrol trigger queue full, dropping trigger" on every subsequent alert.
Read p.config.Enabled in the same RLock as triggerManager and return
early when disabled.
Fixes#1258
(cherry picked from commit 69f399469538f0c9cd59084f6429fed8a793c042)
Recovery (all-clear) notifications were being silently suppressed during
quiet hours for any non-critical alert. Since powered-off alerts default
to Warning level, users who received an alert at 2pm would never get the
recovery notification if the VM came back during quiet hours.
Quiet hours are intended to suppress noisy firing alerts, not to hide
the fact that an issue has resolved. If you got the alert, you should
always get the all-clear.
Remove the ShouldSuppressResolvedNotification gate from handleAlertResolved.
The notifyOnResolve toggle (explicit user preference) is still respected.
Fixes#1259
When syscall.Exec() fails after the binary has already been atomically
replaced on disk, the old process would log an error and keep running
indefinitely with stale code. The next update check (1 hour later) sees
the on-disk version matches the server and skips the update — so the
restart is never retried.
Now the agent exits with code 1 when this happens, allowing systemd (or
any service manager) to restart it with the new binary. This fixes the
"temperature broken after each upgrade" reports where users had to
manually reinstall the agent after every Pulse server upgrade.
Fixes#1247
The SAML route registration (bee3d05f) was incomplete: the auth
middleware uses exact-match for public paths, so /api/saml/{id}/login
etc. would be blocked. Add prefix-based auth bypass for /api/saml/
paths and update route inventory tests for both SSO and SAML routes.
VACUUM creates a full copy of the database. Running retention first
deletes stale data (5GB → ~60MB live), so the VACUUM copies far less
data — faster startup and much less temporary disk space needed.
The auto_vacuum(INCREMENTAL) pragma from the previous commit only takes
effect on new databases. SQLite requires a full VACUUM to restructure
existing files when switching from NONE to INCREMENTAL. Without this,
users upgrading from bloated 5GB+ databases would never reclaim space.
Adds a one-time migration on startup that detects the current auto_vacuum
mode and runs VACUUM to convert if needed. Subsequent startups skip the
migration since the mode is already INCREMENTAL.
The SAML handler functions existed but were never registered in
setupRoutes(), causing 404s for all SAML authentication flows.
Adds /api/saml/ prefix route with dispatcher for all 5 endpoints.
The SSO handler functions and frontend were implemented but the HTTP
routes were never registered in setupRoutes(), causing 404 on all
/api/security/sso/providers endpoints.
Fixes#1248
When two standalone (non-clustered) PVE hosts share the same storage (NFS,
etc.), both instances see the same backup files during polling. Each instance
creates its own StorageBackup entry, causing guests with the same VMID on
different hosts to incorrectly show each other's backups.
Detect shared-storage duplicates by checking if the same volid appears across
multiple instances. When it does AND the VMID is ambiguous (exists on multiple
instances), skip the backup in SyncGuestBackupTimes rather than guessing which
instance owns it. This uses the same ambiguity pattern already applied to PBS
backups.
Fixes#1177