Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-28 03:20:11 +00:00

Author	SHA1	Message	Date
rcourtman	8a43a964b6	fix(ai): wire patrol circuit breaker on first-time configure	2026-03-13 12:10:14 +00:00
rcourtman	ae2edbde20	fix(ai): complete wiring on first-time configure; guard Ollama fallback Three follow-up fixes: 1. RestartAIChat() now performs the full post-start wiring (MCP providers, patrol adapter, investigation orchestrator) when the service starts for the first time via Restart(). Previously these were only wired via StartAIChat(), leaving first-time configure with a partially wired service. 2. The Ollama→OpenAI-compatible fallback in createProviderForModel is now guarded by !strings.HasPrefix(modelStr, "ollama:") so explicit "ollama:llama3" models are never silently rerouted to a different provider. 3. Windows install script registration check now uses the $Hostname override (if set) instead of always looking up $env:COMPUTERNAME, so post-install verification works correctly when a custom hostname is specified.	2026-03-13 12:06:08 +00:00
rcourtman	6b317f08d2	fix(agent): add --hostname support to Windows PowerShell install script Adds $Hostname / $env:PULSE_HOSTNAME parameter so users can set a custom display name at install time, matching the Linux install.sh behaviour. Persists to config.json and passes --hostname to the agent binary args. Closes discussion #818	2026-03-13 11:54:12 +00:00
rcourtman	e137f3fbf7	fix(ai): start chat service on first-time configure without restart When Pulse starts before AI is configured, legacyService is nil. Saving AI settings called Restart() which bailed immediately on the nil check, leaving the service unstarted (503 on /api/ai/sessions) until a full process restart. Merged the nil and !IsRunning checks so first-time configure now starts the service inline, same as the already-handled stopped case. Also: bare model names that ParseModelString routes to Ollama (e.g. "qwen3-omni") now fall back to a configured custom OpenAI base URL when Ollama is not explicitly configured — handles manually-typed model names on self-hosted OpenAI-compatible endpoints. Fixes #1339, #1296	2026-03-13 11:13:27 +00:00
rcourtman	fde4d9124e	fix(frontend): defer discovery tab initialization until opened Some checks failed Build and Test / Secret Scan (push) Has been cancelled Details Build and Test / Frontend & Backend (push) Has been cancelled Details Core E2E Tests / Playwright Core E2E (push) Has been cancelled Details	2026-03-10 23:14:30 +00:00
rcourtman	d05a00b931	fix(monitoring): smooth transient VM memory fallback spikes	2026-03-10 23:06:17 +00:00
rcourtman	40a85175be	fix(frontend): preserve drawer chart range across live updates	2026-03-10 22:56:30 +00:00
rcourtman	afcfb23a30	fix(monitoring): retain intermittent FreeBSD SMART data	2026-03-10 22:52:25 +00:00
rcourtman	1a582ccc35	fix(diagnostics): honor PVE fingerprint in diagnostics probe	2026-03-10 22:46:12 +00:00
rcourtman	92b6da83ea	Refine tooltip labels: Reclaimable cache, Shown in Proxmox Some checks failed Build and Test / Secret Scan (push) Waiting to run Details Build and Test / Frontend & Backend (push) Waiting to run Details Core E2E Tests / Playwright Core E2E (push) Waiting to run Details Helm CI / Lint and Render Chart (push) Has been cancelled Details	2026-03-10 10:35:19 +00:00
rcourtman	9601afb44c	Rename Cache to Reclaimable and add Proxmox reconciliation in tooltip Rename the amber segment label from "Cache" to "Reclaimable" to avoid jargon confusion. Add a "Proxmox view: X%" line in the tooltip so users immediately see why the percentage differs from Proxmox (which includes reclaimable cache as used memory).	2026-03-10 10:26:50 +00:00
rcourtman	7dab977d91	Add split memory bar showing Used \| Cache \| Free segments (#1302 ) Show reclaimable buff/cache as a distinct amber segment between used (green) and free (gray) in the memory bar. This explains why Pulse's memory percentage differs from Proxmox: Pulse reports cache-aware usage (MemAvailable) while Proxmox includes cache as used (Total-Free). Backend: add Cache field to Memory model, derived from MemInfo (Available - Free). Only uses MemInfo.Free (not FreeMem fallback) to avoid inflating cache by the balloon gap on ballooned VMs. Frontend: StackedMemoryBar renders three segments with tooltip breakdown. Tooltip Free accounts for balloon limit when active. Percentage label and alerts remain cache-aware (unchanged).	2026-03-10 10:16:14 +00:00
rcourtman	5498575b8f	Auto-update Helm chart documentation	2026-03-09 22:25:17 +00:00
rcourtman	83d3e3e95e	Bump version to 5.1.23	2026-03-09 21:49:21 +00:00
rcourtman	7a394ed724	Use explicit success flag for disk carry-forward guard (#1319 ) Replace the diskUsage <= 0 heuristic with a diskFromAgent bool that is only set when the guest agent actually returns valid filesystem data. Prevents carry-forward from firing on a genuine 0% disk reading.	2026-03-09 18:54:27 +00:00
rcourtman	9c279732f7	Skip disk carry-forward when guest agent is explicitly disabled (#1319 ) Prevents stale disk data from persisting indefinitely in the efficient poller when a user disables the guest agent after it had been providing data. Matches the fallback poller's agent-disabled exclusion.	2026-03-09 18:37:38 +00:00
rcourtman	abbd0df609	Fix disk metric spikes when guest agent intermittently fails (#1319 ) Carry forward previous cycle's disk data when the QEMU guest agent times out or errors, instead of falling back to Proxmox cluster/resources which always reports 0 for VM disk usage. Applied to both polling paths (pollVMsAndContainersEfficient and pollVMsWithNodes) with safety guards against uint64 underflow and permanent-failure exclusions.	2026-03-09 18:23:15 +00:00
rcourtman	a4b0771974	Prevent removed host agents from resurrecting via in-flight reports (#1331 ) Host agents removed from the UI would reappear on the next report cycle because there was no rejection mechanism — unlike Docker agents which already had resurrection prevention. Mirror the Docker agent pattern: - Track removed host IDs in a `removedHosts` map with 24hr TTL - Persist removal records in `State.RemovedHosts` for frontend display - Reject reports from removed hosts in `ApplyHostReport()` - Add `AllowHostReenroll()` + API route to clear the block - Show removed host agents in the Settings UI with "Allow re-enroll" - Sync removed-agent maps from state on startup for all agent types - Fix mock integration snapshot missing `RemovedDockerHosts` field	2026-03-09 17:52:34 +00:00
rcourtman	9b531c547d	Fix recovery notifications silently disabled by config PUT (#1332 ) Two fixes for missing recovery/resolved notifications: 1. API config PUT handler now preserves notifyOnResolve when the client omits it from the request body. Go decodes a missing bool as false, which silently disabled recovery notifications on older clients. 2. CancelAlert now always cleans up the cooldown record even when the alert has already left the pending buffer, preventing stale cooldown entries from suppressing future alert cycles.	2026-03-09 11:28:28 +00:00
rcourtman	572520ebc6	Promote guest-agent /proc/meminfo fallback for accurate VM memory (#1270 ) Move the guest-agent file-read of /proc/meminfo earlier in the memory fallback chain so it runs before RRD, giving real-time MemAvailable that correctly excludes reclaimable buff/cache on Linux VMs. Also add VM.GuestAgent.FileRead permission for PVE 9 and fix install.sh to use comma-separated privilege strings.	2026-03-09 10:04:28 +00:00
rcourtman	aa139b73fb	Fix intermittent VM disappearance from dashboard (#555 ) Two root causes: (1) When Proxmox cluster/resources returns a partial response (e.g. during migration or transient API issue), VMs missing from a responsive node were silently dropped because the node appeared in nodesWithResources, bypassing grace-period preservation. Now preserves recently-seen guests from online nodes for up to the grace window. (2) The task queue allowed overlapping polls for the same PVE instance — a slower stale poll could overwrite a newer complete VM list. Added per-instance execution lock to skip duplicate scheduled tasks.	2026-03-08 22:16:24 +00:00
rcourtman	d560de15ad	Increase alerts test cleanup sleep to fix flaky test under load The 10ms goroutine drain pause was insufficient under full parallel test suite load, causing intermittent failures in TestPulseMonitorOnlySkipsDispatchButRetainsAlert.	2026-03-08 22:16:24 +00:00
rcourtman	98c9de7c91	Fix FreeBSD SMART disk detection for ada/da/nvd devices (#1254 ) FreeBSD disk discovery now falls back to scanning /dev for ada, da, nvd, nda and other FreeBSD disk names when kern.disks misses them. Probe order prefers the correct device type first (sat for ada, nvme for nvd). Standby disks are preserved as valid results instead of being dropped.	2026-03-08 22:16:24 +00:00
rcourtman	82c615b3b9	Filter virtual disks from SMART checks to prevent false positives (#1329 ) ZFS zvols (zd*), device-mapper, virtio disks, and other virtual block devices don't support SMART and were being reported as FAILED. Use lsblk JSON metadata to filter by device prefix, transport, subsystem, and vendor/model. Also treat missing smart_status as unknown rather than failed, and ignore UNKNOWN health in Patrol/AI signals.	2026-03-08 22:16:24 +00:00
rcourtman	f66aa66e74	Auto-update Helm chart version to 5.1.22	2026-03-08 12:27:01 +00:00
rcourtman	015a33ba13	Auto-update Helm chart documentation	2026-03-08 12:27:00 +00:00
rcourtman	43864ffb95	Bump version to 5.1.22	2026-03-08 11:51:17 +00:00
rcourtman	45b5c8a861	Restore previous license on persistence failure instead of clearing it If license save fails, the in-memory license was being cleared, which could drop a valid existing license. Now snapshots the current license before activation and restores it if persistence fails.	2026-03-08 11:49:26 +00:00
rcourtman	fe0706f614	Fix cluster double-registration invalidating Proxmox credentials (#1319 ) Two nodes in the same PVE cluster generated identical Proxmox API token names, so the second node's setup rotated the shared token and broke the first node. Include the hostname in the token name so each node gets its own token. Also refresh the stored cluster credential on the server when a new endpoint merges into an existing cluster entry.	2026-03-07 22:36:01 +00:00
rcourtman	ff1bbe2fb8	Guard per-VM guest agent calls with timeout and panic recovery (#1319 ) A broken or hung qemu-agent on one VM could stall the entire polling loop, preventing higher-VMID VMs from being detected. Wrap all guest agent work in a 10s per-VM budget with panic recovery, and add a 2s timeout to GetVMStatus in the efficient poller to match the legacy path.	2026-03-07 22:30:18 +00:00
rcourtman	0dd3fc779b	Fix alert disable notification suppression Some checks failed Build and Test / Secret Scan (push) Has been cancelled Details Build and Test / Frontend & Backend (push) Has been cancelled Details Core E2E Tests / Playwright Core E2E (push) Has been cancelled Details	2026-03-07 18:40:08 +00:00
rcourtman	d6e8bffaeb	pulse/license upgrade safety hardening	2026-03-07 15:13:09 +00:00
rcourtman	a6f6f66078	Improve auto-register auth errors and setup token grace window (#1319 ) Some checks are pending Build and Test / Secret Scan (push) Waiting to run Details Build and Test / Frontend & Backend (push) Waiting to run Details Core E2E Tests / Playwright Core E2E (push) Waiting to run Details The /api/auto-register endpoint returned a generic "Invalid or expired setup code" for all auth failures, making cluster registration issues impossible to diagnose. Now returns specific errors for expired tokens, wrong scope, invalid API tokens, etc. Also extend the setup token grace window to /api/auto-register so multiple cluster nodes can register with the same token within the 1-minute grace period after first use.	2026-03-07 13:39:26 +00:00
rcourtman	c0b3a0e665	Restart Pulse service after failed auto-update (#1323 ) The auto-update flow stops the Pulse service before applying updates. If the update fails, the rollback path restored files but never restarted the service. Since the main unit was explicitly stopped (not crashed), systemd's Restart=always didn't rescue it. Add restart-on-failure guards to both pulse-auto-update.sh and install.sh so Pulse is always restarted after a failed update attempt.	2026-03-07 10:46:19 +00:00
rcourtman	64f3bfa922	Bump dompurify to 3.3.2 to fix XSS vulnerability (Dependabot #64 ) DOMPurify 3.1.3–3.3.1 has an XSS vulnerability via missing rawtext element sanitization. Bump to 3.3.2 which includes the fix.	2026-03-07 10:46:12 +00:00
rcourtman	ddecf6d00c	Guard legacyMonitor typed-nil and add OIDC refresh panic recovery Normalize SystemSettingsMonitor interface assignments via reflect to prevent typed-nil-in-interface (same class as #1324 fix). Also add defer/recover to the background OIDC token refresh goroutine so a panic there cannot take down the process.	2026-03-07 10:21:07 +00:00
rcourtman	23a9fa70da	Fix nil pointer crash when saving settings (#1324 ) SystemSettingsHandler.mtMonitor was an interface field. A nil MultiTenantMonitor stored in it became a non-nil interface (Go typed-nil-in-interface), bypassing the nil guard in getMonitor() and panicking on every settings save in single-tenant mode. Change mtMonitor to concrete monitoring.MultiTenantMonitor so nil checks work correctly. Also resolve getMonitor() once per request instead of repeated calls to eliminate a TOCTOU race.	2026-03-07 10:21:07 +00:00
rcourtman	4ea2f49771	Auto-update Helm chart version to 5.1.21 Some checks failed Build and Test / Secret Scan (push) Waiting to run Details Build and Test / Frontend & Backend (push) Waiting to run Details Core E2E Tests / Playwright Core E2E (push) Waiting to run Details Helm CI / Lint and Render Chart (push) Has been cancelled Details	2026-03-06 12:15:39 +00:00
rcourtman	c26a96ef51	Auto-update Helm chart documentation	2026-03-06 12:15:38 +00:00
rcourtman	01bf637d0d	Fix QNAP agent duplicate processes during upgrades (#1317 ) Add singleton watchdog with lock dir, pidfile tracking, and signal traps to prevent multiple pulse-agent instances spawning on QNAP. Tighten procfs matching to avoid killing unrelated processes.	2026-03-06 11:40:53 +00:00
rcourtman	9244498b75	Bump version to 5.1.21	2026-03-06 11:05:01 +00:00
rcourtman	89577fe533	Fix OIDC token refresh bypass and guard AISettingsHandler nil path The applyAuthContextHeaders early-return in CheckAuth skipped the OIDC token refresh block, causing long-lived OIDC sessions to expire instead of auto-refreshing. Move the refresh trigger into extractAndStoreAuthContext so it fires at the middleware level before CheckAuth's early return. Also add a nil guard on mtPersistence in AISettingsHandler.GetAIService for non-default org paths, preventing a potential panic if background code carries a non-default org context in v5 single-tenant mode.	2026-03-06 11:05:01 +00:00
rcourtman	743ef17b79	Fix AI and config profile handlers broken in v5 single-tenant mode The single-tenant lockdown (`499ab812e`) set mtPersistence to nil but only patched AISettingsHandler with a legacy fallback. AIHandler (chat service) and ConfigProfileHandler were missed, so AI features (Patrol, Chat) failed with "chat service not available" and config profiles would panic on nil dereference. Wire legacy persistence into both handlers and add the same fallback to ProfileSuggestionHandler. Fixes #1322	2026-03-06 11:05:01 +00:00
rcourtman	73bf2c1c7b	Auto-update Helm chart version to 5.1.20 Some checks are pending Build and Test / Secret Scan (push) Waiting to run Details Build and Test / Frontend & Backend (push) Waiting to run Details Helm CI / Lint and Render Chart (push) Waiting to run Details Core E2E Tests / Playwright Core E2E (push) Waiting to run Details	2026-03-06 00:33:13 +00:00
rcourtman	4c5fbb0c04	Auto-update Helm chart documentation	2026-03-06 00:33:12 +00:00
rcourtman	6618db7799	Fix v5 single-tenant router test setup	2026-03-05 23:58:11 +00:00
rcourtman	ed8283b223	Bump version to 5.1.20 Some checks are pending Build and Test / Secret Scan (push) Waiting to run Details Build and Test / Frontend & Backend (push) Waiting to run Details Core E2E Tests / Playwright Core E2E (push) Waiting to run Details	2026-03-05 23:46:35 +00:00
rcourtman	499ab812e3	Fix post-release regressions and lock v5 to single-tenant runtime	2026-03-05 23:46:35 +00:00
rcourtman	464d3f8486	Fix stale queued notification delivery	2026-03-05 23:46:35 +00:00
rcourtman	74ce77132b	Auto-update Helm chart version to 5.1.19 Some checks are pending Build and Test / Secret Scan (push) Waiting to run Details Build and Test / Frontend & Backend (push) Waiting to run Details Helm CI / Lint and Render Chart (push) Waiting to run Details Core E2E Tests / Playwright Core E2E (push) Waiting to run Details	2026-03-05 11:21:24 +00:00

1 2 3 4 5 ...

3186 commits