Commit graph

4826 commits

Author SHA1 Message Date
rcourtman
0e8c8d51ca fix(ai): add fallback default model when Ollama model is empty
When model is not explicitly set in config or request, fall back to
llama3 to prevent 'model is required' errors from Ollama.
2025-12-15 16:59:51 +00:00
rcourtman
8687d69242 fix(ai): normalize Ollama base URL to prevent 405 errors
Users sometimes enter URLs with trailing slashes or include the /api path:
- http://host:11434/  -> would become http://host:11434//api/chat
- http://host:11434/api -> would become http://host:11434/api/api/chat

Now we strip trailing slashes and /api suffix during client initialization.

Fixes #847
2025-12-15 16:51:52 +00:00
rcourtman
47674f1d55 Add sponsor button to repo 2025-12-15 16:27:10 +00:00
rcourtman
3c134ff4b8 fix(ci): pass explicit version to demo server update
Previously the workflow ran install.sh without --version, which caused it
to download the latest stable release instead of the target release tag.

This was causing the demo server to downgrade from RC versions to stable
when triggered via workflow_dispatch.
2025-12-15 16:11:49 +00:00
rcourtman
f7ec1842c0 Auto-update Helm chart version to 5.0.0-rc.2 2025-12-15 14:46:45 +00:00
rcourtman
6931b9bb05 Auto-update Helm chart documentation 2025-12-15 14:46:41 +00:00
rcourtman
0fd5cb4643 perf(ci): use amd64-only for preflight staging images
Skip arm64 QEMU emulation in preflight tests - staging images are only
used for integration tests which run on amd64. This cuts ~20-30 minutes
off the release pipeline.

Multi-arch Docker images are still built in the final release job via
publish-docker.yml.
2025-12-15 14:27:34 +00:00
rcourtman
1f131b8a14 fix(sensor-proxy): correctly skip selfheal regeneration during selfheal runs
The PULSE_SENSOR_PROXY_SELFHEAL env var is set to "1", but the check
was only looking for "true", causing selfheal to regenerate itself on
every run. This meant the cached installer would overwrite the selfheal
script with its (potentially older) version, defeating any fixes in
the selfheal script.

Now correctly checks for both "true" and "1".

Related to #849
2025-12-15 14:06:40 +00:00
rcourtman
d5c20556db chore: bump version to 5.0.0-rc.2 2025-12-15 13:38:45 +00:00
rcourtman
7dd036bcda fix(sensor-proxy): skip selfheal reinstall when service is healthy
The selfheal timer was running the full installer every 5 minutes even when
the service was already running and healthy. This caused unnecessary:
- Pulse service restarts
- Config migrations
- Socket setup
- 172MB memory spikes and 15s CPU usage per run

Now the selfheal exits early after checking service health, only proceeding
to reinstall logic if the service is actually failing.

Fixes #849
2025-12-15 13:31:18 +00:00
rcourtman
f18bf62bd3 fix(ai): use configured provider's default model when no model set
When a user configures only Ollama (or any single provider) via the
multi-provider UI without explicitly selecting a model, GetModel() now
returns that provider's default model instead of falling back to the
legacy Provider field which defaults to "anthropic".

This fixes "API key is required for anthropic" errors when enabling AI
with only Ollama configured.

Related to #847
2025-12-15 11:18:05 +00:00
rcourtman
3594fc692a fix(sensor-proxy): skip prerelease versions in selfheal update check
The selfheal timer was downloading prerelease versions (e.g., v5.0.0-rc.1)
even when users had the stable channel selected. This happened because
fetch_latest_release_tag() didn't filter out prereleases from the GitHub
releases API response.

Now both the Python-based parser and the grep fallback skip prereleases:
- Python: checks `release.get("prerelease")` field
- Grep fallback: filters out -rc, -alpha, -beta patterns

Related to #849
2025-12-15 11:03:32 +00:00
rcourtman
75915a561b fix(ai): send Ollama URL when using default localhost address
The form pre-fills the Ollama URL field with http://localhost:11434 but
compared it against that same default value to detect changes. This meant
users configuring Ollama with the default URL never actually saved it.

Also shows actual backend error messages instead of generic "Failed to update"
when toggling the AI enable switch.

Related to #847
2025-12-15 10:03:15 +00:00
rcourtman
fed0a9d89b fix(ai): allow enabling AI when any provider is configured
The enable validation was using the legacy single-provider model which
checked settings.Provider and settings.APIKey. Users configuring Ollama
via the new multi-provider UI (setting ollama_base_url) couldn't enable
AI because settings.Provider defaulted to "anthropic" which required an
API key.

Now checks GetConfiguredProviders() first - if any provider is configured
(Anthropic, OpenAI, DeepSeek, or Ollama), AI can be enabled.

Related to #847
2025-12-15 09:43:17 +00:00
rcourtman
5c6d1798a8 fix(sensor-proxy): prevent installer hang on control-plane sync
Fixes #819

The installer was hanging at 'Pending control-plane sync detected' because
systemctl start ran synchronously, waiting for the selfheal service to complete.
If the control-plane sync failed or took a long time (e.g., Proxmox node not
configured in Pulse yet), the installer would hang indefinitely.

Changed to use 'systemctl start --no-block' to run the selfheal service
asynchronously in the background. The sync will still be attempted, but the
installer will complete immediately and show the success message.
2025-12-15 08:39:06 +00:00
rcourtman
afad679ffd fix(sensor-proxy): add timeouts to pmxcfs operations in installer
The container config backup and pct commands could hang indefinitely
when the Proxmox cluster filesystem (pmxcfs) is slow or unresponsive.
This caused the installer to appear to hang after printing
"Configuring socket bind mount..." with no further output.

Added timeout protection to:
- Container config backup cp operation
- pct status check
- pct config verification
- Config rollback cp operation

Related to #738
2025-12-15 07:04:14 +00:00
rcourtman
6ca6f34577 fix(agent): stop running agent before TrueNAS reinstall to avoid "text file busy"
On TrueNAS, the runtime binary may be in /root/bin or /var/tmp while
the install script only checked INSTALL_DIR (/data/pulse-agent).
This left the running process using the binary when the script tried
to copy a new version, causing "Text file busy" errors.

Now explicitly stop the service and kill any pulse-agent processes
before modifying binaries on TrueNAS systems.

Related to #846
2025-12-15 04:03:06 +00:00
rcourtman
b15097331b docs: fix Docker socket mount path for standalone sensor proxy
The standalone installer creates the socket at /mnt/pulse-proxy on the host,
not /run/pulse-sensor-proxy. Updated documentation to show the correct mount:
  /mnt/pulse-proxy:/run/pulse-sensor-proxy:ro

Related to #822
2025-12-15 02:08:15 +00:00
rcourtman
97655d3abd fix(ui): add CPU tooltip to Proxmox Overview. Related to #816 2025-12-15 02:06:12 +00:00
rcourtman
204219ab7f fix(agent): use /etc/machine-id in LXC containers to avoid ID collisions
LXC containers share the host's /sys/class/dmi/id/product_uuid, which
causes gopsutil to return identical HostIDs for all LXC containers on
the same physical host. This results in agent ID collisions where
multiple LXC containers appear as a single host in Pulse.

The fix detects LXC containers and prefers /etc/machine-id (which is
unique per container) over gopsutil's HostID.

Related to #773
2025-12-14 23:05:32 +00:00
rcourtman
7fac53a7a0 fix(ui): apply threshold-based coloring to memory bars
Fixes #828

Memory bars were always showing green regardless of usage percentage.
Added getMemoryColor() function that applies threshold-based coloring:
- Green: Below 70%
- Yellow/Orange: 70-90%
- Red: Above 90%

This matches the existing disk bar behavior and restores the expected
visual warning for high memory usage.
2025-12-14 22:11:07 +00:00
rcourtman
383c4e21f3 fix(docker): truncate long image names in container table
Fixes #825

Long Docker image names (like Immich postgres containers) were expanding
the entire table row, requiring horizontal scrolling to see CPU/Memory.

Changed the image cell to use:
- Inline style max-width (more reliable in table layouts)
- Block span with truncate class
- overflow-hidden on parent container

Images now truncate with ellipsis at 200px, with full name in tooltip.
2025-12-14 22:05:38 +00:00
rcourtman
ab73ee85d7 fix(ui): stabilize type badge order in unified agents table
Fixes #773

The type badge order was flapping between 'Host Docker' and 'Docker Host'
because Array.from(new Set([...])) doesn't preserve insertion order reliably.

Added a sortTypes helper that ensures types are always displayed in a
consistent order: 'host' before 'docker'. This prevents visual flapping
even when the underlying data sources update at slightly different times.
2025-12-14 21:51:06 +00:00
rcourtman
9e847e2a02 fix(install): add retry logic and better error output for script download
Addresses #827

- Added 3-retry logic with 2-second delays between attempts
- Increased timeout from 15s to 30s for slower connections
- Show actual curl error instead of suppressing stderr
- Provide workaround instructions (download manually then run)
- Show the URL being downloaded for easier debugging
2025-12-14 21:38:59 +00:00
rcourtman
758560ee69 fix(sensor-proxy): add --proxy-url flag for manual URL override
Closes #826

The error messages suggested using --proxy-url but the flag was never
implemented. This adds the flag so users can manually specify the
proxy URL when:
- Auto IP detection produces malformed results
- The desired IP is not the primary IP
- Multi-homed hosts need a specific interface
2025-12-14 21:15:55 +00:00
rcourtman
d12ab31703 feat(docker-agent): add payload size logging for debugging body-too-large errors
Related to #823

- Log payload size (in KB and bytes) at debug level
- Warn when payload approaches 400KB (512KB limit)
- Helps diagnose 'request body too large' errors
2025-12-14 21:10:06 +00:00
rcourtman
40b272bb24 fix: Quick tips banner incorrectly states 0 disables alerts, should be -1
The Quick tips banner on the Alerts page said "Set any threshold to 0" but
the actual code uses -1 to disable alerts. The tooltips on individual fields
were correct; only the top-of-page banner was wrong.

Related to #843
2025-12-14 19:08:24 +00:00
rcourtman
ae70020290 fix: update UnifiedAgents test for 'Force Docker' label change 2025-12-14 18:11:13 +00:00
rcourtman
8bea6c6b99 fix: prevent race conditions in release workflows
- Remove 'release: published' triggers from publish-docker, promote-floating-tags, and helm-pages workflows
- All these workflows now only run via workflow_dispatch, triggered by create-release.yml in sequence
- Add image availability check in promote-floating-tags to wait for Docker images
- create-release.yml now dispatches: publish-docker, promote-floating-tags, helm-pages, update-demo-server
- This prevents the race condition where workflows triggered by release event run before Docker images are ready
2025-12-14 18:07:46 +00:00
rcourtman
130eff34db feat: add draft_only option to release workflow for review before publishing 2025-12-14 17:16:03 +00:00
rcourtman
50246ef5cb fix: add is_prerelease to workflow outputs for downstream jobs 2025-12-14 17:07:18 +00:00
rcourtman
12ef347912 chore: prepare for v5.0.0-rc.1 release
- Update VERSION to 5.0.0-rc.1
- Add prerelease detection to create-release workflow
- Mark RC releases as prereleases on GitHub (not 'latest')
- Update publish-docker workflow to skip :latest tag for RCs
- Support -rc.N, -alpha.N, and -beta.N version suffixes
2025-12-14 16:23:40 +00:00
rcourtman
2e06f6b966 feat: auto-detect platforms during agent install and allow multi-host tokens
- Install script now auto-detects Docker, Kubernetes, and Proxmox
- Platform monitoring is enabled automatically when detected
- Users can override with --disable-* or --enable-* flags
- Allow same token to register multiple hosts (one per hostname)
- Update tests to reflect new multi-host token behavior
- Improve CompleteStep and UnifiedAgents UI components
- Update UNIFIED_AGENT.md documentation
2025-12-14 16:21:59 +00:00
rcourtman
397871629c fix: cluster-aware guest deduplication and multi-agent token binding
- Add cluster-aware guest ID generation (clusterName-VMID instead of instanceName-VMID)
  to prevent duplicate VMs/containers when multiple cluster nodes are monitored

- Add cluster deduplication at registration time - when a node is added that belongs
  to an already-configured cluster, merge as endpoint instead of creating duplicate

- Add startup consolidation to automatically merge duplicate cluster instances

- Change host agent token binding from agent GUID to hostname, allowing:
  - Multiple host agents to share a token (each bound by hostname)
  - Agent reinstalls on same host without token conflicts

- Remove 12-character password minimum requirement

- Remove emoji from auto-registration success message

- Fix grouped view node lookup to support both cluster-aware node IDs
  (clusterName-nodeName) and legacy guest grouping keys (instance-nodeName)

Fixes duplicate guests appearing when agents are installed on multiple
cluster nodes. Also improves multi-agent UX by allowing shared tokens.
2025-12-14 10:16:17 +00:00
rcourtman
d9e2f8c80d feat: display linked host agent badge on PVE nodes
- NodeGroupHeader shows '+ Host Agent' badge when node has linked agent
- NodeSummaryTable shows '+Agent' badge in the node row
- Filter out linked hosts from Managed Agents list (prevents duplication)

This provides visual feedback when a PVE node has enhanced monitoring
via an installed host agent.
2025-12-13 23:18:31 +00:00
rcourtman
9b6c13159f feat: frontend support for linked host agents and PVE nodes
- Added linkedHostAgentId to Node interface
- Added linkedNodeId/linkedVmId/linkedContainerId to Host interface
- Filter out linked hosts from 'Managed Agents' list (they show merged with nodes)

Next: Update Dashboard to display linked entity badges
2025-12-13 23:16:48 +00:00
rcourtman
5e2939b6bd feat: link host agents to PVE nodes by hostname to prevent duplication
When a host agent registers, it now searches for a PVE node with a
matching hostname and links them together. Similarly, when PVE nodes
are discovered, they check for existing host agents with matching hostnames.

This prevents the confusion of seeing duplicate entries when users install
agents on PVE cluster nodes that were already discovered via the cluster API.

- Added LinkedHostAgentID field to Node struct
- Added LinkedNodeID/LinkedVMID/LinkedContainerID fields to Host struct
- Added findLinkedProxmoxEntity() to match by hostname (with domain stripping)
- Updated UpdateNodesForInstance() to preserve and auto-set links
2025-12-13 23:14:00 +00:00
rcourtman
b41c74bbeb feat: auto-generate new token after each install command copy
Users no longer need to manually click 'New Token' - a fresh token is
automatically generated after each copy. The command shown is always
ready for the next host.
2025-12-13 23:00:58 +00:00
rcourtman
a01284427c fix: clarify one-token-per-host requirement and improve stale agent cleanup
- Setup wizard now clearly states 'Use a different token for each host'
- New Token button is always visible and more prominent (blue styling)
- WebSocket store now clears stale agents (>60s since lastSeen) immediately
  when receiving empty host arrays, instead of waiting for 3 consecutive updates
2025-12-13 22:57:00 +00:00
rcourtman
deb940fd7b fix: apply empty agent updates immediately when existing agents are stale
The anti-flapping logic previously required 3 consecutive empty updates
before clearing stale agents from the UI. Now if all existing hosts/dockerHosts
have lastSeen > 60 seconds ago, empty updates are applied immediately.

This fixes the issue where removed agents stayed visible for too long.
2025-12-13 22:41:15 +00:00
rcourtman
ee659fd645 fix: Unraid uninstall now cleans up legacy agents from go script
The previous fix added legacy cleanup for systemd/macOS but missed the
Unraid-specific section. Now removes pulse-host-agent and pulse-docker-agent
entries from /boot/config/go and cleans up /boot/config/pulse directory.
2025-12-13 22:31:50 +00:00
rcourtman
0198cf1a82 fix: clarify that one token works on multiple hosts
Changed misleading 'Each copy uses a unique token' to 'Use the same
command on multiple servers'. Also toned down the 'Different Token'
button since it's rarely needed.
2025-12-13 22:22:25 +00:00
rcourtman
e7524d0264 feat: thorough uninstall cleans up legacy agents and all artifacts
The --uninstall flag now removes:
- Unified pulse-agent (service, binary, logs)
- Legacy pulse-host-agent (service, binary, logs)
- Legacy pulse-docker-agent (service, binary, logs)
- Agent state directory (/var/lib/pulse-agent)
- All related log files

Works on Linux (systemd), macOS (launchd), and other supported platforms.
2025-12-13 21:44:00 +00:00
rcourtman
2ea2b54738 feat: make uninstall command always visible in Agents settings
Previously the uninstall command was hidden behind the token generation step.
Now it's always visible at the bottom of the 'Add a unified agent' card.
2025-12-13 21:41:27 +00:00
rcourtman
2fce9d2e1e chore: update lockfiles 2025-12-13 21:30:57 +00:00
rcourtman
ed76030b03 style: replace emojis with text indicators in UI components
- SecurityWarning: replace emoji icons with text-based indicators
- NodeModal: minor formatting cleanup
2025-12-13 21:30:44 +00:00
rcourtman
a66a2ea43b fix: add null safety checks for resources() in useResourcesAsLegacy
Prevents errors when resources array is undefined during initial load.
2025-12-13 21:30:36 +00:00
rcourtman
0146495c0e feat: add API token to WebSocket URL for authentication
WebSocket connections can't send custom headers, so the token is passed
as a query parameter. Works with the backend change to support ?token= auth.
2025-12-13 21:30:27 +00:00
rcourtman
e153f329d7 perf: debounce search and optimize filter parsing in Backups and Storage
- UnifiedBackups: memoize parsed search filters, debounce search input
- Storage: debounce search input to prevent jank during rapid typing
2025-12-13 21:29:45 +00:00
rcourtman
e6d07c3294 style: remove emojis from log messages
Replaced emoji icons with plain text for cleaner logs and cross-platform compatibility.
2025-12-13 21:29:11 +00:00