Commit graph

2199 commits

Author SHA1 Message Date
rcourtman
a569aeb1d4 Auto-update Helm chart documentation 2026-01-01 11:12:05 +00:00
rcourtman
0c87357fe4 Prepare v5.0.8 release 2026-01-01 10:27:42 +00:00
rcourtman
9a4ab102e5 fix: Handle 'in_progress' status in command acknowledgements. Related to #988 2026-01-01 10:17:50 +00:00
rcourtman
94717ba867 feat(agent): add --docker-runtime flag for podman/docker selection
On systems where Docker compatibility layer obscures Podman (like CoreOS),
the auto-detection can fail. Users can now force the runtime:

  --docker-runtime podman
  PULSE_DOCKER_RUNTIME=podman

Valid values: auto (default), docker, podman

Related to Discussion #958
2026-01-01 00:24:37 +00:00
rcourtman
e3b3785582 feat(agent): add option to disable Docker update checks
Add PULSE_DISABLE_DOCKER_UPDATE_CHECKS environment variable and
--disable-docker-update-checks flag to disable Docker image update
detection. This is useful for:
- Avoiding Docker Hub rate limits
- Users who don't want update notifications in their dashboard

Related to Discussion #982
2026-01-01 00:20:49 +00:00
rcourtman
567a4ad147 fix(replication): fetch status from per-node endpoint
The /cluster/replication endpoint only returns job configuration (guest,
schedule, source, target), not status data (last_sync, next_sync,
duration, fail_count, state).

This fix enriches each replication job with status from the per-node
endpoint /nodes/{node}/replication/{id}/status to get timing and state
data needed for proper UI display.

Added integration tests to verify:
- Status endpoint is called and data is merged correctly
- Graceful handling when status endpoint fails

Fixes #992
2025-12-31 23:58:06 +00:00
rcourtman
724362504e fix: Add SELinux context restoration for Fedora/RHEL systems. Related to #996
On SELinux-enforcing systems (Fedora, RHEL, CentOS), binaries installed to
non-standard locations need proper security contexts for systemd to execute
them. Without this, systemd fails with 'Permission denied' even when the
binary has correct Unix permissions.

Changes:
- Add restore_selinux_contexts() function to both install scripts
- Uses restorecon (preferred) or chcon (fallback) to set bin_t context
- Only runs when SELinux is detected and enforcing
- Called after binary installation, before systemd service start
2025-12-31 23:12:53 +00:00
rcourtman
c1f4b8f40b feat: PULSE_DISK_EXCLUDE now applies to SMART monitoring. Related to #983
Previously, the PULSE_DISK_EXCLUDE environment variable and --disk-exclude
flag only filtered mount points in the hostmetrics collector. This change
extends the exclusion to SMART data collection.

Changes:
- Updated smartctl.CollectLocal() to accept diskExclude patterns
- Added matchesDeviceExclude() for block device pattern matching
- Patterns support: exact match (sda), prefix (nvme*), contains (*cache*)
- Updated hostagent to pass DiskExclude to SMART collector
- Added comprehensive tests for pattern matching
- Updated documentation
2025-12-31 23:07:01 +00:00
rcourtman
3a7e26f42f fix: Temperature text color now respects configured thresholds. Related to #984
Previously, the TemperatureGauge component used hardcoded thresholds
(critical: 80°C, warning: 70°C) for text coloring. Now it uses the
user-configured temperature threshold from alert settings.

Changes:
- Add getTemperatureThreshold() helper to alertsActivation store
- Pass critical/warning props to TemperatureGauge in NodeSummaryTable
- Warning is set to (threshold - 5°C) matching the hysteresis pattern
2025-12-31 23:00:36 +00:00
rcourtman
d804471889 fix: Unified agent Docker module now uses same agent ID as host module
When running as a unified agent (pulse-agent with --enable-docker), the
Docker module was using a different fallback chain for agent ID than the
host module. In unified mode with empty machineID, the Docker module fell
back to daemonID while the host module fell back to hostname. This caused
the server to reject Docker reports with 'token already in use by agent'
errors because the same API token was bound to different agent IDs.

The fix ensures that in unified mode, the Docker module uses the exact
same fallback chain as the host module: machineID -> hostname. The daemonID
fallback is only used in standalone mode for backward compatibility.

Fixes #985, #986
2025-12-31 10:35:00 +00:00
rcourtman
652854af00 fix: Reduce Docker image size by avoiding duplicate binary copies
The runtime stage was copying both amd64 and arm64 pulse binaries to /tmp/,
then selecting one based on TARGETARCH and deleting the rest. Due to Docker's
immutable layers, the deleted binaries were still counted toward image size.

Changed to copy directly using TARGETARCH variable substitution, which only
copies the needed binary for the target architecture.

This saves ~34MB per architecture in the final image.

Note: The agent_runtime stage and /opt/pulse/bin/ download binaries still have
room for optimization, but require more complex changes.

Related to #981
2025-12-31 10:29:38 +00:00
rcourtman
3796408f04 fix: Preserve alert acknowledgement for long-standing alerts during backup
When a powered-off VM is backed up by Proxmox, the alert briefly disappears
as the VM status changes. The previous fix (3830e701) preserved ackState when
alerts were removed, but the cleanup TTL was measured from the acknowledgement
time. For alerts acknowledged > 1 hour ago (common for intentionally powered-off
VMs), the ackState was immediately considered stale and deleted when cleanup ran.

The fix adds an inactiveAt timestamp to track when an alert was removed, and
uses this time for the cleanup TTL instead of the acknowledgement time. This
ensures acknowledgement state is preserved for at least 1 hour after the alert
disappears, regardless of when it was originally acknowledged.

Related to #980
2025-12-31 09:49:11 +00:00
rcourtman
efe3fca534 Auto-update Helm chart version to 5.0.7 2025-12-31 00:27:48 +00:00
rcourtman
c46887f89b Auto-update Helm chart documentation 2025-12-31 00:27:47 +00:00
rcourtman
2b68bfe6ee fix: Update tests for RAID alerting md0/md1 skip and AI gating message change
- RAID tests now use /dev/md2 since md0/md1 are skipped for Synology compatibility
- AI handler tests now expect 'AI is not enabled' message after AI gating change
2025-12-30 23:39:55 +00:00
rcourtman
336714c610 chore: Bump version to 5.0.7 2025-12-30 23:26:59 +00:00
rcourtman
dc65b96e6d fix: Improve light theme contrast for low I/O values. Related to #976
Changed text-gray-300 to text-gray-500 for I/O values under 1 MB/s in light mode.
The previous color was barely visible against the white background.
2025-12-30 22:35:39 +00:00
rcourtman
ed6c3d9c93 fix: Prevent acknowledged alerts from retriggering notifications. Related to #975
dispatchAlert() now checks if an alert is already acknowledged before sending
notifications. Previously, acknowledged alerts (especially backup-age alerts)
would continue to dispatch notifications every poll cycle because the
acknowledgement check was missing from the dispatch path.

The fix adds an early return in dispatchAlert() when alert.Acknowledged is true,
matching the existing checks for flapping, activation state, and quiet hours.
2025-12-30 22:20:47 +00:00
rcourtman
5c2db10780 fix: Gate AI subsystems and intelligence endpoints on AI enabled state. Related to #885
- Add IsAIEnabled() method to AISettingsHandler for consistent checks
- Gate baseline learning, pattern detector, and correlation detector initialization
  in StartPatrol() on AI being enabled
- Add AI enabled checks to all /api/ai/intelligence/* endpoints as defense-in-depth
- Return empty results with "AI is not enabled" message when AI is disabled

This ensures no AI-related data is collected, persisted, or returned when AI is disabled,
preventing the "undismissable alerts" issue where old AI findings would appear.
2025-12-30 22:06:07 +00:00
rcourtman
59eca65ff6 fix: Wire up LOG_FILE, LOG_MAX_SIZE, LOG_MAX_AGE, LOG_COMPRESS config options. Related to #979
The logging config options were defined but never passed to logging.Init(),
making the documented file-based log rotation non-functional.
2025-12-30 21:49:26 +00:00
rcourtman
83b2e7c8e2 fix: Add Docker log rotation to prevent unbounded log growth. Related to #979 2025-12-30 21:40:19 +00:00
rcourtman
421ddf027a Fix guest URL editing inconsistencies: unify icons and use Portal for popover 2025-12-30 20:09:58 +00:00
rcourtman
0caf39456d fix: Show copy buttons always visible in Updates settings. Fixes #960. Changed from opacity-0 (hover to show) to opacity-60 (always visible, brighten on hover) for better discoverability. 2025-12-30 12:41:24 +00:00
rcourtman
d744b6f495 fix: Show unsaved changes warning immediately when typing in Network settings. Fixes #959. Changed Public URL and CORS fields to use onInput instead of onChange for immediate feedback. 2025-12-30 12:36:51 +00:00
rcourtman
c22dac5d8d fix: Docker container memory now subtracts inactive_file on cgroup v2 systems. Fixes container memory reporting to match 'docker stats' output by excluding reclaimable filesystem cache. Related to #435 2025-12-30 12:31:57 +00:00
rcourtman
f855625f65 feat: Add full-width mode toggle for wider views on large monitors. Related to #974 2025-12-30 12:20:44 +00:00
rcourtman
a62d7dc78d fix: Improve host agent matching for temperature transport badge. Related to #971
The findMatchingHostAgent function now uses the proper linkedHostAgentId
relationship from the backend state, falling back to hostname matching only
if no linked agent is found. This fixes the 'Agent required' badge showing
incorrectly when agents are running and collecting data.
2025-12-30 09:50:40 +00:00
rcourtman
1cbe0691e3 feat(ui): implement alert overrides for backups/snapshots and add unsaved changes warning (fixes #961, fixes #959) 2025-12-30 08:46:48 +00:00
rcourtman
5cd6224997 fix(install): align bootstrap token path and data directory config (fixes #962) 2025-12-30 00:28:16 +00:00
rcourtman
b17891fd02 feat(ui): add 'Checking updates...' indicator for Docker hosts 2025-12-30 00:28:10 +00:00
rcourtman
065a59316f fix(alerts): respect per-guest backup and snapshot overrides (fixes #961) 2025-12-30 00:28:05 +00:00
rcourtman
56cb913a51 fix: Improve Kubernetes detection and add --kubeconfig flag to installer
- Search for kubeconfig in /home/*/.kube/config in addition to /root/.kube/config
- Add --kubeconfig installer flag to specify custom kubeconfig path
- Auto-detect and pass kubeconfig path to agent when Kubernetes is enabled
- Respect KUBECONFIG environment variable when kubectl is working

Related to discussion #968
2025-12-29 23:48:17 +00:00
rcourtman
6b2ec32ab3 chore: Ignore local agent startup script 2025-12-29 23:39:18 +00:00
rcourtman
df3ff171b9 fix: Honor DisableAutoUpdate config and disable Docker disk metrics by default 2025-12-29 23:37:30 +00:00
rcourtman
6ac5e3ebfe chore: Clean up build scripts and remove unused Docker agent entry point 2025-12-29 23:37:16 +00:00
rcourtman
e42cbe38f0 test: Improve discovery and Docker agent test coverage 2025-12-29 23:37:10 +00:00
rcourtman
4225f905b0 feat: Add manual Docker update check button. Related to #955 2025-12-29 23:37:05 +00:00
rcourtman
03e9f98ab6 fix: Exclude autofs mount type from disk counts. Related to #942 2025-12-29 23:36:58 +00:00
rcourtman
e6477a6998 fix: Resolve manifest lists for correct update detection. Related to #955 2025-12-29 17:36:16 +00:00
rcourtman
060334375f Fix docker hosts toast usage 2025-12-29 17:26:47 +00:00
rcourtman
c6bd8cb74c Improve internal package test coverage 2025-12-29 17:25:21 +00:00
rcourtman
d07b471e40 Refactor Docker agent: metrics collection, security checks, and batch updates
- Separated metrics collection into internal/dockeragent/collect.go
- Added agent self-update pre-flight check (--self-test)
- Implemented signed binary verification with key rotation for updates
- Added batch update support to frontend with parallel processing
- Cleaned up agent.go and added startup cleanup for backup containers
- Updated documentation for Docker features and agent security
2025-12-29 17:20:18 +00:00
rcourtman
d38a37fe3d fix: Improve useResourcesAsLegacy reactivity with array spreading
When returning legacy arrays directly from store state, the createMemo
wouldn't notify dependents because the same array reference was returned
even when its contents changed (via reconcile). The spread operator
creates a new array reference each time, ensuring downstream memos
like sortedHosts() properly re-run.

This is a defense-in-depth fix complementing commit f5c0af77 which
fixed HostRow destructuring. Together these ensure reliable real-time
updates on the Hosts page.

Related to #949
2025-12-29 16:27:58 +00:00
rcourtman
0c3ebef312 feat: Add Fahrenheit temperature unit option. Related to #957
- Add temperature unit preference in Settings > General
- Store preference in localStorage (temperatureUnit: 'celsius' | 'fahrenheit')
- Create formatTemperature() utility that respects the preference
- Update all temperature displays across the UI:
  - TemperatureGauge component
  - HostsOverview page (sensors, disk temps)
  - DiskList (SMART temperatures)
  - ThresholdSlider
  - ThresholdBadge
  - ThresholdsTable
  - ProxmoxNodesSection
  - NodeSummaryTable tooltips
  - Alert formatters
2025-12-29 16:16:30 +00:00
rcourtman
5ad1f5e847 feat: Merge linked host agent SMART temps into Physical Disks
When a host agent is running on a Proxmox node (linked host agent),
merge the agent's SMART disk temperature data into the Physical Disks
view for that node. This allows disk temps collected by pulse-agent
to populate the Physical Disks page without requiring Proxmox SMART
monitoring to be enabled.

Matching is done by WWN (most reliable), serial number, or device path.

Closes part of issue #909 (follow-up from MichiFr)
2025-12-29 15:39:20 +00:00
rcourtman
d377a5c464 fix: Pass SMART disk temperatures to frontend. Related to #941
The SMART disk temperature data was being collected by the agent but not
passed through to the frontend. Fixed by:

1. Added SMART field to HostSensorSummaryFrontend and created
   HostDiskSMARTFrontend type in models_frontend.go
2. Updated hostSensorSummaryToFrontend() in converters.go to include
   SMART data conversion
3. Added HostDiskSMART interface to frontend TypeScript types
4. Updated HostTemperatureCell to display disk temperatures in tooltip
   with a 'Disk Temperatures' section and fallback to SMART temps when
   no CPU/sensor temps are available
2025-12-29 15:27:46 +00:00
rcourtman
0a20eed07a fix: Normalize URL to prevent double-slash in agent download. Related to #956
Strip trailing slashes from PULSE_URL to prevent URLs like
http://host:7655//download/pulse-agent which incorrectly match
the frontend route instead of the download endpoint.
2025-12-29 14:57:28 +00:00
rcourtman
277aca3e4e fix: Only log 'Migration complete' when inline allowed_nodes actually migrated. Related to Discussion #946
The sensor proxy self-heal script runs every 5 minutes and calls migrate-to-file.
Previously it would print 'Migration complete' every time, even when already in
file mode with nothing to migrate.

Now migrateInlineToFile returns a boolean indicating if migration actually
occurred, and the CLI only prints the message when work was done.
2025-12-29 14:15:57 +00:00
rcourtman
4ce1d551e4 fix: Deduplicate disks by device+total to fix Synology storage overcounting. Related to #953
Synology NAS creates multiple shared folders (e.g., /volume1/docker, /volume1/photos)
that are all mount points on the same underlying BTRFS volume. Each reported the same
16TB total, causing Pulse to show 64TB+ instead of 16TB.

The fix tracks device+total combinations and only counts each unique pair once.
When duplicates are found, the shallowest mountpoint (e.g., /volume1) is preferred.

Added a unit test to verify the deduplication works correctly.
2025-12-29 14:03:32 +00:00
rcourtman
fd1f94babf fix: AI Commands toggle now updates immediately in UI. Related to #952
Previously, toggling AI Commands in the Agents view would show a pending state
and wait for the agent to confirm the change (up to 2 minutes). If the agent
was slow to report or the WebSocket update was missed, the toggle would appear
stuck.

Now, UpdateHostAgentConfig also updates the Host model in state immediately,
providing instant UI feedback. The agent will still receive the config on its
next report, but users see the change right away.

Added SetHostCommandsEnabled function to models.State for this purpose.
2025-12-29 13:56:29 +00:00