Commit graph

4826 commits

Author SHA1 Message Date
rcourtman
cf44b0cca6 polish: Improve update detection edge cases and UX
- Add GHCR (GitHub Container Registry) token support for public images
- Clean up dockerUpdateFirstSeen tracking when containers are removed
- Improve UpdateIcon tooltip to show digest info
- Add cursor-help to indicate hoverable tooltip
2025-12-27 18:14:27 +00:00
rcourtman
5148040ac4 feat: Wire up /api/infra-updates endpoints for infrastructure update detection
- Add routes for infrastructure update detection API:
  - GET /api/infra-updates - list all container updates with filtering
  - GET /api/infra-updates/summary - aggregated stats per host
  - GET /api/infra-updates/host/{hostId} - updates for specific host
  - GET /api/infra-updates/{resourceId} - specific resource update status
  - POST /api/infra-updates/check - trigger update check (placeholder)

- Update handlers to query Docker container updates from monitor state
- Protected by auth and monitoring_read scope
2025-12-27 18:07:10 +00:00
rcourtman
b50872b686 feat: Implement unified update detection system (Phase 1)
Docker container image update detection with full stack implementation:

Backend:
- Add internal/updatedetection package with types, store, registry checker, manager
- Add registry checking to Docker agent (internal/dockeragent/registry.go)
- Add ImageDigest and UpdateStatus fields to container reports
- Add /api/infra-updates API endpoints for querying updates
- Integrate with alert system - fires after 24h of pending updates

Frontend:
- Add UpdateBadge and UpdateIcon components for update indicators
- Add updateStatus to DockerContainer TypeScript interface
- Display blue update badges in Docker unified table image column
- Add 'has:update' search filter support

Features:
- Registry digest comparison for Docker Hub, GHCR, private registries
- Auth token handling for Docker Hub public images
- Caching with 6h TTL (15min for errors)
- Configurable alert delay via UpdateAlertDelayHours (default: 24h)
- Alert metadata includes digests, pending time, image info
2025-12-27 17:58:38 +00:00
rcourtman
39941a3927 fix(agent): use IP that can reach Pulse for registration
When a Proxmox host has multiple network interfaces (management, Ceph,
cluster ring), the agent would use heuristic scoring to pick an IP,
which could select an isolated network instead of the management network.

Now the agent first determines which local IP is actually used to connect
to the Pulse server, ensuring registration uses a reachable IP. Falls back
to the heuristic scoring if connection-based detection fails.

Related to #929
2025-12-27 17:06:20 +00:00
rcourtman
eff4adda49 fix: deduplicate Ceph clusters by FSID before sending to frontend
When the same Ceph cluster is reported from multiple sources (PVE API
and host agent), it showed up twice in the UI. Now we deduplicate by
FSID before converting to frontend format, keeping the cluster entry
with the most complete data (most monitors/managers/pools reported).

Related to #928
2025-12-27 17:03:17 +00:00
rcourtman
81718fcdaa fix(agent): use specific distro name instead of family for osName
Ubuntu was showing as "debian 24.04" because we used PlatformFamily
(which is "debian" for all Debian derivatives) instead of Platform
(which is "ubuntu" for Ubuntu).

Now uses Platform first, falling back to PlatformFamily only if empty.

Related to #927
2025-12-27 15:59:03 +00:00
rcourtman
174a6cad5e Auto-update Helm chart version to 5.0.5 2025-12-27 13:36:27 +00:00
rcourtman
0d387a4718 Auto-update Helm chart documentation 2025-12-27 13:36:26 +00:00
rcourtman
e0325e5cf9 fix(ci): test multi-arch Docker build in preflight before releasing
Previously, preflight only built amd64 images, so multi-arch failures
(like the QEMU timeout in 5.0.5) weren't caught until after the
release was published.

Now preflight builds linux/amd64,linux/arm64 staging images. If
multi-arch build fails, the release pipeline stops before publishing.

Combined with the Dockerfile fix (forcing amd64 for build stages),
this ensures Docker build issues are caught early.
2025-12-27 13:25:01 +00:00
rcourtman
f4cf28e75d fix(docker): force amd64 platform for build stages to avoid QEMU timeout
The multi-arch Docker build was timing out after 1 hour because
the backend-builder stage was running under QEMU emulation for arm64.
Building 31 Go binaries under QEMU is extremely slow.

Since Go cross-compiles all target architectures anyway, and the
frontend build produces platform-independent JS, there's no need
to run these stages under QEMU. Force them to linux/amd64 and let
Go handle cross-compilation natively.

Only the runtime stages need to be multi-arch (for the correct
Alpine base image and binary selection).
2025-12-27 13:21:53 +00:00
rcourtman
fc6d6d22d3 ui: enhance Docker update documentation in Settings
Show the actual update commands with copy buttons for Docker installations,
making it easier for Docker users to update. Includes both docker run and
docker-compose variants.

This addresses audit finding that Docker updates need better documentation
since they can't auto-update like LXC/systemd deployments.
2025-12-27 11:13:36 +00:00
rcourtman
b562a77f84 chore: bump version to 5.0.5 2025-12-27 11:07:53 +00:00
rcourtman
1dff90817f fix: detect duplicate nodes by IP resolution during agent auto-register. Related to #924
When an agent registers using an IP address, check if any existing node's
hostname resolves to that same IP. This prevents duplicates when a node
was manually configured via hostname and later the agent is installed
which registers using the host's IP.

Changes:
- Add extractHostIP() to extract IP from URL if present
- Add resolveHostnameToIP() with 2s timeout for DNS resolution
- During agent auto-registration, check if existing hostname-based
  configs resolve to the new IP and update instead of creating duplicates
- Add test for extractHostIP helper function
2025-12-27 11:02:00 +00:00
rcourtman
861be84f8c fix(agent): improve backward compat for PBS-only hosts. Related to #925
The legacy state file could represent either PVE or PBS registration,
depending on what was installed at the time. Now we check what's
currently installed to determine the correct behavior:
- If PVE is installed: legacy file means PVE was registered
- If PBS-only (no PVE): legacy file means PBS was registered
2025-12-27 10:46:51 +00:00
rcourtman
e0b6c12736 fix(install): clear all Proxmox state files on reinstall. Related to #925 2025-12-27 10:44:35 +00:00
rcourtman
0865ca3512 feat(agent): detect and register both PVE and PBS on same host. Related to #925
When PBS is installed directly on a PVE host (an officially supported
configuration), the agent now detects and registers BOTH products instead
of only detecting PVE.

Changes:
- Add detectProxmoxTypes() to detect all Proxmox products on a host
- Add RunAll() method to register each detected product separately
- Use per-type state files (proxmox-pve-registered, proxmox-pbs-registered)
  to track registration status for each product independently
- Maintain backward compatibility with legacy single state file
- Add tests for new state file path logic
2025-12-27 10:41:44 +00:00
rcourtman
b27b76ae46 feat: implement agent self-unregistration and UI improvements
- Add DELETE /api/agents/unregister endpoint for agent self-unregistration
- Agent now unregisters itself from Pulse server when uninstalled
- Add clarifying note in UnifiedAgents explaining linked agents behavior
- Linked agents are managed via their PVE node but this is now explained in UI
- Add LastSeen field to HostAgent model for better agent status tracking
2025-12-26 23:20:55 +00:00
rcourtman
8c440b6f54 feat: notify server during agent uninstallation
- Add /api/agents/host/uninstall endpoint for agent self-unregistration
- Update install.sh to notify server during --uninstall (reads agent ID from disk)
- Update install.ps1 with same logic for Windows
- Update frontend uninstall command to include URL/token flags

This ensures that when an agent is uninstalled, the host record is
immediately removed from Pulse and any linked PVE nodes have their
+Agent badge cleared.
2025-12-26 22:38:46 +00:00
rcourtman
22d6c1d8a5 fix: Redirect to GitHub releases for agent binary when not available locally
When the unified agent binary isn't found locally (happens on LXC/barebone
installations that update via web UI which only updates the pulse binary),
redirect to GitHub releases using HTTP 307.

This complements the install.sh GitHub proxy fallback from 7b6613bb.

Related to #909
2025-12-26 20:16:15 +00:00
rcourtman
80cc9b30a1 fix: Add GitHub fallback for install scripts on LXC/barebone updates
When install.sh or install.ps1 don't exist locally (happens on LXC/barebone
installations that were updated via web UI which only updates the binary),
fallback to fetching from GitHub raw content.

Related to #909
2025-12-26 19:49:38 +00:00
rcourtman
4a7306f6b8 fix: Auto-clear stale LinkedHostAgentID references during node updates
When nodes are updated, now validates that LinkedHostAgentID points to
an existing host agent. References to deleted host agents are automatically
cleared, fixing the 'Agent' tag persistence for users who removed agent
entries before commit c394d24.

Related to #920
2025-12-26 19:45:31 +00:00
rcourtman
3435a80503 fix: Docker healthcheck fails with HTTPS enabled
The healthcheck was hardcoded to use HTTP, which fails when
HTTPS_ENABLED=true. Now uses a script that detects the protocol
and uses --no-check-certificate for self-signed certs.

Related to #922
2025-12-26 18:18:16 +00:00
rcourtman
2dc144f8dd feat: Add kiosk mode for clean dashboard display
Add ?kiosk=1 URL parameter to hide the filter panel and section nav
for a cleaner dashboard view. Useful for dedicated LCD displays or
wallboard setups.

Related to #917
2025-12-26 18:06:28 +00:00
rcourtman
cf577e715f fix: Clear node host agent link when agent is removed
When a host agent is deleted via the UI, the LinkedHostAgentID on any
PVE nodes that were linked to it was not being cleared. This caused
the "Agent" tag to persist in the UI after uninstalling the agent.

Related to #920
2025-12-26 17:52:32 +00:00
rcourtman
7f5ea636db fix: Skip webhook re-notifications for acknowledged alerts
Acknowledged alerts were still triggering repeated webhook notifications
because the re-notification logic only checked cooldown period, not
acknowledgment status. Now acknowledged alerts are skipped entirely.

Related to #921
2025-12-26 17:47:28 +00:00
rcourtman
c2cac8e2f0 Auto-update Helm chart version to 5.0.4 2025-12-26 17:29:12 +00:00
rcourtman
ffcc19fdba Auto-update Helm chart documentation 2025-12-26 17:29:11 +00:00
rcourtman
9bd7e31843 fix: Handle 404 response in release existence check 2025-12-26 16:49:37 +00:00
rcourtman
4bcad25433 fix: Make release workflow idempotent for re-runs
- Check if tag exists before creating (skip if pointing to HEAD, fail with
  helpful message if pointing elsewhere)
- Check if draft release exists before creating (update existing draft)
- Add --clobber to all asset uploads to allow re-uploading on retry
2025-12-26 16:26:45 +00:00
rcourtman
0b79fcdfd8 chore: bump version to 5.0.4 2025-12-26 13:28:16 +00:00
rcourtman
03d680365c feat(ui): optimize mobile view for Alerts, Storage, and Navigation
- Implement compact mobile navigation with label truncation on xs screens
- Optimize Alerts Overview with tighter spacing and better description truncation
- Enhance Storage table mobile view: consolidate Shared column, use StatusDots, and increase bar thickness
- Increase Node Summary table row height and column min-widths for readability
- Add xs (400px) breakpoint for granular mobile styling
2025-12-26 13:26:21 +00:00
rcourtman
3eedbff6e6 fix(storage): correct column priority types and setup pre-push hook
- Fix Storage.tsx using number priorities instead of string literals
- Move husky configuration to repository root for proper git hook support
- Add package.json/lock.json to root (un-ignore in .gitignore)
- Configure pre-push hook to run type-check before push
2025-12-26 12:21:37 +00:00
rcourtman
a5d92d5359 fix: Add --disk-exclude support to install script
Users can now pass disk exclusion patterns during agent installation:

  curl ... | bash -s -- --disk-exclude '/mnt/*' --url ... --token ...

The flag is repeatable for multiple exclusion patterns.

Related to #896
2025-12-26 12:11:18 +00:00
rcourtman
beb33883b8 fix(storage): add toggleable flag to column definitions
The columns weren't appearing in the ColumnPicker because they
were missing the 'toggleable: true' property.
2025-12-26 11:57:29 +00:00
rcourtman
f94c11ca34 fix: Allow clearing AI findings when AI is disabled
Users who accumulated AI findings before the patrol-without-AI bug was
fixed (24c4bb0b) could not dismiss them because the AI Insights tab
and Clear All button were only visible when patrol was enabled.

Now the AI Insights tab and Clear All button are visible whenever there
are findings to clear, even if AI/patrol is not enabled.

Related to #885
2025-12-26 11:43:34 +00:00
rcourtman
4d03319566 ci: add pre-push hook to prevent TypeScript CI failures
Adds husky with a pre-push hook that runs type-check before allowing
pushes. This catches the TypeScript errors locally that were causing
repeated CI failures and email spam.

Skip with: git push --no-verify
2025-12-26 11:37:37 +00:00
rcourtman
c1393fbaa7 refactor(dashboard): merge Needs Backup into Problems filter
- Removed redundant 'Needs Backup' toggle from filter bar
- Problems filter already includes backup issues (stale, critical, never)
- Reduces UI clutter while preserving functionality
- Also includes user's lint fixes for Backups tests
2025-12-26 11:13:41 +00:00
rcourtman
83790a4bfc fix(frontend): resolve TypeScript errors causing CI failures
- Remove unused setUseRelativeTime in UnifiedBackups.tsx (TS6133)
- Remove unused testing-library imports in PBSEnhancementBanner.test.ts (TS6133)
- Fix type comparison in test by using union type instead of 'as const' (TS2367)
2025-12-26 11:06:42 +00:00
rcourtman
5c600912f7 refactor(dashboard): smart Reset button matches pattern
Reset button now only shows when filters are active,
consistent with Backups and Storage pages.
2025-12-26 11:03:01 +00:00
rcourtman
c14740a22c refactor(storage): match filter bar to Backups pattern
- Full-width search bar on its own row
- Smart Reset button (only shows when filters active)
- Simplified sort options dropdown
- Consistent flex-wrap layout with gap-x-2 gap-y-2

Maintains consistency across Proxmox pages.
2025-12-26 10:53:11 +00:00
rcourtman
d474e08828 refactor(backups): simplify filter bar layout
- Made search bar full-width and prominent
- Removed redundant Status Filter (can use search)
- Removed redundant Type Filter (overlaps with Source)
- Removed Time Format toggle (rarely used)
- Integrated ColumnPicker inline with filter groups
- Layout now matches Dashboard/Overview pattern

Reduces filter controls from ~10 groups to 5 for cleaner UX.
2025-12-26 10:40:41 +00:00
rcourtman
7e64c2c4b7 feat(backups): add column visibility toggle with Comment column
- Add useColumnVisibility hook to Backups table
- Add ColumnPicker component for toggling columns
- Add Comment column for PBS backups (hidden by default)
- Owner, Size, Storage, Verified columns now toggleable
- Column visibility persisted to localStorage

PBS backups with comments now show the comment when the column
is enabled via the Columns picker.
2025-12-26 10:30:45 +00:00
rcourtman
ca4c2383b6 docs(pbs): add Password Setup method for Docker PBS users 2025-12-26 10:12:49 +00:00
rcourtman
4277aa753c feat(pbs): turnkey PBS setup with password auth
When adding a PBS node with username/password credentials, Pulse now
automatically:
1. Connects to PBS using the provided credentials
2. Creates a 'pulse-monitor@pbs' user with Audit permissions
3. Generates an API token
4. Stores the token instead of the password

This enables one-click PBS setup for Docker/containerized deployments
where you can't easily run the agent installer. Simply enter root@pam
credentials in the UI and Pulse handles the rest.

Falls back to password auth if token creation fails (e.g., old PBS
version or permission issues).
2025-12-26 10:12:04 +00:00
rcourtman
3d671c1824 feat(pbs): add API-based token creation for turnkey PBS setup
- Added PBS client methods: CreateUser, SetUserACL, CreateUserToken
- Added SetupMonitoringAccess() turnkey method that creates user + token
- Updated handleSecureAutoRegister to use PBS API for token creation
- Enables one-click PBS setup for Docker/containerized deployments

When users provide PBS root credentials, Pulse can now create the
monitoring user and API token remotely via the PBS API, eliminating
the need to SSH/exec into the container manually.
2025-12-26 10:08:41 +00:00
rcourtman
325713c8db test: add unit tests for PBS enhancement banner and data source indicator
- Tests hasPBSViaPassthrough detection logic
- Tests hasDirectPBS detection logic
- Tests showPBSEnhancementBanner combined logic (4 scenarios)
- Tests localStorage persistence for banner dismissal
- Tests 'via PVE' indicator logic for PBS backups

All 13 tests passing.
2025-12-26 09:53:29 +00:00
rcourtman
48478e3dcc fix: Host Agent toggle visual state not updating after click
The toggle button in Alerts > Thresholds > Host Agents was working
functionally (alerts were actually disabled) but the visual state wasn't
updating. This was because the hostAgentsWithOverrides memo was missing
the 'disabled' field in its returned Resource objects.

Related to #893
2025-12-26 09:51:53 +00:00
rcourtman
f891b6217e docs: add comprehensive PBS integration guide
- Created new PBS.md with setup instructions for direct PBS connection
- Explains difference between direct PBS API vs PVE passthrough
- Documents three setup methods: agent install, one-click script, manual
- Includes permissions reference and troubleshooting section
- Added link in docs/README.md under Monitoring & Agents
2025-12-26 09:45:42 +00:00
rcourtman
cf9012e385 feat: Add copy uninstall button to Managed Agents table
Users can now copy the uninstall command from the Actions column of any
registered agent, without needing to start the token creation flow.

Related to #914
2025-12-26 09:43:01 +00:00
rcourtman
b1e44c5611 chore: set mock mode default to false 2025-12-26 00:22:12 +00:00