Commit graph

1821 commits

Author SHA1 Message Date
rcourtman
fdb2a07f56 fix(agent): find zpool binary on TrueNAS SCALE (#718)
Enhanced zpool binary lookup to try common paths when exec.LookPath fails.
This fixes issue #718 where TrueNAS SCALE reports inflated storage because
the agent runs with a restricted PATH that doesn't include /usr/sbin.

Changes:
- Added findZpool() helper that tries common paths like /usr/sbin/zpool,
  /sbin/zpool, /usr/local/sbin/zpool for TrueNAS/FreeBSD/Linux systems
- Added commonZpoolPaths variable listing typical zpool locations
- Added tests for the new findZpool function

This ensures zpool list is used for accurate pool-level capacity instead
of falling back to dataset-level summation.
2025-12-18 16:23:56 +00:00
rcourtman
0182cc8310 feat(thresholds): add collapsible accordion sections and UX improvements
- Add CollapsibleSection component with animated expand/collapse
- Wrap all 6 resource sections (Nodes, VMs, PBS, Storage, Backups, Snapshots) with accordion UI
- Add section icons and resource counts in headers
- Add expand all / collapse all buttons for quick navigation
- Make help banner dismissible with localStorage persistence
- Add Ctrl/Cmd+F keyboard shortcut to focus search
- Add keyboard shortcut hint badge on search input
- Add icons to tab navigation for quick identification
- Improve mobile tab labels with shorter text on small screens
- Create reusable components: ThresholdBadge, ResourceCard, GlobalDefaultsRow
- Create useCollapsedSections hook with localStorage persistence
- Default less-used sections (Storage, Backups, Snapshots, PBS) to collapsed
2025-12-18 15:47:44 +00:00
rcourtman
c91307be94 fix: guest URL icon now appears/disappears immediately after AI sets/removes it
The issue was a SolidJS reactivity problem in the Dashboard component.
When guestMetadata signal was accessed inside a For loop callback and
assigned to a plain variable, SolidJS lost reactive tracking.

Changed from:
  const metadata = guestMetadata()[guestId] || ...
  customUrl={metadata?.customUrl}

To:
  const getMetadata = () => guestMetadata()[guestId] || ...
  customUrl={getMetadata()?.customUrl}

This ensures SolidJS properly tracks the signal dependency when the
getter function is called directly in JSX props.
2025-12-18 14:42:47 +00:00
rcourtman
5c9bbf33b6 Merge pull request #856 from BTLzdravtech/wildcard
Adds wildcard support for kube namespace filtering.
2025-12-18 11:30:18 +00:00
rcourtman
cf57cfcb03 Merge pull request #855 from BTLzdravtech/main
Fixes inverted boolean logic in isProblemPod/isProblemDeployment checks and improves init container exit code handling.
2025-12-18 11:30:11 +00:00
rcourtman
f13e9eac69 fix(ui): add CPU tooltip to VM/LXC rows in Proxmox Overview. Related to #816
The previous fix (3b4c77de) only addressed PVE/PBS nodes but missed
guest rows. Now VMs and LXC containers also show CPU details tooltip
on hover using EnhancedCPUBar.
2025-12-18 11:28:09 +00:00
Tomas Hruska
a419b6237a support wildcards --kube-include-namespace/--kube-exclude-namespace 2025-12-18 00:00:30 +01:00
Tomas Hruska
69d693f346 Fix kubernetes logic and init containers detection 2025-12-17 23:47:03 +01:00
rcourtman
210a6f7cc0 monitoring: keep host IDs stable via token+hostname binding 2025-12-17 20:16:27 +00:00
rcourtman
ebc3474647 hostagent: avoid identity collisions with MAC fallback (Related to #836) 2025-12-17 20:09:55 +00:00
rcourtman
3623395549 test(api): allow printable alert IDs for acknowledge (Related to #852) 2025-12-17 20:09:51 +00:00
rcourtman
5338ab580c Stabilize core E2E tests
- Preserve alerts activation state when saving thresholds
- Use compliant default E2E password and deterministic bootstrap token seeding
- Harden Playwright selectors, waits, and diagnostics gating
2025-12-17 19:36:48 +00:00
rcourtman
54fc259221 fix(ai): improve AI settings UX with validation and smart fallbacks
Backend:
- Add smart provider fallback when selected model's provider isn't configured
- Automatically switch to a model from a configured provider instead of failing
- Log warning when fallback occurs for visibility

Frontend (AISettings.tsx):
- Add helper functions to check if model's provider is configured
- Group model dropdown: configured providers first, unconfigured marked with ⚠️
- Add inline warning when selecting model from unconfigured provider
- Validate on save that model's provider is configured (or being added)
- Warn before clearing last configured provider (would disable AI)
- Warn before clearing provider that current model uses
- Add patrol interval validation (must be 0 or >= 10 minutes)
- Show red border + inline error for invalid patrol intervals 1-9
- Update patrol interval hint: '(0=off, 10+ to enable)'

These changes prevent confusing '500 Internal Server Error' and
'AI is not enabled or configured' errors when model/provider mismatch.
2025-12-17 18:30:19 +00:00
rcourtman
c4b893e257 Fix agent download serving wrong architecture binary
When a specific architecture is requested (e.g., linux-arm64), don't fall
back to the generic pulse-agent binary if the requested arch isn't found.
This was causing ARM64 machines to receive x86-64 binaries that can't run.

Now returns 404 with helpful error message if requested architecture
binary is not available.
2025-12-17 17:22:51 +00:00
rcourtman
81c876b786 Simplify agent installation UI - remove Force Docker/K8s/Proxmox checkboxes
- Remove confusing Force Docker, Force Kubernetes, Force Proxmox checkboxes
- Auto-detection handles these platforms; checkboxes were redundant
- Keep Skip TLS verification checkbox (commonly needed for self-signed certs)
- Add Troubleshooting section with --enable-* and --disable-* flags for edge cases
- Update tests to reflect simplified UI
2025-12-17 17:14:41 +00:00
rcourtman
30f01771ac Add meaningful tests for host agent and exec websocket 2025-12-17 17:02:01 +00:00
rcourtman
ab480ca489 fix: Prevent orphaned encrypted data when encryption key is deleted
- crypto.go: Add runtime validation to Encrypt() that verifies the key file
  still exists on disk before encrypting. If the key was deleted while Pulse
  is running, encryption now fails with a clear error instead of creating
  orphaned data that can never be decrypted.

- hot-dev.sh: Auto-generate encryption key for production data directory
  (/etc/pulse) when HOT_DEV_USE_PROD_DATA=true and key is missing. This
  prevents startup failures and ensures encrypted data can be created.

- Added test TestEncryptRefusesAfterKeyDeleted to verify the protection works.
2025-12-17 17:00:53 +00:00
rcourtman
d663ba4342 hostagent: avoid host ID collisions and prefer LAN IP 2025-12-17 16:29:59 +00:00
rcourtman
e44a6fdadd test(envdetect): cover environment detection decisions 2025-12-17 16:08:10 +00:00
rcourtman
71e1b5dc86 test: expand AI provider test coverage with HTTP mocks 2025-12-17 15:53:56 +00:00
rcourtman
47dfa5d703 test: expand cmd and agent update coverage 2025-12-17 13:28:17 +00:00
rcourtman
0ee6e50c8b fix(config): avoid deadlock saving empty nodes config 2025-12-17 13:28:06 +00:00
rcourtman
969fa0e509 test: add unit tests for AI, Kubernetes agent, and clients 2025-12-17 12:47:36 +00:00
rcourtman
b562580764 fix: Skip deprecated pulse-sensor-proxy for v5+ installations
The unified agent now handles temperature monitoring in v5+, making
pulse-sensor-proxy unnecessary. This commit:

1. Adds INSTALLER_MAJOR_VERSION constant to declare bundled version
2. Skips 'Temperature Monitoring Setup' prompts for v5+ installs
3. Skips sensor proxy installation entirely for v5+
4. Updates help text to mark --proxy as deprecated for v5+
5. Removes outdated sensor proxy instructions from completion message

Fixes the 'pct pull TASK ERROR: failed to open /opt/pulse/bin/pulse-sensor-proxy-linux-amd64'
error reported by users installing v5.0.0-rc.3.

Reported-by: RLSinRFV (GitHub Discussion #845)
2025-12-17 12:20:03 +00:00
rcourtman
67bde72c93 Improve test coverage 2025-12-17 12:00:59 +00:00
rcourtman
667119269d feat: Add tool/function calling support to Ollama provider
Fixes issue where Ollama users get 'I'm a large language model, I can't do XYZ'
responses when trying to use the AI assistant. The problem was that the
Ollama provider was not passing tool definitions to the API.

Changes:
- Add Tools field to ollamaRequest struct
- Add ollamaTool, ollamaToolFunction, ollamaToolCall structs
- Convert tools from ChatRequest to Ollama format in Chat()
- Parse tool_calls from Ollama response
- Set StopReason to 'tool_use' when model requests tool execution
- Handle tool results in multi-turn conversations

Requires Ollama v0.3.0+ and a tool-capable model (llama3.1+, mistral-nemo, etc.)

Closes: Discussion #845 comment by misterlegend
2025-12-17 11:54:32 +00:00
rcourtman
b6317fe7b1 feat: add Linux agent archives to releases and improve version mismatch detection
Release improvements:
- Add standalone Linux agent archives (amd64, arm64, armv7, armv6, 386) to releases
- Previously only Darwin/Windows had standalone agent downloads

Agent version mismatch detection:
- New agentVersion.ts utility compares agent versions against server version
- DockerHostSummaryTable now shows accurate 'outdated' warnings when
  agent version is older than the connected Pulse server
- Better tooltips explain version status and upgrade instructions
2025-12-16 22:51:10 +00:00
rcourtman
07c993bfe8 fix: backup matching uses instance+VMID to prevent cross-instance collisions
Previously, SyncGuestBackupTimes matched backups to guests using only VMID.
This caused newly created containers to incorrectly show old backup times
from different containers on other Proxmox instances that happened to have
the same VMID.

Now uses composite key (instance+VMID) for PVE storage backups to ensure
proper isolation. PBS backups still use VMID matching (since they aggregate
from multiple sources) but only as a fallback.

Fixes issue where ollama LXC showed 'last backup 3 months ago' despite
being created yesterday.
2025-12-16 22:19:19 +00:00
rcourtman
a115af6906 feat: Improve cluster endpoint error messages for users
- Add sanitizeEndpointError() to transform raw Go errors into user-friendly messages
- Transform 'context deadline exceeded' into helpful messages mentioning possible causes
- Storage timeout errors now suggest checking PBS/NFS/Ceph backend connectivity
- Connection refused, certificate errors, and auth errors get actionable hints
- Apply sanitization everywhere cluster endpoint lastError is stored
- Add comprehensive tests for all error transformations
2025-12-16 21:50:02 +00:00
rcourtman
df48961d04 test: add regression test for OIDC env vars with nil config (#853)
Adds TestOIDCEnvVarsWithNilConfig to catch the case where OIDC_* env
vars were silently ignored when no oidc.enc file existed. This documents
the proper pattern of initializing OIDCConfig before calling MergeFromEnv.
2025-12-16 21:02:47 +00:00
rcourtman
cdbaa6cbf8 fix: AI model selector dropdown overflow on small screens
Add max-height and overflow-y-auto to the model selector dropdown in
the AI chat panel to prevent it from extending past the screen when
there are many models available (e.g., with Ollama).

Fixes feedback from discussion #845
2025-12-16 20:59:42 +00:00
rcourtman
5591b3006f fix: OIDC env vars ignored when no oidc.enc file exists
When OIDC_* environment variables were set but no oidc.enc config file
existed, cfg.OIDC was nil and MergeFromEnv would silently return without
applying the env vars (due to nil receiver check).

Fix: Initialize cfg.OIDC to default values before merging env vars if
it's nil. This ensures OIDC can be configured purely through environment
variables without requiring a pre-existing config file.

Related to #853
2025-12-16 20:25:56 +00:00
rcourtman
e157aa04ad fix: Allow printable ASCII in alert IDs for acknowledge requests
Reverts overly strict alert ID validation that was rejecting valid
alert IDs containing special characters. Docker host IDs can contain
user-supplied data like hostnames which may include parentheses,
brackets, or other printable ASCII characters.

The previous validation only allowed alphanumeric + limited punctuation,
which caused 400 errors when acknowledging alerts from Docker hosts
with special characters in their identifiers.

Related to #852
2025-12-16 20:10:31 +00:00
rcourtman
944911a4cd Add PNG version of logo for GitHub App 2025-12-16 19:35:43 +00:00
rcourtman
cf44352c83 feat: configurable backup freshness thresholds for dashboard indicator
Adds FreshHours and StaleHours settings to control when the dashboard
backup indicator shows green (fresh), amber (stale), or red (critical).

- Backend: Added FreshHours/StaleHours to BackupAlertConfig (default 24/72 hours)
- Frontend: getBackupInfo() now accepts optional thresholds parameter
- Dashboard/GuestRow components use thresholds from alert config
- Settings saved/loaded with alert configuration

Closes #839
2025-12-16 16:36:08 +00:00
rcourtman
b79d04f734 Add comprehensive AI test coverage
- Add integration tests for Ollama provider (17 tests against real API)
- Add unit tests for baseline, correlation, patterns, memory, knowledge, cost packages
- Add context formatter and builder tests
- Add factory tests for provider initialization
- Add Makefile targets: test-integration, test-all
- Clean up test theatre (removed struct field tests)

Integration tests require Ollama at OLLAMA_URL (default: 192.168.0.124:11434)
Run with: make test-integration
2025-12-16 12:33:06 +00:00
rcourtman
dc156d097c fix: Retry-After header now uses actual limiter window
Previously the Retry-After header was hardcoded to "60" seconds
regardless of the rate limiter's actual window duration. Now uses
the limiter's configured window (e.g., 600 seconds for recovery
endpoints, 300 for exports).

Related to #579
2025-12-16 10:07:16 +00:00
rcourtman
a0b4a981b8 Auto-update Helm chart version to 5.0.0-rc.3 2025-12-16 09:52:51 +00:00
rcourtman
c3a72b24d2 Auto-update Helm chart documentation 2025-12-16 09:52:51 +00:00
rcourtman
e3e8e544f8 chore: bump version to 5.0.0-rc.3 2025-12-16 09:22:50 +00:00
rcourtman
47ced7c97e feat(ui): make AI settings page more compact and user-friendly
- Replace verbose info banner with streamlined layout
- Add collapsible 'Advanced Model Selection' accordion for Chat/Patrol models
- Make AI Patrol Settings section collapsible with inline summary badges
- Compact Cost Controls into single-row inline layout
- Reduce form spacing for tighter presentation
- Remove unused formHelpText import

Also includes:
- OpenAI provider fixes for max_tokens parameters
- Security setup CSRF and 401 fixes
- Minor UI tweaks
2025-12-16 09:20:09 +00:00
rcourtman
5b7445f758 fix(sensor-proxy): remove pct pull fallback that causes PVE task errors
The installer had a fallback that tried to copy the sensor-proxy binary from
inside the Pulse container using `pct pull`. However, the paths it tried
(/opt/pulse/bin/pulse-sensor-proxy-linux-amd64) don't exist in Pulse containers.

When pct pull fails, Proxmox logs a task error visible in the UI, even though
the error is harmless and expected. This caused recurring "TASK ERROR: failed
to open /opt/pulse/bin/pulse-sensor-proxy-linux-amd64: No such file or directory"
messages every time the selfheal timer ran and couldn't download from GitHub.

The pct pull approach was never going to succeed for LXC-installed Pulse instances
since the binary doesn't exist at those paths inside the container. Removing it
eliminates the spurious error messages.

Related to #817
2025-12-16 00:02:34 +00:00
rcourtman
e23accf096 fix(security): show actual backend error instead of misleading DISABLE_AUTH message
The frontend was incorrectly assuming any 401/403 response from the
quick-setup endpoint meant the legacy DISABLE_AUTH flag was present.

This was wrong - the real causes are typically:
- Missing bootstrap token (for first-run setup)
- Invalid bootstrap token
- Authentication required for existing setups

Now the frontend properly displays the actual error message from the
backend, which includes helpful instructions for retrieving the
bootstrap token.
2025-12-15 22:30:24 +00:00
rcourtman
fb01d87b00 fix: strip provider prefix from all AI provider models and instant URL refresh
Backend fixes:
- Strip provider prefix (anthropic:, openai:, deepseek:, ollama:) in all
  provider Chat methods and constructors for robust handling
- Models are now correctly parsed regardless of caller format

Frontend fixes:
- Tool cards now persist in AI chat after approval execution by adding
  to streamEvents array
- Dashboard now listens for pulse:metadata-changed custom event
- AI Chat emits this event when set_resource_url tool completes
- Guest URL icons now update instantly when AI sets them
2025-12-15 19:18:09 +00:00
rcourtman
3d6da91ac0 feat(ai): improve AI settings first-time setup UX
- Add setup modal that appears when enabling AI without configured provider
- Modal allows selecting provider (Anthropic, OpenAI, DeepSeek, Ollama)
- Enter API key/URL and enable AI in one smooth flow
- Reorder backend to apply API keys before enabled check
- Fix Ollama to strip 'ollama:' prefix from model names
- Simplify backend error message for unconfigured providers
2025-12-15 18:59:19 +00:00
rcourtman
0e8c8d51ca fix(ai): add fallback default model when Ollama model is empty
When model is not explicitly set in config or request, fall back to
llama3 to prevent 'model is required' errors from Ollama.
2025-12-15 16:59:51 +00:00
rcourtman
8687d69242 fix(ai): normalize Ollama base URL to prevent 405 errors
Users sometimes enter URLs with trailing slashes or include the /api path:
- http://host:11434/  -> would become http://host:11434//api/chat
- http://host:11434/api -> would become http://host:11434/api/api/chat

Now we strip trailing slashes and /api suffix during client initialization.

Fixes #847
2025-12-15 16:51:52 +00:00
rcourtman
47674f1d55 Add sponsor button to repo 2025-12-15 16:27:10 +00:00
rcourtman
3c134ff4b8 fix(ci): pass explicit version to demo server update
Previously the workflow ran install.sh without --version, which caused it
to download the latest stable release instead of the target release tag.

This was causing the demo server to downgrade from RC versions to stable
when triggered via workflow_dispatch.
2025-12-15 16:11:49 +00:00
rcourtman
f7ec1842c0 Auto-update Helm chart version to 5.0.0-rc.2 2025-12-15 14:46:45 +00:00