When PVE backup polling detects permission errors (403/401/permission
denied), track them per instance and surface them via the scheduler
health endpoint.
The Backups page now fetches instance warnings and displays a banner
when backup permission issues are detected, telling users exactly how
to fix the problem.
Related to #1139
- OAuth endpoints now require settings:write scope (not just admin)
- Approval endpoints now require ai:execute scope
- Added CommandHash to approvals for replay protection
- Approvals are now single-use (consumed on first use)
- consumeApprovalWithValidation validates command matches approval
The unified agent's Proxmox setup was missing the PVEDatastoreAdmin
permission on /storage, causing local PVE backups to not appear in
Pulse's backup overview for users who set up nodes via the agent.
The UI-generated setup script already included this permission, but
the agent path (--enable-proxmox) did not, creating an inconsistency.
Related to #1139
- Add professional cover page with branding and report period
- Add Executive Summary page with health status banner (HEALTHY/WARNING/CRITICAL)
- Add Quick Stats section with color-coded metrics and trend indicators
- Add Key Observations with automated analysis of CPU, memory, disk, and disk wear
- Add Recommended Actions section with prioritized, actionable items
- Add Resource Details page with hardware info, storage pools, physical disks
- Add color-coded tables for alerts, storage, and disk health
- Add performance charts with area fills and proper scaling
- Improve overall visual design with consistent color scheme
- Fix SAML session invalidation to use correct SessionStore method
SSE Broadcaster:
- Add per-client mutex to prevent concurrent writes to ResponseWriter
- Fix data race in cleanupLoop reading LastActive without synchronization
- Update LastActive in SendHeartbeat so clients aren't incorrectly pruned
after 5 minutes of idle heartbeat traffic
Alert Acknowledgements:
- Extract authenticated user from X-Authenticated-User header instead of
hardcoding 'admin' or trusting request body's User field
- Prevents audit log spoofing and ensures accurate user attribution
Security Status Endpoint:
- Remove ?token= query param validation from public /api/security/status
- Prevents endpoint from acting as a token validity oracle for attackers
- Authentication still works via session cookies and X-API-Token header
Security fixes:
- Auto-register now requires settings:write scope for API tokens
- X-Forwarded-For in auto-register only trusted from verified proxies
- Public URL capture requires authentication (no loopback bypass)
- Lockout reset now uses RequireAdmin for session users
Reliability fixes:
- Docker stop command expiration clears PendingUninstall flag
- Cancelled notifications get completed_at set and are cleaned up
MetricsHistory.Cleanup() was defined but never called, and even if called,
it only removed old data points without deleting map entries for deleted
containers/VMs. Each stale entry leaked ~224KB (7 pre-allocated slices).
Changes:
- Call metricsHistory.Cleanup() and rateTracker.Cleanup() in maintenance loop
- Delete map entries entirely when all data points have expired
- Return nil instead of empty slice in cleanupMetrics() to release backing arrays
- Add Cleanup() method to RateTracker with 24-hour stale threshold
- Add debug logging to track cleanup activity
Related to #1153
The discovery state adapter was not copying IPAddresses from the models
when converting VM/Container state. This caused getResourceExternalIP()
to return empty strings, preventing URL suggestion from working.
- Fix SSRF and rate limit bypass in SendEnhancedWebhook by validating the rendered URL.
- Fix rate limit spoofing in updates API by using secure IP extraction (trusted proxies).
- Fix memory leak in metrics history by correctly clearing fully stale data series.
- Fix public URL poisoning by preventing overwrites when explicitly configured.
Add deterministic URL suggestion based on service type and external IP:
- Add SuggestedURL field to ResourceDiscovery type (Go + TypeScript)
- Create url_suggestion.go with 60+ service defaults (Jellyfin, Plex,
Home Assistant, Grafana, Proxmox, etc.)
- Support HTTPS services, custom paths (/web, /dashboard/, /admin)
- Fall back to discovered ports for unknown services
- Add UI in DiscoveryTab with "Use this" button to populate URL input
- Add comprehensive unit tests for URL suggestion logic
Suggestion only appears when no custom URL is saved. User clicks
"Use this" to populate the input, then "Save" to confirm.
- Fix data race in webhook notifications by removing shared state
- Fix duplicate monitors on config reload by stopping old instances
- Prevent metrics ID deletion on transient startup errors
- Support Bearer auth header for config export/import endpoints
- Fix API-only mode to accept Bearer tokens and query params
- Fix data race in API token validation using fine-grained locking
- Fix unified agent download serving wrong binary for invalid arch
- Fix AI infra discovery running when AI disabled and missing stop mechanism
- Fix visual flash in discovery tab
- Standardize table column widths and UI across Docker, Hosts, Storage, etc.
- Add support for new K8s and Host charts
- Fix Service Discovery tests
- Initialize Alert and Notification managers with tenant-specific data directories
- Add panic recovery to WebSocket safeSend for stability
- Record host metrics to history for sparkline support
- Add AI provider indicator showing local (Ollama) vs cloud (Anthropic/OpenAI) analysis
- Add "What Discovery Does" explanation section before first scan
- Show commands preview before scan so users know what will run
- Add scan details section showing raw command outputs for admins
- Filter sensitive Docker labels (passwords, secrets, tokens) before AI analysis
- Add comprehensive tests for label filtering
This improves sysadmin confidence by making discovery transparent about
what it does, what data it collects, and where that data goes.
- HIGH: Create per-request AgenticLoop instead of sharing one across
concurrent sessions. This prevents race conditions where ExecuteStream
calls would overwrite each other's FSM, knowledge accumulator, and
other session-specific state.
- MEDIUM: TriggerManager.GetStatus now recomputes adaptive interval after
pruning old events. Previously, currentInterval could remain stuck in
busy/quiet mode after events aged out of the window.
- MEDIUM: Patrol stream phases are now broadcast to subscribers. Fixed
setStreamPhase() to emit phase events and SubscribeToStream() to send
phase events to late joiners. UI was stuck on 'Starting patrol...'
because phase events were never emitted.
- LOW: Fixed TriggerStatus.CurrentInterval JSON serialization. Changed
from time.Duration (serializes as nanoseconds) to int64 milliseconds
to match the 'current_interval_ms' tag.
- Fix routing for POST/PUT/DELETE on /api/discovery/host/ endpoints
(Go's http.ServeMux was matching the longer prefix before method-specific routes)
- Add HOST-specific AI prompt that focuses on identifying the host OS
rather than services/containers running on it
- Add success message UI after discovery completes
- Fix timing so success appears after data is visible (not during refetch)
- Add error handling and display for failed discoveries
Adds a config migration that ensures MonitorBackups is enabled for PVE
instances, matching the existing PBS migration from issue #411. This fixes
issue #1139 where local PVE backups weren't appearing in the backup overview
because the MonitorBackups field defaulted to false when not explicitly set.
Fixes#1139
When local LLM servers (LM Studio, llama.cpp) receive tool definitions
but the model doesn't support function calling, they output internal
control tokens like <|channel|>, <|im_start|>, etc. instead of proper
responses.
This change detects these control tokens during streaming and returns
a clear error message explaining that the model doesn't support function
calling and recommending compatible models (Llama 3.1+, Mistral, Qwen).
This is better than the previous approach of offering a "disable tools"
option, which would have crippled Pulse Assistant/Patrol functionality.
Users need to use compatible models for the AI features to work properly.
Related to #1154
Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible
APIs but don't support function calling. When tools are sent to these
models, they output raw control tokens instead of proper responses.
This change adds:
- openai_tools_disabled config field in AIConfig
- AreToolsDisabledForProvider() method to check at runtime
- API support to get/set the new setting
- Tests for the new functionality
When enabled and using a custom OpenAI base URL, the chat service will
skip sending tools to the model, allowing basic chat functionality to
work even with models that don't support function calling.
Fixes#1154
Some local models (llama.cpp, LM Studio) output internal control tokens
like <|channel|>, <|constrain|>, <|message|> instead of using proper
function calling. These tokens leak into the UI creating a poor UX.
This adds sanitization to strip these control tokens from both streaming
and non-streaming responses before they reach the user.
- Fixed early return in handleAlertResolved that skipped incident recording
when quiet hours suppressed recovery notifications
- Added Host Agent alert delay configuration (backend + UI)
- Host Agents now have dedicated time threshold settings like other resource types
Related to #1179
Two issues fixed:
1. Custom base URL wasn't being passed to the OpenAI client in
createProviderForModel() - requests went to api.openai.com instead
of the configured endpoint (e.g., LM Studio, llama.cpp)
2. Tool schemas were missing the "properties" field when tools had no
parameters. OpenAI API requires "properties" to always be present
as an object, even if empty.
Fixes#1154
- Reduce minimum seed duration from 7 days to 1 hour for faster startup
on resource-constrained systems (like demo server 1GB droplet)
- Reduce sleep times from 200ms to 50ms between resource processing
- Add diagnostic logging throughout mock metrics seeding to help debug
issues where sparklines show no data
- Add progress logging for nodes, VMs, containers, storage, docker hosts
- Remove standalone pulse-assistant architecture doc (content lives in CLAUDE.md)
- Add CountdownTimer component for patrol schedule display
- Rewrite patrol handler test to focus on interval persistence
- Extract MockStateProvider to shared test file
The PBS backup snapshot cache only compared BackupCount and LastBackup
timestamp to decide whether to re-fetch. When PBS verify jobs complete,
neither field changes — only the Verification field on individual
snapshots changes — so the cache served stale data indefinitely.
Add a 10-minute TTL per backup group so verification status changes are
picked up periodically. Also add panic recovery to PBS and PVE backup
goroutines, and use runtimeCtx for PBS backup polling to respect
monitor shutdown.
Closes#1174
The "Every" dropdown on the Patrol page was not being respected. Setting
15 min would show "Runs every 6 hours" and the countdown timer was wrong.
Root cause: PatrolSchedulePreset and PatrolIntervalMinutes had omitempty
JSON tags. When the API handler cleared the preset to "", json.Marshal
dropped the field. On reload, NewDefaultAIConfig() re-introduced "6hr"
as the preset, which took priority over the user's custom minutes.
Additional fixes in the same area:
- Track nextScheduledAt explicitly in the patrol loop so next_patrol_at
reflects the actual ticker schedule, not a stale lastPatrol + interval
calculation that diverges when the interval changes mid-cycle.
- Refetch patrol status in the frontend after an interval change so the
countdown timer updates immediately.
- Seed lastPatrol from persisted run history on startup so the header
countdown timer appears immediately after a backend restart.
Refactor Router to allow HTTP client injection for install script proxying. Add tests for unified agent install mechanism and additional metrics store coverage.