Three follow-up fixes:
1. RestartAIChat() now performs the full post-start wiring (MCP providers,
patrol adapter, investigation orchestrator) when the service starts for
the first time via Restart(). Previously these were only wired via
StartAIChat(), leaving first-time configure with a partially wired service.
2. The Ollama→OpenAI-compatible fallback in createProviderForModel is now
guarded by !strings.HasPrefix(modelStr, "ollama:") so explicit
"ollama:llama3" models are never silently rerouted to a different provider.
3. Windows install script registration check now uses the $Hostname override
(if set) instead of always looking up $env:COMPUTERNAME, so post-install
verification works correctly when a custom hostname is specified.
When Pulse starts before AI is configured, legacyService is nil.
Saving AI settings called Restart() which bailed immediately on the
nil check, leaving the service unstarted (503 on /api/ai/sessions)
until a full process restart.
Merged the nil and !IsRunning checks so first-time configure now
starts the service inline, same as the already-handled stopped case.
Also: bare model names that ParseModelString routes to Ollama (e.g.
"qwen3-omni") now fall back to a configured custom OpenAI base URL
when Ollama is not explicitly configured — handles manually-typed
model names on self-hosted OpenAI-compatible endpoints.
Fixes#1339, #1296
ZFS zvols (zd*), device-mapper, virtio disks, and other virtual block
devices don't support SMART and were being reported as FAILED. Use lsblk
JSON metadata to filter by device prefix, transport, subsystem, and
vendor/model. Also treat missing smart_status as unknown rather than
failed, and ignore UNKNOWN health in Patrol/AI signals.
createProviderForModel() only handled "provider:model" colon format.
Models like "google/gemini-2.5-flash" or "google/gemini-2.0-flash:free"
(OpenRouter format) failed because the colon split produced invalid
provider names.
Now uses config.ParseModelString() which correctly detects slash-
delimited models as OpenRouter (routed via OpenAI-compatible API).
The Undismiss() method existed on FindingsStore but was never exposed
via the API. Users who dismissed findings as "not_an_issue" had no way
to revert them.
- Add HandleUndismissFinding handler and route
- Add Undismiss() to UnifiedStore for parity with FindingsStore
- Also remove matching explicit suppression rules on undismiss
Patrol runs, evaluation passes, and QuickAnalysis calls were consuming
LLM tokens without recording them in the cost store. This made the
cost_budget_usd_30d budget setting ineffective since enforceBudget()
never saw patrol spend.
- Add RecordUsage() to ai.Service for thread-safe cost recording
- Add recordPatrolUsage() helper to PatrolService, called on both
success and error paths for main patrol and evaluation pass
- Record QuickAnalysis token usage in cost store
- Return partial PatrolResponse (with token counts) on error instead
of nil, so callers can always record consumed tokens
- Propagate partial response through chat_service_adapter on error
The AI also receives disk data via tool calls (pulse_metrics type="disks"),
not just the patrol context table. The raw JSON field "wearout" was
ambiguous — rename to "ssd_life_remaining_pct" so the field name itself
communicates that 100 = healthy.
The patrol context table header said "Wearout" and the tool returned a raw
"wearout" JSON field with no indication that 100 = full life remaining.
The AI interpreted "wearout: 100" as fully worn out and raised false
"100% Disk Wearout" findings on healthy NVMe drives.
Rename the patrol table column to "SSD Life Remaining (100%=new)" and
update the data type comment to clarify the semantics.
Show "Pro" badge on the Reporting settings tab so users know upfront
that advanced reporting requires a Pro license, rather than discovering
it after filling out the form.
Downgrade patrol trigger queue-full and rejection messages from Warn to
Debug — these are normal rate-limiting behavior, not actionable warnings.
Recovery notifications were silently disabled for users with pre-5.1.12
configs because the NotifyOnResolve bool field defaults to false when
absent from JSON. Use a *bool probe to detect missing field and default
to true.
Patrol trigger queue filled with warnings when the patrol loop wasn't
running. Gate TriggerPatrolForAlert on p.running and clear the flag
via defer when the loop exits.
saveToDisk used os.WriteFile which doesn't sync to disk before the
atomic rename. On CI runners with aggressive filesystem caching this
can leave the destination file with zero bytes, causing
TestKnowledgeStore_SaveLoad to fail with "unexpected end of JSON input".
#1197: Add Custom URL input to the expanded host row in Settings → Agents.
Loads existing URL via HostMetadataAPI on row expand; saves on button click.
Only shown for host-type agent rows.
#1210: Fix agent_connected always false for Docker hosts on Proxmox VMs.
connectedAgentHostnames now also marks Docker host hostnames reachable when
their matching VM/LXC has a node with a connected Proxmox agent, mirroring
the routing logic already used in the control path.
#1267/#1269: Improve Proxmox auto-registration failure logging. Response body
is now included in the error message, and the warning directs users to delete
the state file to force re-registration rather than claiming the node exists.
(cherry picked from commit 305f6d3c94f0da4fc970450a6304da57d6d7fe80)
TriggerPatrolForAlert was enqueuing into adHocTrigger regardless of
whether Patrol was enabled. With patrolLoop not running (disabled),
nothing drained the channel — it filled on the 10th alert and spammed
"Patrol trigger queue full, dropping trigger" on every subsequent alert.
Read p.config.Enabled in the same RLock as triggerManager and return
early when disabled.
Fixes#1258
(cherry picked from commit 69f399469538f0c9cd59084f6429fed8a793c042)
Patrol only pinged the first IP address of each VM/container, causing
false "unreachable" reports for guests with multiple IPs (common with
Windows VMs that have IPv6 or multi-adapter setups). Now probes all
IPs and marks reachable if any responds.
Fixes#1215
Enrich the patrol seed context with service identity (from discovery
store) and network reachability (via ICMP ping through host agents).
The guest metrics table now includes Service and Reachable columns,
and a Service Health Issues section highlights running-but-unreachable
guests. A new SignalGuestUnreachable signal type creates deterministic
findings for unreachable guests.
New files:
- patrol_intelligence.go: GuestProber interface, GuestIntelligence
type, gatherGuestIntelligence() with concurrent per-node probing
- patrol_prober.go: agentExecProber implementation using batch ping
commands via connected host agents
- Unified Proxmox VE discovery by redirecting Node requests to linked Host Agents.
- Added smart deduplication and legacy fallback for Proxmox discovery results.
- Integrated Proxmox Mail Gateway (PMG) into AI Patrol system.
- Added comprehensive tests for discovery redirection and deduplication.
- Redirect PVE node lookups to linked Host Agent ID when available.
- Implement deduplication in discovery lists to prefer Host Agent data over redundant Node entries.
- Add fallback mechanism to original Node ID for discovery retrieval ensuring compatibility with legacy data.
- Update data adapters and added comprehensive unit tests for redirection and deduplication logic.
- KnowledgeStore: use atomic write (temp+rename) to prevent file
corruption from concurrent async saves
- Change password tests: add auth headers since endpoint now requires
authentication
- ClearSession test: expect 2 cookies (pulse_session + pulse_csrf)
matching updated clearSession behavior
- API token test: update to match current behavior where query-string
tokens are accepted (needed for WebSocket connections)
- Host agent config: allow ScopeHostManage to resolve any host, not
just token-bound hosts
- OAuth endpoints now require settings:write scope (not just admin)
- Approval endpoints now require ai:execute scope
- Added CommandHash to approvals for replay protection
- Approvals are now single-use (consumed on first use)
- consumeApprovalWithValidation validates command matches approval
The discovery state adapter was not copying IPAddresses from the models
when converting VM/Container state. This caused getResourceExternalIP()
to return empty strings, preventing URL suggestion from working.
- Fix API-only mode to accept Bearer tokens and query params
- Fix data race in API token validation using fine-grained locking
- Fix unified agent download serving wrong binary for invalid arch
- Fix AI infra discovery running when AI disabled and missing stop mechanism
- HIGH: Create per-request AgenticLoop instead of sharing one across
concurrent sessions. This prevents race conditions where ExecuteStream
calls would overwrite each other's FSM, knowledge accumulator, and
other session-specific state.
- MEDIUM: TriggerManager.GetStatus now recomputes adaptive interval after
pruning old events. Previously, currentInterval could remain stuck in
busy/quiet mode after events aged out of the window.
- MEDIUM: Patrol stream phases are now broadcast to subscribers. Fixed
setStreamPhase() to emit phase events and SubscribeToStream() to send
phase events to late joiners. UI was stuck on 'Starting patrol...'
because phase events were never emitted.
- LOW: Fixed TriggerStatus.CurrentInterval JSON serialization. Changed
from time.Duration (serializes as nanoseconds) to int64 milliseconds
to match the 'current_interval_ms' tag.
When local LLM servers (LM Studio, llama.cpp) receive tool definitions
but the model doesn't support function calling, they output internal
control tokens like <|channel|>, <|im_start|>, etc. instead of proper
responses.
This change detects these control tokens during streaming and returns
a clear error message explaining that the model doesn't support function
calling and recommending compatible models (Llama 3.1+, Mistral, Qwen).
This is better than the previous approach of offering a "disable tools"
option, which would have crippled Pulse Assistant/Patrol functionality.
Users need to use compatible models for the AI features to work properly.
Related to #1154
Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible
APIs but don't support function calling. When tools are sent to these
models, they output raw control tokens instead of proper responses.
This change adds:
- openai_tools_disabled config field in AIConfig
- AreToolsDisabledForProvider() method to check at runtime
- API support to get/set the new setting
- Tests for the new functionality
When enabled and using a custom OpenAI base URL, the chat service will
skip sending tools to the model, allowing basic chat functionality to
work even with models that don't support function calling.
Fixes#1154
Some local models (llama.cpp, LM Studio) output internal control tokens
like <|channel|>, <|constrain|>, <|message|> instead of using proper
function calling. These tokens leak into the UI creating a poor UX.
This adds sanitization to strip these control tokens from both streaming
and non-streaming responses before they reach the user.