Three follow-up fixes:
1. RestartAIChat() now performs the full post-start wiring (MCP providers,
patrol adapter, investigation orchestrator) when the service starts for
the first time via Restart(). Previously these were only wired via
StartAIChat(), leaving first-time configure with a partially wired service.
2. The Ollama→OpenAI-compatible fallback in createProviderForModel is now
guarded by !strings.HasPrefix(modelStr, "ollama:") so explicit
"ollama:llama3" models are never silently rerouted to a different provider.
3. Windows install script registration check now uses the $Hostname override
(if set) instead of always looking up $env:COMPUTERNAME, so post-install
verification works correctly when a custom hostname is specified.
When Pulse starts before AI is configured, legacyService is nil.
Saving AI settings called Restart() which bailed immediately on the
nil check, leaving the service unstarted (503 on /api/ai/sessions)
until a full process restart.
Merged the nil and !IsRunning checks so first-time configure now
starts the service inline, same as the already-handled stopped case.
Also: bare model names that ParseModelString routes to Ollama (e.g.
"qwen3-omni") now fall back to a configured custom OpenAI base URL
when Ollama is not explicitly configured — handles manually-typed
model names on self-hosted OpenAI-compatible endpoints.
Fixes#1339, #1296
ZFS zvols (zd*), device-mapper, virtio disks, and other virtual block
devices don't support SMART and were being reported as FAILED. Use lsblk
JSON metadata to filter by device prefix, transport, subsystem, and
vendor/model. Also treat missing smart_status as unknown rather than
failed, and ignore UNKNOWN health in Patrol/AI signals.
createProviderForModel() only handled "provider:model" colon format.
Models like "google/gemini-2.5-flash" or "google/gemini-2.0-flash:free"
(OpenRouter format) failed because the colon split produced invalid
provider names.
Now uses config.ParseModelString() which correctly detects slash-
delimited models as OpenRouter (routed via OpenAI-compatible API).
Patrol runs, evaluation passes, and QuickAnalysis calls were consuming
LLM tokens without recording them in the cost store. This made the
cost_budget_usd_30d budget setting ineffective since enforceBudget()
never saw patrol spend.
- Add RecordUsage() to ai.Service for thread-safe cost recording
- Add recordPatrolUsage() helper to PatrolService, called on both
success and error paths for main patrol and evaluation pass
- Record QuickAnalysis token usage in cost store
- Return partial PatrolResponse (with token counts) on error instead
of nil, so callers can always record consumed tokens
- Propagate partial response through chat_service_adapter on error
- HIGH: Create per-request AgenticLoop instead of sharing one across
concurrent sessions. This prevents race conditions where ExecuteStream
calls would overwrite each other's FSM, knowledge accumulator, and
other session-specific state.
- MEDIUM: TriggerManager.GetStatus now recomputes adaptive interval after
pruning old events. Previously, currentInterval could remain stuck in
busy/quiet mode after events aged out of the window.
- MEDIUM: Patrol stream phases are now broadcast to subscribers. Fixed
setStreamPhase() to emit phase events and SubscribeToStream() to send
phase events to late joiners. UI was stuck on 'Starting patrol...'
because phase events were never emitted.
- LOW: Fixed TriggerStatus.CurrentInterval JSON serialization. Changed
from time.Duration (serializes as nanoseconds) to int64 milliseconds
to match the 'current_interval_ms' tag.
Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible
APIs but don't support function calling. When tools are sent to these
models, they output raw control tokens instead of proper responses.
This change adds:
- openai_tools_disabled config field in AIConfig
- AreToolsDisabledForProvider() method to check at runtime
- API support to get/set the new setting
- Tests for the new functionality
When enabled and using a custom OpenAI base URL, the chat service will
skip sending tools to the model, allowing basic chat functionality to
work even with models that don't support function calling.
Fixes#1154
Two issues fixed:
1. Custom base URL wasn't being passed to the OpenAI client in
createProviderForModel() - requests went to api.openai.com instead
of the configured endpoint (e.g., LM Studio, llama.cpp)
2. Tool schemas were missing the "properties" field when tools had no
parameters. OpenAI API requires "properties" to always be present
as an object, even if empty.
Fixes#1154
- Sync UserNote, AcknowledgedAt, SnoozedUntil, DismissedReason, Suppressed,
and TimesRaised from ai.Finding to unified store in both callback and
startup sync paths. Mirror note writes to unified store immediately.
- Dim acknowledged findings (opacity-60), add "Acknowledged" badge, hide
acknowledge button once acknowledged, sort below unacknowledged in
severity mode.
- Pass finding_id through frontend chat API → backend ChatRequest →
ExecuteRequest. Look up full finding from unified store (mutex-guarded)
and prepend structured context to the prompt.
- Inject wrap-up nudges/escalations after token/turn thresholds are met
- Update compaction logic to include key accumulated facts in summaries
- Refine knowledge extraction and accumulation tests
- Update main entry point for revised AI configuration
- Introduce KnowledgeAccumulator to persist facts across turns
- Enhance AgenticLoop to support knowledge injection and final text summaries
- Update chat service to wire up knowledge components
- Frontend updates to support enhanced chat capabilities
- Discovery: classify transient errors (429, timeout, connection refused, etc.)
and return IsError:true so models stop retrying rate-limited calls
- Agentic loop: detect identical tool calls repeated >3 times and block with
LOOP_DETECTED error, forcing the model to try a different approach
- OpenAI provider: skip tool_choice for DeepSeek Reasoner which doesn't support it
- Read-only classifier: fix curl -I case sensitivity (uppercase flags lowered),
add iostat/vmstat/mpstat/sar/lxc-ls/lxc-info/nc -z to allowlist,
fix 2>&1 false positive in input redirect detection
When pruning older messages to fit context limits, we may cut off
a user message that preceded an assistant message with tool calls.
This leaves an orphaned tool call sequence at the start.
Extend pruneMessagesForModel to:
- Skip leading assistant messages with tool calls
- Also skip their following tool results
- Ensures clean message sequence for all providers
- Replace output-parsing approach with tool-based finding creation
- PatrolService now uses runAIAnalysis with proper scope handling
- Add tool event streaming (tool_start, tool_end) to patrol events
- Expose GetExecutor() on chat.Service for patrol integration
- Remove regex-based finding extraction in favor of patrol tools
The patrol now uses the same agentic loop as chat, with the LLM calling
patrol_report_finding to create findings rather than outputting JSON
that gets parsed. This is more reliable and consistent with the tool model.
Remove files that were consolidated into other modules:
- chat/patrol.go, patrol_test.go → moved to chat/service.go
- tools_infrastructure.go → merged into tools_storage.go
- tools_intelligence.go → merged into tools_metrics.go
- tools_patrol.go → merged into tools_alerts.go
- tools_profiles.go, tools_profiles_test.go → removed (unused)
Update related test file references.
- Add ExecutePatrolStream method to chat.Service for patrol-specific execution
- Create chat_service_adapter.go to bridge chat.Service to ai.ChatServiceProvider
- Remove standalone patrol.go and patrol_test.go from chat package
- Add PatrolRequest/PatrolResponse types to chat service
- Add context injection for recent message context
This allows patrol to use an isolated agentic loop with its own system prompt
while leveraging the common chat infrastructure.
Implement ResolvedContext to track pinned resources during chat sessions:
- ResolvedTarget captures resource ID, type, node, and provenance info
- Provenance tracking records how targets were resolved (user mention,
tool result, or implicit context)
- Session maintains pinned targets that persist across conversation turns
Add routing contract tests to verify:
- Commands routed to correct container vs host targets
- Provenance properly recorded for different resolution methods
- Context maintained across multi-turn conversations
This provides audit trail for which resources were accessed and how
they were identified, supporting safety verification and debugging.
Implement a state machine that enforces structural safety guarantees:
- RESOLVING: Initial state, must discover resources before writing
- READING: Read tools allowed after discovery
- WRITING: Transitions to VERIFYING after any write operation
- VERIFYING: Must perform read verification before next write
This prevents:
- Write operations without resource discovery
- Consecutive writes without verification
- Final answers without post-write verification
The FSM is enforced at the tool execution layer, providing defense-in-depth
that doesn't rely on prompt instructions alone.
Major new AI capabilities for infrastructure monitoring:
Investigation System:
- Autonomous finding investigation with configurable autonomy levels
- Investigation orchestrator with rate limiting and guardrails
- Safety checks for read-only mode enforcement
- Chat-based investigation with approval workflows
Forecasting & Remediation:
- Trend forecasting for resource capacity planning
- Remediation engine for generating fix proposals
- Circuit breaker for AI operation protection
Unified Findings:
- Unified store bridging alerts and AI findings
- Correlation and root cause analysis
- Incident coordinator with metrics recording
New Frontend:
- AI Intelligence page with patrol controls
- Investigation drawer for finding details
- Unified findings panel with actions
Supporting Infrastructure:
- Learning store for user preference tracking
- Proxmox event ingestion and correlation
- Enhanced patrol with investigation triggers
- Restore 'mini' mode for StackedDiskBar.
- Restore layout fixes (fixed table layout, mobile columns) for Docker and Hosts tables.
- Remove 'Ask AI' and AI context selection features.
- Docker: Use compact 'Cube' icon for Podman pods to prevent name obstruction.
- Docker: Show concise image names (strip registry URL).
- Backend: Include pending fixes for AI providers.
- Refactored tool execution to handle tenant-scoped contexts
- Added new tests for infrastructure, control, and kubernetes tools
- Improved test coverage for agentic chat and approval store
- Add comprehensive tests for internal/api/config_handlers.go (Phases 1-3)
- Improve test coverage for AI tools, chat service, and session management
- Enhance alert and notification tests (ResolvedAlert, Webhook)
- Add frontend unit tests for utils (searchHistory, tagColors, temperature, url)
- Add proximity client API tests
Replace the OpenCode sidecar with a native chat service that handles:
- Real-time streaming responses from AI providers
- Multi-turn conversation sessions with history
- Tool execution with automatic function calling
- Agentic workflows for autonomous task completion
- Patrol integration for automated health analysis
The chat service directly communicates with AI providers using the
new StreamingProvider interface, eliminating the need for an external
sidecar process. Sessions are managed in-memory with configurable
history limits.
Key components:
- service.go: Main chat service with provider integration
- session.go: Session management and message history
- agentic.go: Agentic loop for autonomous tool execution
- patrol.go: Patrol-specific chat context and analysis
- tools.go: Tool execution bridge to tools package
- types.go: Chat message and event type definitions