Commit graph

83 commits

Author SHA1 Message Date
rcourtman
1de1392c9b Preserve provider metadata in AI model lists (#1320) 2026-03-25 13:08:15 +00:00
rcourtman
c12f5fb5a4 Restart AI chat on provider and patrol model changes (#1339) (#1360) 2026-03-25 10:58:12 +00:00
rcourtman
89577fe533 Fix OIDC token refresh bypass and guard AISettingsHandler nil path
The applyAuthContextHeaders early-return in CheckAuth skipped the OIDC
token refresh block, causing long-lived OIDC sessions to expire instead
of auto-refreshing. Move the refresh trigger into extractAndStoreAuthContext
so it fires at the middleware level before CheckAuth's early return.

Also add a nil guard on mtPersistence in AISettingsHandler.GetAIService
for non-default org paths, preventing a potential panic if background
code carries a non-default org context in v5 single-tenant mode.
2026-03-06 11:05:01 +00:00
rcourtman
499ab812e3 Fix post-release regressions and lock v5 to single-tenant runtime 2026-03-05 23:46:35 +00:00
rcourtman
10a4e994b6 fix(api): return 404 from undismiss endpoint for invalid finding IDs (#1300)
HandleUndismissFinding now checks both patrol and unified stores
before returning. Returns 404 with error message when the finding
is not found or not dismissed, instead of silently returning success.
2026-03-02 11:48:23 +00:00
rcourtman
2fcddecf80 feat(api): add POST /api/ai/patrol/undismiss endpoint to revert suppressed findings (#1300)
The Undismiss() method existed on FindingsStore but was never exposed
via the API. Users who dismissed findings as "not_an_issue" had no way
to revert them.

- Add HandleUndismissFinding handler and route
- Add Undismiss() to UnifiedStore for parity with FindingsStore
- Also remove matching explicit suppression rules on undismiss
2026-03-01 22:29:36 +00:00
rcourtman
6140cb5be4 fix: auto-default discovery interval to 24h when enabled
When users enable AI discovery without setting an interval, the
default of 0 silently stays in manual-only mode. Now normalizes
0 to 24h on save so discovery actually starts automatically.

Fixes #1225
2026-02-10 21:45:59 +00:00
rcourtman
25285e64bc Require proxy admin for AI test endpoints 2026-02-04 16:30:22 +00:00
rcourtman
8f92273e33 security: enforce scope checks for AI approvals and config management 2026-02-03 18:40:31 +00:00
rcourtman
3d8374e527 Fix AI investigation context and UI settings
- Ensure correct org context is used for AI chat service resolution

- Fix AI adapter tests

- Update AI Intelligence page UI for advanced settings
2026-02-03 16:24:56 +00:00
rcourtman
a55ae78715 Revert "Add config option to disable tools for OpenAI-compatible endpoints"
This reverts commit 81229f206f.
2026-02-03 13:26:26 +00:00
rcourtman
81229f206f Add config option to disable tools for OpenAI-compatible endpoints
Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible
APIs but don't support function calling. When tools are sent to these
models, they output raw control tokens instead of proper responses.

This change adds:
- openai_tools_disabled config field in AIConfig
- AreToolsDisabledForProvider() method to check at runtime
- API support to get/set the new setting
- Tests for the new functionality

When enabled and using a custom OpenAI base URL, the chat service will
skip sending tools to the model, allowing basic chat functionality to
work even with models that don't support function calling.

Fixes #1154
2026-02-03 13:21:44 +00:00
rcourtman
d1f76982ec fix: finding drawer actions (notes persist, acknowledge visual, discuss context)
- Sync UserNote, AcknowledgedAt, SnoozedUntil, DismissedReason, Suppressed,
  and TimesRaised from ai.Finding to unified store in both callback and
  startup sync paths. Mirror note writes to unified store immediately.
- Dim acknowledged findings (opacity-60), add "Acknowledged" badge, hide
  acknowledge button once acknowledged, sort below unacknowledged in
  severity mode.
- Pass finding_id through frontend chat API → backend ChatRequest →
  ExecuteRequest. Look up full finding from unified store (mutex-guarded)
  and prepend structured context to the prompt.
2026-02-02 15:18:51 +00:00
rcourtman
3c3b041368 Improve patrol autonomy error messages for clarity
- Update license_required error to mention 'Auto-fix' instead of
  'Assisted and Full autonomy' for clearer user messaging
- Update full_mode_locked error to reference the UI toggle label
  'Auto-fix critical issues' instead of internal field name
2026-02-01 22:25:32 +00:00
rcourtman
44f9a36d5c feat(license): implement free Patrol / pro Auto-Fix tiering strategy 2026-02-01 16:27:10 +00:00
rcourtman
95a0d7a6bd feat(backend): implement AI Patrol, Investigation, and system-wide refactors 2026-01-30 19:02:14 +00:00
rcourtman
c457358c63 fix(api): flush SSE headers immediately on patrol stream connect
Send an SSE comment immediately when a client connects to the patrol
stream endpoint. This flushes HTTP headers so clients receive the
200 response right away, rather than blocking until the first event.

This fixes eval tests where the stream connection would time out
waiting for headers while patrol was still initializing.
2026-01-29 08:20:25 +00:00
rcourtman
9dcd859056 Update API handlers for AI and discovery endpoints
API layer updates:

AI Handlers:
- Better streaming response handling
- Improved error responses
- Session management improvements

Discovery Handlers:
- New discovery endpoint handlers
- Storage config handler
- Better router organization

Removed deprecated aidiscovery handlers in favor of unified approach.
2026-01-28 16:51:35 +00:00
rcourtman
7f7edfceb4 test: expand backend coverage 2026-01-25 21:08:44 +00:00
rcourtman
f2648891a9 chore: remove unused utility functions
Remove createColumn helper from responsive types and getPulsePort
from URL utilities. Both were exported but never used anywhere.
2026-01-24 23:23:27 +00:00
rcourtman
ff2841a5c6 Fix patrol scoping and config propagation 2026-01-24 23:07:55 +00:00
rcourtman
de2cb7a29b chore: remove deprecated GetAvailableModels and ModelInfo
- Remove deprecated config.ModelInfo type (use providers.ModelInfo)
- Remove deprecated GetAvailableModels function (always returned nil)
- Remove associated test
- Update AISettingsResponse to use providers.ModelInfo
2026-01-24 23:00:16 +00:00
rcourtman
27f1a11acb feat: add AI Intelligence system with investigation and forecasting
Major new AI capabilities for infrastructure monitoring:

Investigation System:
- Autonomous finding investigation with configurable autonomy levels
- Investigation orchestrator with rate limiting and guardrails
- Safety checks for read-only mode enforcement
- Chat-based investigation with approval workflows

Forecasting & Remediation:
- Trend forecasting for resource capacity planning
- Remediation engine for generating fix proposals
- Circuit breaker for AI operation protection

Unified Findings:
- Unified store bridging alerts and AI findings
- Correlation and root cause analysis
- Incident coordinator with metrics recording

New Frontend:
- AI Intelligence page with patrol controls
- Investigation drawer for finding details
- Unified findings panel with actions

Supporting Infrastructure:
- Learning store for user preference tracking
- Proxmox event ingestion and correlation
- Enhanced patrol with investigation triggers
2026-01-24 22:41:43 +00:00
rcourtman
8bf31214f5 feat: enhance API handlers and router with new endpoints
- Add new AI handler endpoints
- Enhance diagnostics API
- Improve router configuration
2026-01-22 22:31:24 +00:00
rcourtman
f2541b0d6c Refactor: Multi-tenancy support for API and License handlers
- Updated LicenseHandlers and LicenseService to be context/tenant aware
- Refactored API router and middleware to support tenant-scoped license checks
- Updated associated tests for context-aware handlers
2026-01-22 16:42:39 +00:00
rcourtman
289d95374f feat: add multi-tenancy foundation (directory-per-tenant)
Implements Phase 1-2 of multi-tenancy support using a directory-per-tenant
strategy that preserves existing file-based persistence.

Key changes:
- Add MultiTenantPersistence manager for org-scoped config routing
- Add TenantMiddleware for X-Pulse-Org-ID header extraction and context propagation
- Add MultiTenantMonitor for per-tenant monitor lifecycle management
- Refactor handlers (ConfigHandlers, AlertHandlers, AIHandlers, etc.) to be
  context-aware with getConfig(ctx)/getMonitor(ctx) helpers
- Add Organization model for future tenant metadata
- Update server and router to wire multi-tenant components

All handlers maintain backward compatibility via legacy field fallbacks
for single-tenant deployments using the "default" org.
2026-01-22 13:39:06 +00:00
rcourtman
ecc31730f6 Remove OpenCode references 2026-01-20 16:56:41 +00:00
rcourtman
ffb8928dbf refactor(api): Update handlers for native AI chat service
Adapts API handlers to use the new native chat service:

ai_handler.go:
- Replace opencode.Service with chat.Service
- Add AIService interface for testability
- Add factory function for service creation (mockable)
- Update provider wiring to use tools package types

ai_handlers.go:
- Add Notable field to model list response
- Simplify command approval - execution handled by agentic loop
- Remove inline command execution from approval endpoint

router.go:
- Update imports: mcp -> tools, opencode -> chat
- Add monitor wrapper types for cleaner dependency injection
- Update patrol wiring for new chat service

agent_profiles:
- Rename agent_profiles_mcp.go -> agent_profiles_tools.go
- Update imports for tools package

monitor_wrappers.go:
- New file with wrapper types for alert/notification monitors
- Enables interface-based dependency injection
2026-01-19 19:20:00 +00:00
rcourtman
4cea85ec97 feat(mcp): expand MCP tools and add session management APIs
New API endpoints:
- POST /api/ai/sessions/{id}/summarize - Compress context
- GET /api/ai/sessions/{id}/diff - Get file changes
- POST /api/ai/sessions/{id}/fork - Branch conversation
- POST /api/ai/sessions/{id}/revert - Undo changes
- POST /api/ai/sessions/{id}/unrevert - Restore reverted changes

MCP provider wiring:
- Storage, backup, disk health providers
- Metrics history, baseline, pattern detection
- Findings manager and metadata updater

Tool improvements:
- pulse_get_topology: Unified infrastructure view
- Improved tool descriptions with usage examples
- Better license checking with logging
2026-01-17 14:43:58 +00:00
rcourtman
035436ad6e fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking
The agent was crashing with 'fatal error: concurrent map writes' when
handleCheckUpdatesCommand spawned a goroutine that called collectOnce
concurrently with the main collection loop. Both code paths access
a.prevContainerCPU without synchronization.

Added a.cpuMu mutex to protect all accesses to prevContainerCPU in:
- pruneStaleCPUSamples()
- collectContainer() delete operation
- calculateContainerCPUPercent()

Related to #1063
2026-01-15 21:10:55 +00:00
rcourtman
8c7581d32c feat(profiles): add AI-assisted profile suggestions
Add ability for users to describe what kind of agent profile they need
in natural language, and have AI generate a suggestion with name,
description, config values, and rationale.

- Add ProfileSuggestionHandler with schema-aware prompting
- Add SuggestProfileModal component with example prompts
- Update AgentProfilesPanel with suggest button and description field
- Streamline ValidConfigKeys to only agent-supported settings
- Update profile validation tests for simplified schema
2026-01-15 13:24:18 +00:00
rcourtman
9cd53814a3 feat(alerts): add per-volume disk thresholds for host agents
Allow users to set custom disk usage thresholds per mounted filesystem
on host agents, rather than applying a single threshold to all volumes.

This addresses NAS/NVR use cases where some volumes (e.g., NVR storage)
intentionally run at 99% while others need strict monitoring.

Backend:
- Check for disk-specific overrides before using HostDefaults.Disk
- Override key format: host:<hostId>/disk:<mountpoint>
- Support both custom thresholds and disable per-disk

Frontend:
- Add 'hostDisk' resource type
- Add "Host Disks" collapsible section in Thresholds → Hosts tab
- Group disks by host for easier navigation

Closes #1103
2026-01-13 23:38:20 +00:00
rcourtman
d73e57af86 Initialize SQLite audit logger for Pro license with audit_logging feature
The audit logging feature was showing the UI for Pro users but the
SQLiteLogger was never actually initialized - it fell back to the
ConsoleLogger which only writes to console and returns empty arrays
for queries.

This fix:
- Adds initAuditLoggerIfLicensed() helper to license_handlers.go
- Calls it when loading a persisted license at startup
- Calls it when activating a new license via API
- Creates SQLiteLogger with 90-day default retention when audit_logging
  feature is enabled

The audit.db will be created in {dataDir}/audit/ when Pro is licensed.
2026-01-13 10:06:48 +00:00
rcourtman
b2a6cd0fa3 fix(agent): add FreeBSD platform support to agent download and UI (#1051)
- Add freebsd-amd64 and freebsd-arm64 to normalizeUnifiedAgentArch()
  so the download endpoint serves FreeBSD binaries when requested
- Add FreeBSD/pfSense/OPNsense platform option to agent setup UI
  with note about bash installation requirement
- Add FreeBSD test cases to unified_agent_test.go

Fixes installation on pfSense/OPNsense where users were getting 404
errors because the backend didn't recognize the freebsd-amd64 arch
parameter from install.sh.
2026-01-11 23:51:12 +00:00
rcourtman
55f5f071ed fix: replace hallucinated upgrade URLs with correct pulserelay.pro
Previous LLM sessions incorrectly inserted fake URLs (pulse.sh/pro and
yourpulse.io/pro) for the Pro upgrade links. Neither domain exists.

Replaced all 34 instances with the correct URL: https://pulserelay.pro/

Fixes #1077
2026-01-10 22:45:40 +00:00
rcourtman
6019e3e77e fix: normalize custom OpenAI-compatible API URLs (#1067)
Users providing base URLs like "https://openrouter.ai/api/v1" were
getting HTML error responses because the client used the URL directly
without appending "/chat/completions".

- Normalize baseURL in NewOpenAIClient to ensure it ends with /chat/completions
- Fix modelsEndpoint() to derive /models from the normalized baseURL
- Add tests for URL normalization with various endpoint formats
2026-01-09 09:13:36 +00:00
rcourtman
7db6b3e47d feat: Add AI chat session sync across devices
Implements server-side persistence for AI chat sessions, allowing users
to continue conversations across devices and browser sessions. Related
to #1059.

Backend:
- Add chat session CRUD API endpoints (GET/PUT/DELETE)
- Add persistence layer with per-user session storage
- Support session cleanup for old sessions (90 days)
- Multi-user support via auth context

Frontend:
- Rewrite aiChat store with server sync (debounced)
- Add session management UI (new conversation, switch, delete)
- Local storage as fallback/cache
- Initialize sync on app startup when AI is enabled
2026-01-08 10:47:45 +00:00
rcourtman
3fdf753a5b Enhance devcontainer and CI workflows
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 22:29:15 +00:00
rcourtman
5c2db10780 fix: Gate AI subsystems and intelligence endpoints on AI enabled state. Related to #885
- Add IsAIEnabled() method to AISettingsHandler for consistent checks
- Gate baseline learning, pattern detector, and correlation detector initialization
  in StartPatrol() on AI being enabled
- Add AI enabled checks to all /api/ai/intelligence/* endpoints as defense-in-depth
- Return empty results with "AI is not enabled" message when AI is disabled

This ensures no AI-related data is collected, persisted, or returned when AI is disabled,
preventing the "undismissable alerts" issue where old AI findings would appear.
2025-12-30 22:06:07 +00:00
rcourtman
3040800e7b fix: AI Patrol now respects exact user-configured thresholds
BREAKING CHANGE: AI Patrol now uses EXACT alert thresholds by default
instead of warning 5-15% before the threshold.

Changes:
- Default behavior: Patrol warns at your configured threshold (e.g., 96% = warns at 96%)
- New setting: 'use_proactive_thresholds' enables the old early-warning behavior
- API: Added use_proactive_thresholds to GET/PUT /api/settings/ai
- Backend: Added SetProactiveMode/GetProactiveMode to PatrolService
- Backend: Added GetThresholds to PatrolService for UI display
- Tests: Updated and added tests for both exact and proactive modes
- Also fixed unused imports in dockeragent/agent.go

When proactive mode is disabled (default):
- Watch: threshold - 5% (slight buffer)
- Warning: exact threshold

When proactive mode is enabled:
- Watch: threshold - 15%
- Warning: threshold - 5%

Related to #951
2025-12-29 08:40:34 +00:00
rcourtman
fe3b4ed5b6 fix: require Pro license for auto-fix and autonomous mode
- patrol.go: Auto-fix now requires both config flag AND ai_autofix license
- service.go: IsAutonomous() checks for ai_autofix license before enabling
- ai_handlers.go: API returns 402 if enabling auto-fix/autonomous without license
2025-12-25 21:26:46 +00:00
rcourtman
0feb039775 fix: Allow dismissing AI findings without Pro license. Related to #885
Users who accumulated AI patrol findings before the patrol-without-AI
bug was fixed (24c4bb0b) could not dismiss them because the dismiss and
resolve endpoints required a Pro license.

Changes:
- Remove Pro license requirement from /api/ai/patrol/dismiss endpoint
- Remove Pro license requirement from /api/ai/patrol/resolve endpoint
- Add ClearAll() method to FindingsStore for bulk clearing
- Add DELETE /api/ai/patrol/findings endpoint for clearing all findings
- Add "Clear All" button to AI Insights UI

Users can now dismiss or resolve any findings they can see, and admins
can clear all findings at once if needed.
2025-12-25 05:06:26 +00:00
rcourtman
e5c7cf7d1e fix: AI request timeout setting not persisting. Related to #884
Added request_timeout_seconds field to:
- AISettingsResponse struct (for GET responses)
- AISettingsUpdateRequest struct (for PUT requests)
- HandleUpdateAISettings handler logic (validation + persistence)
- HandleGetAISettings response builder

The frontend was already sending request_timeout_seconds but the
backend was ignoring it. Now the setting persists correctly.
2025-12-24 16:05:07 +00:00
rcourtman
03d7147615 fix(ai): force enabled state in demo mode
Ensures the AI settings endpoint reports enabled=true and configured=true
when running in demo mode (PULSE_MOCK_MODE=true), even if no provider is
configured. This unlocks the frontend UI to allow interaction with the
mock AI assistant.
2025-12-23 18:39:34 +00:00
rcourtman
28ac86c8ab fix: reduce WebSocket reconnection log noise in host agent
Addresses #866 - agents were logging 'WebSocket connection failed' warnings
even during normal reconnection scenarios (server restart, network blip, etc).

Changes:
- Normal close errors (1000, 1001, connection reset) now log at Debug level
- Only log Warning after 3+ consecutive failures
- Changed 'Connecting to Pulse' from Info to Debug to reduce noise
- Successful connections still log at Info level

The WebSocket is only used for AI command execution, not metrics, so
transient disconnections don't affect monitoring functionality.
2025-12-22 14:11:23 +00:00
rcourtman
07c5880b0a fix: AI settings persistence and UI improvements
Bug Fixes:
- Fix boolean fields with 'omitempty' not persisting false values
  - AlertTriggeredAnalysis, PatrolAnalyzeNodes/Guests/Docker/Storage
  - omitempty causes Go to skip false (zero value) when marshaling JSON
  - On reload, NewDefaultAIConfig() sets true, and missing field stays true

- Fix model dropdown losing selection after save (SolidJS reactivity issue)
  - Added explicit 'selected' attribute to option elements
  - Ensures browser maintains selection with optgroups during re-renders

Improvements:
- Change patrol type label from 'Quick' to 'Patrol' in history table
- Add chat_model and patrol_model to AI settings update log
- Add alert_triggered_analysis to AI config load log for debugging
2025-12-21 21:48:09 +00:00
rcourtman
586bf96e03 refactor: remove runbooks feature entirely
Runbooks were a half-built feature that provided no value:
- Only 3 runbooks existed
- AI dynamic remediation already covers the same ground
- Added UI complexity without benefit

Removed:
- runbooks.go and runbooks_test.go
- Handler functions in ai_handlers.go
- Routes in router.go
- Test cases in ai_handlers_test.go
- Auto-fix call in patrol.go

Kept (dead code but harmless):
- Frontend types/API calls (will 404)
- RecordIncidentRunbook function (unused)

Less code = easier to maintain.
2025-12-21 17:48:07 +00:00
rcourtman
417a523d85 feat(ai): add unified Intelligence orchestrator
- Create Intelligence struct that aggregates all AI subsystems
- Add /api/ai/intelligence endpoint for system-wide and per-resource insights
- Wire Intelligence into PatrolService as a facade (not replacement)
- Add TypeScript types and API client for frontend
- Add unit tests for Intelligence orchestrator
- Fix pre-existing test failures using diagnostic commands instead of actionable ones

The Intelligence orchestrator provides:
- System-wide health scoring (A-F grades)
- Aggregated findings, predictions, correlations
- Per-resource context generation for AI prompts
- Learning progress tracking

This unifies access to AI subsystems without replacing existing code paths.
2025-12-21 10:32:02 +00:00
rcourtman
96573f4aca feat: enhance AI baseline context visibility and incident timeline improvements
Backend:
- Enhanced buildEnrichedResourceContext to ALWAYS show learned baselines with
  status indicators (normal/elevated/anomaly) instead of only when anomalous
- This makes Pulse Pro's 'moat' visible - users can see the AI understands
  their infrastructure's normal behavior patterns
- Added baseline import to service.go

Frontend (user changes):
- Added incident event type filtering with toggle buttons
- Added resource incident panel to view all incidents for a resource
- Added timeline expand/collapse functionality in alert history
- Added incident note saving with proper incidentId tracking
- Added startedAt parameter for proper incident timeline loading
2025-12-21 00:14:20 +00:00
rcourtman
db5e79bb37 fix: Allow Host Agent thresholds to be set to 0 to disable alerting. Related to #864 2025-12-20 20:25:20 +00:00