Commit graph

4826 commits

Author SHA1 Message Date
rcourtman
8b077f69ce feat: AI security and policy improvements for 5.0
- Add DOMPurify sanitization for AI chat markdown rendering (XSS fix)
- Configure DOMPurify to add target=_blank and rel=noopener to links
- Update system prompt to align with command approval policy
- Clarify safe vs destructive commands in prompt
- Improve patrol auto-fix mode guidance with safe operation list
- Add verification requirements for auto-fix actions
- Update observe-only mode to be clearer about read-only restrictions
2025-12-12 17:38:55 +00:00
rcourtman
6b511d1733 fix(ui): Move Auto-Fix Model selector next to Auto-Fix toggle
UX improvement: The Auto-Fix Model dropdown was too far from the
Patrol Auto-Fix toggle, making it hard to find.

Now the flow is:
1. Scroll to 'AI Patrol Behavior' section
2. Check the acknowledgement checkbox
3. Enable 'Patrol Auto-Fix' toggle
4. Model selector appears RIGHT BELOW the toggle

The model dropdown only appears when auto-fix is enabled (since
it's irrelevant otherwise).
2025-12-12 15:17:44 +00:00
rcourtman
840fa741cd feat(ui): Add AI Insights Panel component
Add collapsible panel to display AI-learned intelligence:

Features:
- Failure predictions with time estimates
- Color-coded severity (overdue=red, <3 days=amber, etc.)
- Human-readable event types and confidence percentages
- Resource dependency/correlation display
- Shows source → target relationships with avg delay
- Expandable/collapsible design to save space

Styling:
- Purple gradient theme consistent with AI branding
- Responsive with dark mode support
- Clean card-based layout for predictions
- Badge showing total insight count

Ready to integrate into Alerts page or resource details.
2025-12-12 14:55:08 +00:00
rcourtman
effa33ac09 feat(frontend): Add AI intelligence API types and methods
Add TypeScript types and API methods for AI intelligence data:

Types (aiIntelligence.ts):
- FailurePattern - Detected recurring patterns
- FailurePrediction - Predicted failures with confidence
- ResourceCorrelation - Detected resource dependencies
- InfrastructureChange - Recent config/state changes
- ResourceBaseline - Learned normal behavior baselines

API Methods (ai.ts):
- getPatterns(resourceId?) - Fetch failure patterns
- getPredictions(resourceId?) - Fetch failure predictions
- getCorrelations(resourceId?) - Fetch resource correlations
- getRecentChanges(hours?) - Fetch infrastructure changes
- getBaselines(resourceId?) - Fetch learned baselines

All methods support optional resource_id filtering.
2025-12-12 14:53:53 +00:00
rcourtman
6f0379f879 feat(api): Add AI intelligence API endpoints
Expose learned AI intelligence data via REST API:

New endpoints:
- GET /api/ai/intelligence/patterns - Detected failure patterns
- GET /api/ai/intelligence/predictions - Failure predictions
- GET /api/ai/intelligence/correlations - Resource correlations
- GET /api/ai/intelligence/changes - Recent infrastructure changes
- GET /api/ai/intelligence/baselines - Learned baselines

All endpoints support ?resource_id filter for per-resource queries.
Changes endpoint supports ?hours filter (default: 24).

Backend additions:
- ai_intelligence_handlers.go - Handler implementations
- baseline.Store.GetAllBaselines() - Flat baseline export
- patrol.GetChangeDetector() - Access change detector

This enables frontend to display:
- 'OOM expected in 3 days based on pattern'
- 'When storage-1 is full, database VM restarts'
- 'VM memory baseline: 60-75%'

All tests passing.
2025-12-12 14:49:46 +00:00
rcourtman
d36ad0945f feat(settings): Add separate Auto-Fix Model setting for remediation
Add configurable model specifically for automatic remediation actions:

Backend (internal/config/ai.go):
- Add AutoFixModel field to AIConfig
- Add GetAutoFixModel() getter with fallback chain:
  AutoFixModel -> PatrolModel -> Model

Frontend (AISettings.tsx, types/ai.ts):
- Add auto_fix_model to AISettings types
- Add Auto-Fix Model dropdown (only shows when patrol_auto_fix enabled)
- Falls back to patrol model if not set

API (ai_handlers.go):
- Add auto_fix_model to response and update request
- Handle saving/loading the new field

Rationale:
- Auto-fix takes real actions, may warrant a more capable model
- Patrol observation can use cheaper models for cost savings
- Gives users granular control over model costs vs reliability
- Model hierarchy: Chat > AutoFix > Patrol > Default
2025-12-12 14:35:28 +00:00
rcourtman
f7513cf592 docs: Mark Phase 6 (Multi-Resource Correlation) as complete
ALL PHASES COMPLETE! 🎉

Pulse AI now has the full 'moat' architecture:

- Phase 1: Historical Context Integration 
- Phase 2: Anomaly Detection 
- Phase 3: Operational Memory 
- Phase 4: Remediation Integration 
- Phase 5: Predictive Intelligence 
- Phase 6: Multi-Resource Correlation 

The AI becomes more valuable the longer Pulse runs by learning:
- Metric trends and baselines
- Infrastructure changes
- Past remediation actions
- Failure patterns
- Resource dependencies
2025-12-12 14:27:14 +00:00
rcourtman
9539ddaa6b feat(ai): Add multi-resource correlation detection (Phase 6)
Create internal/ai/correlation package:

1. Correlation Detector (detector.go):
   - Tracks events across resources
   - Detects when events on one resource follow events on another
   - Calculates average delay between correlated events
   - Confidence scoring based on occurrence count
   - Persists to ai_correlations.json

2. Features:
   - GetCorrelations() - All detected relationships
   - GetCorrelationsForResource() - Relationships for one resource
   - GetDependencies() - What resources depend on this one
   - GetDependsOn() - What this resource depends on
   - PredictCascade() - Predict what will be affected
   - FormatForContext() - AI-consumable summary

3. Integration:
   - Wire to alert history in router startup
   - Map alert types to correlation event types
   - Add correlation context to enriched AI context

Example AI context now includes:
'When local-zfs experiences high usage, database often follows within 5 minutes'

This enables the AI to understand infrastructure dependencies
and predict cascade failures.

All tests passing.
2025-12-12 14:26:10 +00:00
rcourtman
07baf5532d docs: Mark Phase 5 (Predictive Intelligence) as complete
Updated implementation status:
- Phase 1: Historical Context Integration 
- Phase 2: Anomaly Detection 
- Phase 3: Operational Memory 
- Phase 4: Remediation Integration 
- Phase 5: Predictive Intelligence  (NEW)
- Phase 6: Multi-Resource Correlation (PLANNED)

Pulse AI now has a complete 'moat' - it becomes more
valuable the longer it runs by learning from:
- Historical metric trends
- Baseline behavior patterns
- Infrastructure changes
- Past remediation actions
- Alert patterns and failure cycles
2025-12-12 14:16:41 +00:00
rcourtman
9c92bb49df feat(ai): Wire alert history to pattern detector for event tracking
Connect alert system to failure prediction:

1. Add AlertCallback to HistoryManager:
   - OnAlert() method to register callbacks
   - Callbacks invoked when alerts are added
   - Called outside lock to prevent deadlocks

2. Expose OnAlertHistory() on alerts.Manager:
   - Pass-through to HistoryManager.OnAlert()
   - Enables external systems to track alerts

3. Wire pattern detector in router startup:
   - Register callback when pattern detector is created
   - Convert alert types to trackable events
   - Pattern detector now learns from production alerts

Now every alert (memory_warning, cpu_critical, etc.) is recorded as
a historical event for pattern analysis. The AI can predict:
'High memory usage typically occurs every ~3 days (next expected in ~1 day)'

All tests passing.
2025-12-12 14:16:03 +00:00
rcourtman
e76e86b298 feat(ai): Add failure pattern detection for predictive intelligence (Phase 5)
Create internal/ai/patterns package:

1. Pattern Detector (detector.go):
   - Records historical events (high memory, OOM, restarts, etc.)
   - Detects recurring failure patterns
   - Calculates average interval between occurrences
   - Computes confidence based on pattern consistency
   - Predicts when failures will occur again
   - Persists to ai_patterns.json

2. Event types tracked:
   - high_memory, high_cpu, disk_full
   - oom, restart, unresponsive
   - backup_failed

3. Integration:
   - Wire PatternDetector into router startup
   - Add to AI context in buildEnrichedContext
   - FormatForContext generates failure predictions

Example AI context now includes:
'OOM events typically occurs every ~10 days (next expected in ~3 days)'

This enables proactive alerts before problems recur.

All tests passing.
2025-12-12 14:11:28 +00:00
rcourtman
4abce54d0b docs: Update AI architecture doc with implemented phases
Mark Phases 1-4 as complete:
- Phase 1: Historical Context Integration 
- Phase 2: Anomaly Detection 
- Phase 3: Operational Memory 
- Phase 4: Remediation Integration 

Update future phases (5 & 6) with remaining work.

The AI moat is now built: trends, baselines, anomaly detection,
change tracking, and remediation learning are all operational.
2025-12-12 14:03:50 +00:00
rcourtman
6a8745c7b3 feat(ai): Log command executions and show remediation history in prompts
Phase 4 - Remediation logging integration:

1. logRemediation hook after tool execution:
   - Only logs run_command tools (main remediation action)
   - Records resourceID, resourceType, findingID
   - Extracts problem summary from user prompt
   - Truncates output for storage (max 1000 chars)
   - Distinguishes automatic (patrol) vs manual (chat) actions

2. buildRemediationContext for system prompts:
   - Shows 'Past Successful Fixes for Similar Issues' section
   - Uses keyword matching to find relevant past fixes
   - Shows 'Remediation History for This Resource' section
   - Includes timestamps and outcomes

This enables the AI to say things like:
- 'This worked before: apt clean to free 6GB (resolved)'
- 'Last time on this resource: restarted nginx (resolved)'

All tests passing.
2025-12-12 14:02:14 +00:00
rcourtman
c63d7828a0 feat(ai): Wire operational memory into router startup
Complete Phase 3 integration:

- Initialize ChangeDetector and RemediationLog in StartPatrol
- Add SetChangeDetector/SetRemediationLog to handler chain:
  Router -> AISettingsHandler -> Service -> PatrolService
- Persist change history to ai_changes.json
- Persist remediation log to ai_remediations.json
- Both use the Pulse config directory for storage

Operational memory is now fully integrated:
- Change detector tracks infrastructure changes on each patrol
- Recent changes (24h) are appended to AI context
- Remediation log ready for command execution logging

All tests passing.
2025-12-12 13:54:38 +00:00
rcourtman
58e7091666 feat(ai): Wire change detection into patrol service
Integrate operational memory into patrol context:

- Add changeDetector and remediationLog fields to PatrolService
- Add SetChangeDetector and SetRemediationLog methods
- Integrate change detection into buildEnrichedContext
- Convert state to ResourceSnapshots for change tracking
- Append recent changes summary to AI context

The AI now sees a 'Recent Infrastructure Changes (24h)' section
showing events like:
- VM 'web-server' status changed: running → stopped (2h ago)
- 'db-server' migrated from node1 to node2 (4h ago)
- 'web-server' memory increased: 4 GB → 8 GB (1d ago)

All tests passing.
2025-12-12 13:53:04 +00:00
rcourtman
7ed985a690 feat(ai): Add operational memory (Phase 3) - change detection and remediation logging
Phase 3 of Pulse AI differentiation:

Create internal/ai/memory package with:

1. Change Detection (changes.go):
   - Tracks infrastructure changes: creation, deletion, config changes
   - Detects status changes (started, stopped)
   - Detects VM/container migrations between nodes
   - Detects CPU/memory configuration changes
   - Detects backup completions
   - Persists change history to ai_changes.json
   - GetChangesSummary for AI context

2. Remediation Logging (remediation.go):
   - Records actions taken to fix problems
   - Tracks command, output, and outcome
   - Links to AI findings via findingID
   - GetSimilar finds past similar problems
   - GetSuccessfulRemediations for learning
   - Persists to ai_remediations.json

3. Type exports (memory_exports.go):
   - Clean re-exports from ai package

This enables the AI to say things like:
- 'This VM was migrated 2 hours ago'
- 'Memory was increased from 4GB to 8GB yesterday'
- 'Last time this happened, restarting nginx resolved it'

All tests passing.
2025-12-12 13:49:37 +00:00
rcourtman
0d539ed0a5 Make AI cost sparklines more informative 2025-12-12 13:37:30 +00:00
rcourtman
e1b37c8acc fix(ui): Show single-point sparkline as horizontal line
When there's only 1 day of AI usage data, the sparkline was invisible
because a single point draws at x=0 with no width. Now draws a
horizontal line across the full width so users can see the value.

This happens when AI has just been enabled and there's only one
day of recorded usage so far.
2025-12-12 13:21:54 +00:00
rcourtman
21abb6ef01 Clarify AI cost estimates with pricing coverage 2025-12-12 13:19:03 +00:00
rcourtman
4aea5ed730 Unify provider/model normalization for AI cost export 2025-12-12 13:04:42 +00:00
rcourtman
c598069da3 Add AI cost export and top target rollups 2025-12-12 12:55:39 +00:00
rcourtman
b014cc9418 Move AI cost budget setting into AI settings 2025-12-12 12:23:06 +00:00
rcourtman
39eb94e067 Backup AI usage history on reset 2025-12-12 12:14:13 +00:00
rcourtman
54a3c3c47d Persist AI cost budget and allow history reset 2025-12-12 12:10:58 +00:00
rcourtman
b3f283e7f5 Improve AI cost dashboard ranges and breakdowns 2025-12-12 11:35:41 +00:00
rcourtman
8310974634 feat(ai): Wire baseline learning loop into router startup
Complete Phase 2 baseline integration:

- Add baseline_exports.go for clean type aliasing
- Wire baseline store initialization into StartPatrol
- Implement startBaselineLearning background loop
  - Runs initial learning after 5 min delay
  - Updates baselines every hour from metrics history
  - Learns from 7 days of data for nodes, VMs, containers
- Add SetBaselineStore methods throughout the chain
  (Router -> AIHandler -> Service -> PatrolService)
- Persists baselines to data directory as JSON

The baseline learning loop:
1. Starts automatically when AI patrol starts
2. Queries metrics history for all resources
3. Computes mean, stddev, percentiles for cpu/memory/disk
4. Saves baselines to disk for durability
5. Anomaly detection uses these baselines in context builder

All tests passing.
2025-12-12 11:29:47 +00:00
rcourtman
5a77fab633 feat(ai): Add baseline learning and anomaly detection (Phase 2)
Phase 2 of Pulse AI differentiation:

- Create internal/ai/baseline package for learned baselines
- Implement statistical baseline learning with mean, stddev, percentiles
- Add z-score based anomaly detection with severity classification
  (low, medium, high, critical based on standard deviations)
- Integrate baseline provider into context builder
- Wire baseline store into patrol service with adapters
- Add anomaly enrichment to resource contexts

Key features:
- Learn computes baseline from historical metric data points
- IsAnomaly and CheckAnomaly detect deviations from normal
- Persists baselines to disk as JSON for durability
- Formatted anomaly descriptions for AI consumption
  Example: 'Memory is high above normal (85.2% vs typical 42.1% ± 8.3%)'

The baseline store needs to be initialized and triggered to learn
from metrics history. Next step is adding the learning loop.

All tests passing.
2025-12-12 11:26:31 +00:00
rcourtman
5f81a451d4 Add 1d range to AI cost dashboard 2025-12-12 11:09:44 +00:00
rcourtman
60c980a921 Show AI cost refresh errors and harden log redaction 2025-12-12 11:05:24 +00:00
rcourtman
716a0b8c4d Fix DeepSeek cost attribution and pricing 2025-12-12 10:49:56 +00:00
rcourtman
50c171e3b5 Add estimated USD to AI cost dashboard 2025-12-12 10:43:07 +00:00
rcourtman
4221592458 Add AI usage dashboard 2025-12-12 09:59:59 +00:00
rcourtman
88d419dd5b feat(ai): Add enriched context with historical trends and predictions
Phase 1 of Pulse AI differentiation:

- Create internal/ai/context package with types, trends, builder, formatter
- Implement linear regression for trend computation (growing/declining/stable/volatile)
- Add storage capacity predictions (predicts days until 90% and 100%)
- Wire MetricsHistory from monitor to patrol service
- Update patrol to use buildEnrichedContext instead of basic summary
- Update patrol prompt to reference trend indicators and predictions

This gives the AI awareness of historical patterns, enabling it to:
- Identify resources with concerning growth rates
- Predict capacity exhaustion before it happens
- Distinguish between stable high usage vs growing problems
- Provide more actionable, time-aware insights

All tests passing. Falls back to basic summary if metrics history unavailable.
2025-12-12 09:45:57 +00:00
rcourtman
cbb89c4b6a feat: Docker agent retry, UI column improvements, and IP tooltip enhancements
- Add exponential backoff retry for Docker agent startup (main.go)
- Fix Docker resource/image column widths with proper truncation
- Unify IP tooltip styling across hosts and guests with detailed network info
- Improve column visibility defaults and sticky column handling
- Various component refinements for Dashboard, Storage, and Backups views
2025-12-12 08:26:36 +00:00
rcourtman
fcea3c11ee feat(ai): Replace patrol frequency dropdown with custom minutes input
- Changed patrol schedule from preset dropdown to freeform number input
- Users can now set any interval (min 10 minutes, max 7 days, or 0 to disable)
- Added patrol_interval_minutes to API request/response (preset is now deprecated)
- Backend validates: min 10 minutes when enabled, max 10080 (7 days)
- Frontend shows human-readable duration next to input (e.g., '6h', '2h 30m')

Also improved Auto-Fix Mode safety:
- Removed '(recommended)' from preset options (was subjective)
- Added 'I understand the risks' acknowledgement checkbox
- Toggle is disabled until user explicitly acknowledges the risks
- Shows prominent warning when Auto-Fix is enabled
- Acknowledgement is session-based (must re-acknowledge on page reload)
2025-12-11 23:24:33 +00:00
rcourtman
fa13919987 fix(ai-chat): Display messages chronologically in AI chatbot
- Add 'content' type to StreamDisplayEvent for tracking text chunks
- Track content events in streamEvents array for chronological display
- Update render to use Switch/Match for cleaner conditional rendering
- Interleave thinking, tool calls, and content as they stream in
- Add fallback for old messages without streamEvents for backwards compat

Previously, tool/command outputs stayed at top while AI text responses
accumulated at the bottom. Now all events appear in order like a
normal chatbot.
2025-12-11 23:02:59 +00:00
rcourtman
37c6654daa fix: Remove hardcoded model names from UI to prevent staleness
- Remove model references from provider labels ('OpenAI' not 'OpenAI (GPT-4)')
- Remove DEFAULT_MODELS usage in form initialization
- Use generic placeholders instead of specific model names
- Models are now fetched dynamically from each provider's API
- UI won't become outdated when new models are released
2025-12-11 18:38:59 +00:00
rcourtman
7e121eeb15 feat: Improve AI settings status indicator
- Show number of configured providers and available models
- Display friendly model name (without provider prefix)
- Better status message: 'Ready • 1  10 models'provider
2025-12-11 18:30:04 +00:00
rcourtman
15033cb345 feat: Add refresh models button to AI settings
- Adds 'Refresh Models' button next to Default Model label
- Spinning icon animation during loading
- Allows manual refresh after configuring new providers
2025-12-11 18:28:33 +00:00
rcourtman
b3f3fc95c4 feat: Add clear credentials button for each AI provider
- Add clear_anthropic_key, clear_openai_key, clear_deepseek_key, clear_ollama_url flags to API
- Backend handles clearing with confirmation prompt
- Each provider accordion shows Test and Clear buttons when configured
- Clear button requires confirmation before removing credentials
- Frontend automatically refreshes settings after clearing
2025-12-11 18:24:25 +00:00
rcourtman
df2e36e5e4 feat: Add per-provider test buttons and documentation links
- Add /api/ai/test/{provider} endpoint for testing individual providers
- Add 'Test' button to each provider accordion (visible when configured)
- Shows test result inline (success/error message)
- Update help links with direct URLs to API key pages:
  - Anthropic: console.anthropic.com/settings/keys
  - OpenAI: platform.openai.com/api-keys
  - DeepSeek: platform.deepseek.com/api_keys
  - Ollama: ollama.ai
2025-12-11 18:11:31 +00:00
rcourtman
d078f5f0f6 fix: Ollama should only show as configured when URL is explicitly set
Previously Ollama always showed as 'Available' even if not set up.
Now it only shows as configured when user has entered an OllamaBaseURL.
2025-12-11 17:12:01 +00:00
rcourtman
e842f523b7 feat: Implement multi-provider AI support
Backend:
- Add per-provider API key fields to AIConfig (AnthropicAPIKey, OpenAIAPIKey, DeepSeekAPIKey, OllamaBaseURL, OpenAIBaseURL)
- Add NewForProvider() and NewForModel() factory functions for multi-provider instantiation
- Update ListModels() to aggregate models from all configured providers with provider:model format
- Update Execute/ExecuteStream to dynamically create provider based on selected model
- Update TestConnection to use multi-provider aware provider creation
- Add helper functions: HasProvider(), GetConfiguredProviders(), GetAPIKeyForProvider(), GetBaseURLForProvider(), ParseModelString(), FormatModelString()

Frontend:
- Remove legacy single-provider UI (provider grid, single API key input, single base URL)
- Add accordion-style UI for configuring all providers independently
- Add model grouping by provider in selectors using optgroup
- Update AIChat model dropdown with grouped provider sections
- Add helper functions for parsing provider from model ID and grouping models

API:
- Add multi-provider fields to AISettingsResponse and AISettingsUpdateRequest
- Add /api/ai/models endpoint for dynamic model listing
- Update settings handlers for per-provider credential management
2025-12-11 16:00:45 +00:00
rcourtman
c8b0438894 feat(ai): Add Suppression Rules UI to Alerts page
Users can now:
1. View all active suppression rules in a collapsible section
2. Add new rules manually with resource ID, category, and description
3. Delete rules to re-enable alerts
4. See whether rules came from dismissed findings or were manually created

The UI shows:
- 🔇 Suppression Rules (N active) header with expand/collapse
- + Add Rule button to open the form
- Each rule shows resource, category, origin (Manual/From Finding), and description
- Delete button to remove rules
2025-12-11 00:15:35 +00:00
rcourtman
40236317fb feat(ai): Add suppression rules management API and UI
Users can now:
1. View all suppression rules (both from dismissed findings and manually created)
2. Create manual rules like 'ignore performance issues on debian-go'
3. Delete rules when they want alerts to come back

Backend:
- Added SuppressionRule type for user-defined rules
- Added suppressionRules storage to FindingsStore
- Added AddSuppressionRule/GetSuppressionRules/DeleteSuppressionRule methods
- Added isSuppressedInternal check for manual rules
- Added API handlers and routes for /api/ai/patrol/suppressions

Frontend:
- Added SuppressionRule interface
- Added getSuppressionRules/addSuppressionRule/deleteSuppressionRule API functions
- Added getDismissedFindings for viewing dismissed findings

Example usage:
POST /api/ai/patrol/suppressions
{
  'resource_id': 'debian-go',
  'category': 'performance',
  'description': 'Dev container runs hot - expected'
}
2025-12-11 00:12:18 +00:00
rcourtman
33af1627f4 fix(ai): Make LLM finding IDs stable across patrol runs
The main issue was that finding IDs included the title, which the LLM
generates differently each time. 'High CPU on minipc' vs 'Node minipc
experiencing high CPU load' got different IDs, making dismissals useless.

Changes:
1. LLM findings now get IDs based on resource+category only, not title
2. Add() now checks if finding is suppressed before adding as new
3. Add() now checks dismissed findings and only reactivates on severity escalation
4. IsSuppressed() now matches by resource+category only, not title
5. Added isSuppressedInternal() for use when lock is already held

Now when you dismiss 'performance issues on minipc', any future patrol finding
about performance on minipc will be recognized as the same issue and stay dismissed.
2025-12-11 00:03:17 +00:00
rcourtman
9a32c4fdae fix(ai): Use context.Background() for forced patrol runs
The ForcePatrol() function was using the HTTP request context, which gets
cancelled immediately when the API response is sent. This caused LLM analysis
to fail with 'context canceled' before it could complete.

Now uses context.Background() so the goroutine runs independently of the
HTTP request lifecycle.

Also fixed dropdown hover gap issue in the dismiss menu.
2025-12-10 23:31:21 +00:00
rcourtman
04cb8bc964 feat(ai): Add per-resource notes to patrol context
Knowledge store notes are now included in the patrol LLM prompt. When users
save notes about resources (e.g., 'This VM intentionally runs hot'), the patrol
AI will see these notes and avoid flagging documented behavior as issues.

Changes:
- Added knowledge store reference to PatrolService
- Added SetKnowledgeStore() method to configure the store
- Enhanced buildPatrolPrompt() to include knowledge context
- Connected knowledge store to patrol in service.go SetStateProvider()

This complements the dismissed findings context to give the LLM a complete
picture of what the user considers normal/expected behavior.
2025-12-10 23:03:01 +00:00
rcourtman
7350e64f3e feat(ai): Add LLM memory system for patrol findings
Implements a comprehensive feedback system that allows the LLM to 'remember'
user decisions about findings, preventing repetitive/annoying alerts.

Backend changes:
- Extended Finding struct with dismissed_reason, user_note, times_raised, suppressed
- Added Dismiss(), Suppress(), SetUserNote(), IsSuppressed() methods to FindingsStore
- Added GetDismissedForContext() to format dismissed findings for LLM context
- Enhanced buildPatrolPrompt() to inject user feedback context
- Added POST /api/ai/patrol/dismiss and /api/ai/patrol/suppress endpoints
- Updated IsActive() to exclude suppressed findings

Frontend changes:
- Added Dismiss dropdown with options: Not an Issue, Expected Behavior, Will Fix Later
- Added Never Alert Again option for permanent suppression
- Expected Behavior prompts for optional note to help LLM understand context
- Added visual badges: recurrence count (×N), dismissed status, suppressed indicator
- Display user notes in expanded finding view

Also fixes:
- Fixed 403 error on Run Patrol (compilation errors from partial refactoring)
- Removed non-LLM patrol checks - patrol now uses LLM analysis only
- Fixed function signature mismatches in alert_triggered.go

The LLM now receives context about previously dismissed findings and is
instructed not to re-raise them unless severity has significantly worsened.
2025-12-10 22:55:34 +00:00
rcourtman
1e3fdb6f63 feat(ai): Enhanced AI patrol system with alert triggers and history persistence
- Add alert-triggered AI analysis for real-time incident response
- Implement patrol history persistence across restarts
- Add patrol schedule configuration UI in AI Settings
- Enhance AIChat with patrol status and manual trigger controls
- Add resource store improvements for AI context building
- Expand Alerts page with AI-powered analysis integration
- Add Vite proxy config for AI API endpoints
- Support both Anthropic and OpenAI providers with streaming
2025-12-10 21:08:22 +00:00