Commit graph

400 commits

Author SHA1 Message Date
rcourtman
7f05d87809 fix: add missing HandleLicenseFeatures method and related changes
- Add HandleLicenseFeatures handler that was missing from license_handlers.go
- Add /api/license/features route to router
- Update AI service and metadata provider
- Update frontend license API and components
- Fix CI build failure caused by tests referencing unimplemented method
2025-12-19 22:59:52 +00:00
rcourtman
65e38fac91 test: improve test coverage for AI, license, config, and monitoring packages
New test files:
- internal/ai/providers/gemini_test.go: Comprehensive Gemini provider tests
- internal/api/ai_intelligence_handlers_test.go: AI intelligence endpoint tests
- internal/api/ai_patrol_handlers_test.go: AI patrol endpoint tests
- internal/api/license_handlers_test.go: License API handler tests
- internal/api/security_oidc_response_test.go: OIDC response formatting tests
- internal/config/ai_config_test.go: AI configuration function tests
- internal/config/persistence_ai_test.go: AI config persistence tests
- internal/config/persistence_extended_test.go: Extended persistence tests
- internal/license/persistence_test.go: License persistence tests
- internal/license/pubkey_test.go: Public key handling tests
- internal/monitoring/host_agent_temps_test.go: Temperature processing tests

Enhanced existing files:
- internal/api/updates_test.go: Added update handler tests
- internal/license/license_test.go: Added Service method tests

Coverage improvements:
- ai/providers: 57.3% -> 73.0% (+15.7%)
- license: 78.3% -> 85.9% (+7.6%)
- config: 49.7% -> 53.9% (+4.2%)
- monitoring: 49.8% -> 50.8% (+1.0%)
- api: 28.4% -> 29.8% (+1.4%)
2025-12-19 22:49:30 +00:00
rcourtman
1d64b4c31a fix: show Removed Docker Hosts section in UI for re-enrollment
The 'Removed Docker Hosts' section was not appearing in Settings -> Agents
even when hosts were blocked from re-enrolling. This prevented users from
using the 'Allow re-enroll' button to unblock their Docker agents.

Root cause: The WebSocket store was missing:
1. The 'removedDockerHosts' property in its initial state
2. A handler to process removedDockerHosts data from WebSocket messages

This meant the backend was correctly sending the data, but the frontend
was completely ignoring it.

Changes:
- Add removedDockerHosts to WebSocket store initial state and message handler
- Add removedDockerHosts to App.tsx fallback state for consistency
- Add missing BroadcastState call after AllowDockerHostReenroll succeeds

Also includes previous fixes from this session:
- Add PULSE_AGENT_URL as alias for PULSE_AGENT_CONNECT_URL (config.go)
- Add runtime Docker/Podman auto-detection in pulse-agent (main.go)

Fixes issue reported by darthrater78 in discussion #845
2025-12-19 17:57:04 +00:00
rcourtman
3a9df35ae1 fix(ai): improve patrol timing accuracy and status reporting 2025-12-19 17:04:14 +00:00
rcourtman
4d1138793d feat(license): add initial license implementation structure to fix build 2025-12-19 17:01:57 +00:00
rcourtman
13af682ce1 fix(config): add PULSE_AGENT_CONNECT_URL and improve Docker detection
- Add AgentConnectURL config option to override public URL for agents
- Improve install.sh to diagnose docker detection failures
- Update router to prioritize AgentConnectURL for agent install commands
2025-12-19 16:43:14 +00:00
rcourtman
a93148105f fix: exclude WebSocket from rate limiting to prevent UI lockout
The /ws endpoint was rate limited to 30 connections/minute. After
prolonged use with WebSocket reconnections (network hiccups, browser
tab throttling, etc.), users with many Docker containers would hit
this limit and get stuck with a 'Connecting...' UI.

WebSocket connections are already authenticated via session/API token
and reconnections are normal behavior, so rate limiting is not needed.

Fixes #859 (second report about WebSocket rate limiting after hours of use).
2025-12-19 14:51:52 +00:00
rcourtman
16f143d925 fix: respect X-Forwarded-Proto header for hasHTTPS in /api/security/status
Fixes issue where /api/security/status reports hasHTTPS=false when accessed
via HTTPS through a reverse proxy like Caddy.

Resolves feedback from discussion #845 (clar2242).
2025-12-19 14:40:23 +00:00
rcourtman
0d11da74e2 refactor(ui): standardize URL editing with shared UrlEditPopover component
- Create reusable UrlEditPopover component with fixed positioning
- Add createUrlEditState hook for managing editing state
- Update DockerHostSummaryTable to use new popover
- Update DockerUnifiedTable (containers & services) to use new popover
- Update GuestRow (Proxmox VMs/containers) to use new popover
- Update HostsOverview (Proxmox hosts) to use new popover
- Add Docker host metadata API for custom URLs
- Consistent styling with save, delete, cancel buttons and keyboard shortcuts
2025-12-18 22:22:55 +00:00
rcourtman
65829983b5 v5: gate legacy sensor-proxy and prune dev docs 2025-12-18 21:51:25 +00:00
rcourtman
0d6aaff253 fix: AI Patrol frequency not obeying settings
Fixes #858

The patrol interval setting was not being properly applied due to:

1. ReconfigurePatrol() was setting the deprecated QuickCheckInterval field
   instead of the preferred Interval field

2. SetConfig() was comparing raw field values instead of using GetInterval()
   to compare effective intervals, causing change detection to fail

3. The API response was missing interval_ms, preventing the frontend from
   displaying the correct interval

Changes:
- Update StartPatrol() and ReconfigurePatrol() to use the Interval field
- Fix SetConfig() to use GetInterval() for interval comparison
- Add IntervalMs to PatrolStatusResponse and include it in the API response
2025-12-18 21:33:50 +00:00
rcourtman
2b48b0a459 feat: add --kube-include-all-deployments flag for Kubernetes agent
Adds IncludeAllDeployments option to show all deployments, not just
problem ones (where replicas don't match desired). This provides parity
with the existing --kube-include-all-pods flag.

- Add IncludeAllDeployments to kubernetesagent.Config
- Add --kube-include-all-deployments flag and PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS env var
- Update collectDeployments to respect the new flag
- Add test for IncludeAllDeployments functionality
- Update UNIFIED_AGENT.md documentation

Addresses feedback from PR #855
2025-12-18 20:58:30 +00:00
rcourtman
c91307be94 fix: guest URL icon now appears/disappears immediately after AI sets/removes it
The issue was a SolidJS reactivity problem in the Dashboard component.
When guestMetadata signal was accessed inside a For loop callback and
assigned to a plain variable, SolidJS lost reactive tracking.

Changed from:
  const metadata = guestMetadata()[guestId] || ...
  customUrl={metadata?.customUrl}

To:
  const getMetadata = () => guestMetadata()[guestId] || ...
  customUrl={getMetadata()?.customUrl}

This ensures SolidJS properly tracks the signal dependency when the
getter function is called directly in JSX props.
2025-12-18 14:42:47 +00:00
rcourtman
3623395549 test(api): allow printable alert IDs for acknowledge (Related to #852) 2025-12-17 20:09:51 +00:00
rcourtman
c4b893e257 Fix agent download serving wrong architecture binary
When a specific architecture is requested (e.g., linux-arm64), don't fall
back to the generic pulse-agent binary if the requested arch isn't found.
This was causing ARM64 machines to receive x86-64 binaries that can't run.

Now returns 404 with helpful error message if requested architecture
binary is not available.
2025-12-17 17:22:51 +00:00
rcourtman
969fa0e509 test: add unit tests for AI, Kubernetes agent, and clients 2025-12-17 12:47:36 +00:00
rcourtman
e157aa04ad fix: Allow printable ASCII in alert IDs for acknowledge requests
Reverts overly strict alert ID validation that was rejecting valid
alert IDs containing special characters. Docker host IDs can contain
user-supplied data like hostnames which may include parentheses,
brackets, or other printable ASCII characters.

The previous validation only allowed alphanumeric + limited punctuation,
which caused 400 errors when acknowledging alerts from Docker hosts
with special characters in their identifiers.

Related to #852
2025-12-16 20:10:31 +00:00
rcourtman
dc156d097c fix: Retry-After header now uses actual limiter window
Previously the Retry-After header was hardcoded to "60" seconds
regardless of the rate limiter's actual window duration. Now uses
the limiter's configured window (e.g., 600 seconds for recovery
endpoints, 300 for exports).

Related to #579
2025-12-16 10:07:16 +00:00
rcourtman
47ced7c97e feat(ui): make AI settings page more compact and user-friendly
- Replace verbose info banner with streamlined layout
- Add collapsible 'Advanced Model Selection' accordion for Chat/Patrol models
- Make AI Patrol Settings section collapsible with inline summary badges
- Compact Cost Controls into single-row inline layout
- Reduce form spacing for tighter presentation
- Remove unused formHelpText import

Also includes:
- OpenAI provider fixes for max_tokens parameters
- Security setup CSRF and 401 fixes
- Minor UI tweaks
2025-12-16 09:20:09 +00:00
rcourtman
3d6da91ac0 feat(ai): improve AI settings first-time setup UX
- Add setup modal that appears when enabling AI without configured provider
- Modal allows selecting provider (Anthropic, OpenAI, DeepSeek, Ollama)
- Enter API key/URL and enable AI in one smooth flow
- Reorder backend to apply API keys before enabled check
- Fix Ollama to strip 'ollama:' prefix from model names
- Simplify backend error message for unconfigured providers
2025-12-15 18:59:19 +00:00
rcourtman
fed0a9d89b fix(ai): allow enabling AI when any provider is configured
The enable validation was using the legacy single-provider model which
checked settings.Provider and settings.APIKey. Users configuring Ollama
via the new multi-provider UI (setting ollama_base_url) couldn't enable
AI because settings.Provider defaulted to "anthropic" which required an
API key.

Now checks GetConfiguredProviders() first - if any provider is configured
(Anthropic, OpenAI, DeepSeek, or Ollama), AI can be enabled.

Related to #847
2025-12-15 09:43:17 +00:00
rcourtman
397871629c fix: cluster-aware guest deduplication and multi-agent token binding
- Add cluster-aware guest ID generation (clusterName-VMID instead of instanceName-VMID)
  to prevent duplicate VMs/containers when multiple cluster nodes are monitored

- Add cluster deduplication at registration time - when a node is added that belongs
  to an already-configured cluster, merge as endpoint instead of creating duplicate

- Add startup consolidation to automatically merge duplicate cluster instances

- Change host agent token binding from agent GUID to hostname, allowing:
  - Multiple host agents to share a token (each bound by hostname)
  - Agent reinstalls on same host without token conflicts

- Remove 12-character password minimum requirement

- Remove emoji from auto-registration success message

- Fix grouped view node lookup to support both cluster-aware node IDs
  (clusterName-nodeName) and legacy guest grouping keys (instance-nodeName)

Fixes duplicate guests appearing when agents are installed on multiple
cluster nodes. Also improves multi-agent UX by allowing shared tokens.
2025-12-14 10:16:17 +00:00
rcourtman
1dc573e209 feat: support token query parameter for WebSocket authentication
Allows WebSocket connections to authenticate via ?token= query parameter
when headers cannot be sent (browser WebSocket limitations).
2025-12-13 21:28:50 +00:00
rcourtman
8919281718 fix: clear agents that connected during unauthenticated setup window
When no auth is configured (fresh install), CheckAuth allows all requests.
This creates a race condition where existing agents from a previous setup
can report data before the wizard completes security configuration.

This fix clears all host agents and docker hosts when /api/security/quick-setup
is called, ensuring the wizard shows a clean state after security is configured.

Added:
- State.ClearAllHosts() - removes all host agents
- State.ClearAllDockerHosts() - removes all docker hosts
- Monitor.ClearUnauthenticatedAgents() - clears both and resets token bindings
- Call to ClearUnauthenticatedAgents() in handleQuickSecuritySetupFixed()
2025-12-13 21:22:04 +00:00
rcourtman
5c4069cdbf feat: add persistent metrics history API endpoint
- Add GET /api/metrics-store/history endpoint for querying SQLite-backed metrics
- Support flexible time ranges: 1h, 6h, 12h, 24h, 7d, 30d, 90d
- Return aggregated data with min/max values for longer time ranges
- Add TypeScript types and ChartsAPI.getMetricsHistory() client method

This enables frontend charts to visualize long-term trends using the
tiered retention system (raw → minute → hourly → daily averages).
2025-12-13 14:18:16 +00:00
rcourtman
a259b67348 feat: add Kubernetes platform support 2025-12-12 21:31:11 +00:00
rcourtman
8b077f69ce feat: AI security and policy improvements for 5.0
- Add DOMPurify sanitization for AI chat markdown rendering (XSS fix)
- Configure DOMPurify to add target=_blank and rel=noopener to links
- Update system prompt to align with command approval policy
- Clarify safe vs destructive commands in prompt
- Improve patrol auto-fix mode guidance with safe operation list
- Add verification requirements for auto-fix actions
- Update observe-only mode to be clearer about read-only restrictions
2025-12-12 17:38:55 +00:00
rcourtman
6f0379f879 feat(api): Add AI intelligence API endpoints
Expose learned AI intelligence data via REST API:

New endpoints:
- GET /api/ai/intelligence/patterns - Detected failure patterns
- GET /api/ai/intelligence/predictions - Failure predictions
- GET /api/ai/intelligence/correlations - Resource correlations
- GET /api/ai/intelligence/changes - Recent infrastructure changes
- GET /api/ai/intelligence/baselines - Learned baselines

All endpoints support ?resource_id filter for per-resource queries.
Changes endpoint supports ?hours filter (default: 24).

Backend additions:
- ai_intelligence_handlers.go - Handler implementations
- baseline.Store.GetAllBaselines() - Flat baseline export
- patrol.GetChangeDetector() - Access change detector

This enables frontend to display:
- 'OOM expected in 3 days based on pattern'
- 'When storage-1 is full, database VM restarts'
- 'VM memory baseline: 60-75%'

All tests passing.
2025-12-12 14:49:46 +00:00
rcourtman
d36ad0945f feat(settings): Add separate Auto-Fix Model setting for remediation
Add configurable model specifically for automatic remediation actions:

Backend (internal/config/ai.go):
- Add AutoFixModel field to AIConfig
- Add GetAutoFixModel() getter with fallback chain:
  AutoFixModel -> PatrolModel -> Model

Frontend (AISettings.tsx, types/ai.ts):
- Add auto_fix_model to AISettings types
- Add Auto-Fix Model dropdown (only shows when patrol_auto_fix enabled)
- Falls back to patrol model if not set

API (ai_handlers.go):
- Add auto_fix_model to response and update request
- Handle saving/loading the new field

Rationale:
- Auto-fix takes real actions, may warrant a more capable model
- Patrol observation can use cheaper models for cost savings
- Gives users granular control over model costs vs reliability
- Model hierarchy: Chat > AutoFix > Patrol > Default
2025-12-12 14:35:28 +00:00
rcourtman
9539ddaa6b feat(ai): Add multi-resource correlation detection (Phase 6)
Create internal/ai/correlation package:

1. Correlation Detector (detector.go):
   - Tracks events across resources
   - Detects when events on one resource follow events on another
   - Calculates average delay between correlated events
   - Confidence scoring based on occurrence count
   - Persists to ai_correlations.json

2. Features:
   - GetCorrelations() - All detected relationships
   - GetCorrelationsForResource() - Relationships for one resource
   - GetDependencies() - What resources depend on this one
   - GetDependsOn() - What this resource depends on
   - PredictCascade() - Predict what will be affected
   - FormatForContext() - AI-consumable summary

3. Integration:
   - Wire to alert history in router startup
   - Map alert types to correlation event types
   - Add correlation context to enriched AI context

Example AI context now includes:
'When local-zfs experiences high usage, database often follows within 5 minutes'

This enables the AI to understand infrastructure dependencies
and predict cascade failures.

All tests passing.
2025-12-12 14:26:10 +00:00
rcourtman
9c92bb49df feat(ai): Wire alert history to pattern detector for event tracking
Connect alert system to failure prediction:

1. Add AlertCallback to HistoryManager:
   - OnAlert() method to register callbacks
   - Callbacks invoked when alerts are added
   - Called outside lock to prevent deadlocks

2. Expose OnAlertHistory() on alerts.Manager:
   - Pass-through to HistoryManager.OnAlert()
   - Enables external systems to track alerts

3. Wire pattern detector in router startup:
   - Register callback when pattern detector is created
   - Convert alert types to trackable events
   - Pattern detector now learns from production alerts

Now every alert (memory_warning, cpu_critical, etc.) is recorded as
a historical event for pattern analysis. The AI can predict:
'High memory usage typically occurs every ~3 days (next expected in ~1 day)'

All tests passing.
2025-12-12 14:16:03 +00:00
rcourtman
e76e86b298 feat(ai): Add failure pattern detection for predictive intelligence (Phase 5)
Create internal/ai/patterns package:

1. Pattern Detector (detector.go):
   - Records historical events (high memory, OOM, restarts, etc.)
   - Detects recurring failure patterns
   - Calculates average interval between occurrences
   - Computes confidence based on pattern consistency
   - Predicts when failures will occur again
   - Persists to ai_patterns.json

2. Event types tracked:
   - high_memory, high_cpu, disk_full
   - oom, restart, unresponsive
   - backup_failed

3. Integration:
   - Wire PatternDetector into router startup
   - Add to AI context in buildEnrichedContext
   - FormatForContext generates failure predictions

Example AI context now includes:
'OOM events typically occurs every ~10 days (next expected in ~3 days)'

This enables proactive alerts before problems recur.

All tests passing.
2025-12-12 14:11:28 +00:00
rcourtman
c63d7828a0 feat(ai): Wire operational memory into router startup
Complete Phase 3 integration:

- Initialize ChangeDetector and RemediationLog in StartPatrol
- Add SetChangeDetector/SetRemediationLog to handler chain:
  Router -> AISettingsHandler -> Service -> PatrolService
- Persist change history to ai_changes.json
- Persist remediation log to ai_remediations.json
- Both use the Pulse config directory for storage

Operational memory is now fully integrated:
- Change detector tracks infrastructure changes on each patrol
- Recent changes (24h) are appended to AI context
- Remediation log ready for command execution logging

All tests passing.
2025-12-12 13:54:38 +00:00
rcourtman
21abb6ef01 Clarify AI cost estimates with pricing coverage 2025-12-12 13:19:03 +00:00
rcourtman
4aea5ed730 Unify provider/model normalization for AI cost export 2025-12-12 13:04:42 +00:00
rcourtman
c598069da3 Add AI cost export and top target rollups 2025-12-12 12:55:39 +00:00
rcourtman
39eb94e067 Backup AI usage history on reset 2025-12-12 12:14:13 +00:00
rcourtman
54a3c3c47d Persist AI cost budget and allow history reset 2025-12-12 12:10:58 +00:00
rcourtman
8310974634 feat(ai): Wire baseline learning loop into router startup
Complete Phase 2 baseline integration:

- Add baseline_exports.go for clean type aliasing
- Wire baseline store initialization into StartPatrol
- Implement startBaselineLearning background loop
  - Runs initial learning after 5 min delay
  - Updates baselines every hour from metrics history
  - Learns from 7 days of data for nodes, VMs, containers
- Add SetBaselineStore methods throughout the chain
  (Router -> AIHandler -> Service -> PatrolService)
- Persists baselines to data directory as JSON

The baseline learning loop:
1. Starts automatically when AI patrol starts
2. Queries metrics history for all resources
3. Computes mean, stddev, percentiles for cpu/memory/disk
4. Saves baselines to disk for durability
5. Anomaly detection uses these baselines in context builder

All tests passing.
2025-12-12 11:29:47 +00:00
rcourtman
60c980a921 Show AI cost refresh errors and harden log redaction 2025-12-12 11:05:24 +00:00
rcourtman
88d419dd5b feat(ai): Add enriched context with historical trends and predictions
Phase 1 of Pulse AI differentiation:

- Create internal/ai/context package with types, trends, builder, formatter
- Implement linear regression for trend computation (growing/declining/stable/volatile)
- Add storage capacity predictions (predicts days until 90% and 100%)
- Wire MetricsHistory from monitor to patrol service
- Update patrol to use buildEnrichedContext instead of basic summary
- Update patrol prompt to reference trend indicators and predictions

This gives the AI awareness of historical patterns, enabling it to:
- Identify resources with concerning growth rates
- Predict capacity exhaustion before it happens
- Distinguish between stable high usage vs growing problems
- Provide more actionable, time-aware insights

All tests passing. Falls back to basic summary if metrics history unavailable.
2025-12-12 09:45:57 +00:00
rcourtman
fcea3c11ee feat(ai): Replace patrol frequency dropdown with custom minutes input
- Changed patrol schedule from preset dropdown to freeform number input
- Users can now set any interval (min 10 minutes, max 7 days, or 0 to disable)
- Added patrol_interval_minutes to API request/response (preset is now deprecated)
- Backend validates: min 10 minutes when enabled, max 10080 (7 days)
- Frontend shows human-readable duration next to input (e.g., '6h', '2h 30m')

Also improved Auto-Fix Mode safety:
- Removed '(recommended)' from preset options (was subjective)
- Added 'I understand the risks' acknowledgement checkbox
- Toggle is disabled until user explicitly acknowledges the risks
- Shows prominent warning when Auto-Fix is enabled
- Acknowledgement is session-based (must re-acknowledge on page reload)
2025-12-11 23:24:33 +00:00
rcourtman
b3f3fc95c4 feat: Add clear credentials button for each AI provider
- Add clear_anthropic_key, clear_openai_key, clear_deepseek_key, clear_ollama_url flags to API
- Backend handles clearing with confirmation prompt
- Each provider accordion shows Test and Clear buttons when configured
- Clear button requires confirmation before removing credentials
- Frontend automatically refreshes settings after clearing
2025-12-11 18:24:25 +00:00
rcourtman
df2e36e5e4 feat: Add per-provider test buttons and documentation links
- Add /api/ai/test/{provider} endpoint for testing individual providers
- Add 'Test' button to each provider accordion (visible when configured)
- Shows test result inline (success/error message)
- Update help links with direct URLs to API key pages:
  - Anthropic: console.anthropic.com/settings/keys
  - OpenAI: platform.openai.com/api-keys
  - DeepSeek: platform.deepseek.com/api_keys
  - Ollama: ollama.ai
2025-12-11 18:11:31 +00:00
rcourtman
e842f523b7 feat: Implement multi-provider AI support
Backend:
- Add per-provider API key fields to AIConfig (AnthropicAPIKey, OpenAIAPIKey, DeepSeekAPIKey, OllamaBaseURL, OpenAIBaseURL)
- Add NewForProvider() and NewForModel() factory functions for multi-provider instantiation
- Update ListModels() to aggregate models from all configured providers with provider:model format
- Update Execute/ExecuteStream to dynamically create provider based on selected model
- Update TestConnection to use multi-provider aware provider creation
- Add helper functions: HasProvider(), GetConfiguredProviders(), GetAPIKeyForProvider(), GetBaseURLForProvider(), ParseModelString(), FormatModelString()

Frontend:
- Remove legacy single-provider UI (provider grid, single API key input, single base URL)
- Add accordion-style UI for configuring all providers independently
- Add model grouping by provider in selectors using optgroup
- Update AIChat model dropdown with grouped provider sections
- Add helper functions for parsing provider from model ID and grouping models

API:
- Add multi-provider fields to AISettingsResponse and AISettingsUpdateRequest
- Add /api/ai/models endpoint for dynamic model listing
- Update settings handlers for per-provider credential management
2025-12-11 16:00:45 +00:00
rcourtman
40236317fb feat(ai): Add suppression rules management API and UI
Users can now:
1. View all suppression rules (both from dismissed findings and manually created)
2. Create manual rules like 'ignore performance issues on debian-go'
3. Delete rules when they want alerts to come back

Backend:
- Added SuppressionRule type for user-defined rules
- Added suppressionRules storage to FindingsStore
- Added AddSuppressionRule/GetSuppressionRules/DeleteSuppressionRule methods
- Added isSuppressedInternal check for manual rules
- Added API handlers and routes for /api/ai/patrol/suppressions

Frontend:
- Added SuppressionRule interface
- Added getSuppressionRules/addSuppressionRule/deleteSuppressionRule API functions
- Added getDismissedFindings for viewing dismissed findings

Example usage:
POST /api/ai/patrol/suppressions
{
  'resource_id': 'debian-go',
  'category': 'performance',
  'description': 'Dev container runs hot - expected'
}
2025-12-11 00:12:18 +00:00
rcourtman
7350e64f3e feat(ai): Add LLM memory system for patrol findings
Implements a comprehensive feedback system that allows the LLM to 'remember'
user decisions about findings, preventing repetitive/annoying alerts.

Backend changes:
- Extended Finding struct with dismissed_reason, user_note, times_raised, suppressed
- Added Dismiss(), Suppress(), SetUserNote(), IsSuppressed() methods to FindingsStore
- Added GetDismissedForContext() to format dismissed findings for LLM context
- Enhanced buildPatrolPrompt() to inject user feedback context
- Added POST /api/ai/patrol/dismiss and /api/ai/patrol/suppress endpoints
- Updated IsActive() to exclude suppressed findings

Frontend changes:
- Added Dismiss dropdown with options: Not an Issue, Expected Behavior, Will Fix Later
- Added Never Alert Again option for permanent suppression
- Expected Behavior prompts for optional note to help LLM understand context
- Added visual badges: recurrence count (×N), dismissed status, suppressed indicator
- Display user notes in expanded finding view

Also fixes:
- Fixed 403 error on Run Patrol (compilation errors from partial refactoring)
- Removed non-LLM patrol checks - patrol now uses LLM analysis only
- Fixed function signature mismatches in alert_triggered.go

The LLM now receives context about previously dismissed findings and is
instructed not to re-raise them unless severity has significantly worsened.
2025-12-10 22:55:34 +00:00
rcourtman
1e3fdb6f63 feat(ai): Enhanced AI patrol system with alert triggers and history persistence
- Add alert-triggered AI analysis for real-time incident response
- Implement patrol history persistence across restarts
- Add patrol schedule configuration UI in AI Settings
- Enhance AIChat with patrol status and manual trigger controls
- Add resource store improvements for AI context building
- Expand Alerts page with AI-powered analysis integration
- Add Vite proxy config for AI API endpoints
- Support both Anthropic and OpenAI providers with streaming
2025-12-10 21:08:22 +00:00
rcourtman
ae7b66ecff refactor(ai): Remove over-engineered URL discovery service
Keep only the simple AI-powered approach:
- set_resource_url tool lets AI save discovered URLs
- Users ask AI directly: 'Find URLs for my containers'
- AI uses its intelligence to discover and set URLs

Removed:
- URLDiscoveryService (rigid port scanning)
- Bulk discovery API endpoints
- Frontend discovery button

The AI itself is smart enough to iterate through resources
and discover URLs when asked.
2025-12-10 08:35:24 +00:00
rcourtman
5a8791d97d feat(ai): Add bulk URL discovery service
- Add URLDiscoveryService for scanning all resources at once
- Scans common web ports (80, 443, 8080, 8096, 3000, etc.)
- Automatically saves discovered URLs to resource metadata
- Add API endpoints for start/status/cancel discovery
- Progress tracking with results reporting

Endpoints:
- POST /api/ai/discover-urls/start - Start bulk discovery
- GET /api/ai/discover-urls/status - Check progress
- POST /api/ai/discover-urls/cancel - Cancel discovery
2025-12-10 08:30:56 +00:00