Commit graph

11 commits

Author SHA1 Message Date
rcourtman
d2604a6859 test: add AI memory regression coverage 2026-02-04 19:46:20 +00:00
rcourtman
7f7edfceb4 test: expand backend coverage 2026-01-25 21:08:44 +00:00
rcourtman
27f1a11acb feat: add AI Intelligence system with investigation and forecasting
Major new AI capabilities for infrastructure monitoring:

Investigation System:
- Autonomous finding investigation with configurable autonomy levels
- Investigation orchestrator with rate limiting and guardrails
- Safety checks for read-only mode enforcement
- Chat-based investigation with approval workflows

Forecasting & Remediation:
- Trend forecasting for resource capacity planning
- Remediation engine for generating fix proposals
- Circuit breaker for AI operation protection

Unified Findings:
- Unified store bridging alerts and AI findings
- Correlation and root cause analysis
- Incident coordinator with metrics recording

New Frontend:
- AI Intelligence page with patrol controls
- Investigation drawer for finding details
- Unified findings panel with actions

Supporting Infrastructure:
- Learning store for user preference tracking
- Proxmox event ingestion and correlation
- Enhanced patrol with investigation triggers
2026-01-24 22:41:43 +00:00
rcourtman
3fdf753a5b Enhance devcontainer and CI workflows
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 22:29:15 +00:00
rcourtman
a4611739a9 fix: Hosts page not updating in real-time (SolidJS reactivity bug)
Fixed a critical reactivity bug in HostsOverview.tsx where the HostRow
component was destructuring props.host in the function body. In SolidJS,
this breaks reactivity because the destructured value is a static snapshot
captured at component creation time.

Changes:
- Removed 'const { host } = props' destructuring in HostRow
- Changed all 'host.' references to 'props.host.' to maintain reactivity
- Converted cpuPercent and diskStats to reactive getters (functions)
- Added documentation comment explaining why destructuring breaks reactivity

This fixes Issue #949 where CPU, memory, and disk values on the Hosts
page would stay stale until manual page refresh.

Related to #949
2025-12-29 11:45:45 +00:00
rcourtman
db5e79bb37 fix: Allow Host Agent thresholds to be set to 0 to disable alerting. Related to #864 2025-12-20 20:25:20 +00:00
rcourtman
54fc259221 fix(ai): improve AI settings UX with validation and smart fallbacks
Backend:
- Add smart provider fallback when selected model's provider isn't configured
- Automatically switch to a model from a configured provider instead of failing
- Log warning when fallback occurs for visibility

Frontend (AISettings.tsx):
- Add helper functions to check if model's provider is configured
- Group model dropdown: configured providers first, unconfigured marked with ⚠️
- Add inline warning when selecting model from unconfigured provider
- Validate on save that model's provider is configured (or being added)
- Warn before clearing last configured provider (would disable AI)
- Warn before clearing provider that current model uses
- Add patrol interval validation (must be 0 or >= 10 minutes)
- Show red border + inline error for invalid patrol intervals 1-9
- Update patrol interval hint: '(0=off, 10+ to enable)'

These changes prevent confusing '500 Internal Server Error' and
'AI is not enabled or configured' errors when model/provider mismatch.
2025-12-17 18:30:19 +00:00
rcourtman
b79d04f734 Add comprehensive AI test coverage
- Add integration tests for Ollama provider (17 tests against real API)
- Add unit tests for baseline, correlation, patterns, memory, knowledge, cost packages
- Add context formatter and builder tests
- Add factory tests for provider initialization
- Add Makefile targets: test-integration, test-all
- Clean up test theatre (removed struct field tests)

Integration tests require Ollama at OLLAMA_URL (default: 192.168.0.124:11434)
Run with: make test-integration
2025-12-16 12:33:06 +00:00
rcourtman
7acff2215c style: remove emojis from AI context formatting and prompts
Replaced emoji indicators with text equivalents for better cross-platform
compatibility and cleaner LLM prompts.
2025-12-13 21:26:49 +00:00
rcourtman
8b077f69ce feat: AI security and policy improvements for 5.0
- Add DOMPurify sanitization for AI chat markdown rendering (XSS fix)
- Configure DOMPurify to add target=_blank and rel=noopener to links
- Update system prompt to align with command approval policy
- Clarify safe vs destructive commands in prompt
- Improve patrol auto-fix mode guidance with safe operation list
- Add verification requirements for auto-fix actions
- Update observe-only mode to be clearer about read-only restrictions
2025-12-12 17:38:55 +00:00
rcourtman
9539ddaa6b feat(ai): Add multi-resource correlation detection (Phase 6)
Create internal/ai/correlation package:

1. Correlation Detector (detector.go):
   - Tracks events across resources
   - Detects when events on one resource follow events on another
   - Calculates average delay between correlated events
   - Confidence scoring based on occurrence count
   - Persists to ai_correlations.json

2. Features:
   - GetCorrelations() - All detected relationships
   - GetCorrelationsForResource() - Relationships for one resource
   - GetDependencies() - What resources depend on this one
   - GetDependsOn() - What this resource depends on
   - PredictCascade() - Predict what will be affected
   - FormatForContext() - AI-consumable summary

3. Integration:
   - Wire to alert history in router startup
   - Map alert types to correlation event types
   - Add correlation context to enriched AI context

Example AI context now includes:
'When local-zfs experiences high usage, database often follows within 5 minutes'

This enables the AI to understand infrastructure dependencies
and predict cascade failures.

All tests passing.
2025-12-12 14:26:10 +00:00