Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-28 11:30:15 +00:00

Author	SHA1	Message	Date
rcourtman	95a0d7a6bd	feat(backend): implement AI Patrol, Investigation, and system-wide refactors	2026-01-30 19:02:14 +00:00
rcourtman	27f1a11acb	feat: add AI Intelligence system with investigation and forecasting Major new AI capabilities for infrastructure monitoring: Investigation System: - Autonomous finding investigation with configurable autonomy levels - Investigation orchestrator with rate limiting and guardrails - Safety checks for read-only mode enforcement - Chat-based investigation with approval workflows Forecasting & Remediation: - Trend forecasting for resource capacity planning - Remediation engine for generating fix proposals - Circuit breaker for AI operation protection Unified Findings: - Unified store bridging alerts and AI findings - Correlation and root cause analysis - Incident coordinator with metrics recording New Frontend: - AI Intelligence page with patrol controls - Investigation drawer for finding details - Unified findings panel with actions Supporting Infrastructure: - Learning store for user preference tracking - Proxmox event ingestion and correlation - Enhanced patrol with investigation triggers	2026-01-24 22:41:43 +00:00
rcourtman	3fdf753a5b	Enhance devcontainer and CI workflows - Add persistent volume mounts for Go/npm caches (faster rebuilds) - Add shell config with helpful aliases and custom prompt - Add comprehensive devcontainer documentation - Add pre-commit hooks for Go formatting and linting - Use go-version-file in CI workflows instead of hardcoded versions - Simplify docker compose commands with --wait flag - Add gitignore entries for devcontainer auth files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 22:29:15 +00:00
rcourtman	46a7b7d10a	docs: Add prominent VERSION file warning to GEMINI.md - Add critical release section at top of file - Make VERSION file requirement impossible to miss - Explain that version_guard job will fail if VERSION doesn't match	2025-12-29 10:11:38 +00:00
rcourtman	b65f42acb5	chore: AI patrol and baseline improvements - Enhanced patrol finding display in Alerts.tsx - Improved baseline store with better error handling - Added clean thinking test coverage - Updated patrol logic for better finding management	2025-12-22 23:12:11 +00:00
rcourtman	185f1ef682	Improve AI test coverage - baseline/store_test.go: Add tests for CheckResourceAnomalies, formatAnomalyDescription, formatRatio, GetAllAnomalies, floatToStr (67.9% -> 92.2%) - memory/incidents_test.go: Add tests for RecordAlertUnacknowledged, RecordRunbook, ListIncidentsByResource, FormatForAlert, FormatForResource, FormatForPatrol (66.8% -> 81.1%) - intelligence_test.go: Add tests for SetStateProvider, FormatGlobalContext, RecordLearning, severityOrder, CheckBaselinesForResource with baselines (61.4% -> 63.1%)	2025-12-21 20:22:47 +00:00
rcourtman	b90076f086	feat(ai): implement metric-specific anomaly thresholds Smarter anomaly detection to reduce false positives: Learning Window: 7 days → 14 days - Captures weekly patterns (weekday vs weekend) Metric-Specific Thresholds: CPU: - Only report if usage >70% AND >2x baseline - Low CPU variance (5% vs 10%) is not actionable Memory: - Report if >80% OR (>1.5x baseline AND >60%) - Memory is more stable, lower threshold makes sense Disk: - Report if >85% usage OR +15 percentage points growth - Disk problems are critical, use absolute thresholds Other metrics: - Use 2x threshold as default This dramatically reduces 'noise' anomalies while catching actual problems that need operator attention.	2025-12-21 17:31:30 +00:00
rcourtman	3c4999ea48	fix(ai): raise anomaly threshold to 2x, filter 'Ran diagnostic' noise More aggressive noise filtering: 1. Anomaly threshold raised from 1.5x to 2x - 1.5x is too borderline to be actionable - Now requires genuinely significant deviation 2. Filter out 'Ran diagnostic' and 'Executed command' fallback items - These are generic summaries that provide no value - Only show remediations with specific, meaningful descriptions Goal: If something shows in AI Intelligence, it should demand attention.	2025-12-21 17:19:32 +00:00
rcourtman	4b25b84678	fix(ai): filter out noise from anomalies and status changes Critical changes to surface only actionable insights: 1. Anomalies now require at least 50% deviation from baseline - '1.0x baseline' values filtered out (statistically significant but not actionable) - Must be >1.5x above OR <0.5x below baseline to report 2. Status changes filter out startup noise - 'unknown → running' is just system starting, not a real state change - Backups removed from main list (they have dedicated section) 3. Only show genuinely interesting changes: - Config changes, migrations, restarts, deletions - Things that require operator attention This massively reduces noise while keeping high-signal alerts.	2025-12-21 17:15:44 +00:00
rcourtman	b752773924	test(baseline): add tests for trend prediction Add comprehensive tests for CalculateTrend function: - TestCalculateTrend_InsufficientData: <5 samples returns nil - TestCalculateTrend_IncreasingTrend: detects critical/warning trends - TestCalculateTrend_DecreasingTrend: correctly identifies declining usage - TestCalculateTrend_StableTrend: stable patterns return DaysToFull=-1 - TestFormatDays: human-readable time formatting	2025-12-21 11:31:58 +00:00
rcourtman	6bb46eeb34	feat(ai): enhance intelligence status and add trend prediction AIStatusIndicator: - Now shows BOTH patrol findings AND baseline anomalies - Displays even when only anomaly detection is active (no patrol) - Badge count includes both findings + anomalies - Tooltip provides detailed breakdown by severity Trend Prediction (backend): - Add TrendPrediction struct for resource exhaustion forecasting - CalculateTrend() uses linear regression on sample history - Predicts days until resource is full (or if declining/stable) - Severity: critical (<7 days), warning (<30 days), info (>30 days) - Human-readable descriptions like 'full in ~2 weeks (+0.5% per day)' This creates a more cohesive intelligence experience where anomaly detection works independently of the pro/patrol features, making value visible immediately to all users.	2025-12-21 11:29:44 +00:00
rcourtman	d9f1f7accd	feat(ai): add real-time anomaly detection endpoint Add /api/ai/intelligence/anomalies endpoint that compares live metrics against learned baselines to surface deviations - all deterministic (no LLM required). Backend: - Add AnomalyReport struct with severity classification - Add CheckResourceAnomalies method to baseline store - Add HandleGetAnomalies API handler - Add GetStateProvider getter to AI service Frontend: - Add AnomalyReport and AnomaliesResponse types - Add getAnomalies API function - Add AnomalySeverity type This is the first step toward surfacing deterministic intelligence directly in the UI without requiring LLM interaction.	2025-12-21 10:52:54 +00:00
rcourtman	b79d04f734	Add comprehensive AI test coverage - Add integration tests for Ollama provider (17 tests against real API) - Add unit tests for baseline, correlation, patterns, memory, knowledge, cost packages - Add context formatter and builder tests - Add factory tests for provider initialization - Add Makefile targets: test-integration, test-all - Clean up test theatre (removed struct field tests) Integration tests require Ollama at OLLAMA_URL (default: 192.168.0.124:11434) Run with: make test-integration	2025-12-16 12:33:06 +00:00
rcourtman	6f0379f879	feat(api): Add AI intelligence API endpoints Expose learned AI intelligence data via REST API: New endpoints: - GET /api/ai/intelligence/patterns - Detected failure patterns - GET /api/ai/intelligence/predictions - Failure predictions - GET /api/ai/intelligence/correlations - Resource correlations - GET /api/ai/intelligence/changes - Recent infrastructure changes - GET /api/ai/intelligence/baselines - Learned baselines All endpoints support ?resource_id filter for per-resource queries. Changes endpoint supports ?hours filter (default: 24). Backend additions: - ai_intelligence_handlers.go - Handler implementations - baseline.Store.GetAllBaselines() - Flat baseline export - patrol.GetChangeDetector() - Access change detector This enables frontend to display: - 'OOM expected in 3 days based on pattern' - 'When storage-1 is full, database VM restarts' - 'VM memory baseline: 60-75%' All tests passing.	2025-12-12 14:49:46 +00:00
rcourtman	5a77fab633	feat(ai): Add baseline learning and anomaly detection (Phase 2) Phase 2 of Pulse AI differentiation: - Create internal/ai/baseline package for learned baselines - Implement statistical baseline learning with mean, stddev, percentiles - Add z-score based anomaly detection with severity classification (low, medium, high, critical based on standard deviations) - Integrate baseline provider into context builder - Wire baseline store into patrol service with adapters - Add anomaly enrichment to resource contexts Key features: - Learn computes baseline from historical metric data points - IsAnomaly and CheckAnomaly detect deviations from normal - Persists baselines to disk as JSON for durability - Formatted anomaly descriptions for AI consumption Example: 'Memory is high above normal (85.2% vs typical 42.1% ± 8.3%)' The baseline store needs to be initialized and triggered to learn from metrics history. Next step is adding the learning loop. All tests passing.	2025-12-12 11:26:31 +00:00

15 commits