Commit graph

16 commits

Author SHA1 Message Date
rcourtman
9e339957c6 fix: Update runtime config when toggling Docker update actions setting
The DisableDockerUpdateActions setting was being saved to disk but not
updated in h.config, causing the UI toggle to appear to revert on page
refresh since the API returned the stale runtime value.

Related to #1023
2026-01-03 11:14:17 +00:00
rcourtman
3fdf753a5b Enhance devcontainer and CI workflows
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 22:29:15 +00:00
rcourtman
78c3434061 fix: include VMID in AI context to prevent incorrect references
The LLM was confusing VMIDs because they weren't included in the
context. Now the formatted context shows:

  ### Container: ollama (VMID 200) on minipc

This prevents the AI from referencing the wrong VMID when generating
findings and recommendations.
2025-12-21 23:13:47 +00:00
rcourtman
9c58bfa127 perf: reduce MetricSamples from 100 to 24 points
100 samples was causing 326k+ input tokens which is expensive.
24 samples (hourly resolution) still provides good pattern visibility
while significantly reducing token cost.

Estimated reduction: ~75% fewer metric tokens.
2025-12-21 22:56:19 +00:00
rcourtman
c15f260280 feat: increase MetricSamples to 100 points (~15 min resolution)
Modern LLMs have 100k+ token contexts. 100 samples over 24h gives
~15 minute resolution while adding minimal token overhead.

This lets the LLM see fine-grained patterns, short spikes, and
accurately distinguish anomalies from normal behavior.
2025-12-21 22:25:54 +00:00
rcourtman
d23f1c78de fix: increase MetricSamples to 24 points for hourly resolution
12 samples was too coarse (2-hour intervals could miss spikes).
24 samples gives ~hourly resolution while still being compact.
2025-12-21 22:24:02 +00:00
rcourtman
5877ce00c3 fix: use 24h window for MetricSamples (matches in-memory retention)
The in-memory MetricsHistory only retains 24 hours of data, not 7 days.
Changed computeGuestMetricSamples to use trendWindow24h instead of
trendWindow7d, and reduced sample count from 24 to 12 points.

This ensures the LLM actually receives metric samples in the context,
which wasn't happening before because the 7-day query returned empty data.
2025-12-21 22:19:40 +00:00
rcourtman
f6b1414ed6 debug: add logging to verify MetricSamples population for LLM context 2025-12-21 22:14:54 +00:00
rcourtman
2928fad643 feat(ai): pass raw metric samples to LLM for pattern interpretation
Instead of relying on pre-computed trend heuristics (which can be misleading
for edge cases like step changes vs continuous growth), we now pass downsampled
raw data points to the LLM so it can interpret patterns directly.

Changes:
- Add MetricSamples field to ResourceContext
- Add DownsampleMetrics() to reduce data points for LLM consumption
- Add formatMetricSamples() to format data compactly (e.g., 'Disk: 26→26→31%')
- Add computeGuestMetricSamples() to gather 7-day sampled history
- Populate MetricSamples for VMs and containers during context build
- Add History section to formatted context output

The LLM now sees actual patterns like 'stable for 6 days then jumped' rather
than just '45.8%/day growth rate' - allowing for much more nuanced interpretation.

This approach:
- Leverages LLM's pattern recognition instead of hard-coded heuristics
- Provides 7 days of data (~24 samples) for context on normal behavior
- Uses minimal tokens due to compact formatting with deduplication
- Is more future-proof as LLMs improve

Example output:
  **History (7d sampled, oldest→newest)**: Disk: 26→26→26→26→26→31%

Refs: Frigate disk usage false positive investigation
2025-12-21 21:09:24 +00:00
rcourtman
db5e79bb37 fix: Allow Host Agent thresholds to be set to 0 to disable alerting. Related to #864 2025-12-20 20:25:20 +00:00
rcourtman
0182cc8310 feat(thresholds): add collapsible accordion sections and UX improvements
- Add CollapsibleSection component with animated expand/collapse
- Wrap all 6 resource sections (Nodes, VMs, PBS, Storage, Backups, Snapshots) with accordion UI
- Add section icons and resource counts in headers
- Add expand all / collapse all buttons for quick navigation
- Make help banner dismissible with localStorage persistence
- Add Ctrl/Cmd+F keyboard shortcut to focus search
- Add keyboard shortcut hint badge on search input
- Add icons to tab navigation for quick identification
- Improve mobile tab labels with shorter text on small screens
- Create reusable components: ThresholdBadge, ResourceCard, GlobalDefaultsRow
- Create useCollapsedSections hook with localStorage persistence
- Default less-used sections (Storage, Backups, Snapshots, PBS) to collapsed
2025-12-18 15:47:44 +00:00
rcourtman
b79d04f734 Add comprehensive AI test coverage
- Add integration tests for Ollama provider (17 tests against real API)
- Add unit tests for baseline, correlation, patterns, memory, knowledge, cost packages
- Add context formatter and builder tests
- Add factory tests for provider initialization
- Add Makefile targets: test-integration, test-all
- Clean up test theatre (removed struct field tests)

Integration tests require Ollama at OLLAMA_URL (default: 192.168.0.124:11434)
Run with: make test-integration
2025-12-16 12:33:06 +00:00
rcourtman
7acff2215c style: remove emojis from AI context formatting and prompts
Replaced emoji indicators with text equivalents for better cross-platform
compatibility and cleaner LLM prompts.
2025-12-13 21:26:49 +00:00
rcourtman
6aefeca979 feat: Enhance OCI container display and AI context
- Frontend: Add ociImage memo to extract clean image name from osTemplate
- Frontend: Show OCI image name in type badge tooltip
- Frontend: Display OCI image in OS column when no guest agent info available
- Frontend: Include ociImage in AI context data for selected OCI containers
- Backend: Differentiate OCI containers as 'oci_container' type in AI context
- Backend: Add Metadata field to ResourceContext for extensibility
- Backend: Include oci_image in container metadata for AI analysis
- Backend: Update section heading to 'LXC/OCI Containers' in AI context

This follows Docker container patterns to avoid duplicating work.
2025-12-12 18:00:09 +00:00
rcourtman
5a77fab633 feat(ai): Add baseline learning and anomaly detection (Phase 2)
Phase 2 of Pulse AI differentiation:

- Create internal/ai/baseline package for learned baselines
- Implement statistical baseline learning with mean, stddev, percentiles
- Add z-score based anomaly detection with severity classification
  (low, medium, high, critical based on standard deviations)
- Integrate baseline provider into context builder
- Wire baseline store into patrol service with adapters
- Add anomaly enrichment to resource contexts

Key features:
- Learn computes baseline from historical metric data points
- IsAnomaly and CheckAnomaly detect deviations from normal
- Persists baselines to disk as JSON for durability
- Formatted anomaly descriptions for AI consumption
  Example: 'Memory is high above normal (85.2% vs typical 42.1% ± 8.3%)'

The baseline store needs to be initialized and triggered to learn
from metrics history. Next step is adding the learning loop.

All tests passing.
2025-12-12 11:26:31 +00:00
rcourtman
88d419dd5b feat(ai): Add enriched context with historical trends and predictions
Phase 1 of Pulse AI differentiation:

- Create internal/ai/context package with types, trends, builder, formatter
- Implement linear regression for trend computation (growing/declining/stable/volatile)
- Add storage capacity predictions (predicts days until 90% and 100%)
- Wire MetricsHistory from monitor to patrol service
- Update patrol to use buildEnrichedContext instead of basic summary
- Update patrol prompt to reference trend indicators and predictions

This gives the AI awareness of historical patterns, enabling it to:
- Identify resources with concerning growth rates
- Predict capacity exhaustion before it happens
- Distinguish between stable high usage vs growing problems
- Provide more actionable, time-aware insights

All tests passing. Falls back to basic summary if metrics history unavailable.
2025-12-12 09:45:57 +00:00