Commit graph

53 commits

Author SHA1 Message Date
rcourtman
81fef82bdd Restore RC backend proof regressions 2026-04-09 20:15:17 +01:00
rcourtman
a41c956320 fix(ai): fail closed on unknown v6 read commands 2026-04-09 19:21:16 +01:00
rcourtman
21fa343fa1 Enable structured AI auto-recovery paths 2026-03-31 09:24:56 +01:00
rcourtman
046a0e92c0 Normalize pulse_read native log denial paths 2026-03-31 09:07:57 +01:00
rcourtman
d9d9dd9585 Normalize pulse_query agent and storage floor 2026-03-31 00:26:04 +01:00
rcourtman
ac9375a34b Tighten VMware control wording boundaries 2026-03-30 23:47:38 +01:00
rcourtman
dd5f099cda Lock VMware phase-1 exclusion integrity 2026-03-30 23:42:32 +01:00
rcourtman
56c14ca19f feat(ai): add canonical truenas app config reads 2026-03-29 20:36:43 +01:00
rcourtman
298b23626b feat(ai): add canonical truenas app log reads 2026-03-29 20:13:39 +01:00
rcourtman
b0ba88d541 feat(ai): add canonical truenas app control 2026-03-29 19:50:31 +01:00
rcourtman
a6c0386069 feat(ai): expose canonical truenas resources to pulse query 2026-03-29 18:25:39 +01:00
rcourtman
6e9de3188d fix(ai): expose recovery-backed storage chat path 2026-03-27 08:39:59 +00:00
rcourtman
cc4a48b13e fix(recovery): harden ai storage tool recovery fallbacks 2026-03-26 23:12:27 +00:00
rcourtman
266a504f21 test(recovery): prove ai adapter metadata resilience 2026-03-26 23:03:46 +00:00
rcourtman
2afb96ee13 fix(release): align api and hostagent rc contracts 2026-03-26 17:08:48 +00:00
rcourtman
9a0f8f543f Remove stale relationshipVersion residue 2026-03-19 14:56:02 +00:00
rcourtman
33247d65c3 Normalize remaining graph residue 2026-03-19 14:41:52 +00:00
rcourtman
8e1f832364 Remove dead action plan topology field 2026-03-19 14:40:34 +00:00
rcourtman
cc806171dc Trim dead resource graph surface 2026-03-19 14:26:30 +00:00
rcourtman
aabbd85350 Centralize discovery canonicalization helpers 2026-03-19 05:40:04 +00:00
rcourtman
699a81f7a2 Centralize resource policy cloning 2026-03-19 03:11:38 +00:00
rcourtman
3c62e8e5f5 Persist action audits through tool executor 2026-03-18 17:35:45 +00:00
rcourtman
778a2577b6 feat: Pulse v6 release 2026-03-18 16:06:30 +00:00
rcourtman
c575c7e295 fix(patrol): rename wearout JSON field to ssd_life_remaining_pct (#1300)
The AI also receives disk data via tool calls (pulse_metrics type="disks"),
not just the patrol context table. The raw JSON field "wearout" was
ambiguous — rename to "ssd_life_remaining_pct" so the field name itself
communicates that 100 = healthy.
2026-02-27 23:12:27 +00:00
rcourtman
3006f51b60 fix(patrol): clarify wearout semantics so AI knows 100% = healthy (#1300)
The patrol context table header said "Wearout" and the tool returned a raw
"wearout" JSON field with no indication that 100 = full life remaining.
The AI interpreted "wearout: 100" as fully worn out and raised false
"100% Disk Wearout" findings on healthy NVMe drives.

Rename the patrol table column to "SSD Life Remaining (100%=new)" and
update the data type comment to clarify the semantics.
2026-02-27 23:05:02 +00:00
rcourtman
7efcec3120 fix(agents,ai): host URL field, AI Docker routing, Proxmox registration logging (#1197, #1210, #1267)
#1197: Add Custom URL input to the expanded host row in Settings → Agents.
Loads existing URL via HostMetadataAPI on row expand; saves on button click.
Only shown for host-type agent rows.

#1210: Fix agent_connected always false for Docker hosts on Proxmox VMs.
connectedAgentHostnames now also marks Docker host hostnames reachable when
their matching VM/LXC has a node with a connected Proxmox agent, mirroring
the routing logic already used in the control path.

#1267/#1269: Improve Proxmox auto-registration failure logging. Response body
is now included in the error message, and the warning directs users to delete
the state file to force re-registration rather than claiming the node exists.

(cherry picked from commit 305f6d3c94f0da4fc970450a6304da57d6d7fe80)
2026-02-18 12:57:09 +00:00
rcourtman
69e3286e5e security: fix AI OAuth scope bypass, approval replay attacks, and approval endpoint scope gating
- OAuth endpoints now require settings:write scope (not just admin)
- Approval endpoints now require ai:execute scope
- Added CommandHash to approvals for replay protection
- Approvals are now single-use (consumed on first use)
- consumeApprovalWithValidation validates command matches approval
2026-02-03 19:15:15 +00:00
rcourtman
36eb381c26 test(ai): add validation tests for file tools 2026-02-02 19:24:11 +00:00
rcourtman
712e5846ec test(ai): add unit tests for discovery adapter
- Add comprehensive tests for DiscoveryMCPAdapter in internal/ai/tools/discovery_adapter_test.go
- Validate strict delegation to DiscoverySource and data transformation
2026-02-02 15:04:45 +00:00
rcourtman
b6bd9fd2d4 feat(ai): add RegisterTool method for runtime tool registration 2026-02-02 11:14:55 +00:00
rcourtman
81ec5c525a feat(ai): parallelize tool execution and refine knowledge extraction
- Implement parallel execution for read-only tools in agentic loop
- Optimize negative marker summaries to be more informative
- Fix memory percentage scaling in query tools
- Add derived memory stats (avg/max) to extraction logic
- Add explicit fresh data intent detection to bypass knowledge gate
- Update associated tests
2026-02-01 00:12:36 +00:00
rcourtman
9b0fb527f5 feat(patrol): implement patrol findings, evaluation, and investigation logic
- Add core Patrol system for automated investigations
- Implement findings management and deduplication logic
- Add evaluation framework (patrol_eval) with quality assertions and scenarios
- Add patrol-specific tools and executor integration
- Add E2E test matrix script
2026-01-31 16:23:08 +00:00
rcourtman
95a0d7a6bd feat(backend): implement AI Patrol, Investigation, and system-wide refactors 2026-01-30 19:02:14 +00:00
rcourtman
e85ec858fd fix(ai): discovery transient error handling, agentic loop detection, and read-only classification
- Discovery: classify transient errors (429, timeout, connection refused, etc.)
  and return IsError:true so models stop retrying rate-limited calls
- Agentic loop: detect identical tool calls repeated >3 times and block with
  LOOP_DETECTED error, forcing the model to try a different approach
- OpenAI provider: skip tool_choice for DeepSeek Reasoner which doesn't support it
- Read-only classifier: fix curl -I case sensitivity (uppercase flags lowered),
  add iostat/vmstat/mpstat/sar/lxc-ls/lxc-info/nc -z to allowlist,
  fix 2>&1 false positive in input redirect detection
2026-01-29 18:29:54 +00:00
rcourtman
f83356b430 feat(ai): add patrol-specific tools for agentic finding creation
Add three new patrol tools that enable the LLM to create findings via
tool calls instead of relying on output parsing:

- patrol_report_finding: Create a structured finding with validation
- patrol_resolve_finding: Mark a finding as resolved
- patrol_get_findings: Query active findings for a resource

These tools are only functional during a patrol run when PatrolFindingCreator
is set on the executor. This approach is more reliable than parsing
JSON from LLM output.
2026-01-28 23:18:42 +00:00
rcourtman
9c2f8a3284 refactor(ai): remove obsolete tool and chat files
Remove files that were consolidated into other modules:
- chat/patrol.go, patrol_test.go → moved to chat/service.go
- tools_infrastructure.go → merged into tools_storage.go
- tools_intelligence.go → merged into tools_metrics.go
- tools_patrol.go → merged into tools_alerts.go
- tools_profiles.go, tools_profiles_test.go → removed (unused)

Update related test file references.
2026-01-28 21:30:24 +00:00
rcourtman
a75393d1c5 refactor(ai): consolidate tool implementations into domain-specific files
- Merge tools_infrastructure.go, tools_intelligence.go, tools_patrol.go,
  tools_profiles.go into their respective domain tools
- Expand tools_control.go with command execution logic
- Expand tools_discovery.go with resource discovery handlers
- Expand tools_storage.go with storage-related operations
- Expand tools_metrics.go with metrics functionality
- Update tests to match new structure

This consolidation reduces file count and groups related functionality together.
2026-01-28 21:21:28 +00:00
rcourtman
23ff4d1337 chore: remove remaining gitignored files from tracking
- analyze_coverage.py (local coverage analysis script)
- coverage_summary.txt (coverage output)
- mock.env (environment file)
2026-01-28 21:19:52 +00:00
rcourtman
0013d64c7b Consolidate and extend AI tool suite
Major tools refactoring for better organization and capabilities:

New consolidated tools:
- pulse_query: Unified resource search, get, config, topology operations
- pulse_read: Safe read-only command execution with NonInteractiveOnly
- pulse_control: Guest lifecycle control (start/stop/restart)
- pulse_docker: Docker container operations
- pulse_file: Safe file read/write operations
- pulse_kubernetes: K8s resource management
- pulse_metrics: Performance metrics retrieval
- pulse_alerts: Alert management
- pulse_storage: Storage pool operations
- pulse_knowledge: Note-taking and recall
- pulse_pmg: Proxmox Mail Gateway integration

Executor improvements:
- Cleaner tool registration pattern
- Better error handling and recovery
- Protocol layer for result formatting
- Enhanced adapter interfaces

Includes comprehensive tests for:
- File and Docker operations
- Kubernetes control operations
- Command execution safety
2026-01-28 16:50:25 +00:00
rcourtman
b2e0ae3fdb Add ExecutionIntent classification and NonInteractiveOnly enforcement
Implement safety layers for command execution:

ExecutionIntent classifies commands as:
- ObservationOnly: Pure read (status, logs, metrics)
- SideEffects: May change state (restart, write, delete)

NonInteractiveOnly enforces safe command forms:
- Blocks interactive commands (vim, top without -b, etc)
- Blocks unbounded streaming (tail -f without limit)
- Suggests safe alternatives in error messages

Add phantom execution detection:
- Catches when model claims actions without using tools
- Skips check when tools actually succeeded (fixes false positives)

Includes comprehensive tests for:
- Intent classification accuracy
- Interactive command blocking
- Strict resolution validation
2026-01-28 16:49:00 +00:00
rcourtman
7f7edfceb4 test: expand backend coverage 2026-01-25 21:08:44 +00:00
rcourtman
27f1a11acb feat: add AI Intelligence system with investigation and forecasting
Major new AI capabilities for infrastructure monitoring:

Investigation System:
- Autonomous finding investigation with configurable autonomy levels
- Investigation orchestrator with rate limiting and guardrails
- Safety checks for read-only mode enforcement
- Chat-based investigation with approval workflows

Forecasting & Remediation:
- Trend forecasting for resource capacity planning
- Remediation engine for generating fix proposals
- Circuit breaker for AI operation protection

Unified Findings:
- Unified store bridging alerts and AI findings
- Correlation and root cause analysis
- Incident coordinator with metrics recording

New Frontend:
- AI Intelligence page with patrol controls
- Investigation drawer for finding details
- Unified findings panel with actions

Supporting Infrastructure:
- Learning store for user preference tracking
- Proxmox event ingestion and correlation
- Enhanced patrol with investigation triggers
2026-01-24 22:41:43 +00:00
rcourtman
c93b54ce9f refactor: clean up AI tools and remove deprecated code
- Remove deprecated tool functions
- Simplify control helpers
- Clean up test files
2026-01-22 22:31:04 +00:00
rcourtman
422efdde61 Restore UI improvements and refine Docker/Hosts display
- Restore 'mini' mode for StackedDiskBar.
- Restore layout fixes (fixed table layout, mobile columns) for Docker and Hosts tables.
- Remove 'Ask AI' and AI context selection features.
- Docker: Use compact 'Cube' icon for Podman pods to prevent name obstruction.
- Docker: Show concise image names (strip registry URL).
- Backend: Include pending fixes for AI providers.
2026-01-22 18:03:35 +00:00
rcourtman
defe298ddd Refactor: AI provider and executor multi-tenancy support
- Updated AI providers and tests for context/tenant awareness
- Refactored tool executor for multi-tenant state handling
- Added new tests for Docker control and update tools
2026-01-22 16:51:45 +00:00
rcourtman
798f6a8deb Refactor: Update AI tools and tests for multi-tenancy
- Refactored tool execution to handle tenant-scoped contexts
- Added new tests for infrastructure, control, and kubernetes tools
- Improved test coverage for agentic chat and approval store
2026-01-22 16:43:08 +00:00
rcourtman
267d5f97e5 Support: Fix OpenAI tool schema error by ensuring properties field is always present
- Removed omitempty from InputSchema.Properties
- Ensures OpenAI accepts tools with no input parameters
2026-01-22 16:41:57 +00:00
rcourtman
6e2cae2363 feat(ui): add history chart components for guest drawer
- HistoryChart: single metric visualization (CPU, memory, disk)
- UnifiedHistoryChart: combined multi-metric view
- Support for time range selection (1h to 90d)
- Responsive charts with proper dark mode support
- Fix corrupted tools_query_test.go from stash merge
2026-01-22 00:46:52 +00:00
rcourtman
f293f41499 refactor: consolidate AI tools tests
- Remove executor_test.go (tests moved to specific tool test files)
- Refactor infrastructure, patrol, profiles, and query tests
- Add query tool enhancements for better resource filtering
2026-01-22 00:43:41 +00:00
rcourtman
36622d2c17 Hide unavailable AI tools 2026-01-20 17:19:47 +00:00