Pulse/docs/architecture/pulse-assistant-deep-dive.md
rcourtman fa1b74792e docs: add comprehensive deep-dive documentation for AI subsystems
Adds detailed architecture documentation for Pulse Patrol and Pulse Assistant. Updates AI.md and PULSE_PRO.md. Also includes additional tests.
2026-02-02 10:29:07 +00:00

897 lines
28 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Pulse Assistant: Technical Deep Dive
This document provides an in-depth look at the engineering behind Pulse Assistant — a safety-gated, tool-driven AI system for infrastructure management that goes far beyond simple chatbots.
---
## Executive Summary
Pulse Assistant isn't a "chat wrapper around an LLM." It's a **protocol-driven, safety-gated agentic system** that:
1. **Treats the LLM as untrusted** — the model proposes, Go code enforces
2. **Proactively gathers context** — understands resources before you ask
3. **Learns within sessions** — extracts and caches facts to avoid redundant queries
4. **Enforces workflow invariants** — FSM prevents dangerous state transitions
5. **Supports parallel tool execution** — efficient batch operations
6. **Detects and prevents hallucinations** — phantom execution detection
7. **Auto-recovers from errors** — structured error envelopes enable self-correction
All while providing streaming responses with real-time tool execution visibility.
---
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ USER REQUEST │
│ (with optional @mentions) │
└────────────────────────────────┬────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ CONTEXT PREFETCHER │
│ • Detects resource mentions (@homepage, "jellyfin") │
│ • Resolves structured mentions from frontend autocomplete │
│ • Gathers discovery info (ports, config paths, bind mounts) │
│ • Builds authoritative context summary │
└────────────────────────────────┬────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ AGENTIC LOOP │
│ ┌─────────────┐ ┌──────────┐ ┌─────────────────┐ ┌───────────────┐ │
│ │ LLM │──▶│ FSM │──▶│ Tool Executor │──▶│ Knowledge │ │
│ │ (Proposer) │ │ (Gating) │ │ (Validation) │ │ Accumulator │ │
│ └─────────────┘ └──────────┘ └─────────────────┘ └───────────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌──────────┐ ┌─────────────────┐ ┌───────────────┐ │
│ │ Phantom │ │ Telemetry│ │ ResolvedContext│ │ Context │ │
│ │ Detection │ │ Counters │ │ (Session Truth)│ │ Compaction │ │
│ └─────────────┘ └──────────┘ └─────────────────┘ └───────────────┘ │
└────────────────────────────────┬────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ AGENT EXECUTION LAYER │
│ (CommandPolicy → AgentServer → Connected Agents) │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## 1. Context Prefetcher
**📁 Location:** `internal/ai/chat/context_prefetch.go`
### What It Does
Before the LLM even sees your message, the Context Prefetcher **proactively gathers relevant context** about any resources you mention.
### How It Works
```go
type ResourceMention struct {
Name string
ResourceType string // "vm", "lxc", "docker", "host", "node", "k8s_pod"
ResourceID string
HostID string
MatchedText string
BindMounts []MountInfo // Docker bind mount mappings
DockerHostName string // Full routing chain for nested Docker
DockerHostType string // "lxc", "vm", or "host"
DockerHostVMID int
ProxmoxNode string
TargetHost string // THE correct target for commands
}
```
### Two Resolution Modes
1. **Structured Mentions** (preferred): Frontend autocomplete passes fully-resolved resource identities
2. **Fuzzy Matching** (fallback): Text analysis matches resource names from your message
### Fuzzy Matching Intelligence
```go
// Matches: "homepage" → "homepage-docker"
// Matches: "check jellyfin logs" → finds "jellyfin" container
func matchesResource(messageLower string, messageWords []string, resourceName string) bool {
// Direct containment
if strings.Contains(messageLower, resourceName) { return true }
// Prefix matching (homepage → homepage-docker)
for _, word := range messageWords {
if len(word) >= 4 && strings.HasPrefix(resourceName, word) {
return true
}
}
// Hyphenated part matching
// ...
}
```
### Unresolved Mention Handling
If you @mention something that doesn't exist:
```go
// Returns explicit feedback rather than wasting tool calls
sb.WriteString("'myservice' was NOT found in Pulse monitoring.\n")
sb.WriteString("Do NOT use pulse_discovery to search — they are not in the system.\n")
sb.WriteString("Instead: use pulse_control directly if you know the host.\n")
```
### Full Routing Chain for Docker
Docker containers get special treatment — the prefetcher resolves the complete routing chain:
```
## jellyfin (Docker container)
Location: Docker on "media-server" (LXC 141) on Proxmox node "delly"
>>> target_host: "media-server" <<<
Bind mounts (paths on media-server filesystem, NOT inside container):
/opt/jellyfin/config → /config
/mnt/media → /media
```
This prevents the common mistake of running commands on the Proxmox node instead of the LXC.
---
## 2. Knowledge Accumulator
**📁 Location:** `internal/ai/chat/knowledge_accumulator.go`
### What It Does
The Knowledge Accumulator **extracts and caches facts** from every tool result during a session. This prevents redundant queries and helps the model remember what it's already learned.
### Fact Categories
```go
const (
FactCategoryResource = "resource" // VM/LXC/container status
FactCategoryStorage = "storage" // Pool usage, backup info
FactCategoryAlert = "alert" // Active alerts, findings
FactCategoryDiscovery = "discovery" // Ports, paths, services
FactCategoryExec = "exec" // Command outputs
FactCategoryMetrics = "metrics" // Performance data
FactCategoryFinding = "finding" // Patrol findings
)
```
### Fact Structure
```go
type Fact struct {
Category FactCategory
Key string // Dedup key: "lxc:delly:106:status"
Value string // Compact value: "running, Postfix, hostname=patrol-test"
ObservedAt time.Time
Turn int // Which agentic turn observed this
}
```
### Bounded, Session-Scoped
```go
const (
defaultMaxEntries = 60 // Max facts stored
defaultMaxChars = 2000 // Max total characters
maxValueLen = 200 // Max per-fact value length
)
```
### LRU Eviction with Turn Pinning
Facts from the current or previous turn are "soft-pinned" and won't be evicted:
```go
func (ka *KnowledgeAccumulator) evict() {
// ...
// Soft-pin: don't evict facts from current or previous turn
if fact.Turn >= ka.currentTurn-1 {
continue
}
// Evict oldest facts first
}
```
### Knowledge Gate in Agentic Loop
Before executing a tool, the loop checks if we already have the answer:
```go
// KNOWLEDGE GATE: Return cached facts for redundant tool calls
if keys := PredictFactKeys(tc.Name, tc.Input); len(keys) > 0 {
if cachedValue, found := ka.Lookup(key); found {
return "Already known (from earlier investigation): " + cachedValue
}
}
```
---
## 3. Knowledge Extractor
**📁 Location:** `internal/ai/chat/knowledge_extractor.go`
### What It Does
Deterministically parses tool results and extracts structured facts — **no LLM calls** required.
### Coverage
The extractor handles all major tools:
| Tool | Facts Extracted |
|------|-----------------|
| `pulse_query` | Resource status, health, topology, configs |
| `pulse_storage` | Pool usage, backup tasks, disk health |
| `pulse_discovery` | Ports, config paths, log paths, services |
| `pulse_read` | Command outputs (keyed by command) |
| `pulse_metrics` | CPU, memory, disk averages, baselines |
| `pulse_alerts` | Active alerts, findings with severity |
| `pulse_docker` | Container states, update availability |
| `pulse_kubernetes` | Pod counts, deployment status |
### Example Extraction
For `pulse_query action=get`:
```go
func extractQueryGetFacts(input, resultText string) []FactEntry {
// Parse JSON response
// Extract: status, cpu_avg, mem_avg, disk_avg, hostname
// Key format: "lxc:delly:106:status"
// Value format: "running, cpu:12%, mem:45%, disk:67%"
}
```
### Negative Markers
If a query returns an error or empty result, a "negative marker" is stored:
```go
// Key: "lxc:delly:106:status:queried"
// Value: "not_found" or "error"
```
This prevents the model from re-querying resources that don't exist.
---
## 4. Finite State Machine (FSM)
**📁 Location:** `internal/ai/chat/fsm.go`
### What It Does
The FSM enforces **workflow invariants** that prevent dangerous state transitions. The model cannot bypass these — they're enforced in Go code.
### States
```go
const (
StateResolving = "RESOLVING" // No target yet, must discover first
StateReading = "READING" // Read tools allowed, can explore
StateWriting = "WRITING" // Write in progress (transitional)
StateVerifying = "VERIFYING" // Must read/verify before final answer
)
```
### Tool Classification
```go
const (
ToolKindResolve // Discovery/query (pulse_query, pulse_discovery)
ToolKindRead // Read-only (pulse_read, pulse_metrics, pulse_storage)
ToolKindWrite // Mutating (pulse_control, pulse_file_edit write)
)
```
### State Transitions
```
┌───────────────────────────────────────────────┐
│ │
▼ │
┌──────────┐ resolve/read ┌─────────┐ read │
│RESOLVING │ ───────────────▶│ READING │◀──────────┤
└──────────┘ └────┬────┘ │
│ │
write│ │
▼ │
┌───────────┐ verify │
│ VERIFYING │──────────┘
└───────────┘
write blocked!
```
### Key Invariants Enforced
1. **Discovery Before Action**: Can't write to undiscovered resources
2. **Verification After Write**: Must read/check after any mutation
3. **Read/Write Tool Separation**: `pulse_read` never triggers VERIFYING
### Error Response for FSM Blocks
```json
{
"error": {
"code": "FSM_BLOCKED",
"message": "Must verify the previous write operation",
"blocked": true,
"recoverable": true,
"recovery_hint": "Use a read tool to check the result first"
}
}
```
---
## 5. Resolved Context
**📁 Location:** `internal/ai/chat/types.go`
### What It Does
ResolvedContext is the **session-scoped source of truth** for discovered resources. The model cannot fabricate resource IDs — they must come from successful discovery.
### Resource Structure
```go
type ResolvedResource struct {
// Structured Identity
Kind string // "lxc", "vm", "docker_container", "node"
ProviderUID string // Stable ID from provider
Scope ResourceScope // Host, parent, cluster, namespace
Aliases []string // All names this resource is known by
// Routing
ReachableVia []ExecutorPath // All ways to reach this resource
TargetHost string // Primary command target
AllowedActions []string // What can be done
// Proxmox-specific
VMID int
Node string
}
```
### Canonical Resource ID Format
```
{kind}:{host}:{provider_uid} # Scoped resources
lxc:delly:141 # LXC 141 on node delly
docker_container:media-server:abc123 # Docker on host media-server
{kind}:{provider_uid} # Global resources
node:delly # Proxmox node
```
### TTL and LRU Eviction
```go
const (
DefaultResolvedContextTTL = 45 * time.Minute
DefaultResolvedContextMaxEntries = 500
)
```
### Explicit vs General Access Tracking
**Critical distinction** that prevents false positives:
| Tracking | Set By | Purpose |
|----------|--------|---------|
| `lastAccessed` | Every add/get | LRU eviction, TTL expiry |
| `explicitlyAccessed` | User intent only | Routing validation |
Why this matters:
- Bulk discovery adds many resources → sets `lastAccessed` for all
- If routing validation used `lastAccessed`, bulk discovery would block host operations
- Instead, routing checks `explicitlyAccessed` which is only set for single-resource operations
---
## 6. Parallel Tool Execution
**📁 Location:** `internal/ai/chat/agentic.go`
### What It Does
When the LLM requests multiple tool calls, compatible operations execute **in parallel** for efficiency.
### Three-Phase Pipeline
```
Phase 1: Pre-check (sequential)
├── FSM validation
├── Loop detection
└── Knowledge gate
Phase 2: Execute (parallel, max 4 concurrent)
├── Tool 1 ────────────┐
├── Tool 2 ─────────┐ │
├── Tool 3 ───────┐ │ │
└── Tool 4 ─────┐ │ │ │
▼ ▼ ▼ ▼
Results
Phase 3: Post-process (sequential)
├── Stream events to UI
├── FSM state transitions
├── Knowledge extraction
└── Approval flow (if needed)
```
### Concurrency Control
```go
var wg sync.WaitGroup
sem := make(chan struct{}, 4) // Cap at 4 concurrent
for j, pe := range pendingExec {
wg.Add(1)
go func(idx int, tc providers.ToolCall) {
defer wg.Done()
sem <- struct{}{} // Acquire
defer func() { <-sem }() // Release
r, e := a.executor.ExecuteTool(ctx, tc.Name, tc.Input)
execResults[idx] = parallelToolResult{Result: r, Err: e}
}(j, pe.tc)
}
wg.Wait()
```
---
## 7. Loop Detection
**📁 Location:** `internal/ai/chat/agentic.go`
### What It Does
Prevents the model from getting stuck calling the same tool repeatedly.
### Implementation
```go
const maxIdenticalCalls = 3
recentCallCounts := make(map[string]int)
func toolCallKey(name string, input map[string]interface{}) string {
inputJSON, _ := json.Marshal(input)
return name + ":" + string(inputJSON)
}
// In the loop:
callKey := toolCallKey(tc.Name, tc.Input)
recentCallCounts[callKey]++
if recentCallCounts[callKey] > maxIdenticalCalls {
return "LOOP_DETECTED: You have called " + tc.Name +
" with the same arguments " + count + " times. Blocked."
}
```
---
## 8. Phantom Execution Detection
**📁 Location:** `internal/ai/chat/agentic.go`
### What It Does
Detects when the model **claims to have done something** but never actually called tools. This catches hallucinations.
### Pattern Detection
```go
func hasPhantomExecution(content string) bool {
// Only flag if tools haven't succeeded this episode
// Look for phrases like:
// - "I have restarted..."
// - "Successfully stopped..."
// - "The service has been..."
// - "Done! I've..."
}
```
### Safe Response on Detection
```go
safeResponse := "I apologize, but I wasn't able to access the infrastructure " +
"tools needed to complete that request. This can happen when:\n\n" +
"1. The tools aren't available right now\n" +
"2. There was a connection issue\n" +
"3. The model I'm running on doesn't support function calling\n\n" +
"Please try again, or let me know if you have a question I can " +
"answer without checking live infrastructure."
```
---
## 9. Context Compaction
**📁 Location:** `internal/ai/chat/agentic.go`
### What It Does
Compacts old tool results to prevent context window exhaustion in long conversations.
### Strategy
```go
const compactionKeepTurns = 2 // Keep last 2 turns in full
const compactionMinChars = 300 // Only compact results > 300 chars
func compactOldToolResults(messages, turnStartIdx, keepTurns, minChars int, ka *KnowledgeAccumulator) {
// For results older than keepTurns:
// 1. Replace with fact summary from KnowledgeAccumulator
// 2. Or truncate to first 200 chars + "[truncated]"
}
```
### Wrap-Up Nudges
After many tool calls, the loop hints the model to wrap up:
```go
const wrapUpNudgeAfterCalls = 12
const wrapUpEscalateAfterCalls = 18
// After 12 calls: "Consider summarizing your findings"
// After 18 calls: "Please provide your final answer now"
```
---
## 10. Execution Intent Classification
**📁 Location:** `internal/ai/tools/tools_query.go`
### What It Does
Classifies commands as read-only or potentially mutating — **deterministically, without LLM judgment**.
### Intent Levels
```go
const (
IntentReadOnlyCertain // Non-mutating by construction (cat, grep, docker logs)
IntentReadOnlyConditional // Proven read-only by content (SELECT queries)
IntentWriteOrUnknown // Cannot prove safe (unknown, or has mutation patterns)
)
```
### Classification Phases (Order Matters!)
```
Phase 1: Mutation-capability guards
├── Block: sudo, redirects, pipes to dual-use tools, shell chaining
Phase 2: Known write patterns
├── Block: rm, shutdown, systemctl restart, DROP DATABASE
Phase 3: Read-only by construction
├── Allow: cat, grep, ls, docker logs, ffprobe, kubectl get
Phase 4: Content inspection
├── SQL: SELECT → allow, INSERT/UPDATE/DELETE → block
├── Redis: GET → allow, SET/DEL → block
Phase 5: Conservative fallback
└── Unknown → IntentWriteOrUnknown
```
**Critical**: Phase 1-2 **dominate** — a command matching known write patterns is blocked even if it also matches read-only patterns.
### Content Inspectors
```go
type ContentInspector interface {
Applies(cmdLower string) bool
IsReadOnly(cmdLower string) (bool, string)
}
// Example: SQL inspector
type sqlContentInspector struct{}
func (s *sqlContentInspector) IsReadOnly(cmd string) (bool, string) {
// Parse SQL, check for SELECT only
// Block: INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER
}
```
---
## 11. Strict Resolution
**📁 Location:** `internal/ai/tools/tools_query.go`
### What It Does
Prevents the model from operating on **fabricated or hallucinated resource IDs**.
### Error Response
```go
type ErrStrictResolution struct {
Action string
Name string
ResourceID string
Message string
Suggestions []string // Discovered resources with similar names
}
func (e *ErrStrictResolution) ToToolResponse() ToolResponse {
return ToolResponse{
OK: false,
Error: &ToolError{
Code: "STRICT_RESOLUTION",
Message: e.Message,
Blocked: true,
Details: map[string]interface{}{
"recovery_hint": "Use pulse_query to discover resources first",
"auto_recoverable": true,
},
},
}
}
```
---
## 12. Routing Mismatch Detection
**📁 Location:** `internal/ai/tools/tools_query.go`
### What It Does
Prevents accidentally operating on a parent host when you meant to target a child resource.
### Scenario
1. User asks "edit config on @jellyfin" (Docker container on LXC)
2. Model calls `pulse_file_edit` with `target_host="delly"` (the Proxmox node)
3. File happens to exist on delly too (shared path)
4. **Wrong file gets edited!**
### Detection
```go
type ErrRoutingMismatch struct {
TargetHost string
MoreSpecificResources []string // ["jellyfin"]
MoreSpecificResourceIDs []string // ["docker_container:media-server:abc123"]
TargetResourceID string
Message string
}
```
### Error Response
```json
{
"error": {
"code": "ROUTING_MISMATCH",
"message": "target_host 'delly' is a Proxmox node, but you recently referenced more specific resources: [jellyfin]",
"details": {
"target_resource_id": "docker_container:media-server:abc123",
"recovery_hint": "Retry with target_host='media-server'",
"auto_recoverable": true
}
}
}
```
---
## 13. Approval Flow
**📁 Location:** `internal/ai/chat/agentic.go`, `internal/ai/approval/`
### What It Does
In "Controlled" mode, write operations require explicit user approval before execution.
### Flow
```
1. Tool returns APPROVAL_REQUIRED
├── approval_id
├── command
├── risk_level
└── description
2. Agentic loop emits approval_needed SSE event
3. UI shows approval card to user
4. User approves/denies via API
POST /api/ai/approvals/{id}/approve
POST /api/ai/approvals/{id}/deny
5. On approve: Tool re-executes with _approval_id
On deny: Assistant responds "Command denied: <reason>"
```
### Autonomous Mode
For investigations, approvals can be bypassed:
```go
func (a *AgenticLoop) SetAutonomousMode(enabled bool) {
a.mu.Lock()
a.autonomousMode = enabled
a.mu.Unlock()
}
```
---
## 14. Token Budget Management
**📁 Location:** `internal/ai/chat/agentic.go`
### Token Tracking
```go
// Per-turn tracking
a.totalInputTokens += data.InputTokens
a.totalOutputTokens += data.OutputTokens
// After each turn:
if a.budgetChecker != nil {
if err := a.budgetChecker(); err != nil {
return resultMessages, fmt.Errorf("budget exceeded: %w", err)
}
}
```
### Dynamic Turn Limits
```go
// Force text-only on last turn to get a summary
if turn >= maxTurns-1 {
req.ToolChoice = &providers.ToolChoice{Type: providers.ToolChoiceNone}
}
// After write completion, force summary (prevents stale-data loops)
if writeCompletedLastTurn {
req.ToolChoice = &providers.ToolChoice{Type: providers.ToolChoiceNone}
}
```
---
## 15. DeepSeek Artifact Cleanup
**📁 Location:** `internal/ai/chat/agentic.go`
### What It Does
DeepSeek models sometimes leak internal markup in responses. The Assistant cleans this:
```go
func containsDeepSeekMarker(text string) bool {
return strings.Contains(text, "<DSML") ||
strings.Contains(text, "<end▁of▁thinking>")
}
func cleanDeepSeekArtifacts(content string) string {
// Remove:
// - <DSMLfunction_calls>...</DSML>
// - <end▁of▁thinking>
// - Internal reasoning markers
}
```
---
## 16. Tool Protocol
**📁 Location:** `internal/ai/tools/protocol.go`
### Consistent Error Envelopes
All tools return the same structure, enabling auto-recovery:
```go
type ToolResponse struct {
OK bool `json:"ok"`
Data interface{} `json:"data,omitempty"`
Error *ToolError `json:"error,omitempty"`
Meta map[string]interface{} `json:"meta,omitempty"`
}
type ToolError struct {
Code string `json:"code"`
Message string `json:"message"`
Blocked bool `json:"blocked,omitempty"`
Failed bool `json:"failed,omitempty"`
Retryable bool `json:"retryable,omitempty"`
Details map[string]interface{} `json:"details,omitempty"`
}
```
### Error Codes
| Code | Meaning | Auto-Recoverable |
|------|---------|------------------|
| `STRICT_RESOLUTION` | Resource not discovered | Yes (discover then retry) |
| `FSM_BLOCKED` | FSM state prevents operation | Yes (perform required action) |
| `ROUTING_MISMATCH` | Wrong target host | Yes (use correct target) |
| `APPROVAL_REQUIRED` | User approval needed | Yes (wait for approval) |
| `NOT_FOUND` | Resource doesn't exist | No |
| `POLICY_BLOCKED` | Security policy blocked | No |
| `EXECUTION_FAILED` | Runtime error | Depends |
---
## 17. Telemetry & Metrics
**📁 Location:** `internal/ai/chat/metrics.go`
### Key Metrics Collected
```go
// Agentic loop iterations
metrics.RecordAgenticIteration(provider, model)
// FSM blocks
metrics.RecordFSMToolBlock(state, toolName, toolKind)
metrics.RecordFSMFinalBlock(state)
// Phantom detection
metrics.RecordPhantomDetected(provider, model)
// Auto-recovery
metrics.RecordAutoRecoveryAttempt(errorCode, toolName)
metrics.RecordAutoRecoverySuccess(errorCode, toolName)
// Routing mismatches
pulse_ai_routing_mismatch_block_total{tool, target_kind, child_kind}
```
---
## Files Reference
| File | Purpose |
|------|---------|
| `internal/ai/chat/service.go` | Chat service orchestration |
| `internal/ai/chat/session.go` | Session lifecycle management |
| `internal/ai/chat/agentic.go` | Core agentic loop (2289 lines) |
| `internal/ai/chat/fsm.go` | Finite state machine |
| `internal/ai/chat/types.go` | ResolvedContext, ResolvedResource |
| `internal/ai/chat/context_prefetch.go` | Proactive context gathering |
| `internal/ai/chat/knowledge_accumulator.go` | Fact caching |
| `internal/ai/chat/knowledge_extractor.go` | Deterministic fact extraction |
| `internal/ai/tools/tools_query.go` | Query/discovery tools, strict resolution |
| `internal/ai/tools/tools_read.go` | Read-only execution, intent classification |
| `internal/ai/tools/tools_control.go` | Write operations |
| `internal/ai/tools/protocol.go` | ToolResponse envelope |
| `internal/ai/approval/` | Approval flow management |
---
## Summary
Pulse Assistant is engineered with **safety as a first-class concern**:
1. **LLM as proposer, Go as enforcer** — the model suggests, code validates
2. **Proactive context** — understands resources before you ask
3. **Session-scoped learning** — remembers what it's discovered
4. **Workflow enforcement** — FSM prevents dangerous transitions
5. **Parallel execution** — efficient batch operations
6. **Hallucination detection** — phantom execution caught and handled
7. **Auto-recovery** — structured errors enable self-correction
8. **Routing validation** — prevents wrong-target mistakes
This architecture ensures the Assistant is both **powerful** (can control infrastructure) and **safe** (can't cause unintended damage).