rcourtman
|
b2e0ae3fdb
|
Add ExecutionIntent classification and NonInteractiveOnly enforcement
Implement safety layers for command execution:
ExecutionIntent classifies commands as:
- ObservationOnly: Pure read (status, logs, metrics)
- SideEffects: May change state (restart, write, delete)
NonInteractiveOnly enforces safe command forms:
- Blocks interactive commands (vim, top without -b, etc)
- Blocks unbounded streaming (tail -f without limit)
- Suggests safe alternatives in error messages
Add phantom execution detection:
- Catches when model claims actions without using tools
- Skips check when tools actually succeeded (fixes false positives)
Includes comprehensive tests for:
- Intent classification accuracy
- Interactive command blocking
- Strict resolution validation
|
2026-01-28 16:49:00 +00:00 |
|