Add ML-based prompt injection detection (#5623)

2026-05-01 21:10:54 +00:00 · 2026-01-08 11:55:59 +10:00 · 2026-01-08 11:55:59 +10:00 · 9dc548ee2f
commit 9dc548ee2f
parent 01da90c9b3
10 changed files with 806 additions and 394 deletions
--- a/documentation/docs/guides/config-files.md
+++ b/documentation/docs/guides/config-files.md
@ -47,6 +47,8 @@ The following settings can be configured at the root level of your config.yaml f
 | `otel_exporter_otlp_timeout` | Export timeout in milliseconds for [observability](/docs/guides/environment-variables#opentelemetry-protocol-otlp) | Integer (ms) | 10000 | No |
 | `SECURITY_PROMPT_ENABLED` | Enable [prompt injection detection](/docs/guides/security/prompt-injection-detection) to identify potentially harmful commands | true/false | false | No |
 | `SECURITY_PROMPT_THRESHOLD` | Sensitivity threshold for [prompt injection detection](/docs/guides/security/prompt-injection-detection) (higher = stricter) | Float between 0.01 and 1.0 | 0.7 | No |
+<!-- | `SECURITY_PROMPT_CLASSIFIER_ENABLED` | Enable ML-based prompt injection detection for advanced threat identification | true/false | false | No | -->
+<!-- | `SECURITY_PROMPT_CLASSIFIER_MODEL` | Specify the BERT ML model to use for prompt injection detection | String | "ProtectAI DeBERTa" | No | -->

 ## Experimental Features