Update eval_playbook.md

This commit is contained in:
PSBigBig × MiniPS 2026-02-26 16:07:02 +08:00 committed by GitHub
parent 0d5e04558c
commit fbe1f71dbf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -16,6 +16,11 @@
> If you need the full triage and all prescriptions, return to the Emergency Room lobby.
</details>
> **Evaluation disclaimer (observability playbook)**
> The signals in this playbook come from specific logging and probing setups.
> They are tools for monitoring behavior, not proofs of safety or correctness on their own.
---
A compact playbook to **stabilize evaluation** and ensure results are reproducible.
Use this when metrics look inconsistent, coverage drifts, or benchmarks feel untrustworthy.