Update eval_cross_agent_consistency.md

This commit is contained in:
PSBigBig × MiniPS 2026-02-26 15:47:44 +08:00 committed by GitHub
parent 75d34ca5fd
commit 0d5e04558c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -16,6 +16,11 @@
> If you need the full triage and all prescriptions, return to the Emergency Room lobby.
</details>
> **Evaluation disclaimer (cross agent consistency)**
> Agreement between agents is measured with chosen prompts and roles and can still be wrong in absolute terms.
> Consistency scores are diagnostic tools, not proof that the agreed answer is true or safe.
---
**Goal**
Measure and enforce agreement between two independent validators: a **Scholar** (claims/citations checker) and an **Auditor** (policy/provenance/constraints gate). Produce (1) quantitative agreement (Percent Agreement & Cohens κ) and (2) a deterministic **conflict-resolution policy** for ship/no-ship.