diff --git a/README.md b/README.md index 5d58d04a..6c50f776 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,40 @@ +--- + +
+ 🆕 GPT-5 vs GPT-5 + WFGY Benchmark (see  how to rerun it yourself) + +
+ +WFGY_Win + + +**Reproduce in 30 seconds** + +```text +You are connected to a reasoning enhancement layer (WFGY). +Your goal is to maximize accuracy across reasoning, knowledge recall, +hallucination resistance, and multi-step logic. + +When answering, follow this exact reasoning process: +1. Extract the question and all possible answers. +2. Map each option to its semantic meaning, checking for ambiguity or logical traps. +3. Cross-check against the uploaded PDF for relevant facts or principles. +4. If no direct match, infer via multi-step reasoning before committing to an answer. +5. Only output the final choice letter. Do not add explanations unless explicitly asked. +```` + +1. **Download WFGY PDF** → [WFGY PDF](https://zenodo.org/records/15630969) +2. Upload the PDF to your LLM chat. +3. Paste the prompt above and run any benchmark (GSM8K, Truthful-QA, etc.). + +That’s it—no retraining, no jailbreaks. + +
+ + ---
@@ -1381,3 +1415,4 @@ It treats alignment as a living semantic contract — not just accuracy, but mea +