Create reproduce-in-60-seconds.md

This commit is contained in:
PSBigBig + MiniPS 2026-03-30 19:51:48 +08:00 committed by GitHub
parent 99d4efa940
commit eedc21fd7f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -0,0 +1,154 @@
<!--
AI_NOTE_START
Document role:
This page is the shortest public rerun path for the current WFGY 4.0 Twin Atlas proof surface.
What this page is for:
1. Give readers the fastest possible way to test the current public surface themselves.
2. Point directly to the two TXT files required for the public rerun.
3. Clarify what the reader should compare between BEFORE and AFTER.
4. Route readers into deeper evidence pages only after the short path is clear.
What this page is not:
1. It is not the full methodology archive.
2. It is not the cleanest reviewer-facing protocol.
3. It is not a universal benchmark procedure.
4. It is not a replacement for results summary or raw runs.
Reading strategy:
1. Read the one-minute summary first.
2. Download or open the two TXT files.
3. Run them in order.
4. Compare BEFORE and AFTER using the guidance on this page.
5. Go to deeper evidence pages only if needed.
Important boundary:
This page is the shortest public rerun path for the current WFGY 4.0 proof surface.
It is designed for speed and clarity, not for exhaustive benchmark rigor.
AI_NOTE_END
-->
# ⚡ Reproduce in 60 Seconds
> The shortest public rerun path for the current WFGY 4.0 Twin Atlas proof surface.
This page exists for readers who do not want a long explanation first.
If you want to know whether the governance shift is visible, this is the fastest public path.
You need exactly two files:
- [Twin Atlas Runtime TXT](./prompts/wfgy-4_0-twin-atlas-runtime.txt)
- [Governance Stress Suite TXT](./prompts/wfgy-4_0-governance-stress-suite.txt)
---
## 🕒 One-minute summary
Do this in order:
1. open your target AI system
2. paste the Twin Atlas runtime TXT
3. paste the governance stress suite TXT
4. let the model complete both passes
5. compare the BEFORE pass and the AFTER pass
That is it.
---
## 📂 The two files
### 1. Twin Atlas runtime
Use this first:
- [wfgy-4_0-twin-atlas-runtime.txt](./prompts/wfgy-4_0-twin-atlas-runtime.txt)
This loads the public WFGY 4.0 Twin Atlas runtime.
### 2. Governance stress suite
Use this second:
- [wfgy-4_0-governance-stress-suite.txt](./prompts/wfgy-4_0-governance-stress-suite.txt)
This runs the current public governance stress surface.
---
## 🧪 What you are actually checking
The question is not:
“did the model become more polite?”
“did the model become more cautious?”
“did the answer get softer?”
The real question is:
**did the model stop turning plausibility into public conclusion too early?**
That is the core public test.
A strong AFTER pass should make at least one of these shifts visible:
- less illegal commitment
- less evidence-boundary crossing
- less single-cause compression
- less contradiction suppression
- more lawful downgrade
- stronger preservation of still-live ambiguity
---
## 📌 What not to overclaim
This page is useful because it is fast.
That does **not** mean it is the cleanest reviewer-facing protocol or a universal benchmark.
A visible before / after shift is meaningful.
A repeated pattern across public runs is meaningful.
A reproducible TXT-based path is meaningful.
But none of those things automatically prove universal superiority in every domain or every future deployment environment.
This page is honest about that.
---
## 🔁 If you want a cleaner public rerun path
Use:
- [Advanced Clean Protocol](../evidence/advanced-clean-protocol.md)
That page is better if you want the reviewer-facing path instead of the shortest path.
---
## 🖼️ If you want to compare with the current public runs
Use:
- [AI Eval](./ai-eval.md)
- [Screenshots](./screenshots/)
- [Raw Runs](../evidence/raw-runs/)
- [Results Summary](../evidence/results-summary.md)
---
## 🧭 Where to go next
### If you want screenshot-first proof
- [AI Eval](./ai-eval.md)
### If you want aggregate interpretation
- [Results Summary](../evidence/results-summary.md)
### If you want original model outputs
- [Raw Runs](../evidence/raw-runs/)
### If you want the flagship case layer
- [Flagship Cases](../evidence/flagship-cases.md)