Update README.md

This commit is contained in:
PSBigBig × MiniPS 2026-02-11 23:56:38 +08:00 committed by GitHub
parent 0e50a61f1d
commit 22bd17e79a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -165,11 +165,29 @@ Use this when automated verification is unavailable, or when you want to confirm
<br/>
### 2. TU Q124 · Scalable oversight and evaluation (MVP)
### 2. TU Q121 · Single-agent alignment tension (MVP)
One-cell Colab demo for TU Q121 at the effective layer.
The notebook defines two personas on the same base model (literal helper vs aligned helper),
runs them on a small set of SAFE / UNSAFE scenarios, and uses a scalar observable `T_align`
to measure alignment tension from judged risk and helpfulness scores.
- No API key: scroll the header, charter, scenarios and screenshots as a worked example.
- With API key: paste a key once, run the single cell, and reproduce the full table and `T_align` plots.
- **Q121-A · Literal helper vs aligned helper**
[Open in Colab](https://colab.research.google.com/github/onestardao/WFGY/blob/main/TensionUniverse/Experiments/Q121_MVP/Q121_A.ipynb)
[Experiment README](https://github.com/onestardao/WFGY/blob/main/TensionUniverse/Experiments/Q121_MVP/README.md)
<br/>
### 3. TU Q124 · Scalable oversight and evaluation (MVP)
One-cell Colab demo for TU Q124 at the effective layer.
The notebook prints the setup, defines a small set of synthetic oversight cases, then (optionally) asks for an API key and runs baseline vs guided evaluators with a single tension observable `T_oversight`.
The notebook prints the setup, defines a small set of synthetic oversight cases, then (optionally)
asks for an API key and runs baseline vs guided evaluators with a single tension observable `T_oversight`.
- No API key: just read the header and code as a worked example.
- With API key: run the full experiment and see tables plus a tension plot.
@ -180,10 +198,33 @@ The notebook prints the setup, defines a small set of synthetic oversight cases,
<br/>
### 3. TU Q130 · Early effective-layer experiments
### 4. TU Q127 · Synthetic worlds and data entropy (MVP)
One-cell Colab demo for TU Q127 at the effective layer.
The notebook builds three tiny synthetic worlds with different class balance / label noise,
trains a small MLP per world, and computes a scalar observable `T_entropy(train → test)`
that behaves like a simple “world detector” over the cross-world grid.
- No API key: the whole run is offline. Just hit “Run all” to generate the summary table
and the `T_entropy` heatmap, then compare your pattern to the README screenshots.
- With API key: not required for the current MVP, but the script can be extended if you
want to add LLM-based diagnostics.
- **Q127-A · Synthetic worlds entropy gauge**
[Open in Colab](https://colab.research.google.com/github/onestardao/WFGY/blob/main/TensionUniverse/Experiments/Q127_MVP/Q127_A.ipynb)
[Experiment README](https://github.com/onestardao/WFGY/blob/main/TensionUniverse/Experiments/Q127_MVP/README.md)
<br/>
### 5. TU Q130 · Early effective-layer experiments
Early TU Q130 experiments.
Each notebook is a single-cell script: install deps, explain the setup, ask for an API key in Colab, then run and print tables / plots. No fine-tuning, only encoding and scoring changes.
Each notebook is a single-cell script: install deps, explain the setup, ask for an API key in Colab,
then run and print tables / plots. No fine-tuning, only encoding and scoring changes.
- **Q130-A · OOD tension gauge (Hollywood vs Physics)**
[Open in Colab](https://colab.research.google.com/github/onestardao/WFGY/blob/main/TensionUniverse/Experiments/Q130_MVP/Q130_A.ipynb)
@ -199,7 +240,6 @@ More TU MVP experiments will be added here as the Tension Universe program grows
</details>
---
## Tension Universe · BlackHole S problem collection