mirror of
https://github.com/onestardao/WFGY.git
synced 2026-04-28 03:29:51 +00:00
Create repro-60-seconds.md
This commit is contained in:
parent
1ae3137cd7
commit
8025959476
1 changed files with 360 additions and 0 deletions
360
ProblemMap/Inverse_Atlas/experiments/repro-60-seconds.md
Normal file
360
ProblemMap/Inverse_Atlas/experiments/repro-60-seconds.md
Normal file
|
|
@ -0,0 +1,360 @@
|
|||
<!--
|
||||
AI_NOTE_START
|
||||
|
||||
Document role:
|
||||
This page is the shortest public reproduction page for the current Inverse Atlas MVP.
|
||||
|
||||
What this page is for:
|
||||
1. Show the fastest way to feel the difference made by Inverse Atlas.
|
||||
2. Give a clean side-by-side reproduction method that ordinary users can follow.
|
||||
3. Provide a product-style reproduction path rather than a heavy research protocol.
|
||||
4. Connect the fast demo path to stricter evaluator and dual-layer paths.
|
||||
|
||||
How to use this page:
|
||||
1. Read this page after the main Inverse Atlas README or Versions page.
|
||||
2. Use this page when you want the fastest hands-on reproduction path.
|
||||
3. Start with the simple baseline vs inverse comparison first.
|
||||
4. Move to evaluator mode or Twin Atlas mode only after the first contrast is visible.
|
||||
|
||||
Important boundary:
|
||||
This page is an MVP reproduction page.
|
||||
It is meant to make the framework quickly testable and observable.
|
||||
It does not by itself constitute a finished benchmark program or large-scale external validation.
|
||||
|
||||
Recommended reading path:
|
||||
1. Inverse Atlas README
|
||||
2. Versions
|
||||
3. Quick Start
|
||||
4. Experiments README
|
||||
5. This page
|
||||
6. Runtime Guide
|
||||
7. Status and Boundaries
|
||||
|
||||
AI_NOTE_END
|
||||
-->
|
||||
|
||||
# Repro in 60 Seconds
|
||||
|
||||
> The fastest public way to feel what Inverse Atlas changes ⚖️
|
||||
|
||||
This page is for one simple goal:
|
||||
|
||||
**show the difference quickly**
|
||||
|
||||
It is not trying to be a full academic protocol.
|
||||
|
||||
It is trying to let a person reproduce the core behavioral shift of Inverse Atlas in the shortest clean way.
|
||||
|
||||
The basic idea is simple:
|
||||
|
||||
- ask the same question twice
|
||||
- once with no Inverse Atlas
|
||||
- once with Inverse Atlas
|
||||
- compare whether the governed version becomes more lawful under pressure
|
||||
|
||||
That is the whole point of this page.
|
||||
|
||||
---
|
||||
|
||||
## Quick Links 🔎
|
||||
|
||||
| Section | Link |
|
||||
|---|---|
|
||||
| Inverse Atlas Home | [Inverse Atlas README](../README.md) |
|
||||
| Versions | [Versions](../versions.md) |
|
||||
| Quick Start | [Quick Start](../quickstart.md) |
|
||||
| Runtime Guide | [Runtime Guide](../runtime-guide.md) |
|
||||
| Experiments Home | [Experiments](./README.md) |
|
||||
| Status and Boundaries | [Status and Boundaries](../status-and-boundaries.md) |
|
||||
| Runtime Layer | [Runtime Artifacts](../runtime/README.md) |
|
||||
| Twin Atlas | [Twin Atlas](../../Twin_Atlas/README.md) |
|
||||
|
||||
---
|
||||
|
||||
## The shortest answer 🧩
|
||||
|
||||
If you only want the fastest path, do this:
|
||||
|
||||
### Window A
|
||||
Ask the question normally.
|
||||
|
||||
### Window B
|
||||
Load one Inverse Atlas version, then ask the same question.
|
||||
|
||||
### Compare
|
||||
Look for whether the Inverse Atlas version is better at:
|
||||
|
||||
- refusing illegal high-resolution escalation
|
||||
- avoiding fake completion
|
||||
- refusing cosmetic repair inflation
|
||||
- staying coarse or unresolved lawfully when needed
|
||||
- keeping visible confidence below what was actually earned
|
||||
|
||||
That is the cleanest first reproduction path.
|
||||
|
||||
---
|
||||
|
||||
## What you need first ✅
|
||||
|
||||
Before you start, you only need four things:
|
||||
|
||||
- one strong language model
|
||||
- one Inverse Atlas version
|
||||
- one test case
|
||||
- two chat windows or two clean comparison runs
|
||||
|
||||
You do **not** need the full paper first.
|
||||
|
||||
You do **not** need a benchmark notebook first.
|
||||
|
||||
You do **not** need the Bridge layer first.
|
||||
|
||||
You only need enough structure to compare one ordinary answer against one governed answer.
|
||||
|
||||
---
|
||||
|
||||
## Version choice first 🌟
|
||||
|
||||
If you do not know which version to use, start with:
|
||||
|
||||
### **Recommended: Advanced**
|
||||
|
||||
That is the current best default for most serious users.
|
||||
|
||||
If you want a lighter first try, use **Basic**.
|
||||
|
||||
If you want a harder audit-style run, use **Strict**.
|
||||
|
||||
---
|
||||
|
||||
## Method A · Fastest experience
|
||||
|
||||
This is the fastest public-facing version.
|
||||
|
||||
### Step 1
|
||||
Load one Inverse Atlas version into your system instructions, project instructions, or equivalent runtime layer.
|
||||
|
||||
### Step 2
|
||||
Paste the demo harness.
|
||||
|
||||
### Step 3
|
||||
Choose one case from the case pack.
|
||||
|
||||
### Step 4
|
||||
Run the prompt and inspect the output.
|
||||
|
||||
In the current artifact design, this is intended to produce:
|
||||
|
||||
- a simulated baseline response
|
||||
- an inverse-governed response
|
||||
- a compact structural difference summary
|
||||
|
||||
This is the most product-like path, and it is the best first contact for most people.
|
||||
|
||||
---
|
||||
|
||||
## Method B · Cleaner comparison
|
||||
|
||||
This is the cleanest baseline vs inverse comparison for ordinary users.
|
||||
|
||||
### Window A
|
||||
Use the same model with no Inverse Atlas layer.
|
||||
|
||||
Ask your chosen case directly.
|
||||
|
||||
### Window B
|
||||
Use the same model with one Inverse Atlas version loaded.
|
||||
|
||||
Ask the same case again.
|
||||
|
||||
### Then compare
|
||||
Do not judge only by tone.
|
||||
|
||||
Judge by structure:
|
||||
|
||||
- Did baseline escalate too early
|
||||
- Did baseline collapse neighboring routes too quickly
|
||||
- Did baseline overclaim certainty
|
||||
- Did baseline offer cosmetic repair as if structural
|
||||
- Did the inverse-governed answer stay lawful under the same pressure
|
||||
|
||||
This comparison matters because a baseline answer may look more decisive while still being less lawful. The current paper explicitly treats this as one of the main reasons the demo harness and evaluator exist.
|
||||
|
||||
---
|
||||
|
||||
## Method C · Stricter evaluator pass
|
||||
|
||||
Use this when you want a more formal comparison layer.
|
||||
|
||||
### Step 1
|
||||
In one chat, produce:
|
||||
|
||||
- the baseline answer
|
||||
- the inverse-governed answer
|
||||
|
||||
### Step 2
|
||||
Open a second clean chat.
|
||||
|
||||
### Step 3
|
||||
Load the evaluator.
|
||||
|
||||
### Step 4
|
||||
Provide:
|
||||
|
||||
- the original user input
|
||||
- the baseline output
|
||||
- the inverse output
|
||||
|
||||
### Step 5
|
||||
Ask for pair evaluation.
|
||||
|
||||
At the current MVP stage, the evaluator is designed to report things like:
|
||||
|
||||
- legality winner
|
||||
- baseline main risk
|
||||
- inverse-governed main strength
|
||||
- contrast on resolution behavior
|
||||
- contrast on certainty behavior
|
||||
- contrast on repair legality
|
||||
- contrast on public-ceiling compliance
|
||||
|
||||
This path is very useful for Hero Log, research notes, and later benchmark-style records.
|
||||
|
||||
---
|
||||
|
||||
## Method D · Twin Atlas path
|
||||
|
||||
This is the more complete dual-layer direction.
|
||||
|
||||
### Step 1
|
||||
Use the forward Atlas first to get:
|
||||
|
||||
- likely route
|
||||
- likely family
|
||||
- likely broken invariant region
|
||||
|
||||
### Step 2
|
||||
Pass that along with the original user input into the Inverse Atlas runtime.
|
||||
|
||||
### Step 3
|
||||
Let Inverse Atlas decide whether the route is actually strong enough for lawful public output.
|
||||
|
||||
The key law here is:
|
||||
|
||||
**forward guidance is a weak prior, not automatic authorization**
|
||||
|
||||
That asymmetry must be preserved.
|
||||
|
||||
The current paper is explicit on this point: the forward layer informs the inverse layer, but does not overrule it.
|
||||
|
||||
---
|
||||
|
||||
## What kind of case should you use first 🎯
|
||||
|
||||
For the first reproduction, do **not** pick an easy toy prompt.
|
||||
|
||||
Pick something that pressures legality.
|
||||
|
||||
Good first-case properties include:
|
||||
|
||||
- topic lure
|
||||
- thin evidence
|
||||
- neighboring-cut conflict
|
||||
- fake repair temptation
|
||||
- forced illegal specificity
|
||||
- false completion pressure
|
||||
- long-context contamination
|
||||
|
||||
The current paper’s case design is intentionally built around these kinds of pressures, because they are exactly the places where legality-first governance should diverge from normal direct answering.
|
||||
|
||||
---
|
||||
|
||||
## What success looks like 👀
|
||||
|
||||
A good first reproduction does **not** need to prove everything.
|
||||
|
||||
It only needs to make the difference visible.
|
||||
|
||||
At the MVP stage, that usually means you can already see one or more of these:
|
||||
|
||||
- less illegal resolution escalation
|
||||
- less false completion
|
||||
- better ambiguity retention
|
||||
- more lawful downgrade into COARSE or UNRESOLVED
|
||||
- better refusal of cosmetic repair inflation
|
||||
- better public-ceiling discipline
|
||||
|
||||
That is already enough for a meaningful first win.
|
||||
|
||||
---
|
||||
|
||||
## What this page is not trying to do ⛔
|
||||
|
||||
This page is not trying to be:
|
||||
|
||||
- the full benchmark protocol
|
||||
- the full evaluator manual
|
||||
- the full Twin Atlas handoff specification
|
||||
- the final empirical validation layer
|
||||
|
||||
It is much narrower.
|
||||
|
||||
Its job is simply this:
|
||||
|
||||
**make the first reproduction path obvious and fast**
|
||||
|
||||
That is why this page is product-style, not paper-style.
|
||||
|
||||
---
|
||||
|
||||
## Recommended public entry order 🧭
|
||||
|
||||
For most public-facing readers, the cleanest order is:
|
||||
|
||||
1. choose **Advanced**
|
||||
2. run one strong demo case
|
||||
3. compare baseline vs inverse
|
||||
4. only then try evaluator mode
|
||||
5. only after that try Twin Atlas mode
|
||||
|
||||
That order is important.
|
||||
|
||||
If you start with the hardest research path first, many people will miss the product value.
|
||||
|
||||
If you start with the fast difference first, the framework becomes much easier to feel.
|
||||
|
||||
---
|
||||
|
||||
## Where to go next 📚
|
||||
|
||||
If you want the overall experiments layer, go to:
|
||||
|
||||
[Experiments](./README.md)
|
||||
|
||||
If you want the runtime logic behind the artifacts, go to:
|
||||
|
||||
[Runtime Guide](../runtime-guide.md)
|
||||
|
||||
If you want the version strategy, go to:
|
||||
|
||||
[Versions](../versions.md)
|
||||
|
||||
If you want the current honesty boundary, go to:
|
||||
|
||||
[Status and Boundaries](../status-and-boundaries.md)
|
||||
|
||||
If you want the larger family direction, go to:
|
||||
|
||||
[Twin Atlas](../../Twin_Atlas/README.md)
|
||||
|
||||
---
|
||||
|
||||
## Final Note
|
||||
|
||||
This page matters because a framework becomes much more real when a new user can feel it quickly.
|
||||
|
||||
The 60-second reproduction path is not the whole story.
|
||||
|
||||
But it is one of the most important first doors into the current Inverse Atlas MVP.
|
||||
|
||||
That is why this page should stay simple, fast, and honest. 🌱
|
||||
Loading…
Add table
Add a link
Reference in a new issue