Create repro-60-seconds.md

This commit is contained in:
PSBigBig + MiniPS 2026-03-24 11:04:49 +08:00 committed by GitHub
parent 1ae3137cd7
commit 8025959476
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -0,0 +1,360 @@
<!--
AI_NOTE_START
Document role:
This page is the shortest public reproduction page for the current Inverse Atlas MVP.
What this page is for:
1. Show the fastest way to feel the difference made by Inverse Atlas.
2. Give a clean side-by-side reproduction method that ordinary users can follow.
3. Provide a product-style reproduction path rather than a heavy research protocol.
4. Connect the fast demo path to stricter evaluator and dual-layer paths.
How to use this page:
1. Read this page after the main Inverse Atlas README or Versions page.
2. Use this page when you want the fastest hands-on reproduction path.
3. Start with the simple baseline vs inverse comparison first.
4. Move to evaluator mode or Twin Atlas mode only after the first contrast is visible.
Important boundary:
This page is an MVP reproduction page.
It is meant to make the framework quickly testable and observable.
It does not by itself constitute a finished benchmark program or large-scale external validation.
Recommended reading path:
1. Inverse Atlas README
2. Versions
3. Quick Start
4. Experiments README
5. This page
6. Runtime Guide
7. Status and Boundaries
AI_NOTE_END
-->
# Repro in 60 Seconds
> The fastest public way to feel what Inverse Atlas changes ⚖️
This page is for one simple goal:
**show the difference quickly**
It is not trying to be a full academic protocol.
It is trying to let a person reproduce the core behavioral shift of Inverse Atlas in the shortest clean way.
The basic idea is simple:
- ask the same question twice
- once with no Inverse Atlas
- once with Inverse Atlas
- compare whether the governed version becomes more lawful under pressure
That is the whole point of this page.
---
## Quick Links 🔎
| Section | Link |
|---|---|
| Inverse Atlas Home | [Inverse Atlas README](../README.md) |
| Versions | [Versions](../versions.md) |
| Quick Start | [Quick Start](../quickstart.md) |
| Runtime Guide | [Runtime Guide](../runtime-guide.md) |
| Experiments Home | [Experiments](./README.md) |
| Status and Boundaries | [Status and Boundaries](../status-and-boundaries.md) |
| Runtime Layer | [Runtime Artifacts](../runtime/README.md) |
| Twin Atlas | [Twin Atlas](../../Twin_Atlas/README.md) |
---
## The shortest answer 🧩
If you only want the fastest path, do this:
### Window A
Ask the question normally.
### Window B
Load one Inverse Atlas version, then ask the same question.
### Compare
Look for whether the Inverse Atlas version is better at:
- refusing illegal high-resolution escalation
- avoiding fake completion
- refusing cosmetic repair inflation
- staying coarse or unresolved lawfully when needed
- keeping visible confidence below what was actually earned
That is the cleanest first reproduction path.
---
## What you need first ✅
Before you start, you only need four things:
- one strong language model
- one Inverse Atlas version
- one test case
- two chat windows or two clean comparison runs
You do **not** need the full paper first.
You do **not** need a benchmark notebook first.
You do **not** need the Bridge layer first.
You only need enough structure to compare one ordinary answer against one governed answer.
---
## Version choice first 🌟
If you do not know which version to use, start with:
### **Recommended: Advanced**
That is the current best default for most serious users.
If you want a lighter first try, use **Basic**.
If you want a harder audit-style run, use **Strict**.
---
## Method A · Fastest experience
This is the fastest public-facing version.
### Step 1
Load one Inverse Atlas version into your system instructions, project instructions, or equivalent runtime layer.
### Step 2
Paste the demo harness.
### Step 3
Choose one case from the case pack.
### Step 4
Run the prompt and inspect the output.
In the current artifact design, this is intended to produce:
- a simulated baseline response
- an inverse-governed response
- a compact structural difference summary
This is the most product-like path, and it is the best first contact for most people.
---
## Method B · Cleaner comparison
This is the cleanest baseline vs inverse comparison for ordinary users.
### Window A
Use the same model with no Inverse Atlas layer.
Ask your chosen case directly.
### Window B
Use the same model with one Inverse Atlas version loaded.
Ask the same case again.
### Then compare
Do not judge only by tone.
Judge by structure:
- Did baseline escalate too early
- Did baseline collapse neighboring routes too quickly
- Did baseline overclaim certainty
- Did baseline offer cosmetic repair as if structural
- Did the inverse-governed answer stay lawful under the same pressure
This comparison matters because a baseline answer may look more decisive while still being less lawful. The current paper explicitly treats this as one of the main reasons the demo harness and evaluator exist.
---
## Method C · Stricter evaluator pass
Use this when you want a more formal comparison layer.
### Step 1
In one chat, produce:
- the baseline answer
- the inverse-governed answer
### Step 2
Open a second clean chat.
### Step 3
Load the evaluator.
### Step 4
Provide:
- the original user input
- the baseline output
- the inverse output
### Step 5
Ask for pair evaluation.
At the current MVP stage, the evaluator is designed to report things like:
- legality winner
- baseline main risk
- inverse-governed main strength
- contrast on resolution behavior
- contrast on certainty behavior
- contrast on repair legality
- contrast on public-ceiling compliance
This path is very useful for Hero Log, research notes, and later benchmark-style records.
---
## Method D · Twin Atlas path
This is the more complete dual-layer direction.
### Step 1
Use the forward Atlas first to get:
- likely route
- likely family
- likely broken invariant region
### Step 2
Pass that along with the original user input into the Inverse Atlas runtime.
### Step 3
Let Inverse Atlas decide whether the route is actually strong enough for lawful public output.
The key law here is:
**forward guidance is a weak prior, not automatic authorization**
That asymmetry must be preserved.
The current paper is explicit on this point: the forward layer informs the inverse layer, but does not overrule it.
---
## What kind of case should you use first 🎯
For the first reproduction, do **not** pick an easy toy prompt.
Pick something that pressures legality.
Good first-case properties include:
- topic lure
- thin evidence
- neighboring-cut conflict
- fake repair temptation
- forced illegal specificity
- false completion pressure
- long-context contamination
The current papers case design is intentionally built around these kinds of pressures, because they are exactly the places where legality-first governance should diverge from normal direct answering.
---
## What success looks like 👀
A good first reproduction does **not** need to prove everything.
It only needs to make the difference visible.
At the MVP stage, that usually means you can already see one or more of these:
- less illegal resolution escalation
- less false completion
- better ambiguity retention
- more lawful downgrade into COARSE or UNRESOLVED
- better refusal of cosmetic repair inflation
- better public-ceiling discipline
That is already enough for a meaningful first win.
---
## What this page is not trying to do ⛔
This page is not trying to be:
- the full benchmark protocol
- the full evaluator manual
- the full Twin Atlas handoff specification
- the final empirical validation layer
It is much narrower.
Its job is simply this:
**make the first reproduction path obvious and fast**
That is why this page is product-style, not paper-style.
---
## Recommended public entry order 🧭
For most public-facing readers, the cleanest order is:
1. choose **Advanced**
2. run one strong demo case
3. compare baseline vs inverse
4. only then try evaluator mode
5. only after that try Twin Atlas mode
That order is important.
If you start with the hardest research path first, many people will miss the product value.
If you start with the fast difference first, the framework becomes much easier to feel.
---
## Where to go next 📚
If you want the overall experiments layer, go to:
[Experiments](./README.md)
If you want the runtime logic behind the artifacts, go to:
[Runtime Guide](../runtime-guide.md)
If you want the version strategy, go to:
[Versions](../versions.md)
If you want the current honesty boundary, go to:
[Status and Boundaries](../status-and-boundaries.md)
If you want the larger family direction, go to:
[Twin Atlas](../../Twin_Atlas/README.md)
---
## Final Note
This page matters because a framework becomes much more real when a new user can feel it quickly.
The 60-second reproduction path is not the whole story.
But it is one of the most important first doors into the current Inverse Atlas MVP.
That is why this page should stay simple, fast, and honest. 🌱