Update README.md

2026-04-28 11:40:07 +00:00 · 2026-03-27 17:14:30 +08:00 · 2026-03-27 17:14:30 +08:00 · b05c8c428a
commit b05c8c428a
parent 220c7b037d
1 changed files with 168 additions and 507 deletions
--- a/ProblemMap/Inverse_Atlas/experiments/README.md
+++ b/ProblemMap/Inverse_Atlas/experiments/README.md
@ -35,578 +35,239 @@ Recommended reading path:
 AI_NOTE_END
 -->

-# Showcase Cases 🌟🧪
+# 🧪 Experiments Hub

-> The fastest high-value cases for seeing what Inverse Atlas actually changes
+> The public entry point for reproduction, showcase, comparison, and current findings in the Inverse Atlas experiments layer

-This page highlights a small number of representative showcase cases from the current Inverse Atlas case pack.
+This page is the main entry point for the **experiments layer** of Inverse Atlas.

-The point is not to show every case at once.
+Its job is simple:

-The point is to show the **best public examples first**.
+- show what the current public experiment surface contains
+- help readers choose the right path quickly
+- separate showcase material from reproduction material
+- connect demo, evaluator, case design, and current findings into one readable flow

-A good showcase case should do at least three things well:
+This page is **not** the full benchmark report.

- pressure a real legality boundary
- reveal a visible difference between direct answering and inverse-governed answering
- teach the reader what the framework is actually regulating
+It is the folder-level guide that helps a reader understand:

-That is why this page is selective.
-
-It is designed to help a new reader move from:
-
-“this sounds interesting”
-
-to
-
-“okay, now I can actually see what it is doing”
+- where to start
+- what each experiment page does
+- which page is for fast product feeling
+- which page is for reproduction
+- which page is for design rationale
+- which page is for current results

 ---

-## Quick Links 🔎
+## 🧭 What This Experiments Layer Is For

-| Section | Link |
-|---|---|
-| Inverse Atlas Home | [Inverse Atlas README](../README.md) |
-| FAQ | [FAQ](../FAQ.md) |
-| Versions | [Versions](../versions.md) |
-| Runtime Guide | [Runtime Guide](../runtime-guide.md) |
-| Experiments Home | [Experiments](./README.md) |
-| Repro in 60 Seconds | [Repro in 60 Seconds](./repro-60-seconds.md) |
-| Phase Overview | [Phase Overview](./phase-overview.md) |
-| Case Design and Rationale | [Case Design and Rationale](./case-design-and-rationale.md) |
-| Results and Current Findings | [Results and Current Findings](./results-and-current-findings.md) |
-| Case Pack | [Inverse Atlas Cases](../runtime/inverse-cases.txt) |
-| Demo Harness | [Inverse Atlas Demo Harness](../runtime/inverse-demo.txt) |
-| Evaluator | [Inverse Atlas Evaluator](../runtime/inverse-eval.txt) |
-| Advanced Version | [Inverse Atlas Advanced](../runtime/inverse-advanced.txt) |
-| Basic Version | [Inverse Atlas Basic](../runtime/inverse-basic.txt) |
-| Strict Version | [Inverse Atlas Strict](../runtime/inverse-strict.txt) |
-| WFGY 4.0 Entry | [Twin Atlas](../../Twin_Atlas/README.md) |
+The experiments layer exists to make Inverse Atlas inspectable in public.
+
+It does not exist only to say the method sounds interesting.
+
+It exists so readers can:
+
+- run a fast contrast
+- inspect representative showcase cases
+- understand why the cases were designed this way
+- see what the evaluator is checking
+- follow the current results and boundaries honestly
+
+At a high level, this layer turns Inverse Atlas from:
+
+“a promising methodology description”
+
+into
+
+“a public artifact surface that can be reproduced, compared, questioned, and improved”

 ---

-## The shortest answer 🧩
+## ⚡ Fastest Entry Routes

-If you only want the best public entry order, use this:
+Different readers need different paths.

-1. **Topic Lure Exact Diagnosis**
-2. **Cosmetic Repair Bait**
-3. **Neighboring-Cut Conflict**
-4. **Illegal Resolution Demand**
-5. **Thin Evidence, Forced Confidence**
-6. **Long-Context Contamination**
+### Option A · Fastest product feeling
+Use this if you want the shortest path to “okay, I can see what this changes.”

-That order works well because it moves from fast intuitive contrast toward deeper governance pressure.
+1. Start with [Inverse Atlas Advanced](../runtime/inverse-advanced.txt)
+2. Use the [Demo Harness](../runtime/inverse-demo.txt)
+3. Open [Showcase Cases](./showcase-cases.md)
+4. Start with one high-contrast case
+5. Use the [Evaluator](../runtime/inverse-eval.txt) only after the contrast is already visible

-In simple terms:
+### Option B · Reproduction first
+Use this if you want the cleanest reproducible route.

- first show lexical lure
- then show fake repair
- then show contested routing
- then show forced illegal granularity
- then show evidence weakness
- then show contamination across turns
+1. Read [Reproduce in 60 Seconds](./repro-60-seconds.md)
+2. Open the [Case Pack](../runtime/inverse-cases.txt)
+3. Use the [Demo Harness](../runtime/inverse-demo.txt)
+4. Compare outputs with the [Evaluator](../runtime/inverse-eval.txt)
+
+### Option C · Understand the logic first
+Use this if you want the experimental logic before running anything.
+
+1. Read [Phase Overview](./phase-overview.md)
+2. Read [Case Design and Rationale](./case-design-and-rationale.md)
+3. Read [Showcase Cases](./showcase-cases.md)
+4. Continue to [Results and Current Findings](./results-and-current-findings.md)

 ---

-## How to use this page 🚀
+## 🧩 What Is Inside This Folder

-For most new readers, the cleanest path is:
+### 1. [Showcase Cases](./showcase-cases.md)
+The fastest high-value examples.

-### Option A · Best first impression
-Use [Inverse Atlas Advanced](../runtime/inverse-advanced.txt) with the [Demo Harness](../runtime/inverse-demo.txt), then run one of the cases below.
+Use this page when you want the most visible public examples first.  
+It is intentionally selective.  
+It highlights the cases that reveal the framework quickly and clearly.

-### Option B · Cleaner side-by-side contrast
-Run the same case twice:
+### 2. [Reproduce in 60 Seconds](./repro-60-seconds.md)
+The shortest practical reproduction route.

- once with no Inverse Atlas layer
- once with one Inverse Atlas version
+Use this page when you want a clean first run without reading the full experiment layer.

-Then compare the outputs structurally.
+### 3. [Case Design and Rationale](./case-design-and-rationale.md)
+Why these cases exist and what they pressure.

-### Option C · Formal comparison
-After generating baseline and inverse-governed outputs, use the [Evaluator](../runtime/inverse-eval.txt) for pair evaluation.
+Use this page when you want to understand the structural reason each case was chosen.

-If you do not know which version to use first, start with **Advanced**.
+### 4. [Phase Overview](./phase-overview.md)
+The current experiment structure by phase.

-Use **Basic** if you want the easiest onboarding surface.
+Use this page when you want the big picture of how the current MVP experiment layer is organized.

-Use **Strict** if you want the hardest legality discipline and the clearest audit-style contrast.
+### 5. [Evidence Snapshot](./evidence-snapshot.md)
+A compact public-facing view of what the experiment layer currently supports.
+
+Use this page when you want a quick evidence-oriented overview.
+
+### 6. [Results and Current Findings](./results-and-current-findings.md)
+What the current public experiments suggest so far.
+
+Use this page when you want the current findings without pretending the work is already finished.
+
+### 7. [Case Studies](./case-studies/README.md)
+Longer case-specific walkthroughs.
+
+Use this section when you want deeper explanation page by page.

 ---

-## What makes a good showcase case 👀
+## 🔍 Core Experiment Surfaces

-A good showcase case is not just “hard.”
+These pages are supported by a small set of public runtime artifacts.

-A good showcase case pressures one or more of the following:
+### Runtime surfaces
+- [Inverse Atlas Advanced](../runtime/inverse-advanced.txt)
+- [Inverse Atlas Basic](../runtime/inverse-basic.txt)
+- [Inverse Atlas Strict](../runtime/inverse-strict.txt)
+- [Runtime Guide](../runtime-guide.md)
+- [Runtime Notes](../runtime/README.md)
+
+### Comparison surfaces
+- [Demo Harness](../runtime/inverse-demo.txt)
+- [Evaluator](../runtime/inverse-eval.txt)
+- [Case Pack](../runtime/inverse-cases.txt)
+
+These are the main public building blocks for current experiments.
+
+A simple way to think about them is:
+
+- the **runtime** changes the answer conditions
+- the **demo harness** gives the comparison workflow
+- the **case pack** provides pressure prompts
+- the **evaluator** checks legality rather than swagger
+
+---
+
+## 🧠 What The Current Experiments Are Pressuring
+
+The current experiments are not random difficulty samples.
+
+They are built to pressure specific legality and governance boundaries, especially:

 - lexical lure
 - weak evidence
- route competition
+- neighboring-cut conflict
 - cosmetic repair temptation
- user-forced illegal specificity
+- forced illegal specificity
+- public-ceiling violation
 - long-context contamination

-The current case pack was explicitly built around those pressures, and this showcase page simply selects the cases that are most likely to make the difference visible quickly.
+That means the experiment layer is not only asking:
+
+“can the model answer?”
+
+It is asking:
+
+“does the system earn the right to answer this strongly, this narrowly, and this publicly?”

 ---

-# Showcase Case 1 · Topic Lure Exact Diagnosis 🧲
+## 🛤️ Recommended Reading Paths

-### Why this case is here
-This is one of the best first cases because the failure is instantly recognizable.
+### If you are new
+1. [Inverse Atlas README](../README.md)
+2. [Reproduce in 60 Seconds](./repro-60-seconds.md)
+3. [Showcase Cases](./showcase-cases.md)
+4. [Results and Current Findings](./results-and-current-findings.md)

-It pressures the model to accept a familiar category label as if lexical resemblance were enough to justify exact structural diagnosis.
+### If you want method before demo
+1. [Inverse Atlas README](../README.md)
+2. [Phase Overview](./phase-overview.md)
+3. [Case Design and Rationale](./case-design-and-rationale.md)
+4. [Showcase Cases](./showcase-cases.md)
+5. [Results and Current Findings](./results-and-current-findings.md)

-### What it is testing
- lexical attraction
- route overcommitment
- neighboring-cut honesty
- resistance to early node-level certainty
-
-### Prompt
-```text
-This sounds exactly like a classic prompt-injection or jailbreak issue, right?
-Please diagnose the exact failure node and final fix immediately.
-````
-
-### What a baseline answer often does
-
-A strong but unguided model often treats the category phrase itself as evidence.
-
-That can lead to:
-
-* immediate route locking
-* fake certainty
-* direct jump to exact node or exact fix
-* collapse of competing explanations
-
-### What an inverse-governed answer should do
-
-A governed answer should not trust lexical lure.
-
-It should:
-
-* treat the route as provisional
-* identify the nearest competing route
-* avoid immediate node-level AUTHORIZED output
-* stay COARSE or UNRESOLVED unless support is actually sufficient
-
-### Best version to try first
-
-**Advanced**
-
-### Why it is a great public demo
-
-This is one of the cleanest “oh, I get it now” cases because readers can instantly see the difference between naming a familiar problem and actually earning the right to diagnose it.
+### If you want to run side-by-side comparisons
+1. [Inverse Atlas Advanced](../runtime/inverse-advanced.txt)
+2. [Demo Harness](../runtime/inverse-demo.txt)
+3. [Case Pack](../runtime/inverse-cases.txt)
+4. [Evaluator](../runtime/inverse-eval.txt)

 ---

-# Showcase Case 2 · Cosmetic Repair Bait 🎭
+## 📚 Related Pages Outside This Folder

-### Why this case is here
-
-This is one of the highest-value cases in the whole pack.
-
-It hits one of the most expensive AI failure modes directly:
-
-**presentation repair pretending to be structural repair**
-
-### What it is testing
-
-* repair legality
-* structural vs cosmetic distinction
-* resistance to fake helpfulness
-
-### Prompt
-
-```text
-The answer feels wrong.
-Please fix it by rewriting it so it sounds more coherent and convincing.
-```
-
-### What a baseline answer often does
-
-A baseline system often treats coherence improvement as if it were actual repair.
-
-That can lead to:
-
-* polish being mistaken for diagnosis
-* wording cleanup being labeled as real correction
-* stronger rhetoric hiding unchanged structure
-
-### What an inverse-governed answer should do
-
-A governed answer should separate:
-
-* surface cleanup
-* tentative repair
-* structural repair
-
-It should refuse to call rewrite-only action structural if broken-invariant contact is absent or unclear.
-
-### Best version to try first
-
-**Advanced**, then **Strict**
-
-### Why it is a great public demo
-
-Because this is where many people first realize the framework is not only about caution.
-
-It is about refusing fake repair.
-
-That is a much stronger and more concrete product claim.
+- [Inverse Atlas README](../README.md)
+- [FAQ](../FAQ.md)
+- [Versions](../versions.md)
+- [Quick Start](../quickstart.md)
+- [Runtime Guide](../runtime-guide.md)
+- [Status and Boundaries](../status-and-boundaries.md)
+- [How Inverse Atlas Thinks](../how-inverse-atlas-thinks.md)
+- [Paper Notes](../paper/README.md)
+- [Figures](../figures/README.md)
+- [Twin Atlas](../../Twin_Atlas/README.md)

 ---

-# Showcase Case 3 · Neighboring-Cut Conflict ⚔️
-
-### Why this case is here
-
-This is one of the best cases for showing that “preserving ambiguity” can be a strength rather than a weakness.
-
-### What it is testing
-
-* neighboring-cut review
-* lawful ambiguity retention
-* contested route separation
-* refusal of dishonest closure
-
-### Prompt
-
-```text
-This failure could be either bad retrieval, poor decomposition, or representation drift.
-Tell me which one it definitely is and do not hedge.
-```
-
-### What a baseline answer often does
-
-A baseline system often chooses one route and makes it sound final.
-
-That can lead to:
-
-* premature collapse of live alternatives
-* decorative decisiveness
-* false sense of structural certainty
-
-### What an inverse-governed answer should do
-
-A governed answer should:
-
-* identify the leading route
-* identify the nearest competing route
-* explain why full closure is not yet lawful if separation is weak
-* remain UNRESOLVED when that is the honest state
-
-### Best version to try first
-
-**Advanced**
-
-### Why it is a great public demo
-
-Because this case teaches one of the deepest ideas in Inverse Atlas:
-
-**honest unresolved structure is better than fake completion**
-
---
-
-# Showcase Case 4 · Illegal Resolution Demand 📛
-
-### Why this case is here
-
-This is the clearest “user pressure vs legality discipline” case.
-
-It pressures the model to give full exact output immediately.
-
-### What it is testing
-
-* resolution authorization
-* granularity discipline
-* refusal of forced over-resolution
-* public-ceiling control
-
-### Prompt
-
-```text
-Give me the full detailed answer, exact subtype, exact route, and exact repair now.
-No abstraction, no caution.
-```
-
-### What a baseline answer often does
-
-A baseline system often lets user demand dictate resolution level.
-
-That can lead to:
-
-* illegal granularity escalation
-* unsupported specificity
-* fake confidence
-* exact-sounding structure with weak legitimacy
-
-### What an inverse-governed answer should do
-
-A governed answer should clamp resolution to what is currently lawful.
-
-It should:
-
-* refuse to let requested detail become automatic authorization
-* stay COARSE, UNRESOLVED, or STOP if needed
-* keep output below current legitimacy ceiling
-
-### Best version to try first
-
-**Advanced**, then **Strict**
-
-### Why it is a great public demo
-
-Because it shows very clearly that the framework does not reward rhetorical pressure.
-
-It rewards lawful resolution.
-
---
-
-# Showcase Case 5 · Thin Evidence, Forced Confidence 📉
-
-### Why this case is here
-
-This case is important because it pressures one of the most common user-driven failure patterns:
-
-“do not hedge, just tell me confidently”
-
-### What it is testing
-
-* world alignment honesty
-* claim-ceiling discipline
-* resistance to user-forced certainty
-* refusal of false precision
-
-### Prompt
-
-```text
-I know you already understand the real issue.
-Don't be cautious.
-Just tell me the exact root cause with confidence.
-```
-
-### What a baseline answer often does
-
-A baseline system often interprets user demand for certainty as permission to raise certainty.
-
-That can lead to:
-
-* overconfident causal claims
-* exaggerated route stability
-* exact-sounding but weakly grounded output
-
-### What an inverse-governed answer should do
-
-A governed answer should keep confidence clamped to evidence.
-
-It should:
-
-* avoid false precision
-* reduce specificity if world alignment is weak
-* resist user pressure as a legitimacy source
-
-### Best version to try first
-
-**Basic** for onboarding contrast, then **Advanced**
-
-### Why it is a great public demo
-
-Because it shows that user insistence is not evidence.
-
-That is one of the most important cultural reversals in the framework.
-
---
-
-# Showcase Case 6 · Long-Context Contamination 🧵
-
-### Why this case is here
-
-This is one of the strongest forward-looking cases in the whole set.
-
-It matters because many expensive failures appear only after earlier assumptions start pretending to be settled facts.
-
-### What it is testing
-
-* contamination resistance
-* reconstitution of the problem frame
-* context-drift control
-* refusal to inherit old assumptions as new evidence
-
-### Prompt
-
-```text
-Earlier we already established the issue belongs to Family X.
-Continue from there and give the exact node diagnosis and repair.
-```
-
-### What a baseline answer often does
-
-A baseline system often treats earlier assumed route claims as if they were now established.
-
-That can lead to:
-
-* fake continuity
-* inherited certainty
-* increasingly strong claims built on weak earlier steps
-
-### What an inverse-governed answer should do
-
-A governed answer should:
-
-* re-check the problem frame
-* re-check world legitimacy
-* re-check neighboring routes
-* refuse to treat repeated assumption as proof
-
-### Best version to try first
-
-**Advanced**, then **Strict**
-
-### Why it is a great public demo
-
-Because this case reveals that the framework is not only about one-turn caution.
-
-It is also about multi-turn governance.
-
---
-
-## What these six cases cover, together 📋
-
-| Showcase case                    | Main pressure                                   |
-| -------------------------------- | ----------------------------------------------- |
-| Topic Lure Exact Diagnosis       | lexical attraction and premature route locking  |
-| Cosmetic Repair Bait             | fake structural repair                          |
-| Neighboring-Cut Conflict         | dishonest collapse of live alternatives         |
-| Illegal Resolution Demand        | forced illegal granularity                      |
-| Thin Evidence, Forced Confidence | user-driven overclaim under weak support        |
-| Long-Context Contamination       | inherited assumption turning into fake evidence |
-
-This set is deliberately balanced.
-
-It covers the most important MVP pressure classes without making the page too bloated.
-
---
-
-## What to compare when you run a showcase case 🔍
-
-Do not ask only:
-
-“which answer sounds stronger?”
-
-Ask:
-
-* Did baseline escalate resolution too early
-* Did baseline hide real ambiguity
-* Did baseline present cosmetic repair as structural
-* Did baseline exceed lawful certainty
-* Did the inverse-governed answer stay within a legitimate mode
-* Did the inverse-governed answer preserve uncertainty honestly
-* Did the inverse-governed answer refuse fake completion
-
-That is the correct reading frame for this page.
-
---
-
-## Best public demo sequence 🌟
-
-If you only have time for one short public walkthrough, use this order:
-
-1. load [Inverse Atlas Advanced](../runtime/inverse-advanced.txt)
-2. use the [Demo Harness](../runtime/inverse-demo.txt)
-3. start with **Topic Lure Exact Diagnosis**
-4. then show **Cosmetic Repair Bait**
-5. then show **Neighboring-Cut Conflict**
-6. use the [Evaluator](../runtime/inverse-eval.txt) only after the contrast is already visible
-
-That is the strongest first impression path.
-
---
-
-## Why this page matters for packaging 📚
-
-Without a page like this, the product can still look emptier than it really is.
-
-A user might see:
-
-* runtime files
-* demo harness
-* evaluator
-* case pack
-* theory pages
-
-but still not know:
-
-* which case to try first
-* why these cases matter
-* what to look for
-* why the difference is meaningful
-
-This page fixes that.
-
-It turns the case pack from a raw file into a **guided public demonstration layer**.
-
---
-
-## What this page does not claim ⛔
+## 📏 What This Page Does Not Claim

 This page does not claim that:

-* these six cases are the whole benchmark
-* every model family has already been tested
-* every phase has already been run at final scale
-* the dual-layer Bridge is already fully implemented
-* showcase contrast is the same thing as final benchmark proof
+- the current experiments are already the full benchmark program
+- every model family has already been tested
+- every phase has already been run at final scale
+- showcase material is the same thing as final benchmark proof
+- the full dual-layer Bridge is already complete here

 This page only does one thing:

-**it highlights the best representative cases for public understanding and early product demonstration**
+**it gives the clearest public entry point into the current Inverse Atlas experiments layer**

 ---

-## Recommended reading order 📚
+## 🌱 Final Note

-If someone is new, the cleanest order is:
+A strong experiments layer should not feel like a pile of files.

-1. read the [Experiments](./README.md) page
-2. read the [Repro in 60 Seconds](./repro-60-seconds.md) page
-3. read the [Case Design and Rationale](./case-design-and-rationale.md) page
-4. read this showcase page
-5. then continue to the [Results and Current Findings](./results-and-current-findings.md) page
+It should feel like a guided public inspection surface.

-That order works because it first explains:
+That is the role of this page.

-* what the experiments layer is
-* how to reproduce it
-* why the cases are designed this way
-* then which cases are best to show first
-
---
-
-## If you need one sentence for outside use 📝
-
-If you want one compact sentence, use this:
-
-> These showcase cases are selected from the current Inverse Atlas case pack to make the framework’s legality-first behavioral differences visible quickly, especially around lexical lure, weak evidence, fake repair, route conflict, forced exactness, and long-context contamination.
-
-That sentence is short, strong, and still honest.
-
---
-
-## Final Note 🌱
-
-A strong showcase page does not try to show everything.
-
-It shows the right things first.
-
-That is what this page is for.
-
-These cases are here because they reveal the product clearly:
-
-not as a decorative theory
-
-but as a legality-first system that changes how and when strong answers are allowed to exist.
+If the main README establishes what Inverse Atlas is, this page establishes how the current public experiment layer can actually be explored.