mirror of
https://github.com/onestardao/WFGY.git
synced 2026-04-28 11:40:07 +00:00
Rename ProblemMap/Inverse_Atlas/experiments/experiments/case-studies/README.md to ProblemMap/Inverse_Atlas/experiments/case-studies/README.md
This commit is contained in:
parent
ea4d9b8e47
commit
b10782a1fe
1 changed files with 0 additions and 0 deletions
415
ProblemMap/Inverse_Atlas/experiments/case-studies/README.md
Normal file
415
ProblemMap/Inverse_Atlas/experiments/case-studies/README.md
Normal file
|
|
@ -0,0 +1,415 @@
|
|||
<!--
|
||||
AI_NOTE_START
|
||||
|
||||
Document role:
|
||||
This page is the entry page for the public case-studies layer of the current Inverse Atlas MVP.
|
||||
|
||||
What this page is for:
|
||||
1. Turn raw smoke results into a guided case-study surface.
|
||||
2. Help readers know which cases to read first.
|
||||
3. Connect the public showcase layer with raw result artifacts and the Colab reproduction path.
|
||||
4. Keep the evidence story readable without forcing people to inspect raw txt files directly.
|
||||
|
||||
How to use this page:
|
||||
1. Read this page after the experiments entry page and showcase-cases page.
|
||||
2. Start with the flagship cases first.
|
||||
3. Use the raw result links only after you understand the story of a case.
|
||||
4. Treat this page as the human-readable case-study index of the current smoke evidence layer.
|
||||
|
||||
Important boundary:
|
||||
This page is a guided case-study index for the current smoke evidence layer.
|
||||
It is not the full benchmark archive and not the final large-scale empirical report.
|
||||
It exists to make the current MVP evidence surface readable and inspectable.
|
||||
|
||||
Recommended reading path:
|
||||
1. Experiments
|
||||
2. Repro in 60 Seconds
|
||||
3. Phase Overview
|
||||
4. Showcase Cases
|
||||
5. Case Studies README
|
||||
6. Individual Case Study Pages
|
||||
7. Results and Current Findings
|
||||
8. Evidence Snapshot
|
||||
|
||||
AI_NOTE_END
|
||||
-->
|
||||
|
||||
# Case Studies 📚🧪
|
||||
|
||||
> The guided public evidence layer for the current Inverse Atlas smoke results
|
||||
|
||||
This page is the public entry point for the **case-studies layer** of the current Inverse Atlas MVP.
|
||||
|
||||
The purpose of this layer is simple:
|
||||
|
||||
**do not make readers dig through raw txt files to understand what matters**
|
||||
|
||||
The raw smoke outputs are useful, but most humans will not open a pile of result files and reconstruct the story by themselves.
|
||||
|
||||
So this case-studies layer exists to turn the current smoke evidence into something that is:
|
||||
|
||||
- readable
|
||||
- teachable
|
||||
- linkable
|
||||
- challengeable
|
||||
- easier to show in public
|
||||
|
||||
In simple terms:
|
||||
|
||||
- the raw result files are the evidence base
|
||||
- the case studies are the guided reading surface
|
||||
|
||||
---
|
||||
|
||||
## Quick Links 🔎
|
||||
|
||||
| Section | Link |
|
||||
|---|---|
|
||||
| Inverse Atlas Home | [Inverse Atlas README](../../README.md) |
|
||||
| Start Here | [Start Here](../../start-here.md) |
|
||||
| FAQ | [FAQ](../../FAQ.md) |
|
||||
| Experiments Home | [Experiments](../README.md) |
|
||||
| Repro in 60 Seconds | [Repro in 60 Seconds](../repro-60-seconds.md) |
|
||||
| Phase Overview | [Phase Overview](../phase-overview.md) |
|
||||
| Case Design and Rationale | [Case Design and Rationale](../case-design-and-rationale.md) |
|
||||
| Showcase Cases | [Showcase Cases](../showcase-cases.md) |
|
||||
| Results and Current Findings | [Results and Current Findings](../results-and-current-findings.md) |
|
||||
| Evidence Snapshot | [Evidence Snapshot](../evidence-snapshot.md) |
|
||||
| Colab | [Colab](../../colab.md) |
|
||||
| Notebook | [Inverse Atlas MVP Reproduction Notebook](../../colab/Inverse_Atlas_MVP_Reproduction.ipynb) |
|
||||
| Runtime Layer | [Runtime Artifacts](../../runtime/README.md) |
|
||||
| Advanced Version | [Inverse Atlas Advanced](../../runtime/inverse-advanced.txt) |
|
||||
| Demo Harness | [Inverse Atlas Demo Harness](../../runtime/inverse-demo.txt) |
|
||||
| Evaluator | [Inverse Atlas Evaluator](../../runtime/inverse-eval.txt) |
|
||||
|
||||
---
|
||||
|
||||
## The shortest version 🧩
|
||||
|
||||
If you only want the fastest correct reading path, start here:
|
||||
|
||||
1. [Smoke Case 04 · Neighboring-Cut Conflict](./smoke-case-04-neighboring-cut-conflict.md)
|
||||
2. [Smoke Case 05 · Long-Context Contamination](./smoke-case-05-long-context-contamination.md)
|
||||
3. [Smoke Case 06 · Illegal Resolution Demand](./smoke-case-06-illegal-resolution-demand.md)
|
||||
4. [Smoke Case 08 · World-Alignment Instability](./smoke-case-08-world-alignment-instability.md)
|
||||
|
||||
That is the strongest first sequence.
|
||||
|
||||
Why?
|
||||
|
||||
Because these four cases are the clearest public proof-of-feel cases for showing:
|
||||
|
||||
- route conflict
|
||||
- long-context contamination
|
||||
- forced illegal exactness
|
||||
- weak grounding and public-ceiling discipline
|
||||
|
||||
---
|
||||
|
||||
## What this layer is trying to do 🎯
|
||||
|
||||
This layer is **not** trying to replace:
|
||||
|
||||
- the raw result files
|
||||
- the notebook
|
||||
- the evaluator
|
||||
- the larger experiments pages
|
||||
|
||||
Its job is narrower and more useful:
|
||||
|
||||
**turn the current smoke evidence into guided public case studies**
|
||||
|
||||
That means each case study should help a reader answer questions like:
|
||||
|
||||
- Why is this case important?
|
||||
- What pressure is this case applying?
|
||||
- What does the baseline tend to do?
|
||||
- What does the inverse-governed run do differently?
|
||||
- Why does that difference matter for the framework?
|
||||
|
||||
That is the value of this layer.
|
||||
|
||||
---
|
||||
|
||||
## Why this layer exists at all 🚨
|
||||
|
||||
Without this layer, a new reader is likely to see:
|
||||
|
||||
- a notebook
|
||||
- eight raw smoke result files
|
||||
- a showcase page
|
||||
- some theory
|
||||
- some runtime artifacts
|
||||
|
||||
and still think:
|
||||
|
||||
“Okay, but where do I actually look first?”
|
||||
|
||||
That is bad packaging.
|
||||
|
||||
This layer fixes that.
|
||||
|
||||
It tells readers:
|
||||
|
||||
- which cases matter most first
|
||||
- which cases are best for market-facing demos
|
||||
- which cases are better for deeper conceptual explanation
|
||||
- where to find the underlying raw results
|
||||
|
||||
---
|
||||
|
||||
## Best first cases 🌟
|
||||
|
||||
These are the strongest first public cases.
|
||||
|
||||
### Flagship 1
|
||||
[Smoke Case 04 · Neighboring-Cut Conflict](./smoke-case-04-neighboring-cut-conflict.md)
|
||||
|
||||
Why this case is strong:
|
||||
- it shows why a plausible route is not yet a final route
|
||||
- it reveals the value of honest unresolved structure
|
||||
- it gives a very clear contrast between direct route locking and legality-governed restraint
|
||||
|
||||
### Flagship 2
|
||||
[Smoke Case 05 · Long-Context Contamination](./smoke-case-05-long-context-contamination.md)
|
||||
|
||||
Why this case is strong:
|
||||
- it shows why repeated assumption is not the same thing as evidence
|
||||
- it demonstrates contamination pressure across turns
|
||||
- it helps explain why long-context needs its own experiment phase
|
||||
|
||||
### Flagship 3
|
||||
[Smoke Case 06 · Illegal Resolution Demand](./smoke-case-06-illegal-resolution-demand.md)
|
||||
|
||||
Why this case is strong:
|
||||
- it shows that user demand does not become automatic authorization
|
||||
- it pressures exact subtype, exact route, and exact repair all at once
|
||||
- it gives a very strong public contrast between over-resolution and lawful refusal
|
||||
|
||||
### Flagship 4
|
||||
[Smoke Case 08 · World-Alignment Instability](./smoke-case-08-world-alignment-instability.md)
|
||||
|
||||
Why this case is strong:
|
||||
- it shows how vague symptoms can seduce a model into structural overclaim
|
||||
- it demonstrates why weak grounding should block strong final output
|
||||
- it is very good for explaining world-alignment honesty
|
||||
|
||||
---
|
||||
|
||||
## Second-wave cases 🧠
|
||||
|
||||
These are also important, but are slightly better after the flagship four.
|
||||
|
||||
### Secondary 1
|
||||
[Smoke Case 01 · Topic Lure Exact Diagnosis](./smoke-case-01-topic-lure-exact-diagnosis.md)
|
||||
|
||||
Best for:
|
||||
- lexical attraction
|
||||
- familiar category labels
|
||||
- “this obviously is X” pressure
|
||||
|
||||
### Secondary 2
|
||||
[Smoke Case 02 · Thin Evidence, Forced Confidence](./smoke-case-02-thin-evidence-forced-confidence.md)
|
||||
|
||||
Best for:
|
||||
- weak evidence
|
||||
- user-driven confidence pressure
|
||||
- claim-ceiling discipline
|
||||
|
||||
### Secondary 3
|
||||
[Smoke Case 03 · Cosmetic Repair Bait](./smoke-case-03-cosmetic-repair-bait.md)
|
||||
|
||||
Best for:
|
||||
- structural repair vs cosmetic repair
|
||||
- fake helpfulness
|
||||
- repair legality
|
||||
|
||||
### Secondary 4
|
||||
[Smoke Case 07 · False Completion Pressure](./smoke-case-07-false-completion-pressure.md)
|
||||
|
||||
Best for:
|
||||
- fake closure
|
||||
- rhetorical finality
|
||||
- lawful incompletion
|
||||
|
||||
---
|
||||
|
||||
## Recommended public reading order 📚
|
||||
|
||||
If someone is completely new to the smoke evidence layer, this is the cleanest order:
|
||||
|
||||
1. [Smoke Case 04 · Neighboring-Cut Conflict](./smoke-case-04-neighboring-cut-conflict.md)
|
||||
2. [Smoke Case 06 · Illegal Resolution Demand](./smoke-case-06-illegal-resolution-demand.md)
|
||||
3. [Smoke Case 05 · Long-Context Contamination](./smoke-case-05-long-context-contamination.md)
|
||||
4. [Smoke Case 08 · World-Alignment Instability](./smoke-case-08-world-alignment-instability.md)
|
||||
5. [Smoke Case 03 · Cosmetic Repair Bait](./smoke-case-03-cosmetic-repair-bait.md)
|
||||
6. [Smoke Case 01 · Topic Lure Exact Diagnosis](./smoke-case-01-topic-lure-exact-diagnosis.md)
|
||||
7. [Smoke Case 02 · Thin Evidence, Forced Confidence](./smoke-case-02-thin-evidence-forced-confidence.md)
|
||||
8. [Smoke Case 07 · False Completion Pressure](./smoke-case-07-false-completion-pressure.md)
|
||||
|
||||
That order is designed to maximize:
|
||||
|
||||
- first impression
|
||||
- structural clarity
|
||||
- product feeling
|
||||
- conceptual depth
|
||||
|
||||
---
|
||||
|
||||
## How each case study should be read 🔍
|
||||
|
||||
A good case study in this folder should answer the same set of questions every time:
|
||||
|
||||
1. What legality boundary is being pressured?
|
||||
2. Why is this case important?
|
||||
3. What does a direct baseline tend to do?
|
||||
4. What does the inverse-governed output do differently?
|
||||
5. What does the evaluator say?
|
||||
6. Why does this difference matter for the framework?
|
||||
7. Where are the raw results?
|
||||
|
||||
That consistency is important.
|
||||
It makes the case-study layer feel like a real evidence surface rather than a pile of disconnected notes.
|
||||
|
||||
---
|
||||
|
||||
## How this layer relates to the raw result files 📦
|
||||
|
||||
The raw result files still matter.
|
||||
|
||||
They are the underlying evidence base.
|
||||
|
||||
But they are not the best first surface for most human readers.
|
||||
|
||||
So the clean relationship is:
|
||||
|
||||
### Raw files
|
||||
Keep the uncompressed evidence.
|
||||
|
||||
### Case studies
|
||||
Turn the evidence into a guided interpretation layer.
|
||||
|
||||
### Evidence snapshot
|
||||
Give the shortest high-level public summary.
|
||||
|
||||
### Notebook
|
||||
Let a reader reproduce or inspect the contrast more directly.
|
||||
|
||||
That is the correct role split.
|
||||
|
||||
---
|
||||
|
||||
## Raw Smoke Result Links 🗂️
|
||||
|
||||
These are the current raw smoke result files.
|
||||
|
||||
### Case 01
|
||||
[Raw Smoke Result · Case 01](../results/smoke/raw/case1-2type.txt)
|
||||
|
||||
### Case 02
|
||||
[Raw Smoke Result · Case 02](../results/smoke/raw/case2-2type.txt)
|
||||
|
||||
### Case 03
|
||||
[Raw Smoke Result · Case 03](../results/smoke/raw/case3-2type.txt)
|
||||
|
||||
### Case 04
|
||||
[Raw Smoke Result · Case 04](../results/smoke/raw/case4-2type.txt)
|
||||
|
||||
### Case 05
|
||||
[Raw Smoke Result · Case 05](../results/smoke/raw/case5-2type.txt)
|
||||
|
||||
### Case 06
|
||||
[Raw Smoke Result · Case 06](../results/smoke/raw/case6-2type.txt)
|
||||
|
||||
### Case 07
|
||||
[Raw Smoke Result · Case 07](../results/smoke/raw/case7-2type.txt)
|
||||
|
||||
### Case 08
|
||||
[Raw Smoke Result · Case 08](../results/smoke/raw/case8-2type.txt)
|
||||
|
||||
If your final repo layout chooses a different raw-results location, update these links accordingly.
|
||||
|
||||
---
|
||||
|
||||
## How this layer relates to the notebook 💻
|
||||
|
||||
The notebook is the reproduction layer.
|
||||
|
||||
The case studies are the guided public evidence layer.
|
||||
|
||||
That means:
|
||||
|
||||
- the notebook helps a reader re-run or inspect
|
||||
- the case studies help a reader understand what they are looking at
|
||||
|
||||
So these two layers should reinforce each other, not compete.
|
||||
|
||||
The cleanest future state is:
|
||||
|
||||
- notebook gives reproducible contrast
|
||||
- case studies give curated interpretation
|
||||
- evidence snapshot gives public summary
|
||||
|
||||
---
|
||||
|
||||
## What this layer is not trying to do ⛔
|
||||
|
||||
This page is not trying to be:
|
||||
|
||||
- the full benchmark archive
|
||||
- the final empirical report
|
||||
- the complete case pack in one page
|
||||
- a replacement for the experiments overview
|
||||
- a replacement for the raw data
|
||||
|
||||
Its job is simply this:
|
||||
|
||||
**give the smoke evidence a human-readable front door**
|
||||
|
||||
That is enough.
|
||||
|
||||
---
|
||||
|
||||
## Best entry by reader type 👥
|
||||
|
||||
### I want the fastest punch
|
||||
Start with:
|
||||
- [Smoke Case 04 · Neighboring-Cut Conflict](./smoke-case-04-neighboring-cut-conflict.md)
|
||||
- [Smoke Case 06 · Illegal Resolution Demand](./smoke-case-06-illegal-resolution-demand.md)
|
||||
|
||||
### I care about multi-turn drift
|
||||
Start with:
|
||||
- [Smoke Case 05 · Long-Context Contamination](./smoke-case-05-long-context-contamination.md)
|
||||
|
||||
### I care about fake repair
|
||||
Start with:
|
||||
- [Smoke Case 03 · Cosmetic Repair Bait](./smoke-case-03-cosmetic-repair-bait.md)
|
||||
|
||||
### I care about weak evidence and overclaim
|
||||
Start with:
|
||||
- [Smoke Case 02 · Thin Evidence, Forced Confidence](./smoke-case-02-thin-evidence-forced-confidence.md)
|
||||
- [Smoke Case 08 · World-Alignment Instability](./smoke-case-08-world-alignment-instability.md)
|
||||
|
||||
---
|
||||
|
||||
## If you need one sentence for outside use 📝
|
||||
|
||||
If you want one compact sentence, use this:
|
||||
|
||||
> The case-studies layer turns the current Inverse Atlas smoke results into guided public evidence, helping readers see the strongest baseline-vs-governed contrasts without having to inspect raw result files first.
|
||||
|
||||
That sentence is short, accurate, and useful.
|
||||
|
||||
---
|
||||
|
||||
## Final Note 🌱
|
||||
|
||||
A strong evidence layer should not force new readers to reverse-engineer meaning from raw logs.
|
||||
|
||||
It should help them enter at the right depth.
|
||||
|
||||
That is what this page is for.
|
||||
|
||||
The smoke results already exist.
|
||||
|
||||
This layer makes them readable.
|
||||
Loading…
Add table
Add a link
Reference in a new issue