8.6 KiB
🧪 Experiments Hub
The public entry point for reproduction, showcase, comparison, and current findings in the Inverse Atlas experiments layer
This page is the main entry point for the experiments layer of Inverse Atlas.
Its job is simple:
- show what the current public experiment surface contains
- help readers choose the right path quickly
- separate showcase material from reproduction material
- connect demo, evaluator, case design, and current findings into one readable flow
This page is not the full benchmark report.
It is the folder-level guide that helps a reader understand:
- where to start
- what each experiment page does
- which page is for fast product feeling
- which page is for reproduction
- which page is for design rationale
- which page is for current results
🧭 What This Experiments Layer Is For
The experiments layer exists to make Inverse Atlas inspectable in public.
It does not exist only to say the method sounds interesting.
It exists so readers can:
- run a fast contrast
- inspect representative showcase cases
- understand why the cases were designed this way
- see what the evaluator is checking
- follow the current results and boundaries honestly
At a high level, this layer turns Inverse Atlas from:
“a promising methodology description”
into
“a public artifact surface that can be reproduced, compared, questioned, and improved”
⚡ Fastest Entry Routes
Different readers need different paths.
Option A · Fastest product feeling
Use this if you want the shortest path to “okay, I can see what this changes.”
- Start with Inverse Atlas Advanced
- Use the Demo Harness
- Open Showcase Cases
- Start with one high-contrast case
- Use the Evaluator only after the contrast is already visible
Option B · Reproduction first
Use this if you want the cleanest reproducible route.
- Read Reproduce in 60 Seconds
- Open the Case Pack
- Use the Demo Harness
- Compare outputs with the Evaluator
Option C · Understand the logic first
Use this if you want the experimental logic before running anything.
- Read Phase Overview
- Read Case Design and Rationale
- Read Showcase Cases
- Continue to Results and Current Findings
🧩 What Is Inside This Folder
1. Showcase Cases
The fastest high-value examples.
Use this page when you want the most visible public examples first.
It is intentionally selective.
It highlights the cases that reveal the framework quickly and clearly.
2. Reproduce in 60 Seconds
The shortest practical reproduction route.
Use this page when you want a clean first run without reading the full experiment layer.
3. Case Design and Rationale
Why these cases exist and what they pressure.
Use this page when you want to understand the structural reason each case was chosen.
4. Phase Overview
The current experiment structure by phase.
Use this page when you want the big picture of how the current MVP experiment layer is organized.
5. Evidence Snapshot
A compact public-facing view of what the experiment layer currently supports.
Use this page when you want a quick evidence-oriented overview.
6. Results and Current Findings
What the current public experiments suggest so far.
Use this page when you want the current findings without pretending the work is already finished.
7. Case Studies
Longer case-specific walkthroughs.
Use this section when you want deeper explanation page by page.
🔍 Core Experiment Surfaces
These pages are supported by a small set of public runtime artifacts.
Runtime surfaces
Comparison surfaces
These are the main public building blocks for current experiments.
A simple way to think about them is:
- the runtime changes the answer conditions
- the demo harness gives the comparison workflow
- the case pack provides pressure prompts
- the evaluator checks legality rather than swagger
🧠 What The Current Experiments Are Pressuring
The current experiments are not random difficulty samples.
They are built to pressure specific legality and governance boundaries, especially:
- lexical lure
- weak evidence
- neighboring-cut conflict
- cosmetic repair temptation
- forced illegal specificity
- public-ceiling violation
- long-context contamination
That means the experiment layer is not only asking:
“can the model answer?”
It is asking:
“does the system earn the right to answer this strongly, this narrowly, and this publicly?”
🛤️ Recommended Reading Paths
If you are new
If you want method before demo
- Inverse Atlas README
- Phase Overview
- Case Design and Rationale
- Showcase Cases
- Results and Current Findings
If you want to run side-by-side comparisons
📚 Related Pages Outside This Folder
- Inverse Atlas README
- FAQ
- Versions
- Quick Start
- Runtime Guide
- Status and Boundaries
- How Inverse Atlas Thinks
- Paper Notes
- Figures
- Twin Atlas
📏 What This Page Does Not Claim
This page does not claim that:
- the current experiments are already the full benchmark program
- every model family has already been tested
- every phase has already been run at final scale
- showcase material is the same thing as final benchmark proof
- the full dual-layer Bridge is already complete here
This page only does one thing:
it gives the clearest public entry point into the current Inverse Atlas experiments layer
🌱 Final Note
A strong experiments layer should not feel like a pile of files.
It should feel like a guided public inspection surface.
That is the role of this page.
If the main README establishes what Inverse Atlas is, this page establishes how the current public experiment layer can actually be explored.