mirror of
https://github.com/onestardao/WFGY.git
synced 2026-04-28 11:40:07 +00:00
| .. | ||
| README.md | ||
Community Benchmark Reruns
Rerun packs, comparisons, and route-aware benchmark evidence
This folder is for community-contributed reruns that test atlas routing, first repair moves, or troubleshooting improvements on repeatable examples.
Typical contributions here include:
- small rerun packs
- before and after comparisons
- benchmark slices tied to one failure family
- structured rerun notes for one troubleshooting setting
- route-aware comparison packs
What belongs here
Good rerun contributions include:
- one small benchmark slice
- one clear rerun protocol
- one route-aware before and after comparison
- one compact result table with method note
- one reproducible troubleshooting benchmark example
A good rerun contribution should be:
- scoped
- method-aware
- explicit about data source
- explicit about limits
- tied to atlas routing
What does not belong here
Please do not use this folder for:
- unsupported score claims
- screenshots with no method note
- giant benchmark reports with no case framing
- unclear comparisons with moving variables
- claims that a rerun proves the whole atlas by itself
Suggested rerun pattern
A useful rerun contribution usually includes:
- target task or failure family
- rerun setup
- baseline behavior
- routed or repaired behavior
- compact result summary
- limitations
That is enough to make the rerun informative.
Suggested naming style
Examples:
f1-grounding-rerun-v1.mdf5-trace-uplift-rerun-v1.mdf7-structured-output-rerun-v1.md
If code or notebooks are included, place them in a clearly named subfolder.
Before contributing
Please read:
One-line status
This folder holds community reruns that test atlas-guided troubleshooting in compact, repeatable benchmark-style settings.