Add self-contained acceptance test artifact that external developers can
run offline and reproduce identical graded outcomes:
- SHA-256-linked witness chain: every puzzle decision (skip_mode,
context_bucket, steps, correct) hashed into a tamper-evident chain.
Changing any single bit invalidates everything downstream.
- Deterministic replay: frozen seeds → identical puzzles → identical
solve paths → identical chain_root_hash. Two runs with the same
config produce the same hash, proven by test.
- JSON manifest: config, per-mode scorecards (A/B/C), all six ablation
assertions with measured values, full witness chain, chain root hash.
- Verifier: re-runs with same config, recomputes chain, compares root
hash. Mismatch means non-identical outcomes.
- CLI binary: `acceptance-rvf generate -o manifest.json` to produce,
`acceptance-rvf verify -i manifest.json` to verify.
66 lib tests + 20 integration tests pass.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Implements a recursive intelligence amplification pipeline where each
level feeds the next, measuring IQ at every stage:
L1 Foundation (IQ ~79) Adaptive solver + ReasoningBank + retry
L2 Meta-Learning (IQ ~82) Learns optimal hyperparams per problem class
L3 Ensemble Arbiter (IQ ~83) Multi-strategy voting with learned selection
L4 Recursive Improve(IQ ~85) Bootstraps from own outputs + knowledge compiler
L5 Adversarial Grow (IQ ~89) Self-generated hard tasks + cascade reasoning
Key mechanisms:
- MetaParams: EMA-learned step budgets + retry benefit estimation
- StrategyEnsemble: N-solver majority vote, confidence-weighted
- KnowledgeCompiler: compiles patterns to direct lookup (54% hit rate)
- AdversarialGenerator: weakness-targeted difficulty escalation
- CascadeReasoner: multi-pass solve-verify-resolve
Results: +7.5 to +10.1 IQ gain across 5 levels, reaching IQ 86-89
depending on noise conditions. 100% accuracy at max difficulty in L4/L5.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G