ruvector/examples/benchmarks
Claude 0cd418062c feat(ablation): Thompson Sampling two-signal model, speculative dual-path, constraint propagation
Replace epsilon-greedy with two-signal Thompson Sampling (safety Beta
posterior + cost EMA) for Mode C learned policy. Score = safety_sample
- lambda * cost_ema provides principled exploration-exploitation.

Add speculative dual-path for Mode C only: when Beta variance > 0.02
and top-2 arms within delta 0.15, run both arms (60/40 budget split)
to resolve uncertainty faster while keeping Mode A/B ablation clean.

Add constraint propagation pre-pass as PolicyKernel-controlled mode
(Off/Light/Full, defaults to Off). Light handles InMonth+DayOfMonth
direct solves; Full adds DayOfWeek pruning for ranges ≤60 days.
PrepassMetrics tracks pruned_candidates, prepass_steps, scan_steps_saved.

Beta sampling via Marsaglia-Tsang Gamma method + Box-Muller normal.

https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
2026-02-15 23:40:05 +00:00
..
src feat(ablation): Thompson Sampling two-signal model, speculative dual-path, constraint propagation 2026-02-15 23:40:05 +00:00
tests feat(generator): posterior-targeting puzzle generation, weekday skipping PolicyKernel 2026-02-15 22:31:12 +00:00
Cargo.toml feat(agi-contract): multi-dimensional IQ with cost, robustness, and AGI contract 2026-02-15 20:43:31 +00:00