mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-28 01:44:41 +00:00
Add self-contained acceptance test artifact that external developers can run offline and reproduce identical graded outcomes: - SHA-256-linked witness chain: every puzzle decision (skip_mode, context_bucket, steps, correct) hashed into a tamper-evident chain. Changing any single bit invalidates everything downstream. - Deterministic replay: frozen seeds → identical puzzles → identical solve paths → identical chain_root_hash. Two runs with the same config produce the same hash, proven by test. - JSON manifest: config, per-mode scorecards (A/B/C), all six ablation assertions with measured values, full witness chain, chain root hash. - Verifier: re-runs with same config, recomputes chain, compares root hash. Mismatch means non-identical outcomes. - CLI binary: `acceptance-rvf generate -o manifest.json` to produce, `acceptance-rvf verify -i manifest.json` to verify. 66 lib tests + 20 integration tests pass. https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
38 lines
1.1 KiB
Rust
38 lines
1.1 KiB
Rust
//! RuVector Benchmarks Library
|
|
//!
|
|
//! Comprehensive benchmarking suite for:
|
|
//! - Temporal reasoning (TimePuzzles-style constraint inference)
|
|
//! - Vector index operations (IVF, coherence-gated search)
|
|
//! - Swarm controller regret tracking
|
|
//! - Intelligence metrics and cognitive capability assessment
|
|
//! - Adaptive learning with ReasoningBank trajectory tracking
|
|
//!
|
|
//! Based on research from:
|
|
//! - TimePuzzles benchmark (arXiv:2601.07148)
|
|
//! - Sublinear regret in multi-agent control
|
|
//! - Tool-augmented iterative temporal reasoning
|
|
//! - Cognitive capability assessment frameworks
|
|
//! - lean-agentic type theory for verified reasoning
|
|
|
|
pub mod acceptance_test;
|
|
pub mod agi_contract;
|
|
pub mod intelligence_metrics;
|
|
pub mod logging;
|
|
pub mod loop_gating;
|
|
pub mod publishable_rvf;
|
|
pub mod reasoning_bank;
|
|
pub mod rvf_artifact;
|
|
pub mod rvf_intelligence_bench;
|
|
pub mod superintelligence;
|
|
pub mod swarm_regret;
|
|
pub mod temporal;
|
|
pub mod timepuzzles;
|
|
pub mod vector_index;
|
|
|
|
pub use intelligence_metrics::*;
|
|
pub use logging::*;
|
|
pub use reasoning_bank::*;
|
|
pub use swarm_regret::*;
|
|
pub use temporal::*;
|
|
pub use timepuzzles::*;
|
|
pub use vector_index::*;
|