ruvector/Cargo.toml
rUv 8f97421297
research(nightly): rairs-ivf — RAIRS IVF, ruvector's first Inverted File Index (ADR-193) (#459)
* feat(rairs-ivf): add RAIRS IVF — ruvector's first Inverted File Index (ADR-193)

Implements Yang & Chen, SIGMOD 2026 (arXiv:2601.07183): three variants of
IVF with Redundant Assignment + Amplified Inverse Residual + SEIL layout.

Three measurable variants (N=5K, D=128, 64 clusters, cargo --release):
  IvfFlat      nprobe=1 recall@10  61.3%  mem 2,571 KB  26,984 QPS
  RairsStrict  nprobe=1 recall@10  83.8%  mem 5,110 KB  13,243 QPS
  RairsSeil    nprobe=1 recall@10  93.1%  mem 2,571 KB  13,582 QPS

RairsSeil: +31.8 pp recall at nprobe=1 vs IvfFlat with identical memory.

Files:
  crates/ruvector-rairs/         — new crate (IvfFlat, RairsStrict, RairsSeil)
  docs/adr/ADR-193-rairs-ivf.md  — architecture decision record
  docs/research/nightly/2026-05-12-rairs-ivf/README.md — SOTA survey + results
  Cargo.toml                     — workspace member added

10/10 unit tests pass. cargo build --release -p ruvector-rairs green.

* perf(ruvector-rairs): SIMD-friendly distance kernels + partial-select top-k; fix clippy/fmt; flag unverified citation

Optimizations (recall unchanged; ~2.3–2.9× single-thread QPS across all
variants/nprobe on x86-64):
- index.rs: rewrite l2sq/dot as 8-lane unrolled reductions so LLVM
  auto-vectorises the f32 accumulation (the naïve iter().sum() can't — f32
  add isn't associative). This is the hot path: every centroid scan + every
  list-entry distance.
- index.rs: add finalize_topk() / top_nprobe_centroids() using
  select_nth_unstable (O(n) avg) instead of full O(n log n) sorts of every
  candidate / every centroid; all three search() impls use them. Distance
  ordering switched to f32::total_cmp — no more partial_cmp().unwrap() panics.
- rairs.rs: rair_score is now allocation-free (no per-call Vec for the diff);
  search() dedups ids with a reused bool scratch array instead of allocating
  a HashSet per query.
- seil.rs: block-visited dedup uses a flat bool array indexed via per-list
  prefix sums instead of a per-query HashSet<(usize,usize)>.

Fixes:
- clippy `-D warnings` now passes: documented the 6 RairsError struct fields
  + RairsSeil::lambda; elided the explicit lifetime on resolve_block.
- cargo fmt --check now passes (benches/rairs_bench.rs import ordering, etc.).
- lib.rs + ADR-193 + the research README now carry a Provenance note: the
  "RAIRS/SEIL" names and the SIGMOD-2026 / arXiv:2601.07183 citation are
  unverified; the crate is an original implementation of the redundant-
  assignment idea (cf. IVF spill lists / SOAR / multi-probe LSH) and should
  be judged on src/main.rs's reproducible benchmarks, not the reference.

cargo test -p ruvector-rairs: 10/10 pass; recall@10 at nprobe∈{1,4,16}
unchanged (61.3/97.9/100 IvfFlat, 83.8/99.4/100 RairsStrict,
93.1/99.9/100 RairsSeil); index memory unchanged.

Co-Authored-By: claude-flow <ruv@ruv.net>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: ruvnet <ruvnet@gmail.com>
2026-05-12 09:47:19 -04:00

331 lines
12 KiB
TOML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[workspace]
exclude = ["crates/micro-hnsw-wasm", "crates/ruvector-hyperbolic-hnsw", "crates/ruvector-hyperbolic-hnsw-wasm", "examples/ruvLLM/esp32", "examples/ruvLLM/esp32-flash", "examples/edge-net", "examples/data", "examples/ruvLLM", "examples/delta-behavior", "crates/rvf", "crates/rvf/*", "crates/rvf/*/*", "examples/rvf-desktop", "crates/mcp-brain-server",
# ruvector-postgres is a pgrx-based PostgreSQL extension. Its build script
# requires `$PGRX_HOME` set up via `cargo install cargo-pgrx --version 0.12.9`
# and `cargo pgrx init`, which downloads and builds multiple Postgres
# versions. Keep it out of default workspace builds so `cargo build --workspace`
# works in stock environments. Build it explicitly with `cargo build -p ruvector-postgres`
# after running pgrx init.
"crates/ruvector-postgres",
# Iter 219 (closes ADR-178 Gap E folded into Gap B): the hailo
# crates rejoined the workspace once the iter-218 ruvector-core
# path dep + EmbeddingProvider impls landed. The `hailo` feature
# stays opt-in (only `cargo build --features hailo,cpu-fallback`
# pulls libhailort + candle), so workspace builds on stock x86
# still compile without Pi-specific tooling.
# ruos-thermal: Pi 5 thermal supervisor skeleton (ADR-174). Standalone
# for now; joins workspace once daemon mode + Unix socket protocol
# land in iters 92-97.
"crates/ruos-thermal"]
members = [
"crates/ruvector-acorn",
"crates/ruvector-acorn-wasm",
"crates/ruvector-rabitq",
"crates/ruvector-rabitq-wasm",
"crates/ruvector-rulake",
"crates/ruvector-core",
"crates/ruvector-node",
"crates/ruvector-wasm",
"crates/ruvector-cli",
"crates/ruvector-bench",
"crates/ruvector-metrics",
"crates/ruvector-filter",
"crates/ruvector-router-core",
"crates/ruvector-router-cli",
"crates/ruvector-router-ffi",
"crates/ruvector-router-wasm",
"crates/ruvector-server",
"crates/ruvector-snapshot",
"crates/ruvector-tiny-dancer-core",
"crates/ruvector-tiny-dancer-wasm",
"crates/ruvector-tiny-dancer-node",
"crates/ruvector-collections",
"crates/ruvector-cluster",
"crates/ruvector-raft",
"crates/ruvector-replication",
"crates/ruvector-graph",
"crates/ruvector-graph-node",
"crates/ruvector-graph-wasm",
"crates/ruvector-gnn",
"crates/ruvector-gnn-node",
"crates/ruvector-gnn-wasm",
"crates/ruvector-attention",
"crates/ruvector-attention-wasm",
"crates/ruvector-attention-node",
"crates/ruvector-cnn",
"crates/ruvector-cnn-wasm",
"crates/ruvector-mincut",
"crates/ruvector-mincut-wasm",
"crates/ruvector-mincut-node",
"crates/ruvector-mincut-gated-transformer",
"crates/ruvector-mincut-gated-transformer-wasm",
# NOTE: ruvector-postgres is in workspace `exclude` (pgrx env requirement).
"crates/ruvector-nervous-system",
# Iter 219 — hailo backend rejoined the workspace (closes
# ADR-178 Gap E folded into Gap B). All three build clean on
# x86 with default features; opting into the actual NPU path
# requires `--features hailo` on a Pi 5 + AI HAT+.
"crates/hailort-sys",
"crates/ruvector-hailo",
"crates/ruvector-mmwave",
"crates/ruvector-hailo-cluster",
"examples/refrag-pipeline",
"examples/scipix",
"examples/google-cloud",
"examples/subpolynomial-time",
"crates/sona",
"crates/rvlite",
"crates/ruvector-nervous-system",
"crates/ruvector-dag",
"crates/ruvector-dag-wasm",
"crates/ruvector-nervous-system-wasm",
"crates/ruvector-economy-wasm",
"crates/ruvector-learning-wasm",
"crates/ruvector-exotic-wasm",
"crates/ruvector-attention-unified-wasm",
"crates/ruvector-fpga-transformer",
"crates/ruvector-fpga-transformer-wasm",
"crates/ruvector-sparse-inference",
"crates/ruvector-math",
"crates/ruvector-math-wasm",
"examples/benchmarks",
"crates/cognitum-gate-kernel",
"crates/cognitum-gate-tilezero",
"crates/mcp-gate",
"crates/mcp-brain",
"crates/mcp-brain-server",
"crates/ruQu",
"crates/ruvllm",
"crates/ruvllm-cli",
"crates/ruvllm-wasm",
"crates/prime-radiant",
"crates/ruvector-delta-core",
"crates/ruvector-delta-wasm",
"crates/ruvector-delta-index",
"crates/ruvector-delta-graph",
"crates/ruvector-delta-consensus",
"crates/ruvector-crv",
"crates/ruvector-temporal-tensor",
"crates/ruqu-core",
"crates/ruqu-algorithms",
"crates/ruqu-wasm",
"crates/ruqu-exotic",
"crates/ruvector-domain-expansion",
"crates/ruvector-domain-expansion-wasm",
"crates/ruvector-solver",
"crates/ruvector-solver-wasm",
"crates/ruvector-solver-node",
"examples/dna",
"examples/OSpipe",
"crates/ruvector-coherence",
"crates/ruvector-profiler",
"crates/ruvector-attn-mincut",
"crates/ruvector-cognitive-container",
"crates/ruvector-verified",
"crates/ruvector-verified-wasm",
"crates/ruvector-graph-transformer",
"crates/ruvector-graph-transformer-wasm",
"crates/ruvector-graph-transformer-node",
"examples/rvf-kernel-optimized",
"examples/verified-applications",
"crates/thermorust",
"crates/ruvector-dither",
"crates/ruvector-robotics",
"examples/robotics",
"crates/neural-trader-core",
"crates/neural-trader-coherence",
"crates/neural-trader-replay",
"crates/neural-trader-wasm",
# Kalshi integration (ADR-153)
"crates/ruvector-kalshi",
"crates/neural-trader-strategies",
# RuVix Cognition Kernel (organized under crates/ruvix/)
"crates/ruvix/crates/types",
"crates/ruvix/crates/region",
"crates/ruvix/crates/queue",
"crates/ruvix/crates/cap",
"crates/ruvix/crates/proof",
"crates/ruvix/crates/sched",
"crates/ruvix/crates/boot",
"crates/ruvix/crates/vecgraph",
"crates/ruvix/crates/nucleus",
# Phase B: Bare metal AArch64 support
"crates/ruvix/crates/hal",
"crates/ruvix/crates/aarch64",
"crates/ruvix/crates/drivers",
"crates/ruvix/tests",
"crates/ruvix/benches",
"crates/ruvix/examples/cognitive_demo",
# rvAgent — AI Agent Framework (DeepAgents Rust conversion)
"crates/rvAgent/rvagent-core",
"crates/rvAgent/rvagent-backends",
"crates/rvAgent/rvagent-middleware",
"crates/rvAgent/rvagent-tools",
"crates/rvAgent/rvagent-subagents",
"crates/rvAgent/rvagent-cli",
"crates/rvAgent/rvagent-acp",
"crates/rvAgent/rvagent-mcp",
"crates/rvAgent/rvagent-wasm",
"crates/rvAgent/rvagent-a2a",
# ADR-159 a2a-swarm demo
"examples/a2a-swarm",
# ETL pipeline example
"examples/train-discoveries",
# Spectral graph sparsification
"crates/ruvector-sparsifier",
"crates/ruvector-sparsifier-wasm",
# Consciousness metrics (IIT Φ, causal emergence)
"crates/ruvector-consciousness",
"crates/ruvector-consciousness-wasm",
"examples/cmb-consciousness",
"examples/gw-consciousness",
"examples/ecosystem-consciousness",
"examples/quantum-consciousness",
"examples/gene-consciousness",
"examples/climate-consciousness",
# JS bundle decompiler (ADR-135)
"crates/ruvector-decompiler",
"crates/ruvector-decompiler-wasm",
# DiskANN / Vamana (ADR-143)
"crates/ruvector-diskann",
"crates/ruvector-diskann-node",
# Boundary-first scientific discovery PoC
"examples/boundary-discovery",
# CMB Cold Spot boundary-first discovery
"examples/cmb-boundary-discovery",
# FRB population boundary discovery (CHIME-like data)
"examples/frb-boundary-discovery",
# Cosmic void boundary information content
"examples/void-boundary-discovery",
# Multi-regime temporal attractor boundary detection
"examples/temporal-attractor-discovery",
# Music genre boundary discovery via spectral graph bisection
"examples/music-boundary-discovery",
# Weather regime boundary detection (variance/correlation precedes temperature)
"examples/weather-boundary-discovery",
# Market regime boundary discovery via correlation structure
"examples/market-boundary-discovery",
# Health state boundary detection from wearable sensor data
"examples/health-boundary-discovery",
# SETI exotic signals gallery: boundary-first detection of sub-threshold signals
"examples/seti-exotic-signals",
# SETI boundary-first discovery: sub-noise signal detection via coherence graphs
"examples/seti-boundary-discovery",
# Earthquake precursor detection via inter-station correlation boundary shifts
"examples/earthquake-boundary-discovery",
# Pandemic outbreak detection 60 days before case counts via correlation boundaries
"examples/pandemic-boundary-discovery",
# Infrastructure failure prediction via sensor correlation boundaries
"examples/infrastructure-boundary-discovery",
# Pre-seizure detection via brain correlation boundary shifts
"examples/brain-boundary-discovery",
# Clinical-publication-grade pre-seizure detection report with CSV output
"examples/seizure-clinical-report",
# Closed-loop seizure detection + therapeutic response simulation
"examples/seizure-therapeutic-sim",
# Real EEG analysis: CHB-MIT PhysioNet data with boundary-first detection
"examples/real-eeg-analysis",
# Multi-seizure cross-patient analysis: all 7 chb01 seizures
"examples/real-eeg-multi-seizure",
# ruvllm sparse attention kernel for Hailo-10H cluster (ADR-183 ADR-190)
"crates/ruvllm_sparse_attention",
# Generic retrieval LM + masked discrete diffusion built on the kernel
"crates/ruvllm_retrieval_diffusion",
# RAIRS IVF: Redundant Assignment + Amplified Inverse Residual (ADR-193)
"crates/ruvector-rairs",
]
resolver = "2"
[workspace.package]
version = "2.2.2"
edition = "2021"
rust-version = "1.77"
license = "MIT"
authors = ["Ruvector Team"]
repository = "https://github.com/ruvnet/ruvector"
[workspace.dependencies]
# Core functionality
redb = "2.1"
memmap2 = "0.9"
hnsw_rs = "0.3"
simsimd = "5.9"
rayon = "1.10"
crossbeam = "0.8"
# Serialization
rkyv = "0.8"
bincode = { version = "2.0.0-rc.3", features = ["serde"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
# Node.js bindings
napi = { version = "2.16", default-features = false, features = ["napi9", "async", "tokio_rt"] }
napi-derive = "2.16"
# WASM
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = { version = "0.3", features = ["Worker", "MessagePort", "console"] }
getrandom = { version = "0.3", features = ["wasm_js"] }
# Async runtime
tokio = { version = "1.41", features = ["rt-multi-thread", "sync", "macros"] }
futures = "0.3"
# Error handling and utilities
thiserror = "2.0"
anyhow = "1.0"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
# Math and numerics
nalgebra = { version = "0.33", default-features = false, features = ["std"] }
ndarray = "0.16"
rand = "0.8"
rand_distr = "0.4"
# Time and UUID
chrono = { version = "0.4", features = ["serde"] }
uuid = { version = "1.11", features = ["v4", "serde", "js"] }
# CLI
clap = { version = "4.5", features = ["derive", "cargo"] }
indicatif = "0.17"
console = "0.15"
# Testing and benchmarking
criterion = { version = "0.5", features = ["html_reports"] }
proptest = "1.5"
mockall = "0.13"
# Formal verification
lean-agentic = "=0.1.0"
# Performance
dashmap = "6.1"
parking_lot = "0.12"
once_cell = "1.20"
[profile.release]
opt-level = 3
lto = "fat"
codegen-units = 1
strip = true
panic = "unwind"
[profile.bench]
inherits = "release"
debug = true
[profile.dev]
opt-level = 0
debug = true
[profile.test]
# Patch hnsw_rs to use rand 0.8 instead of 0.9 for WASM compatibility
# This resolves the getrandom version conflict (0.2 vs 0.3)
[patch.crates-io]
hnsw_rs = { path = "./patches/hnsw_rs" }