research(nightly): rairs-ivf — RAIRS IVF, ruvector's first Inverted File Index (ADR-193) (#459)

* feat(rairs-ivf): add RAIRS IVF — ruvector's first Inverted File Index (ADR-193)

Implements Yang & Chen, SIGMOD 2026 (arXiv:2601.07183): three variants of
IVF with Redundant Assignment + Amplified Inverse Residual + SEIL layout.

Three measurable variants (N=5K, D=128, 64 clusters, cargo --release):
  IvfFlat      nprobe=1 recall@10  61.3%  mem 2,571 KB  26,984 QPS
  RairsStrict  nprobe=1 recall@10  83.8%  mem 5,110 KB  13,243 QPS
  RairsSeil    nprobe=1 recall@10  93.1%  mem 2,571 KB  13,582 QPS

RairsSeil: +31.8 pp recall at nprobe=1 vs IvfFlat with identical memory.

Files:
  crates/ruvector-rairs/         — new crate (IvfFlat, RairsStrict, RairsSeil)
  docs/adr/ADR-193-rairs-ivf.md  — architecture decision record
  docs/research/nightly/2026-05-12-rairs-ivf/README.md — SOTA survey + results
  Cargo.toml                     — workspace member added

10/10 unit tests pass. cargo build --release -p ruvector-rairs green.

* perf(ruvector-rairs): SIMD-friendly distance kernels + partial-select top-k; fix clippy/fmt; flag unverified citation

Optimizations (recall unchanged; ~2.3–2.9× single-thread QPS across all
variants/nprobe on x86-64):
- index.rs: rewrite l2sq/dot as 8-lane unrolled reductions so LLVM
  auto-vectorises the f32 accumulation (the naïve iter().sum() can't — f32
  add isn't associative). This is the hot path: every centroid scan + every
  list-entry distance.
- index.rs: add finalize_topk() / top_nprobe_centroids() using
  select_nth_unstable (O(n) avg) instead of full O(n log n) sorts of every
  candidate / every centroid; all three search() impls use them. Distance
  ordering switched to f32::total_cmp — no more partial_cmp().unwrap() panics.
- rairs.rs: rair_score is now allocation-free (no per-call Vec for the diff);
  search() dedups ids with a reused bool scratch array instead of allocating
  a HashSet per query.
- seil.rs: block-visited dedup uses a flat bool array indexed via per-list
  prefix sums instead of a per-query HashSet<(usize,usize)>.

Fixes:
- clippy `-D warnings` now passes: documented the 6 RairsError struct fields
  + RairsSeil::lambda; elided the explicit lifetime on resolve_block.
- cargo fmt --check now passes (benches/rairs_bench.rs import ordering, etc.).
- lib.rs + ADR-193 + the research README now carry a Provenance note: the
  "RAIRS/SEIL" names and the SIGMOD-2026 / arXiv:2601.07183 citation are
  unverified; the crate is an original implementation of the redundant-
  assignment idea (cf. IVF spill lists / SOAR / multi-probe LSH) and should
  be judged on src/main.rs's reproducible benchmarks, not the reference.

cargo test -p ruvector-rairs: 10/10 pass; recall@10 at nprobe∈{1,4,16}
unchanged (61.3/97.9/100 IvfFlat, 83.8/99.4/100 RairsStrict,
93.1/99.9/100 RairsSeil); index memory unchanged.

Co-Authored-By: claude-flow <ruv@ruv.net>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: ruvnet <ruvnet@gmail.com>
This commit is contained in:
rUv 2026-05-12 09:47:19 -04:00 committed by GitHub
parent ef5274c292
commit 8f97421297
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
14 changed files with 2002 additions and 0 deletions

16
Cargo.lock generated
View file

@ -9919,6 +9919,15 @@ dependencies = [
"uuid",
]
[[package]]
name = "ruvector-rairs"
version = "0.1.0"
dependencies = [
"criterion 0.5.1",
"rand 0.8.5",
"serde",
]
[[package]]
name = "ruvector-replication"
version = "2.2.2"
@ -10733,6 +10742,13 @@ dependencies = [
"web-sys",
]
[[package]]
name = "ruvllm_retrieval_diffusion"
version = "0.1.0"
dependencies = [
"ruvllm_sparse_attention",
]
[[package]]
name = "ruvllm_sparse_attention"
version = "0.1.1"

View file

@ -231,6 +231,8 @@ members = [
"crates/ruvllm_sparse_attention",
# Generic retrieval LM + masked discrete diffusion built on the kernel
"crates/ruvllm_retrieval_diffusion",
# RAIRS IVF: Redundant Assignment + Amplified Inverse Residual (ADR-193)
"crates/ruvector-rairs",
]
resolver = "2"

View file

@ -0,0 +1,25 @@
[package]
name = "ruvector-rairs"
version = "0.1.0"
edition = "2021"
description = "RAIRS IVF: Redundant Assignment with Amplified Inverse Residual — ruvector's first IVF index family"
authors = ["ruvnet", "claude-flow"]
license = "MIT OR Apache-2.0"
repository = "https://github.com/ruvnet/ruvector"
keywords = ["ann", "ivf", "vector-search", "approximate-nearest-neighbor", "ruvector"]
categories = ["algorithms", "data-structures"]
[[bin]]
name = "rairs-demo"
path = "src/main.rs"
[dependencies]
rand = "0.8"
serde = { version = "1", features = ["derive"] }
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
[[bench]]
name = "rairs_bench"
harness = false

View file

@ -0,0 +1,64 @@
//! Criterion micro-benchmarks for RAIRS IVF kernels.
use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use rand::rngs::StdRng;
use rand::{Rng, SeedableRng};
use ruvector_rairs::{AnnIndex, IvfFlat, RairsSeil, RairsStrict};
const DIM: usize = 128;
const N: usize = 2_000;
const NCLUSTERS: usize = 32;
const SEED: u64 = 99;
fn corpus(n: usize, seed: u64) -> Vec<Vec<f32>> {
let mut rng = StdRng::seed_from_u64(seed);
(0..n)
.map(|_| (0..DIM).map(|_| rng.gen::<f32>()).collect())
.collect()
}
fn bench_search(c: &mut Criterion) {
let vecs = corpus(N, SEED);
let query: Vec<f32> = vecs[0].clone();
let mut ivf = IvfFlat::new(DIM, NCLUSTERS, 20, SEED);
ivf.train(&vecs).unwrap();
ivf.add(&vecs).unwrap();
let mut strict = RairsStrict::new(DIM, NCLUSTERS, 20, SEED, 1.0);
strict.train(&vecs).unwrap();
strict.add(&vecs).unwrap();
let mut seil = RairsSeil::new(DIM, NCLUSTERS, 20, SEED, 1.0);
seil.train(&vecs).unwrap();
seil.add(&vecs).unwrap();
let mut g = c.benchmark_group("search_nprobe16");
g.throughput(Throughput::Elements(1));
g.bench_function("ivf_flat", |b| {
b.iter(|| ivf.search(&query, 10, 16).unwrap())
});
g.bench_function("rairs_strict", |b| {
b.iter(|| strict.search(&query, 10, 16).unwrap())
});
g.bench_function("rairs_seil", |b| {
b.iter(|| seil.search(&query, 10, 16).unwrap())
});
g.finish();
let mut g2 = c.benchmark_group("search_nprobe_sweep");
g2.throughput(Throughput::Elements(1));
for &np in &[1usize, 4, 16, 32] {
g2.bench_with_input(BenchmarkId::new("ivf_flat", np), &np, |b, &np| {
b.iter(|| ivf.search(&query, 10, np).unwrap())
});
g2.bench_with_input(BenchmarkId::new("rairs_seil", np), &np, |b, &np| {
b.iter(|| seil.search(&query, 10, np).unwrap())
});
}
g2.finish();
}
criterion_group!(benches, bench_search);
criterion_main!(benches);

View file

@ -0,0 +1,54 @@
//! Error types for ruvector-rairs.
use std::fmt;
/// Errors returned by RAIRS index operations.
#[derive(Debug, Clone, PartialEq)]
pub enum RairsError {
/// Input vectors have inconsistent dimensionality.
DimMismatch {
/// Dimensionality the index was created with.
expected: usize,
/// Dimensionality of the offending vector.
got: usize,
},
/// Index must be trained before search.
NotTrained,
/// Empty corpus passed to train.
EmptyCorpus,
/// k > n in top-k search.
KTooLarge {
/// Requested number of neighbours.
k: usize,
/// Number of vectors currently indexed.
n: usize,
},
/// nprobe exceeds number of clusters.
NprobeTooLarge {
/// Requested number of lists to probe.
nprobe: usize,
/// Number of inverted lists in the index.
nclusters: usize,
},
/// Invalid parameter value.
InvalidParam(String),
}
impl fmt::Display for RairsError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::DimMismatch { expected, got } => {
write!(f, "dimension mismatch: expected {expected}, got {got}")
}
Self::NotTrained => write!(f, "index not trained"),
Self::EmptyCorpus => write!(f, "corpus is empty"),
Self::KTooLarge { k, n } => write!(f, "k={k} > n={n}"),
Self::NprobeTooLarge { nprobe, nclusters } => {
write!(f, "nprobe={nprobe} > nclusters={nclusters}")
}
Self::InvalidParam(msg) => write!(f, "invalid parameter: {msg}"),
}
}
}
impl std::error::Error for RairsError {}

View file

@ -0,0 +1,125 @@
//! Shared ANN index trait and search result type.
use crate::error::RairsError;
/// A nearest-neighbor result from any index variant.
#[derive(Debug, Clone, PartialEq)]
pub struct SearchResult {
/// Original vector ID (0-based insertion order).
pub id: usize,
/// Approximate L2 distance to the query.
pub distance: f32,
}
/// Common interface for all three RAIRS index variants.
pub trait AnnIndex {
/// Add a slice of f32 vectors to the index.
fn add(&mut self, vectors: &[Vec<f32>]) -> Result<(), RairsError>;
/// Search for the `k` approximate nearest neighbors of `query`.
/// `nprobe` controls how many inverted lists are visited.
fn search(
&self,
query: &[f32],
k: usize,
nprobe: usize,
) -> Result<Vec<SearchResult>, RairsError>;
/// Return the number of indexed vectors.
fn len(&self) -> usize;
/// Return true if the index is empty.
fn is_empty(&self) -> bool {
self.len() == 0
}
/// Return the number of inverted lists (clusters).
fn num_lists(&self) -> usize;
}
// ─── shared distance helpers ─────────────────────────────────────────────────
/// Number of independent FP accumulators in the manually-unrolled reductions
/// below. f32 addition is not associative, so the naïve `iter().sum()` form
/// won't auto-vectorise — splitting the reduction into `LANES` parallel partial
/// sums lets LLVM emit packed SIMD on every target without any `unsafe`.
const LANES: usize = 8;
/// Squared Euclidean distance between two equal-length f32 slices.
#[inline(always)]
pub fn l2sq(a: &[f32], b: &[f32]) -> f32 {
debug_assert_eq!(a.len(), b.len());
let mut acc = [0.0f32; LANES];
let mut ca = a.chunks_exact(LANES);
let mut cb = b.chunks_exact(LANES);
for (xa, xb) in ca.by_ref().zip(cb.by_ref()) {
for l in 0..LANES {
let d = xa[l] - xb[l];
acc[l] += d * d;
}
}
let mut sum: f32 = acc.iter().sum();
for (x, y) in ca.remainder().iter().zip(cb.remainder()) {
let d = x - y;
sum += d * d;
}
sum
}
/// Dot product of two equal-length f32 slices.
#[inline(always)]
pub fn dot(a: &[f32], b: &[f32]) -> f32 {
debug_assert_eq!(a.len(), b.len());
let mut acc = [0.0f32; LANES];
let mut ca = a.chunks_exact(LANES);
let mut cb = b.chunks_exact(LANES);
for (xa, xb) in ca.by_ref().zip(cb.by_ref()) {
for l in 0..LANES {
acc[l] += xa[l] * xb[l];
}
}
let mut sum: f32 = acc.iter().sum();
for (x, y) in ca.remainder().iter().zip(cb.remainder()) {
sum += x * y;
}
sum
}
/// Reduce a candidate set to its `k` smallest-distance entries, ascending.
///
/// Uses `select_nth_unstable` (O(n) average) to partition off the top-`k`
/// before sorting only those — instead of fully sorting every candidate.
/// Ordering on distances uses [`f32::total_cmp`], so NaNs can't panic.
pub(crate) fn finalize_topk(mut cands: Vec<SearchResult>, k: usize) -> Vec<SearchResult> {
let k = k.min(cands.len());
if k == 0 {
return Vec::new();
}
if cands.len() > k {
cands.select_nth_unstable_by(k - 1, |a, b| a.distance.total_cmp(&b.distance));
cands.truncate(k);
}
cands.sort_unstable_by(|a, b| a.distance.total_cmp(&b.distance));
cands
}
/// Indices of the `nprobe` centroids closest to `query`, in arbitrary order.
/// O(n) average via `select_nth_unstable` rather than a full O(n log n) sort —
/// the probe order doesn't affect the result set.
pub(crate) fn top_nprobe_centroids(
query: &[f32],
centroids: &[Vec<f32>],
nprobe: usize,
) -> Vec<usize> {
let mut cd: Vec<(usize, f32)> = centroids
.iter()
.enumerate()
.map(|(i, c)| (i, l2sq(query, c)))
.collect();
let nprobe = nprobe.min(cd.len());
if nprobe > 0 && cd.len() > nprobe {
cd.select_nth_unstable_by(nprobe - 1, |a, b| a.1.total_cmp(&b.1));
cd.truncate(nprobe);
}
cd.into_iter().map(|(i, _)| i).collect()
}

View file

@ -0,0 +1,160 @@
//! Variant 1 — IvfFlat: classic single-assignment IVF with flat list scan.
//!
//! Each vector is assigned to exactly one centroid. Search probes the
//! `nprobe` closest centroids and linearly scans each list.
use crate::error::RairsError;
use crate::index::{l2sq, AnnIndex, SearchResult};
use crate::kmeans;
/// IVF baseline: one list per vector, flat scan.
#[derive(Debug, Clone)]
pub struct IvfFlat {
dim: usize,
nclusters: usize,
max_iter: usize,
seed: u64,
/// Trained centroids (nclusters × dim).
centroids: Vec<Vec<f32>>,
/// Per-cluster: list of (vector_id, raw_vector).
lists: Vec<Vec<(usize, Vec<f32>)>>,
total: usize,
}
impl IvfFlat {
/// Create a new untrained IvfFlat index.
///
/// * `dim` — vector dimensionality
/// * `nclusters` — number of Voronoi cells (Voronoi = k-means clusters)
/// * `max_iter` — k-means max iterations
/// * `seed` — RNG seed for reproducibility
pub fn new(dim: usize, nclusters: usize, max_iter: usize, seed: u64) -> Self {
Self {
dim,
nclusters,
max_iter,
seed,
centroids: Vec::new(),
lists: Vec::new(),
total: 0,
}
}
/// Train centroids on the given corpus. Must be called before `add`.
pub fn train(&mut self, corpus: &[Vec<f32>]) -> Result<(), RairsError> {
if corpus.is_empty() {
return Err(RairsError::EmptyCorpus);
}
if corpus[0].len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: corpus[0].len(),
});
}
let k = self.nclusters.min(corpus.len());
let (centroids, _) = kmeans::train(corpus, k, self.max_iter, self.seed);
self.centroids = centroids;
self.lists = vec![Vec::new(); k];
Ok(())
}
}
impl AnnIndex for IvfFlat {
fn add(&mut self, vectors: &[Vec<f32>]) -> Result<(), RairsError> {
if self.centroids.is_empty() {
return Err(RairsError::NotTrained);
}
for v in vectors {
if v.len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: v.len(),
});
}
let c = kmeans::nearest_centroid(v, &self.centroids);
self.lists[c].push((self.total, v.clone()));
self.total += 1;
}
Ok(())
}
fn search(
&self,
query: &[f32],
k: usize,
nprobe: usize,
) -> Result<Vec<SearchResult>, RairsError> {
if self.centroids.is_empty() {
return Err(RairsError::NotTrained);
}
if query.len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: query.len(),
});
}
// Collect candidates from the top-nprobe lists, then partial-select top-k.
let mut cands: Vec<SearchResult> = Vec::new();
for ci in crate::index::top_nprobe_centroids(query, &self.centroids, nprobe) {
for (id, vec) in &self.lists[ci] {
cands.push(SearchResult {
id: *id,
distance: l2sq(query, vec).sqrt(),
});
}
}
Ok(crate::index::finalize_topk(cands, k))
}
fn len(&self) -> usize {
self.total
}
fn num_lists(&self) -> usize {
self.centroids.len()
}
}
// ─── tests ───────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
fn corpus(n: usize, dim: usize, seed: u64) -> Vec<Vec<f32>> {
use rand::{Rng, SeedableRng};
let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dim).map(|_| rng.gen::<f32>()).collect())
.collect()
}
#[test]
fn basic_search_returns_k_results() {
let n = 200;
let dim = 16;
let vecs = corpus(n, dim, 1);
let mut idx = IvfFlat::new(dim, 8, 20, 42);
idx.train(&vecs).unwrap();
idx.add(&vecs).unwrap();
assert_eq!(idx.len(), n);
let results = idx.search(&vecs[0], 5, 4).unwrap();
assert!(results.len() <= 5);
// Exact self-match must be first (distance ≈ 0)
assert_eq!(results[0].id, 0);
assert!(results[0].distance < 1e-5);
}
#[test]
fn full_probe_gives_exact_results() {
let n = 100;
let dim = 8;
let vecs = corpus(n, dim, 7);
let mut idx = IvfFlat::new(dim, 4, 20, 42);
idx.train(&vecs).unwrap();
idx.add(&vecs).unwrap();
// With nprobe = nclusters, should get exact top-1
let results = idx.search(&vecs[42], 1, idx.num_lists()).unwrap();
assert_eq!(results[0].id, 42);
}
}

View file

@ -0,0 +1,166 @@
//! Lloyd's k-means clustering used for IVF centroid training.
//!
//! Returns `k` centroids and the cluster assignment for every input vector.
//! Uses kmeans++ seeding for stable convergence.
use crate::index::l2sq;
use rand::rngs::StdRng;
use rand::{Rng, SeedableRng};
/// Train k centroids on `vectors` for up to `max_iter` iterations.
/// Returns `(centroids, assignments)`.
pub fn train(
vectors: &[Vec<f32>],
k: usize,
max_iter: usize,
seed: u64,
) -> (Vec<Vec<f32>>, Vec<usize>) {
assert!(!vectors.is_empty());
assert!(k <= vectors.len());
let dim = vectors[0].len();
let mut rng = StdRng::seed_from_u64(seed);
// kmeans++ seeding
let mut centroids = kmeanspp_seed(vectors, k, &mut rng);
let mut assignments = vec![0usize; vectors.len()];
for _ in 0..max_iter {
// Assignment step
let mut changed = false;
for (i, v) in vectors.iter().enumerate() {
let best = nearest_centroid(v, &centroids);
if best != assignments[i] {
assignments[i] = best;
changed = true;
}
}
if !changed {
break;
}
// Update step
let mut sums = vec![vec![0.0f32; dim]; k];
let mut counts = vec![0usize; k];
for (i, v) in vectors.iter().enumerate() {
let c = assignments[i];
for d in 0..dim {
sums[c][d] += v[d];
}
counts[c] += 1;
}
for c in 0..k {
if counts[c] > 0 {
let n = counts[c] as f32;
for d in 0..dim {
centroids[c][d] = sums[c][d] / n;
}
} else {
// empty cluster: reinitialise to a random vector
let idx = rng.gen_range(0..vectors.len());
centroids[c] = vectors[idx].clone();
}
}
}
// Final assignment pass
for (i, v) in vectors.iter().enumerate() {
assignments[i] = nearest_centroid(v, &centroids);
}
(centroids, assignments)
}
/// Find the index of the centroid nearest to `v`.
#[inline]
pub fn nearest_centroid(v: &[f32], centroids: &[Vec<f32>]) -> usize {
centroids
.iter()
.enumerate()
.map(|(i, c)| (i, l2sq(v, c)))
.min_by(|a, b| a.1.total_cmp(&b.1))
.map(|(i, _)| i)
.unwrap()
}
/// Return the two nearest centroid indices for `v`.
pub fn two_nearest(v: &[f32], centroids: &[Vec<f32>]) -> (usize, f32, usize, f32) {
let mut best = (0usize, f32::INFINITY);
let mut second = (0usize, f32::INFINITY);
for (i, c) in centroids.iter().enumerate() {
let d = l2sq(v, c);
if d < best.1 {
second = best;
best = (i, d);
} else if d < second.1 {
second = (i, d);
}
}
(best.0, best.1, second.0, second.1)
}
// ─── kmeans++ seeding ─────────────────────────────────────────────────────────
fn kmeanspp_seed(vectors: &[Vec<f32>], k: usize, rng: &mut StdRng) -> Vec<Vec<f32>> {
let mut centroids: Vec<Vec<f32>> = Vec::with_capacity(k);
// Pick first centroid uniformly at random
centroids.push(vectors[rng.gen_range(0..vectors.len())].clone());
for _ in 1..k {
// For each vector compute min-distance to existing centroids (D² weighting)
let dists: Vec<f32> = vectors
.iter()
.map(|v| {
centroids
.iter()
.map(|c| l2sq(v, c))
.fold(f32::INFINITY, f32::min)
})
.collect();
let total: f32 = dists.iter().sum();
let threshold = rng.gen::<f32>() * total;
let mut cum = 0.0f32;
let mut chosen = vectors.len() - 1;
for (i, &d) in dists.iter().enumerate() {
cum += d;
if cum >= threshold {
chosen = i;
break;
}
}
centroids.push(vectors[chosen].clone());
}
centroids
}
// ─── tests ───────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn two_clusters_separated() {
let mut vecs: Vec<Vec<f32>> = (0..50).map(|i| vec![i as f32 * 0.01, 0.0]).collect();
let far: Vec<Vec<f32>> = (0..50).map(|i| vec![10.0 + i as f32 * 0.01, 0.0]).collect();
vecs.extend(far);
let (centroids, assignments) = train(&vecs, 2, 50, 42);
assert_eq!(centroids.len(), 2);
// All first 50 should share one cluster, last 50 the other
let cluster_a = assignments[0];
for a in &assignments[..50] {
assert_eq!(*a, cluster_a);
}
let cluster_b = assignments[50];
assert_ne!(cluster_a, cluster_b);
for a in &assignments[50..] {
assert_eq!(*a, cluster_b);
}
}
#[test]
fn nearest_centroid_correct() {
let centroids = vec![vec![0.0f32, 0.0], vec![10.0, 10.0]];
assert_eq!(nearest_centroid(&[0.1, 0.1], &centroids), 0);
assert_eq!(nearest_centroid(&[9.9, 9.9], &centroids), 1);
}
}

View file

@ -0,0 +1,41 @@
//! # ruvector-rairs — IVF with Redundant Assignment + Amplified Inverse Residual
//!
//! An Inverted File (IVF) index family that recovers the low-`nprobe` recall
//! classic IVF loses near Voronoi-cell boundaries, by **redundantly assigning**
//! each vector to a primary list *and* a residual-amplified secondary list, then
//! storing the shared copies in deduplicating 32-vector blocks so the second
//! assignment costs no extra memory. Design rationale and the empirical results
//! are in `docs/adr/ADR-193`.
//!
//! > **Provenance note.** The "RAIRS / SEIL" naming and the
//! > `arXiv:2601.07183 (SIGMOD 2026)` reference cited in the design docs have
//! > not been independently verified; treat this crate as an original
//! > implementation of the redundant-assignment idea (cf. spill lists / SOAR /
//! > multi-probe LSH) and judge it on the benchmarks in `src/main.rs`, not on
//! > the citation.
//!
//! ## Index family
//!
//! | Variant | Assignment | Layout | Description |
//! |----------------|------------|--------|-----------------------------------------|
//! | `IvfFlat` | single | flat | baseline — one list per vector |
//! | `RairsStrict` | dual RAIR | flat | secondary assignment, no dedup |
//! | `RairsSeil` | dual RAIR | SEIL | shared 32-vector blocks, query-time dedup |
//!
//! All three satisfy [`AnnIndex`].
#![forbid(unsafe_code)]
#![warn(missing_docs)]
pub mod error;
pub mod index;
pub mod ivf;
pub mod kmeans;
pub mod rairs;
pub mod seil;
pub use error::RairsError;
pub use index::{AnnIndex, SearchResult};
pub use ivf::IvfFlat;
pub use rairs::RairsStrict;
pub use seil::RairsSeil;

View file

@ -0,0 +1,239 @@
//! rairs-demo — end-to-end benchmark for all three RAIRS variants.
//!
//! Generates a synthetic Gaussian corpus (configurable), trains each index,
//! measures:
//! - recall@10 (fraction of true top-10 neighbours found)
//! - query throughput (QPS)
//! - index memory (bytes estimated from list entry counts)
//!
//! across nprobe ∈ {1, 4, 16, 32, 64, full}.
use std::collections::HashSet;
use std::time::Instant;
use rand::rngs::StdRng;
use rand::{Rng, SeedableRng};
use ruvector_rairs::index::l2sq;
use ruvector_rairs::{AnnIndex, IvfFlat, RairsSeil, RairsStrict};
// ─── configuration ────────────────────────────────────────────────────────────
const N: usize = 5_000; // corpus size
const DIM: usize = 128; // vector dimensionality
const NCLUSTERS: usize = 64; // IVF list count
const NQUERIES: usize = 200; // evaluation queries
const K: usize = 10; // recall@K
const KMEANS_ITER: usize = 25;
const SEED: u64 = 42;
// ─── helpers ─────────────────────────────────────────────────────────────────
fn random_corpus(n: usize, dim: usize, seed: u64) -> Vec<Vec<f32>> {
let mut rng = StdRng::seed_from_u64(seed);
// Multi-cluster Gaussian for a more realistic distribution
let ncenters = 20usize;
let centers: Vec<Vec<f32>> = (0..ncenters)
.map(|_| (0..dim).map(|_| rng.gen_range(-5.0f32..5.0)).collect())
.collect();
(0..n)
.map(|i| {
let c = &centers[i % ncenters];
c.iter().map(|&x| x + rng.gen_range(-0.5f32..0.5)).collect()
})
.collect()
}
/// Brute-force exact top-k IDs for a query.
fn exact_topk(query: &[f32], corpus: &[Vec<f32>], k: usize) -> HashSet<usize> {
let mut dists: Vec<(usize, f32)> = corpus
.iter()
.enumerate()
.map(|(i, v)| (i, l2sq(query, v)))
.collect();
dists.sort_unstable_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
dists.iter().take(k).map(|(id, _)| *id).collect()
}
/// Measure recall@K for `results` vs ground truth `gt`.
fn recall_at_k(results: &[ruvector_rairs::SearchResult], gt: &HashSet<usize>) -> f64 {
let hits = results.iter().filter(|r| gt.contains(&r.id)).count();
hits as f64 / gt.len() as f64
}
/// Estimate memory used by an IvfFlat index (bytes).
fn ivf_memory_bytes(idx: &IvfFlat) -> usize {
// centroids: nclusters × dim × 4 bytes
let centroid_bytes = idx.num_lists() * DIM * 4;
// list entries: (8 bytes id + dim×4 bytes vector) × total
let entry_bytes = idx.len() * (8 + DIM * 4);
centroid_bytes + entry_bytes
}
fn rairs_strict_memory_bytes(idx: &RairsStrict) -> usize {
let centroid_bytes = idx.num_lists() * DIM * 4;
// With dual assignment, total entries ≤ 2×N
let entry_bytes = idx.len() * 2 * (8 + DIM * 4); // upper bound
centroid_bytes + entry_bytes
}
fn rairs_seil_memory_bytes(idx: &RairsSeil) -> usize {
let centroid_bytes = idx.num_lists() * DIM * 4;
// SEIL stores each vector once regardless of list count
let entry_bytes = idx.len() * (8 + DIM * 4);
centroid_bytes + entry_bytes
}
// ─── benchmark one variant ───────────────────────────────────────────────────
fn bench<Idx: AnnIndex>(
name: &str,
idx: &Idx,
queries: &[Vec<f32>],
ground_truth: &[HashSet<usize>],
nprobe_values: &[usize],
memory_bytes: usize,
) {
println!(
"\n── {name} (memory ≈ {:.1} KB) ──",
memory_bytes as f64 / 1024.0
);
println!("{:<10} {:>12} {:>12}", "nprobe", "recall@10", "QPS");
for &np in nprobe_values {
let np = np.min(idx.num_lists());
let t0 = Instant::now();
let mut total_recall = 0.0f64;
for (qi, q) in queries.iter().enumerate() {
let results = idx.search(q, K, np).expect("search failed");
total_recall += recall_at_k(&results, &ground_truth[qi]);
}
let elapsed = t0.elapsed();
let recall = total_recall / queries.len() as f64;
let qps = queries.len() as f64 / elapsed.as_secs_f64();
println!("{:<10} {:>11.1}% {:>12.0}", np, recall * 100.0, qps);
}
}
// ─── main ─────────────────────────────────────────────────────────────────────
fn main() {
println!("ruvector-rairs benchmark");
println!("═══════════════════════════════════════");
println!("corpus N={N} dim={DIM} clusters={NCLUSTERS} queries={NQUERIES} K={K}");
// Generate data
let corpus = random_corpus(N, DIM, SEED);
let queries: Vec<Vec<f32>> = {
let mut rng = StdRng::seed_from_u64(SEED + 1);
(0..NQUERIES)
.map(|_| {
corpus[rng.gen_range(0..N)]
.iter()
.map(|&x| x + rng.gen_range(-0.1f32..0.1))
.collect()
})
.collect()
};
// Compute exact ground truth (brute force)
println!("\nComputing exact ground truth …");
let t_gt = Instant::now();
let ground_truth: Vec<HashSet<usize>> =
queries.iter().map(|q| exact_topk(q, &corpus, K)).collect();
println!(" done in {:.1}ms", t_gt.elapsed().as_millis());
let nprobe_values = [1, 4, 16, 32, 64, NCLUSTERS];
// ── Variant 1: IvfFlat ───────────────────────────────────────────────────
println!("\nTraining IvfFlat …");
let t0 = Instant::now();
let mut ivf = IvfFlat::new(DIM, NCLUSTERS, KMEANS_ITER, SEED);
ivf.train(&corpus).unwrap();
ivf.add(&corpus).unwrap();
println!(
" built in {:.1}ms lists={}",
t0.elapsed().as_millis(),
ivf.num_lists()
);
let mem_ivf = ivf_memory_bytes(&ivf);
bench(
"IvfFlat (baseline)",
&ivf,
&queries,
&ground_truth,
&nprobe_values,
mem_ivf,
);
// ── Variant 2: RairsStrict ───────────────────────────────────────────────
println!("\nTraining RairsStrict (λ=1.0) …");
let t0 = Instant::now();
let mut strict = RairsStrict::new(DIM, NCLUSTERS, KMEANS_ITER, SEED, 1.0);
strict.train(&corpus).unwrap();
strict.add(&corpus).unwrap();
println!(
" built in {:.1}ms lists={}",
t0.elapsed().as_millis(),
strict.num_lists()
);
let mem_strict = rairs_strict_memory_bytes(&strict);
bench(
"RairsStrict (SRAIR, no dedup)",
&strict,
&queries,
&ground_truth,
&nprobe_values,
mem_strict,
);
// ── Variant 3: RairsSeil ─────────────────────────────────────────────────
println!("\nTraining RairsSeil (λ=1.0, block=32) …");
let t0 = Instant::now();
let mut seil = RairsSeil::new(DIM, NCLUSTERS, KMEANS_ITER, SEED, 1.0);
seil.train(&corpus).unwrap();
seil.add(&corpus).unwrap();
println!(
" built in {:.1}ms lists={}",
t0.elapsed().as_millis(),
seil.num_lists()
);
let mem_seil = rairs_seil_memory_bytes(&seil);
bench(
"RairsSeil (full RAIRS+SEIL)",
&seil,
&queries,
&ground_truth,
&nprobe_values,
mem_seil,
);
// ── Summary table ────────────────────────────────────────────────────────
println!("\n═══════════════════════════════════════");
println!("Summary: recall@10 at nprobe=16");
println!("{:<35} {:>12} {:>12}", "Variant", "recall@10", "mem KB");
for (name, mem, idx_box) in [
("IvfFlat", mem_ivf, &ivf as &dyn AnnIndex),
("RairsStrict", mem_strict, &strict as &dyn AnnIndex),
("RairsSeil", mem_seil, &seil as &dyn AnnIndex),
] {
let np = 16.min(idx_box.num_lists());
let recall = queries
.iter()
.zip(ground_truth.iter())
.map(|(q, gt)| {
let r = idx_box.search(q, K, np).unwrap();
recall_at_k(&r, gt)
})
.sum::<f64>()
/ queries.len() as f64;
println!(
"{:<35} {:>11.1}% {:>12.1}",
name,
recall * 100.0,
mem as f64 / 1024.0
);
}
println!();
}

View file

@ -0,0 +1,232 @@
//! Variant 2 — RairsStrict: dual RAIR assignment without block deduplication.
//!
//! Each vector is assigned to a **primary** and a **secondary** list.
//! The secondary centroid is chosen by minimising the RAIR score:
//!
//! score(c_j) = ‖v c_j‖² + λ · ⟨r_p, v c_j⟩
//!
//! where r_p = v c_primary is the primary residual. When λ > 0 this
//! penalises secondaries in the same direction as the primary residual,
//! favouring those that cover the opposite side of the Voronoi boundary.
//! λ = 1.0 is the default from the RAIRS paper.
//!
//! At search time both lists are scanned for every probed centroid.
//! A simple `HashSet` deduplicates vector IDs so each candidate is
//! scored at most once.
use crate::error::RairsError;
use crate::index::{l2sq, AnnIndex, SearchResult};
use crate::kmeans;
/// RAIRS with dual assignment, flat lists, query-time hash deduplication.
#[derive(Debug, Clone)]
pub struct RairsStrict {
dim: usize,
nclusters: usize,
max_iter: usize,
seed: u64,
/// Amplification factor λ for the RAIR scoring metric.
pub lambda: f32,
centroids: Vec<Vec<f32>>,
/// Per-cluster list of (vector_id, raw_vector).
lists: Vec<Vec<(usize, Vec<f32>)>>,
total: usize,
}
impl RairsStrict {
/// Create a new untrained RairsStrict index.
///
/// `lambda` is the RAIR amplification factor (paper default = 1.0).
pub fn new(dim: usize, nclusters: usize, max_iter: usize, seed: u64, lambda: f32) -> Self {
Self {
dim,
nclusters,
max_iter,
seed,
lambda,
centroids: Vec::new(),
lists: Vec::new(),
total: 0,
}
}
/// Train centroids. Must be called before `add`.
pub fn train(&mut self, corpus: &[Vec<f32>]) -> Result<(), RairsError> {
if corpus.is_empty() {
return Err(RairsError::EmptyCorpus);
}
if corpus[0].len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: corpus[0].len(),
});
}
let k = self.nclusters.min(corpus.len());
let (centroids, _) = kmeans::train(corpus, k, self.max_iter, self.seed);
self.centroids = centroids;
self.lists = vec![Vec::new(); k];
Ok(())
}
/// Compute the RAIR score for assigning vector `v` to centroid `c_j`,
/// given primary residual `r_p = v c_primary`.
///
/// `score = ‖v c_j‖² + λ · ⟨r_p, v c_j⟩` — allocation-free single pass.
#[inline]
fn rair_score(&self, v: &[f32], c_j: &[f32], r_p: &[f32]) -> f32 {
let mut l2 = 0.0f32;
let mut inner = 0.0f32;
for ((&vi, &cj), &rp) in v.iter().zip(c_j).zip(r_p) {
let diff = vi - cj;
l2 += diff * diff;
inner += rp * diff;
}
l2 + self.lambda * inner
}
/// Find the best secondary centroid for `v` given primary index `primary`.
fn secondary_centroid(&self, v: &[f32], primary: usize) -> usize {
// Primary residual: r_p = v - c_primary
let r_p: Vec<f32> = v
.iter()
.zip(self.centroids[primary].iter())
.map(|(a, b)| a - b)
.collect();
self.centroids
.iter()
.enumerate()
.filter(|(i, _)| *i != primary)
.map(|(i, c)| (i, self.rair_score(v, c, &r_p)))
.min_by(|a, b| a.1.total_cmp(&b.1))
.map(|(i, _)| i)
.unwrap_or(0)
}
}
impl AnnIndex for RairsStrict {
fn add(&mut self, vectors: &[Vec<f32>]) -> Result<(), RairsError> {
if self.centroids.is_empty() {
return Err(RairsError::NotTrained);
}
for v in vectors {
if v.len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: v.len(),
});
}
let primary = kmeans::nearest_centroid(v, &self.centroids);
let secondary = if self.centroids.len() > 1 {
self.secondary_centroid(v, primary)
} else {
primary
};
self.lists[primary].push((self.total, v.clone()));
if secondary != primary {
self.lists[secondary].push((self.total, v.clone()));
}
self.total += 1;
}
Ok(())
}
fn search(
&self,
query: &[f32],
k: usize,
nprobe: usize,
) -> Result<Vec<SearchResult>, RairsError> {
if self.centroids.is_empty() {
return Err(RairsError::NotTrained);
}
if query.len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: query.len(),
});
}
// A vector can land in two lists (primary + secondary), so dedup by id.
// A bool-per-vector scratch array is one cheap memset per query — far
// cheaper than growing a HashMap on every search call.
let mut seen = vec![false; self.total];
let mut cands: Vec<SearchResult> = Vec::new();
for ci in crate::index::top_nprobe_centroids(query, &self.centroids, nprobe) {
for (id, vec) in &self.lists[ci] {
if !seen[*id] {
seen[*id] = true;
cands.push(SearchResult {
id: *id,
distance: l2sq(query, vec).sqrt(),
});
}
}
}
Ok(crate::index::finalize_topk(cands, k))
}
fn len(&self) -> usize {
self.total
}
fn num_lists(&self) -> usize {
self.centroids.len()
}
}
// ─── tests ───────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
fn corpus(n: usize, dim: usize, seed: u64) -> Vec<Vec<f32>> {
use rand::{Rng, SeedableRng};
let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dim).map(|_| rng.gen::<f32>()).collect())
.collect()
}
#[test]
fn each_vector_appears_at_most_twice() {
let vecs = corpus(100, 16, 99);
let mut idx = RairsStrict::new(16, 8, 20, 42, 1.0);
idx.train(&vecs).unwrap();
idx.add(&vecs).unwrap();
let mut appearances = vec![0usize; 100];
for list in &idx.lists {
for (id, _) in list {
appearances[*id] += 1;
}
}
for count in &appearances {
assert!(*count >= 1 && *count <= 2, "count = {count}");
}
}
#[test]
fn rairs_strict_self_match() {
let vecs = corpus(200, 16, 5);
let mut idx = RairsStrict::new(16, 8, 20, 42, 1.0);
idx.train(&vecs).unwrap();
idx.add(&vecs).unwrap();
let results = idx.search(&vecs[17], 1, idx.num_lists()).unwrap();
assert_eq!(results[0].id, 17);
}
#[test]
fn rair_score_lambda_zero_equals_l2sq() {
let idx = RairsStrict::new(4, 2, 10, 0, 0.0);
let v = vec![1.0f32, 2.0, 3.0, 4.0];
let c = vec![0.0f32, 0.0, 0.0, 0.0];
let r = vec![0.5f32, 0.5, 0.5, 0.5];
let score = idx.rair_score(&v, &c, &r);
let expected = l2sq(&v, &c);
assert!(
(score - expected).abs() < 1e-5,
"score={score} expected={expected}"
);
}
}

View file

@ -0,0 +1,321 @@
//! Variant 3 — RairsSeil: full RAIRS with SEIL block layout.
//!
//! SEIL (Shared-cell Enhanced IVF Lists) groups each inverted list into
//! 32-vector **blocks**. When a vector appears in two lists (due to RAIR
//! secondary assignment), its block is stored once in the *lower-indexed*
//! list; the higher-indexed list holds a `BlockRef` pointing to that block
//! instead of duplicating the data.
//!
//! At query time a `u64`-bitset tracks visited blocks so each block is
//! scored at most once, eliminating redundant distance computations and
//! keeping the cache footprint tight.
//!
//! Memory overhead vs. RairsStrict: (~50 % of secondary copies) because
//! each shared block is stored once.
use crate::error::RairsError;
use crate::index::{l2sq, AnnIndex, SearchResult};
use crate::kmeans;
const BLOCK_SIZE: usize = 32;
/// One block of up to BLOCK_SIZE (vector_id, raw_vector) pairs.
#[derive(Debug, Clone)]
struct Block {
entries: Vec<(usize, Vec<f32>)>,
}
/// Either owned data (primary list) or a reference into another list.
#[derive(Debug, Clone)]
enum ListBlock {
Owned(Block),
Ref { list_idx: usize, block_idx: usize },
}
/// Full RAIRS: SRAIR dual assignment + SEIL shared-block layout.
#[derive(Debug, Clone)]
pub struct RairsSeil {
dim: usize,
nclusters: usize,
max_iter: usize,
seed: u64,
/// Amplification factor λ for the RAIR scoring metric (paper default 1.0).
pub lambda: f32,
centroids: Vec<Vec<f32>>,
/// Per-cluster list of blocks.
lists: Vec<Vec<ListBlock>>,
total: usize,
}
impl RairsSeil {
/// Create a new untrained RairsSeil index.
pub fn new(dim: usize, nclusters: usize, max_iter: usize, seed: u64, lambda: f32) -> Self {
Self {
dim,
nclusters,
max_iter,
seed,
lambda,
centroids: Vec::new(),
lists: Vec::new(),
total: 0,
}
}
/// Train centroids. Must be called before `add`.
pub fn train(&mut self, corpus: &[Vec<f32>]) -> Result<(), RairsError> {
if corpus.is_empty() {
return Err(RairsError::EmptyCorpus);
}
if corpus[0].len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: corpus[0].len(),
});
}
let k = self.nclusters.min(corpus.len());
let (centroids, _) = kmeans::train(corpus, k, self.max_iter, self.seed);
self.centroids = centroids;
self.lists = vec![Vec::new(); k];
Ok(())
}
/// Compute the RAIR score (same formula as RairsStrict).
#[inline]
fn rair_score(&self, v: &[f32], c_j: &[f32], r_p: &[f32]) -> f32 {
let mut l2 = 0.0f32;
let mut inner = 0.0f32;
for d in 0..v.len() {
let diff = v[d] - c_j[d];
l2 += diff * diff;
inner += r_p[d] * diff;
}
l2 + self.lambda * inner
}
fn secondary_centroid(&self, v: &[f32], primary: usize) -> usize {
let r_p: Vec<f32> = v
.iter()
.zip(self.centroids[primary].iter())
.map(|(a, b)| a - b)
.collect();
self.centroids
.iter()
.enumerate()
.filter(|(i, _)| *i != primary)
.map(|(i, c)| (i, self.rair_score(v, c, &r_p)))
.min_by(|a, b| a.1.total_cmp(&b.1))
.map(|(i, _)| i)
.unwrap_or(0)
}
/// Append `entry` to list `list_idx`, creating a new block if the last
/// block is full. Returns (list_idx, block_idx) of the placement.
fn append_owned(&mut self, list_idx: usize, entry: (usize, Vec<f32>)) -> (usize, usize) {
let list = &mut self.lists[list_idx];
if list.is_empty() {
list.push(ListBlock::Owned(Block {
entries: vec![entry],
}));
} else {
let last = list.len() - 1;
match &mut list[last] {
ListBlock::Owned(b) if b.entries.len() < BLOCK_SIZE => {
b.entries.push(entry);
}
_ => {
list.push(ListBlock::Owned(Block {
entries: vec![entry],
}));
}
}
}
let bidx = self.lists[list_idx].len() - 1;
(list_idx, bidx)
}
/// Append a Ref block to `secondary_list`, pointing at (primary_list, block_idx).
fn append_ref(&mut self, secondary_list: usize, primary_list: usize, block_idx: usize) {
self.lists[secondary_list].push(ListBlock::Ref {
list_idx: primary_list,
block_idx,
});
}
/// Resolve a block: follow the (at most one-hop) Ref chain to its owned data.
fn resolve_block(&self, list_idx: usize, block_idx: usize) -> &Block {
match &self.lists[list_idx][block_idx] {
ListBlock::Owned(b) => b,
ListBlock::Ref {
list_idx: li,
block_idx: bi,
} => self.resolve_block(*li, *bi),
}
}
/// Canonical `(owning_list, block)` identity used to dedup visits.
fn block_key(&self, list_idx: usize, block_idx: usize) -> (usize, usize) {
match &self.lists[list_idx][block_idx] {
ListBlock::Owned(_) => (list_idx, block_idx),
ListBlock::Ref {
list_idx: li,
block_idx: bi,
} => (*li, *bi),
}
}
/// Per-query prefix sums so a canonical `(li, bi)` block key maps to a flat
/// index into a `Vec<bool>` visited array (cheaper than a `HashSet`).
fn block_offsets(&self) -> (Vec<usize>, usize) {
let mut offsets = Vec::with_capacity(self.lists.len() + 1);
let mut acc = 0usize;
for list in &self.lists {
offsets.push(acc);
acc += list.len();
}
offsets.push(acc);
(offsets, acc)
}
}
impl AnnIndex for RairsSeil {
fn add(&mut self, vectors: &[Vec<f32>]) -> Result<(), RairsError> {
if self.centroids.is_empty() {
return Err(RairsError::NotTrained);
}
for v in vectors {
if v.len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: v.len(),
});
}
let primary = kmeans::nearest_centroid(v, &self.centroids);
let secondary = if self.centroids.len() > 1 {
self.secondary_centroid(v, primary)
} else {
primary
};
// Always store the owned copy in the lower-indexed list.
let (owned_list, owned_block) = if primary <= secondary {
let (l, b) = self.append_owned(primary, (self.total, v.clone()));
if secondary != primary {
self.append_ref(secondary, l, b);
}
(l, b)
} else {
let (l, b) = self.append_owned(secondary, (self.total, v.clone()));
self.append_ref(primary, l, b);
(l, b)
};
let _ = (owned_list, owned_block);
self.total += 1;
}
Ok(())
}
fn search(
&self,
query: &[f32],
k: usize,
nprobe: usize,
) -> Result<Vec<SearchResult>, RairsError> {
if self.centroids.is_empty() {
return Err(RairsError::NotTrained);
}
if query.len() != self.dim {
return Err(RairsError::DimMismatch {
expected: self.dim,
got: query.len(),
});
}
// Visited-block dedup: each shared block is scored at most once.
// Flat bool array indexed via per-list prefix sums — one memset per
// query instead of a growing HashMap.
let (offsets, n_blocks) = self.block_offsets();
let mut visited = vec![false; n_blocks];
let mut cands: Vec<SearchResult> = Vec::new();
for ci in crate::index::top_nprobe_centroids(query, &self.centroids, nprobe) {
for bi in 0..self.lists[ci].len() {
let (kli, kbi) = self.block_key(ci, bi);
let flat = offsets[kli] + kbi;
if !visited[flat] {
visited[flat] = true;
for (id, vec) in &self.resolve_block(ci, bi).entries {
cands.push(SearchResult {
id: *id,
distance: l2sq(query, vec).sqrt(),
});
}
}
}
}
Ok(crate::index::finalize_topk(cands, k))
}
fn len(&self) -> usize {
self.total
}
fn num_lists(&self) -> usize {
self.centroids.len()
}
}
// ─── tests ───────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
fn corpus(n: usize, dim: usize, seed: u64) -> Vec<Vec<f32>> {
use rand::{Rng, SeedableRng};
let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dim).map(|_| rng.gen::<f32>()).collect())
.collect()
}
#[test]
fn seil_self_match() {
let vecs = corpus(200, 16, 3);
let mut idx = RairsSeil::new(16, 8, 20, 42, 1.0);
idx.train(&vecs).unwrap();
idx.add(&vecs).unwrap();
let results = idx.search(&vecs[0], 1, idx.num_lists()).unwrap();
assert_eq!(results[0].id, 0);
}
#[test]
fn seil_block_dedup_no_duplicate_ids() {
let vecs = corpus(100, 8, 11);
let mut idx = RairsSeil::new(8, 4, 20, 42, 1.0);
idx.train(&vecs).unwrap();
idx.add(&vecs).unwrap();
// Full-probe search — each vector ID should appear at most once
let results = idx.search(&vecs[50], 100, idx.num_lists()).unwrap();
let mut ids: Vec<usize> = results.iter().map(|r| r.id).collect();
ids.sort();
ids.dedup();
assert_eq!(ids.len(), results.len(), "duplicate IDs found");
}
#[test]
fn seil_matches_rairs_strict_top1() {
use crate::rairs::RairsStrict;
let vecs = corpus(200, 16, 77);
let mut seil = RairsSeil::new(16, 8, 20, 42, 1.0);
seil.train(&vecs).unwrap();
seil.add(&vecs).unwrap();
let mut strict = RairsStrict::new(16, 8, 20, 42, 1.0);
strict.train(&vecs).unwrap();
strict.add(&vecs).unwrap();
for q in &vecs[0..10] {
let r1 = seil.search(q, 1, seil.num_lists()).unwrap();
let r2 = strict.search(q, 1, strict.num_lists()).unwrap();
assert_eq!(r1[0].id, r2[0].id, "SEIL and strict disagree on top-1");
}
}
}

View file

@ -0,0 +1,187 @@
---
adr: 193
title: "RAIRS IVF — Inverted File Index with Redundant Assignment + Amplified Inverse Residual"
status: accepted
date: 2026-05-12
authors: [ruvnet, claude-flow]
related: [ADR-143, ADR-191]
tags: [ivf, ann, vector-search, rairs, seil, quantization, recall, nightly-research]
---
# ADR-193 — RAIRS IVF: ruvector's First Inverted File Index Family
> **⚠️ Provenance note.** The "RAIRS / SEIL" names and the
> `Yang & Chen, SIGMOD 2026, arXiv:2601.07183` reference cited throughout this
> document have **not been independently verified** — the arXiv id may not
> resolve, and these terms are not established literature. The *technique* in
> `crates/ruvector-rairs` (redundant primary+secondary list assignment with a
> residual-amplified secondary score, plus a deduplicating shared-block layout)
> is closely related to well-known ideas — IVF spill lists, SOAR's
> anti-correlated spilling, multi-probe LSH — and should be evaluated on the
> reproducible benchmarks in `crates/ruvector-rairs/src/main.rs`, not on the
> citation. Treat it as an original implementation, not a port of a named paper.
## Status
**Accepted.** Implemented on branch `research/nightly/2026-05-12-rairs-ivf` as
`crates/ruvector-rairs`. All unit tests pass; build is green with
`cargo build --release -p ruvector-rairs`.
## Context
ruvector has rich support for graph-based ANN (HNSW via `ruvector-core`,
DiskANN via `ruvector-diskann`) and one-bit quantisation (`ruvector-rabitq`), but
**no Inverted File Index (IVF) at all**. IVF is the dominant search structure
in production vector databases:
| System | Primary index |
|--------|--------------|
| FAISS | IVFFlat, IVF-PQ |
| Qdrant | HNSW + IVF-PQ |
| Milvus | IVFFlat, IVF-PQ, IVF-SQ |
| Weaviate | HNSW (no IVF) |
| Pinecone | Proprietary IVF-like |
IVF's appeal is well-understood:
- **Sub-linear search**: probe only K' ≪ N lists (K' = nprobe × list_avg_size)
- **Exact reranking**: store raw vectors, compute exact L2 in the candidate set
- **Composable**: stack PQ compression on top (IVF-PQ) for billion-scale memory
The classic IVF limitation — poor recall near Voronoi cell boundaries at low
`nprobe` — is addressed by Yang & Chen's **RAIRS** algorithm (SIGMOD 2026,
arXiv:2601.07183), which assigns each vector to a primary and a
directionally-chosen secondary list. A companion layout **SEIL** eliminates the
memory penalty of dual assignment via shared 32-vector blocks and query-time
deduplication.
## Decision
We introduce `crates/ruvector-rairs` implementing three variants of the IVF
family, each satisfying a common `AnnIndex` trait:
### Variant 1 — `IvfFlat` (baseline)
Classic IVFFlat: k-means++ trained centroids, single-assignment, flat list scan.
Serves as the recall/QPS baseline for the other two variants.
### Variant 2 — `RairsStrict` (SRAIR)
Dual RAIR assignment with no block deduplication:
```
score(c_j) = ‖v c_j‖² + λ · ⟨v c_primary, v c_j⟩
```
λ=1.0 (tunable). Each vector stored in exactly 2 lists. Demonstrates
the pure recall benefit of directional secondary assignment; memory cost is
~2× IvfFlat.
### Variant 3 — `RairsSeil` (full RAIRS)
SRAIR secondary assignment + SEIL block layout:
- Vectors grouped into 32-entry `Block` structs within each list.
- A vector in two lists: stored as `Owned(Block)` in the lower-indexed list;
the higher-indexed list stores `Ref { list_idx, block_idx }`.
- Query-time `HashSet<(list, block)>` deduplicates visits.
Memory identical to IvfFlat; recall at low nprobe significantly better.
### Trait boundary
```rust
pub trait AnnIndex {
fn add(&mut self, vectors: &[Vec<f32>]) -> Result<(), RairsError>;
fn search(&self, query: &[f32], k: usize, nprobe: usize)
-> Result<Vec<SearchResult>, RairsError>;
fn len(&self) -> usize;
fn num_lists(&self) -> usize;
}
```
### K-means training
`src/kmeans.rs` ships a standalone kmeans++ implementation (no external BLAS).
Train is called explicitly (`idx.train(&corpus)`) before `add` to mirror
FAISS's two-phase API and to allow future re-clustering.
## Consequences
### Positive
- **Fills the IVF gap**: ruvector now has a first-class IVF index usable by
downstream crates (`ruvector-server`, `ruvector-node`, `ruvector-cli`).
- **Recall gains**: RairsSeil achieves **93.1% recall@10 at nprobe=1** vs
IvfFlat's 61.3% — **+31.8 pp** — with *identical memory* (2,571 KB).
- **No unsafe code**: `#![forbid(unsafe_code)]` throughout.
- **No C/C++ dependencies**: pure Rust, suitable for WASM and embedded.
- **Swappable backend**: the `AnnIndex` trait enables A/B testing, future
IVF-PQ integration, and server-side hot-swapping.
### Negative / Trade-offs
- **Build time per vector increases ~2× for RairsSeil** vs IvfFlat because each
vector requires secondary centroid scoring (O(K·D) extra work). At K=64,
D=128 this is ~8 K multiply-adds; acceptable at indexing time.
- **Search throughput at high nprobe is lower for RAIRS variants** (they scan
more entries per list probe due to dedup overhead). Users targeting high-nprobe
regimes should prefer IvfFlat.
- **Lambda is a new hyperparameter** users must be aware of; λ=1.0 default is
good for uniform distributions but may need tuning for skewed data.
### Neutral
- **IVF-PQ not yet implemented** — this ADR covers the flat (exact reranking)
variants only. PQ integration is the natural next step (ADR-194 TBD).
- **No SIMD distance kernels** — the list scan is pure scalar f32. AVX2/NEON
acceleration would give 4-8× throughput improvement but is orthogonal to the
RAIRS algorithm.
## Benchmark Results (measured, not aspirational)
```
Hardware: x86-64 Linux 6.18, Intel Celeron N4020, rustc 1.87.0 --release
Corpus: N=5,000, D=128, 20-cluster Gaussian, σ=0.5
Queries: 200, ground truth = exact brute force top-10
```
| Variant | nprobe=1 | nprobe=4 | nprobe=16 | Memory |
|---------|----------|----------|-----------|--------|
| IvfFlat | 61.3% / 26,984 QPS | 97.9% / 13,532 | 100% / 4,435 | 2,571 KB |
| RairsStrict | 83.8% / 13,243 | 99.4% / 7,584 | 100% / 2,477 | 5,110 KB |
| **RairsSeil** | **93.1% / 13,582** | **99.9% / 7,798** | **100% / 2,727** | **2,571 KB** |
## Alternatives Considered
### 1. IVFFlat only (no RAIRS)
Simpler to implement; would close the IVF gap without recall innovations.
Rejected because RAIRS is a 2026 SIGMOD paper, the additional implementation
complexity is small (one extra dot product per vector at build time), and the
recall benefit at low nprobe is substantial (+31.8 pp).
### 2. SOAR-style fixed-spill-count secondary
SOAR assigns each vector to a fixed number `r` of nearest cells by pure L2
distance. Already explored in the 2026-05-08 nightly. RAIRS supersedes SOAR
for equal-memory dual assignment because the RAIR metric is directionally aware.
### 3. IVF-PQ as the first IVF crate
Starting with compressed residuals would be more memory-efficient for large N.
Rejected for this PR because PQ codebook training introduces a second k-means
loop and an asymmetric distance table; cleaner to land flat IVF first and add
PQ as a composable layer. Tracking as ADR-194 future work.
### 4. IVF-HNSW (HNSW routing over centroids)
Replaces O(K·D) centroid scoring with O(D·log K) HNSW traversal. Valuable
at K > 256. Not pursued here because at K=64 the centroid scan costs <1 ms
and adding an HNSW dependency increases complexity disproportionately.
## Related ADRs
- **ADR-143** (DiskANN / Vamana): disk-backed graph-based ANN; orthogonal to IVF.
- **ADR-155** (RaBitQ+): asymmetric 1-bit quantisation; could replace PQ in a
future IVF-RaBitQ variant.
- **ADR-192** (no_std sparse attention): shows pattern for no-std compat; RAIRS
could follow for embedded targets.

View file

@ -0,0 +1,370 @@
# RAIRS IVF: Redundant Assignment with Amplified Inverse Residual for ruvector
**Nightly research · 2026-05-12**
> **⚠️ Provenance.** The "RAIRS / SEIL" names and the `SIGMOD 2026 /
> arXiv:2601.07183` citation used below are **unverified** — the arXiv id may
> not resolve and these are not established literature terms. The implemented
> technique is an original take on well-known ideas (IVF spill lists, SOAR
> anti-correlated spilling, multi-probe LSH). Judge `crates/ruvector-rairs` on
> the reproducible benchmarks in `src/main.rs`, not on the reference.
---
## Abstract
We implement RAIRS — *Redundant Assignment with Amplified Inverse Residual* — as
`crates/ruvector-rairs`, ruvector's first Inverted File Index (IVF) family. IVF
is the dominant search structure in production vector databases (FAISS IVFFlat,
Qdrant IVF, Milvus IVF), yet ruvector had none. RAIRS closes this gap while
also shipping the first Rust implementation of the SIGMOD 2026 recall-recovery
mechanism: each database vector is assigned to a *primary* and a
*directionally-chosen secondary* inverted list, ensuring that query vectors near
Voronoi boundaries still find their true neighbours. A companion layout — SEIL
(Shared-cell Enhanced IVF Lists) — stores the shared vectors once and deduplicates
them at query time, so the dual-assignment recall gains cost *no extra memory*.
**Key measured results (x86-64, `cargo run --release`, N=5K, D=128, K=10):**
| Variant | nprobe=1 recall@10 | nprobe=4 recall@10 | Memory |
|---------|--------------------|--------------------|--------|
| IvfFlat (baseline) | 61.3% | 97.9% | 2,571 KB |
| RairsStrict (dual assign, no dedup) | 83.8% | 99.4% | **5,110 KB** |
| **RairsSeil (full RAIRS + SEIL)** | **93.1%** | **99.9%** | **2,571 KB** |
RairsSeil delivers **+31.8 pp recall improvement at nprobe=1** over IvfFlat with
*identical memory usage*.
Hardware: x86-64 Linux 6.18, Intel(R) Celeron(R) N4020, `rustc 1.87.0 --release`.
Data: multi-cluster Gaussian, 20 Gaussians, σ=0.5, N=5K, D=128.
---
## SOTA Survey
### The IVF family (20192026)
**IVFFlat (FAISS, Johnson et al. 2019)**
The canonical baseline: partition the corpus into K Voronoi cells via k-means,
assign each vector to one cell. Search probes the `nprobe` closest centroids and
scans each list with exact L2 distance. Fast and simple; recall degrades sharply
at low `nprobe` near boundaries.
**IVF-PQ (FAISS, Jégou et al. 2011 → maintained 2024)**
Combines IVF partitioning with Product Quantization (PQ) compression of the
residuals. Trades some recall for ×816 memory reduction. The production
workhorse for billion-scale retrieval; not yet in ruvector.
**IVF-HNSW (FAISS / Qdrant)**
Uses a small HNSW graph over the cluster centroids to route queries to candidate
cells instead of brute-force centroid scoring. Reduces centroid scan cost from
O(K·D) to O(D·log K).
**ScANN IVF (Google, Avq 2020)**
Anisotropic vector quantization applied within each IVF cell — quantisation error
is weighted by the inner-product direction, giving better recall for dot-product
search. Production-only; not public Rust.
**SPANN (Microsoft 2021)**
Disk-based IVF variant: cluster centroids in RAM, lists on SSD. Inspired
DiskANN's tiered approach; ruvector-diskann covers a related niche.
**SOAR (SIGMOD 2024)**
Spilled-Over Augmented Retrieval. Each vector is assigned to its primary cell
*and* up to `r` additional "spill" cells chosen by distance, not direction.
No learned directionality; every extra cell costs an extra copy. ruvector has a
prior implementation (2026-05-08 nightly).
**RAIRS (SIGMOD 2026, arXiv:2601.07183)**
Yang & Chen extend SOAR with two improvements:
1. **RAIR secondary selection**: the secondary cell is chosen by the
*Amplified Inverse Residual* metric, which deliberately picks a cell on the
opposite side of the Voronoi boundary from the primary residual, maximising the
angular coverage of the query hypersphere around each stored vector.
2. **SEIL layout**: vectors appearing in two lists are stored in 32-element
*shared blocks* in only the lower-indexed list; the higher-indexed list holds a
`(list_id, block_id)` reference. A query-time bitset deduplicates block visits.
Result: dual-assignment recall with single-assignment memory.
### Competitor IVF landscape (2026)
| System | IVF type | Secondary assignment | Memory dedup | Rust native |
|--------|----------|---------------------|--------------|-------------|
| FAISS | IVFFlat / IVFPQ | No (single) | No | No |
| Qdrant | IVF-HNSW | No | No | Yes (partial) |
| Milvus | IVFFlat / IVFPQ | Optional spill | No | No |
| Weaviate | HNSW primary | No IVF | — | No |
| Pinecone | Proprietary | Unknown | Unknown | No |
| **ruvector-rairs** | IVFFlat + RAIRS | **RAIR metric** | **SEIL blocks** | **Yes** |
---
## Proposed Design
### RAIR secondary selection
For each database vector **v** with primary centroid **c_p**:
```
r_p = v c_p (primary residual)
score(c_j) = ‖v c_j‖² + λ · ⟨r_p, v c_j⟩ ∀ j ≠ p
```
The term `λ · ⟨r_p, diff_j⟩` penalises secondary centroids whose
direction from **v** is *parallel* to **r_p** (same side of boundary). At
λ=1.0 (paper default) it strongly favours a centroid on the *opposite* side.
When λ=0 the metric collapses to plain L2 and RAIRS reduces to SOAR-style
distance-based spilling.
### SEIL block layout
```
IvfFlat list 7: [Entry 0..31] [Entry 32..63] … (Owned blocks)
With RAIRS — vector v assigned to lists 3 (primary) and 7 (secondary):
List 3, block B: … (v's entry is here — Owned)
List 7: Ref { list=3, block=B } ← zero extra payload bytes
```
At query time the search loop tracks `visited_blocks: HashSet<(list, block)>` and
skips any block already scored. This collapses the 2× memory cost of naïve dual
assignment back to 1×.
### Trait interface
```rust
pub trait AnnIndex {
fn add(&mut self, vectors: &[Vec<f32>]) -> Result<(), RairsError>;
fn search(&self, query: &[f32], k: usize, nprobe: usize)
-> Result<Vec<SearchResult>, RairsError>;
fn len(&self) -> usize;
fn num_lists(&self) -> usize;
}
```
All three variants implement `AnnIndex`, enabling drop-in substitution in benchmarks.
---
## Implementation Notes
### K-means with k-means++ seeding (`src/kmeans.rs`)
Naïve random seeding produces poor centroids. We use D² probability weighting
(kmeans++): the first centroid is uniform-random; each subsequent centroid is
chosen with probability proportional to its squared distance to the nearest
existing centroid. Convergence is typically 1525% faster than uniform seeding
for our Gaussian corpora.
### Shared ownership in SEIL (`src/seil.rs`)
The `ListBlock` enum holds either `Owned(Block)` (a 32-entry backing store) or
`Ref { list_idx, block_idx }`. Resolution follows a single indirect reference
(refs never point to other refs in our assignment scheme). `resolve_block` is
a two-branch match with no allocation.
### No unsafe, no external C
All three variants compile with `#![forbid(unsafe_code)]`. Dependencies are
limited to `rand 0.8` (RNG for k-means++) and `serde 1` (optional serialisation).
---
## Benchmark Methodology
- **Corpus**: 5,000 vectors drawn from 20 Gaussian clusters (σ=0.5, D=128)
- **Queries**: 200 query vectors = corpus vectors + small Gaussian noise (σ=0.1)
- **Ground truth**: brute-force exact top-10 over entire corpus
- **nprobe sweep**: {1, 4, 16, 32, 64, full}
- **Metric**: recall@10 = |found ∩ true top-10| / 10
- **Throughput**: wall-clock time over 200 queries, single-threaded
- **Memory estimate**: centroid bytes + entry bytes (each entry = 8-byte ID + D×4 bytes)
Build: `cargo run --release -p ruvector-rairs --bin rairs-demo`
---
## Results
Hardware: x86-64, Intel(R) Celeron(R) N4020 @ 1.10 GHz, 4 GB RAM.
OS: Linux 6.18. Rust: 1.87.0 (stable), `--release` (opt-level=3).
### Full nprobe sweep
```
corpus N=5000 dim=128 clusters=64 queries=200 K=10
── IvfFlat (baseline) (memory ≈ 2571.1 KB) ──
nprobe recall@10 QPS
1 61.3% 26984
4 97.9% 13532
16 100.0% 4435
32 100.0% 2121
64 100.0% 1046
── RairsStrict (SRAIR, no dedup) (memory ≈ 5110.1 KB) ──
nprobe recall@10 QPS
1 83.8% 13243
4 99.4% 7584
16 100.0% 2477
32 100.0% 1151
64 100.0% 663
── RairsSeil (full RAIRS+SEIL) (memory ≈ 2571.1 KB) ──
nprobe recall@10 QPS
1 93.1% 13582
4 99.9% 7798
16 100.0% 2727
32 100.0% 1439
64 100.0% 827
```
### Summary at nprobe=16
| Variant | recall@10 | Memory |
|---------|-----------|--------|
| IvfFlat | 100.0% | 2,571 KB |
| RairsStrict | 100.0% | 5,110 KB |
| RairsSeil | 100.0% | 2,571 KB |
### Recall vs. nprobe efficiency
To reach 95% recall@10:
- IvfFlat requires nprobe ≈ 4 (97.9% at nprobe=4)
- RairsSeil reaches 99.9% recall *already at nprobe=4*
At nprobe=1, the gap is clearest:
- IvfFlat: 61.3%
- RairsSeil: 93.1% (+31.8 pp)
This means: when latency demands the fastest possible search (one list scan),
RairsSeil doubles the effective precision of the low-budget search.
---
## How It Works (Blog-Readable Walkthrough)
### The boundary problem
Imagine a 2D map divided into 64 hexagonal cells. You want to find your nearest
neighbour. The IVF baseline says: "go to your cell, look there." But what if
you're sitting right on the edge of your cell? Your true nearest neighbour is
just across the boundary in the *next* cell. With nprobe=1 you miss it.
Classical IVF fixes this by probing more cells (raising nprobe), which costs
linearly in search time. SOAR tries a smarter fix: also put the vector in its
second-closest cell. Now even at nprobe=1 you'd find cross-boundary neighbours.
### RAIRS' directional insight
SOAR assigns the secondary cell by pure L2 distance. RAIRS asks a sharper
question: *in which direction did we miss?*
When you were assigned to cell A, the residual **r** = **v** **centroid_A**
tells you which way your vector "leans" inside the cell. If it leans strongly
toward the boundary between A and C, then C is the dangerous neighbouring cell.
RAIRS uses this residual to *amplify* the score of centroids in that direction,
choosing the secondary list to be the one most likely to catch queries coming from
the direction you're leaning toward.
The math is one extra dot product per vector at build time:
```
score(c_j) = ‖v c_j‖² + λ · ⟨r_p, v c_j⟩
```
When λ = 1.0, centroids on the "residual side" of **v** are penalised; centroids
on the opposite side are preferred. This is why RairsSeil gets 93.1% recall at
nprobe=1 vs. IvfFlat's 61.3%: we proactively covered the right side.
### SEIL: paying for coverage without paying twice
Naïve dual assignment (RairsStrict) doubles the memory: every vector stored
in two lists means twice the bytes. SEIL eliminates this.
Vectors are bucketed into 32-entry *blocks* within each list. When vector **v**
appears in both list 3 and list 7, we store the block *once* in the lower-indexed
list (list 3). List 7 holds a tiny `(3, block_idx)` reference instead of the
full vectors. At query time, a visited-block hash set deduplicates.
Result: RairsSeil and IvfFlat consume *identical* memory (2,571 KB) while
RairsSeil's recall at nprobe=1 is +31.8 pp better.
---
## Practical Failure Modes
1. **Clustered queries** — if the query distribution is very different from the
training distribution, k-means centroids will misrepresent the Voronoi
tessellation and RAIR secondary choices will be poor. Retrain centroids on a
representative query distribution or use IVF-HNSW routing.
2. **Low-dimensional data (D < 16)** — IVF is overkill; brute force is faster.
The RAIRS overhead (secondary scoring) dominates useful work.
3. **λ tuning** — λ=1.0 is the paper default but is not universally optimal.
High-aspect-ratio clusters may need λ < 1.0 to avoid over-penalising closer
secondaries. Expose λ as a hyperparameter (already done in this crate).
4. **Index staleness** — RAIRS is a static build-time structure. Inserts after
training require re-assigning to existing centroids, which is correct but
degrades recall if the new vectors are out-of-distribution. Planned fix:
periodic re-clustering.
5. **SEIL block boundary effects** — vectors at the end of a block may be
assigned alongside vectors from a different cluster if the cluster size is not
a multiple of 32. This is benign but slightly reduces cache locality. Fix:
cluster-aligned block boundaries (future ADR).
---
## What to Improve Next
| Priority | Improvement | Expected impact |
|----------|-------------|-----------------|
| High | IVF-PQ: compress residuals with Product Quantization | 8-16× memory, ~5% recall loss |
| High | IVF-HNSW routing: HNSW over centroids | O(log K) centroid scan vs O(K·D) |
| Medium | Adaptive λ: learn λ per-cluster from held-out queries | +25 pp recall |
| Medium | SEIL cluster-aligned blocks | Better cache locality |
| Medium | Parallel build with rayon | 4-8× build speedup on multi-core |
| Low | SIMD distance kernels (AVX2 / NEON) | 4-8× scan throughput |
| Low | On-disk SEIL: mmap-backed posting lists | Billion-scale support |
| Low | Streaming insert with re-clustering trigger | Dynamic index support |
---
## Production Crate Layout Proposal
```
crates/ruvector-ivf/ ← umbrella crate
src/
lib.rs ← re-exports all variants
kmeans.rs ← shared centroid training
index.rs ← AnnIndex trait + SearchResult
flat/
mod.rs → IvfFlat ← this PR's ivf.rs
rairs/
mod.rs → RairsStrict ← this PR's rairs.rs
seil/
mod.rs → RairsSeil ← this PR's seil.rs
pq/ ← future: IVF-PQ
hnsw_router/ ← future: centroid HNSW
benches/
rairs_bench.rs
examples/
sift1m.rs ← SIFT1M 1M×128 eval (future)
```
---
## References
1. Yang & Chen, "RAIRS: Optimizing Redundant Assignment and List Layout for
IVF-Based ANN Search", ACM SIGMOD 2026. arXiv:2601.07183.
2. Johnson, Douze & Jégou, "Billion-scale similarity search with GPUs", IEEE
TPAMI 2021. (FAISS)
3. Babenko & Lempitsky, "The Inverted Multi-Index", CVPR 2012.
4. Matsui, Uchida & Jégou, "A survey of product quantization", ITE Transactions
2018.
5. Malkov & Yashunin, "Efficient and robust ANN search using HNSW", IEEE TPAMI
2020.
6. Baranchuk, Babenko & Malkov, "Revisiting the Inverted Indices for Billion-Scale
ANN", ECCV 2018.