mirror of https://github.com/ruvnet/RuVector.git synced 2026-07-09 17:28:42 +00:00

History

rUv feb4ee2753 perf(hnsw): 4-acc AVX-512 + parallel-insert — +9.7% build throughput (query QPS unchanged: memory-bound at 1M scale) (#619 ) * perf(hnsw): 4-accumulator AVX-512 kernels + SIMD wiring into search hot path - Replace single-accumulator AVX-512 distance kernels with 4-accumulator versions in simd_intrinsics.rs (euclidean, cosine, dot, manhattan). On Zen 5 with 4-cycle FMA latency, single-accumulator was latency-bound (96 cycles for 384-dim); 4-accumulator hides this to ~24 cycles. - Wire HNSW search hot path in DistanceFn::eval to call simd_intrinsics directly (inline, no Result wrapping, no simsimd FFI overhead). - Enable parallel batch insert via hnsw_rs::parallel_insert_slice (rayon). Measured: 6-10% QPS improvement on 128-dim/1K-vector bench; larger gains expected on 1M-vector workloads where distance computation dominates. 228 unit tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_019rVRYrRDKyxYK18kuVrDSf * perf(hnsw): gate parallel_insert_slice behind 10K-vector threshold Rayon-based parallel insert (hnsw_rs::parallel_insert_slice) degrades graph connectivity for small batches (<10K vectors) because worker threads can't see each other's in-flight insertions, reducing optimal neighbor links. Add PARALLEL_THRESHOLD=10_000: use parallel insert only when the batch is large enough that the graph quality converges. Below threshold: sequential insert_data (same as before this PR). Above threshold: parallel_insert_slice for build-time speedup. 228 unit tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_019rVRYrRDKyxYK18kuVrDSf * bench(sift1m): add SIFT-1M fvecs benchmark + hnswlib comparison tooling Adds two benchmark binaries driven by the real TEXMEX SIFT-1M dataset: * crates/ruvector-sota-bench/src/bin/sift1m_bench.rs Reads sift_base.fvecs / sift_query.fvecs / sift_groundtruth.ivecs directly (no HDF5 required). Sweeps ef_search to produce a recall@10 vs QPS table used for before/after PR #619 comparison. * scripts/sift1m_hnswlib_bench.mjs Same sweep via hnswlib-node (C++ HNSW) to measure the competitive gap. Cargo.toml: add simd-avx512 feature to sota-bench dependency so the full optimised kernel path is exercised. Measured on AMD Ryzen 9 9950X (Zen 5, AVX-512), M=16, efC=200, 1M vecs: Source Build ef=100 recall ef=100 QPS ef=200 recall ef=200 QPS before PR 849 s 0.9585 1,849 0.9713 1,058 after PR (#619) 774 s 0.9592 1,768 0.9722 1,024 hnswlib-node 322 s 0.9828 5,339 0.9957 2,897 Build speedup: +9.7 %. Query QPS at 1M-scale: within noise (memory- bandwidth bound, not compute-bound). Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_019rVRYrRDKyxYK18kuVrDSf * style: cargo fmt for sift1m benchmark binary Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_019rVRYrRDKyxYK18kuVrDSf --------- Co-authored-by: ruvnet <ruvnet@gmail.com>		2026-06-28 20:45:35 -04:00
..
benches	chore(workspace): clippy-clean every crate under -D warnings + fmt + repair pre-existing broken benches	2026-04-25 17:00:20 -04:00
docs	feat(postgres): Add HNSW index and embedding functions support (#62 )	2025-12-09 11:14:52 -05:00
examples	style: apply rustfmt across entire codebase	2026-01-28 17:00:26 +00:00
fuzz	feat(quality): ADR-144 monorepo quality analysis — Phase 1 critical fixes (#336 )	2026-04-06 21:19:13 -04:00
src	perf(hnsw): 4-acc AVX-512 + parallel-insert — +9.7% build throughput (query QPS unchanged: memory-bound at 1M scale) (#619 )	2026-06-28 20:45:35 -04:00
tests	fix(core): HNSW correctness fixes, k=0 guard, sorted results, cross-integration helpers (v2.2.3) (#502 )	2026-05-23 03:37:35 -04:00
Cargo.toml	fix(security): RUSTSEC advisories + clippy hardening in RuVector (#504 )	2026-05-23 05:40:24 -04:00
README.md	docs: optimize 12 crate READMEs and add SONA learning loop diagram	2026-02-27 03:38:42 +00:00

README.md

Ruvector Core

The pure-Rust vector database engine behind RuVector -- HNSW indexing, quantization, and SIMD acceleration in a single crate.

ruvector-core is the foundational library that powers the entire RuVector ecosystem. It gives you a production-grade vector database you can embed directly into any Rust application: insert vectors, search them in under a millisecond, filter by metadata, and compress storage up to 32x -- all without external services. If you need vector search as a library instead of a server, this is the crate.

	ruvector-core	Typical Vector Database
Deployment	Embed as a Rust dependency -- no server, no network calls	Run a separate service, manage connections
Query latency	<0.5 ms p50 at 1M vectors with HNSW	~1-5 ms depending on network and index
Memory compression	Scalar (4x), Product (8-32x), Binary (32x) quantization built in	Often requires paid tiers or external tools
SIMD acceleration	SimSIMD hardware-optimized distance calculations, automatic	Manual tuning or not available
Search modes	Dense vectors, sparse BM25, hybrid, MMR diversity, filtered -- all in one API	Typically dense-only; hybrid and filtering are add-ons
Storage	Zero-copy mmap with `redb` -- instant loading, no deserialization	Load time scales with dataset size
Concurrency	Lock-free indexing with parallel batch processing via Rayon	Varies; many require single-writer locks
Dependencies	Minimal -- pure Rust, compiles anywhere `rustc` runs	Often depends on C/C++ libraries (BLAS, LAPACK)
Cost	Free forever -- open source (MIT)	Per-vector or per-query pricing on managed tiers

Installation

Add ruvector-core to your Cargo.toml:

[dependencies]
ruvector-core = "0.1.0"

Feature Flags

[dependencies]
ruvector-core = { version = "0.1.0", features = ["simd", "uuid-support"] }

Available features:

simd (default): Enable SIMD-optimized distance calculations
uuid-support (default): Enable UUID generation for vector IDs

Key Features

Feature	What It Does	Why It Matters
HNSW Indexing	Hierarchical Navigable Small World graphs for O(log n) approximate nearest neighbor search	Sub-millisecond queries at million-vector scale
Multiple Distance Metrics	Euclidean, Cosine, Dot Product, Manhattan	Match the metric to your embedding model without conversion
Scalar Quantization	Compress vectors to 8-bit integers (4x reduction)	Cut memory by 75% with 98% recall preserved
Product Quantization	Split vectors into subspaces with codebooks (8-32x reduction)	Store millions of vectors on a single machine
Binary Quantization	1-bit representation (32x reduction)	Ultra-fast screening pass for massive datasets
SIMD Distance	Hardware-accelerated distance via SimSIMD	Up to 80K QPS on 8 cores without code changes
Zero-Copy I/O	Memory-mapped storage loads instantly	No deserialization step -- open a file and search immediately
Hybrid Search	Combine dense vector similarity with sparse BM25 text scoring	One query handles both semantic and keyword matching
Metadata Filtering	Apply key-value filters during search	No post-filtering needed -- results are already filtered
MMR Diversification	Maximal Marginal Relevance re-ranking	Avoid redundant results when top-K are too similar
Conformal Prediction	Uncertainty quantification on search results	Know when to trust (or distrust) a match
Lock-Free Indexing	Concurrent reads and writes without blocking	High-throughput ingestion while serving queries
Batch Processing	Parallel insert and search via Rayon	Saturate all cores for bulk operations

Quick Start

Basic Usage

use ruvector_core::{VectorDB, DbOptions, VectorEntry, SearchQuery, DistanceMetric};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a new vector database
    let mut options = DbOptions::default();
    options.dimensions = 384;  // Vector dimensions
    options.storage_path = "./my_vectors.db".to_string();
    options.distance_metric = DistanceMetric::Cosine;

    let db = VectorDB::new(options)?;

    // Insert vectors
    db.insert(VectorEntry {
        id: Some("doc1".to_string()),
        vector: vec![0.1, 0.2, 0.3, /* ... 384 dimensions */],
        metadata: None,
    })?;

    db.insert(VectorEntry {
        id: Some("doc2".to_string()),
        vector: vec![0.4, 0.5, 0.6, /* ... 384 dimensions */],
        metadata: None,
    })?;

    // Search for similar vectors
    let results = db.search(SearchQuery {
        vector: vec![0.1, 0.2, 0.3, /* ... 384 dimensions */],
        k: 10,  // Return top 10 results
        filter: None,
        ef_search: None,
    })?;

    for result in results {
        println!("ID: {}, Score: {}", result.id, result.score);
    }

    Ok(())
}

Batch Operations

use ruvector_core::{VectorDB, VectorEntry};

// Insert multiple vectors efficiently
let entries = vec![
    VectorEntry {
        id: Some("doc1".to_string()),
        vector: vec![0.1, 0.2, 0.3],
        metadata: None,
    },
    VectorEntry {
        id: Some("doc2".to_string()),
        vector: vec![0.4, 0.5, 0.6],
        metadata: None,
    },
];

let ids = db.insert_batch(entries)?;
println!("Inserted {} vectors", ids.len());

With Metadata Filtering

use std::collections::HashMap;
use serde_json::json;

// Insert with metadata
db.insert(VectorEntry {
    id: Some("product1".to_string()),
    vector: vec![0.1, 0.2, 0.3],
    metadata: Some(HashMap::from([
        ("category".to_string(), json!("electronics")),
        ("price".to_string(), json!(299.99)),
    ])),
})?;

// Search with metadata filter
let results = db.search(SearchQuery {
    vector: vec![0.1, 0.2, 0.3],
    k: 10,
    filter: Some(HashMap::from([
        ("category".to_string(), json!("electronics")),
    ])),
    ef_search: None,
})?;

HNSW Configuration

use ruvector_core::{DbOptions, HnswConfig, DistanceMetric};

let mut options = DbOptions::default();
options.dimensions = 384;
options.distance_metric = DistanceMetric::Cosine;

// Configure HNSW index parameters
options.hnsw_config = Some(HnswConfig {
    m: 32,                    // Connections per layer (16-64 typical)
    ef_construction: 200,     // Build-time accuracy (100-500 typical)
    ef_search: 100,          // Search-time accuracy (50-200 typical)
    max_elements: 10_000_000, // Maximum vectors
});

let db = VectorDB::new(options)?;

Quantization

use ruvector_core::{DbOptions, QuantizationConfig};

let mut options = DbOptions::default();
options.dimensions = 384;

// Enable scalar quantization (4x compression)
options.quantization = Some(QuantizationConfig::Scalar);

// Or product quantization (8-32x compression)
options.quantization = Some(QuantizationConfig::Product {
    subspaces: 8,  // Number of subspaces
    k: 256,        // Codebook size
});

let db = VectorDB::new(options)?;

API Overview

Core Types

// Main database interface
pub struct VectorDB { /* ... */ }

// Vector entry with optional ID and metadata
pub struct VectorEntry {
    pub id: Option<VectorId>,
    pub vector: Vec<f32>,
    pub metadata: Option<HashMap<String, serde_json::Value>>,
}

// Search query parameters
pub struct SearchQuery {
    pub vector: Vec<f32>,
    pub k: usize,
    pub filter: Option<HashMap<String, serde_json::Value>>,
    pub ef_search: Option<usize>,
}

// Search result with score
pub struct SearchResult {
    pub id: VectorId,
    pub score: f32,
    pub vector: Option<Vec<f32>>,
    pub metadata: Option<HashMap<String, serde_json::Value>>,
}

Main Operations

impl VectorDB {
    // Create new database with options
    pub fn new(options: DbOptions) -> Result<Self>;

    // Create with just dimensions (uses defaults)
    pub fn with_dimensions(dimensions: usize) -> Result<Self>;

    // Insert single vector
    pub fn insert(&self, entry: VectorEntry) -> Result<VectorId>;

    // Insert multiple vectors
    pub fn insert_batch(&self, entries: Vec<VectorEntry>) -> Result<Vec<VectorId>>;

    // Search for similar vectors
    pub fn search(&self, query: SearchQuery) -> Result<Vec<SearchResult>>;

    // Delete vector by ID
    pub fn delete(&self, id: &str) -> Result<bool>;

    // Get vector by ID
    pub fn get(&self, id: &str) -> Result<Option<VectorEntry>>;

    // Get total count
    pub fn len(&self) -> Result<usize>;

    // Check if empty
    pub fn is_empty(&self) -> Result<bool>;
}

Distance Metrics

pub enum DistanceMetric {
    Euclidean,   // L2 distance - default for embeddings
    Cosine,      // Cosine similarity (1 - similarity)
    DotProduct,  // Negative dot product (for maximization)
    Manhattan,   // L1 distance
}

Advanced Features

// Hybrid search (dense + sparse)
use ruvector_core::{HybridSearch, HybridConfig};

let hybrid = HybridSearch::new(HybridConfig {
    alpha: 0.7,  // Balance between dense (0.7) and sparse (0.3)
    ..Default::default()
});

// Filtered search with expressions
use ruvector_core::{FilteredSearch, FilterExpression};

let filtered = FilteredSearch::new(db);
let expr = FilterExpression::And(vec![
    FilterExpression::Equals("category".to_string(), json!("books")),
    FilterExpression::GreaterThan("price".to_string(), json!(10.0)),
]);

// MMR diversification
use ruvector_core::{MMRSearch, MMRConfig};

let mmr = MMRSearch::new(MMRConfig {
    lambda: 0.5,  // Balance relevance (0.5) and diversity (0.5)
    ..Default::default()
});

Performance

Latency (Single Query)

Operation           Flat Index    HNSW Index
---------------------------------------------
Search (1K vecs)    ~0.1ms       ~0.2ms
Search (100K vecs)  ~10ms        ~0.5ms
Search (1M vecs)    ~100ms       <1ms
Insert              ~0.1ms       ~1ms
Batch (1000)        ~50ms        ~500ms

Memory Usage (1M Vectors, 384 Dimensions)

Configuration              Memory      Recall
---------------------------------------------
Full Precision (f32)       ~1.5GB      100%
Scalar Quantization        ~400MB      98%
Product Quantization       ~200MB      95%
Binary Quantization        ~50MB       85%

Throughput (Queries Per Second)

Configuration              QPS         Latency (p50)
-----------------------------------------------------
Single Thread             ~2,000      ~0.5ms
Multi-Thread (8 cores)    ~50,000     <0.5ms
With SIMD                 ~80,000     <0.3ms
With Quantization         ~100,000    <0.2ms

Configuration Guide

For Maximum Accuracy

let options = DbOptions {
    dimensions: 384,
    distance_metric: DistanceMetric::Cosine,
    hnsw_config: Some(HnswConfig {
        m: 64,
        ef_construction: 500,
        ef_search: 200,
        max_elements: 10_000_000,
    }),
    quantization: None,  // Full precision
    ..Default::default()
};

For Maximum Speed

let options = DbOptions {
    dimensions: 384,
    distance_metric: DistanceMetric::DotProduct,
    hnsw_config: Some(HnswConfig {
        m: 16,
        ef_construction: 100,
        ef_search: 50,
        max_elements: 10_000_000,
    }),
    quantization: Some(QuantizationConfig::Binary),
    ..Default::default()
};

For Balanced Performance

let options = DbOptions::default(); // Recommended defaults

Building and Testing

Build

# Build with default features
cargo build --release

# Build without SIMD
cargo build --release --no-default-features --features uuid-support

# Build for specific target with optimizations
RUSTFLAGS="-C target-cpu=native" cargo build --release

Testing

# Run all tests
cargo test

# Run with specific features
cargo test --features simd

# Run with logging
RUST_LOG=debug cargo test

Benchmarks

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench --bench hnsw_search

# Run with features
cargo bench --features simd

Available benchmarks:

distance_metrics - SIMD-optimized distance calculations
hnsw_search - HNSW index search performance
quantization_bench - Quantization techniques
batch_operations - Batch insert/search operations
comprehensive_bench - Full system benchmarks

ruvector-core is the foundation for platform-specific bindings:

ruvector-node - Node.js bindings via NAPI-RS
ruvector-wasm - WebAssembly bindings for browsers
ruvector-gnn - Graph Neural Network layer for learned search
ruvector-cli - Command-line interface
ruvector-bench - Performance benchmarks

Documentation

Main README - Complete project overview
Getting Started Guide - Quick start tutorial
Rust API Reference - Detailed API documentation
Advanced Features Guide - Quantization, indexing, tuning
Performance Tuning - Optimization strategies
API Documentation - Full API reference on docs.rs

Acknowledgments

Built with state-of-the-art algorithms and libraries:

hnsw_rs - HNSW implementation
simsimd - SIMD distance calculations
redb - Embedded database
rayon - Data parallelism
memmap2 - Memory-mapped files

License

MIT License - see LICENSE for details.

Part of RuVector - Built by rUv

Documentation | Crates.io | GitHub