mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-24 05:43:58 +00:00
* feat(ruvector-rabitq-wasm): WASM bindings for RaBitQ via wasm-bindgen
Closes the WASM gap from `docs/research/rabitq-integration/` Tier 2
("WASM / edge: 32× compression makes on-device RAG feasible") and
ADR-157 ("VectorKernel WASM kernel as a Phase 2 goal"). Adds a
`ruvector-rabitq-wasm` sibling crate that exposes `RabitqIndex` to
JavaScript/TypeScript callers (browsers, Cloudflare Workers, Deno,
Bun) via wasm-bindgen.
```js
import init, { RabitqIndex } from "ruvector-rabitq";
await init();
const dim = 768;
const n = 10_000;
const vectors = new Float32Array(n * dim); // populate
const idx = RabitqIndex.build(vectors, dim, 42, 20);
const query = new Float32Array(dim);
const results = idx.search(query, 10); // [{id, distance}, ...]
```
## Surface
- `RabitqIndex.build(vectors: Float32Array, dim, seed, rerank_factor)`
- `idx.search(query: Float32Array, k) → SearchResult[]`
- `idx.len`, `idx.isEmpty`
- `version()` — crate version baked at build time
- `SearchResult { id: u32, distance: f32 }` — mirrors the Python SDK
(PR #381) shape so callers porting code between languages get
identical structures.
## Native compatibility tweak
`ruvector-rabitq` had one rayon call site in
`from_vectors_parallel_with_rotation`. WASM is single-threaded — gated
that path on `cfg(not(target_arch = "wasm32"))` with a sequential
`.into_iter()` fallback for wasm. Output is bit-identical because the
rotation matrix is deterministic (ADR-154); parallel ordering doesn't
affect bytes.
`rayon` is now `[target.'cfg(not(target_arch = "wasm32"))'.dependencies]`
so the wasm build doesn't pull it in. Native build behavior unchanged
(39 / 39 lib tests still pass).
## Crate layout
crates/ruvector-rabitq-wasm/
Cargo.toml cdylib + rlib, wasm-bindgen 0.2, abi-3-friendly
src/lib.rs ~150 LoC of bindings; tests gated to wasm32 via
wasm_bindgen_test (native test would panic in
wasm-bindgen 0.2.117's runtime stub).
## Testing strategy
Native tests of WASM bindings panic by design — `JsValue::from_str`
calls into a wasm-bindgen runtime stub that's `unimplemented!()` on
non-wasm32 targets (since 0.2.117). The right path is
`wasm-pack test --node` or `wasm-pack test --headless --chrome`,
which we'll wire into CI as a follow-up.
The numerical correctness is already covered by `ruvector-rabitq`'s
own test suite. This crate only adds the JS-facing surface.
## Verification (native)
cargo build --workspace → 0 errors
cargo build -p ruvector-rabitq-wasm → clean
cargo clippy -p ruvector-rabitq-wasm --all-targets --no-deps -- -D warnings → exit 0
cargo test -p ruvector-rabitq → 39 / 39 (unchanged)
cargo fmt --all --check → clean
WASM target build (`wasm32-unknown-unknown`) requires `rustup target
add wasm32-unknown-unknown` — not exercised in this PR; will be
covered by a follow-up CI job.
Refs: docs/research/rabitq-integration/ Tier 2, ADR-157
("Optional Accelerator Plane"), PR #381 (Python SDK shape mirror).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(acorn): add ruvector-acorn crate — ACORN predicate-agnostic filtered HNSW
Implements the ACORN algorithm (Patel et al., SIGMOD 2024, arXiv:2403.04871)
as a standalone Rust crate. ACORN solves filtered vector search recall collapse
at low predicate selectivity by expanding ALL graph neighbors regardless of
predicate outcome, combined with a γ-augmented graph (γ·M neighbors/node).
Three index variants:
- FlatFilteredIndex: post-filter brute-force baseline
- AcornIndex1: ACORN with M=16 standard edges
- AcornIndexGamma: ACORN with 2M=32 edges (γ=2)
Measured (n=5K, D=128, release): ACORN-γ achieves 98.9% recall@10 at 1%
selectivity. cargo build --release and cargo test (12/12) both pass.
https://claude.ai/code/session_0173QrGBttNDWcVXXh4P17if
* perf(acorn): bounded beam, parallel build, flat data, unrolled L2²
Five linked optimizations to ruvector-acorn (≈50% smaller search
working set, ≈6× faster build on 8 cores, comparable or better
recall at every selectivity):
1. **Fix broken bounded-beam eviction in `acorn_search`.**
The previous implementation admitted that its `else` branch was
"wrong" (the comment literally said "this is wrong") and pushed
every neighbor into `candidates` unconditionally, growing the
frontier to O(n). Replace with a correct max-heap eviction:
when `|candidates| >= ef`, only admit a neighbor if it improves
on the farthest pending candidate, evicting that one. This gives
the documented O(ef) memory bound and stops wasted neighbor
expansions at the prune cutoff.
2. **Parallelize the O(n²·D) graph build with rayon.**
The forward pass (each node finds its M nearest predecessors) is
embarrassingly parallel — `into_par_iter` over rows. Back-edge
merge stays serial behind a `Mutex<Vec<u32>>` per node so the
merge is deterministic. ~6× faster on an 8-core box for 5K×128.
3. **Flat row-major vector storage.**
`data: Vec<Vec<f32>>` → `data: Vec<f32>` (length n·dim) with a
`row(i)` accessor. Eliminates the per-vector heap indirection,
keeps the L2² inner loop on contiguous memory the compiler can
vectorize, and trims index size by ~one allocation per row.
4. **`Vec<bool>` for `visited` instead of `HashSet<u32>`.**
O(1) lookup with no hashing or allocator pressure on the hot path.
5. **Hand-unroll L2² by 4.**
Four independent accumulators give LLVM enough room to issue
AVX2/SSE/NEON FMA chains on contemporary x86_64 / aarch64.
3-5× faster for D ≥ 64 in microbenchmarks.
Other:
- `exact_filtered_knn` parallelizes across data via rayon (recall
measurement only — needs `+ Sync` on the predicate).
- `benches/acorn_bench.rs` switches `SmallRng` → `StdRng` (the
workspace doesn't enable rand's `small_rng` feature so the bench
failed to compile).
- `cargo fmt` applied across the crate; CI's Rustfmt check was the
blocking failure on the original PR.
Demo run on x86_64, n=5000, D=128, k=10:
Build: ACORN-γ ≈ 23 ms (was 1.8 s)
Recall: 96.0% @ 1% selectivity (paper: ~98%)
92.0% @ 5% selectivity
79.7% @ 10% selectivity
34.5% @ 50% selectivity (predicate dilutes top-k truth)
QPS: 18 K @ 1% sel, 65 K @ 50% sel
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(acorn): clippy clean-up — sort_by_key, is_empty, redundant closures
CI's `Clippy (deny warnings)` flagged three lints introduced by the
previous optimization commit:
- `unnecessary_sort_by` (graph.rs:158, 176) → use `sort_by_key`
- `len_without_is_empty` (graph.rs) → add `AcornGraph::is_empty`
and `if graph.is_empty()` in search.rs
- `redundant_closure` (main.rs:65, 159, 160) → pass the predicate
directly to `recall_at_k` instead of `|id| pred(id)`
No semantic change.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(wasm): publish @ruvector/rabitq-wasm and @ruvector/acorn-wasm to npm
Two new WASM packages (both v0.1.0, MIT OR Apache-2.0, scoped under
@ruvector). Mirrors the existing @ruvector/graph-wasm packaging
pattern so release tooling treats all three uniformly.
- ADR-161: @ruvector/rabitq-wasm — RaBitQ 1-bit quantized vector
index. 32× embedding compression with deterministic rotation.
Wraps the existing crates/ruvector-rabitq-wasm crate.
- ADR-162: @ruvector/acorn-wasm — ACORN predicate-agnostic filtered
HNSW. 96% recall@10 at 1% selectivity with arbitrary JS predicates.
Adds crates/ruvector-acorn-wasm (new), wrapping the ruvector-acorn
crate from PR #391.
Each crate ships with:
- `build.sh` that runs `wasm-pack build` for web / nodejs / bundler
targets, emitting into npm/packages/{rabitq,acorn}-wasm/{,node/,bundler/}.
- A canonical scoped package.json (kept under git as
package.scoped.json because wasm-pack regenerates package.json from
Cargo metadata on every build).
- A README.md with install + usage for browser, Node.js, and bundler
contexts.
- A `.gitignore` that excludes the wasm-pack-generated artifacts
(.wasm + .js + .d.ts) so only canonical source lives in the repo.
Build sanity:
- `cargo check -p ruvector-acorn-wasm -p ruvector-rabitq-wasm` clean
- `cargo clippy -- -D warnings` clean for both
- `wasm-pack build` succeeds for all three targets on both crates
Published:
- @ruvector/rabitq-wasm@0.1.0 — 40 KB tarball, 71 KB wasm
- @ruvector/acorn-wasm@0.1.0 — 49 KB tarball, ~85 KB wasm
Root README updated with both packages in the npm packages table.
Note: this branch also carries cherry-picks of PR #391's `ruvector-acorn`
crate (commits b90af9caa, 0b4eab11f, eb88176bd, f5913b783) and PR
#391's predecessor commit a674d6eba for `ruvector-rabitq-wasm` itself,
because both base crates are required to build the new WASM wrappers.
Co-Authored-By: claude-flow <ruv@ruv.net>
---------
Co-authored-by: ruvnet <ruvnet@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
307 lines
10 KiB
TOML
307 lines
10 KiB
TOML
[workspace]
|
|
exclude = ["crates/micro-hnsw-wasm", "crates/ruvector-hyperbolic-hnsw", "crates/ruvector-hyperbolic-hnsw-wasm", "examples/ruvLLM/esp32", "examples/ruvLLM/esp32-flash", "examples/edge-net", "examples/data", "examples/ruvLLM", "examples/delta-behavior", "crates/rvf", "crates/rvf/*", "crates/rvf/*/*", "examples/rvf-desktop", "crates/mcp-brain-server",
|
|
# ruvector-postgres is a pgrx-based PostgreSQL extension. Its build script
|
|
# requires `$PGRX_HOME` set up via `cargo install cargo-pgrx --version 0.12.9`
|
|
# and `cargo pgrx init`, which downloads and builds multiple Postgres
|
|
# versions. Keep it out of default workspace builds so `cargo build --workspace`
|
|
# works in stock environments. Build it explicitly with `cargo build -p ruvector-postgres`
|
|
# after running pgrx init.
|
|
"crates/ruvector-postgres"]
|
|
members = [
|
|
"crates/ruvector-acorn",
|
|
"crates/ruvector-acorn-wasm",
|
|
"crates/ruvector-rabitq",
|
|
"crates/ruvector-rabitq-wasm",
|
|
"crates/ruvector-rulake",
|
|
"crates/ruvector-core",
|
|
"crates/ruvector-node",
|
|
"crates/ruvector-wasm",
|
|
"crates/ruvector-cli",
|
|
"crates/ruvector-bench",
|
|
"crates/ruvector-metrics",
|
|
"crates/ruvector-filter",
|
|
"crates/ruvector-router-core",
|
|
"crates/ruvector-router-cli",
|
|
"crates/ruvector-router-ffi",
|
|
"crates/ruvector-router-wasm",
|
|
"crates/ruvector-server",
|
|
"crates/ruvector-snapshot",
|
|
"crates/ruvector-tiny-dancer-core",
|
|
"crates/ruvector-tiny-dancer-wasm",
|
|
"crates/ruvector-tiny-dancer-node",
|
|
"crates/ruvector-collections",
|
|
"crates/ruvector-cluster",
|
|
"crates/ruvector-raft",
|
|
"crates/ruvector-replication",
|
|
"crates/ruvector-graph",
|
|
"crates/ruvector-graph-node",
|
|
"crates/ruvector-graph-wasm",
|
|
"crates/ruvector-gnn",
|
|
"crates/ruvector-gnn-node",
|
|
"crates/ruvector-gnn-wasm",
|
|
"crates/ruvector-attention",
|
|
"crates/ruvector-attention-wasm",
|
|
"crates/ruvector-attention-node",
|
|
"crates/ruvector-cnn",
|
|
"crates/ruvector-cnn-wasm",
|
|
"crates/ruvector-mincut",
|
|
"crates/ruvector-mincut-wasm",
|
|
"crates/ruvector-mincut-node",
|
|
"crates/ruvector-mincut-gated-transformer",
|
|
"crates/ruvector-mincut-gated-transformer-wasm",
|
|
# NOTE: ruvector-postgres is in workspace `exclude` (pgrx env requirement).
|
|
"crates/ruvector-nervous-system",
|
|
"examples/refrag-pipeline",
|
|
"examples/scipix",
|
|
"examples/google-cloud",
|
|
"examples/subpolynomial-time",
|
|
"crates/sona",
|
|
"crates/rvlite",
|
|
"crates/ruvector-nervous-system",
|
|
"crates/ruvector-dag",
|
|
"crates/ruvector-dag-wasm",
|
|
"crates/ruvector-nervous-system-wasm",
|
|
"crates/ruvector-economy-wasm",
|
|
"crates/ruvector-learning-wasm",
|
|
"crates/ruvector-exotic-wasm",
|
|
"crates/ruvector-attention-unified-wasm",
|
|
"crates/ruvector-fpga-transformer",
|
|
"crates/ruvector-fpga-transformer-wasm",
|
|
"crates/ruvector-sparse-inference",
|
|
"crates/ruvector-math",
|
|
"crates/ruvector-math-wasm",
|
|
"examples/benchmarks",
|
|
"crates/cognitum-gate-kernel",
|
|
"crates/cognitum-gate-tilezero",
|
|
"crates/mcp-gate",
|
|
"crates/mcp-brain",
|
|
"crates/mcp-brain-server",
|
|
"crates/ruQu",
|
|
"crates/ruvllm",
|
|
"crates/ruvllm-cli",
|
|
"crates/ruvllm-wasm",
|
|
"crates/prime-radiant",
|
|
"crates/ruvector-delta-core",
|
|
"crates/ruvector-delta-wasm",
|
|
"crates/ruvector-delta-index",
|
|
"crates/ruvector-delta-graph",
|
|
"crates/ruvector-delta-consensus",
|
|
"crates/ruvector-crv",
|
|
"crates/ruvector-temporal-tensor",
|
|
"crates/ruqu-core",
|
|
"crates/ruqu-algorithms",
|
|
"crates/ruqu-wasm",
|
|
"crates/ruqu-exotic",
|
|
"crates/ruvector-domain-expansion",
|
|
"crates/ruvector-domain-expansion-wasm",
|
|
"crates/ruvector-solver",
|
|
"crates/ruvector-solver-wasm",
|
|
"crates/ruvector-solver-node",
|
|
"examples/dna",
|
|
"examples/OSpipe",
|
|
"crates/ruvector-coherence",
|
|
"crates/ruvector-profiler",
|
|
"crates/ruvector-attn-mincut",
|
|
"crates/ruvector-cognitive-container",
|
|
"crates/ruvector-verified",
|
|
"crates/ruvector-verified-wasm",
|
|
"crates/ruvector-graph-transformer",
|
|
"crates/ruvector-graph-transformer-wasm",
|
|
"crates/ruvector-graph-transformer-node",
|
|
"examples/rvf-kernel-optimized",
|
|
"examples/verified-applications",
|
|
"crates/thermorust",
|
|
"crates/ruvector-dither",
|
|
"crates/ruvector-robotics",
|
|
"examples/robotics",
|
|
"crates/neural-trader-core",
|
|
"crates/neural-trader-coherence",
|
|
"crates/neural-trader-replay",
|
|
"crates/neural-trader-wasm",
|
|
# Kalshi integration (ADR-153)
|
|
"crates/ruvector-kalshi",
|
|
"crates/neural-trader-strategies",
|
|
# RuVix Cognition Kernel (organized under crates/ruvix/)
|
|
"crates/ruvix/crates/types",
|
|
"crates/ruvix/crates/region",
|
|
"crates/ruvix/crates/queue",
|
|
"crates/ruvix/crates/cap",
|
|
"crates/ruvix/crates/proof",
|
|
"crates/ruvix/crates/sched",
|
|
"crates/ruvix/crates/boot",
|
|
"crates/ruvix/crates/vecgraph",
|
|
"crates/ruvix/crates/nucleus",
|
|
# Phase B: Bare metal AArch64 support
|
|
"crates/ruvix/crates/hal",
|
|
"crates/ruvix/crates/aarch64",
|
|
"crates/ruvix/crates/drivers",
|
|
"crates/ruvix/tests",
|
|
"crates/ruvix/benches",
|
|
"crates/ruvix/examples/cognitive_demo",
|
|
# rvAgent — AI Agent Framework (DeepAgents Rust conversion)
|
|
"crates/rvAgent/rvagent-core",
|
|
"crates/rvAgent/rvagent-backends",
|
|
"crates/rvAgent/rvagent-middleware",
|
|
"crates/rvAgent/rvagent-tools",
|
|
"crates/rvAgent/rvagent-subagents",
|
|
"crates/rvAgent/rvagent-cli",
|
|
"crates/rvAgent/rvagent-acp",
|
|
"crates/rvAgent/rvagent-mcp",
|
|
"crates/rvAgent/rvagent-wasm",
|
|
"crates/rvAgent/rvagent-a2a",
|
|
# ADR-159 a2a-swarm demo
|
|
"examples/a2a-swarm",
|
|
# ETL pipeline example
|
|
"examples/train-discoveries",
|
|
# Spectral graph sparsification
|
|
"crates/ruvector-sparsifier",
|
|
"crates/ruvector-sparsifier-wasm",
|
|
# Consciousness metrics (IIT Φ, causal emergence)
|
|
"crates/ruvector-consciousness",
|
|
"crates/ruvector-consciousness-wasm",
|
|
"examples/cmb-consciousness",
|
|
"examples/gw-consciousness",
|
|
"examples/ecosystem-consciousness",
|
|
"examples/quantum-consciousness",
|
|
"examples/gene-consciousness",
|
|
"examples/climate-consciousness",
|
|
# JS bundle decompiler (ADR-135)
|
|
"crates/ruvector-decompiler",
|
|
"crates/ruvector-decompiler-wasm",
|
|
# DiskANN / Vamana (ADR-143)
|
|
"crates/ruvector-diskann",
|
|
"crates/ruvector-diskann-node",
|
|
# Boundary-first scientific discovery PoC
|
|
"examples/boundary-discovery",
|
|
# CMB Cold Spot boundary-first discovery
|
|
"examples/cmb-boundary-discovery",
|
|
# FRB population boundary discovery (CHIME-like data)
|
|
"examples/frb-boundary-discovery",
|
|
# Cosmic void boundary information content
|
|
"examples/void-boundary-discovery",
|
|
# Multi-regime temporal attractor boundary detection
|
|
"examples/temporal-attractor-discovery",
|
|
# Music genre boundary discovery via spectral graph bisection
|
|
"examples/music-boundary-discovery",
|
|
# Weather regime boundary detection (variance/correlation precedes temperature)
|
|
"examples/weather-boundary-discovery",
|
|
# Market regime boundary discovery via correlation structure
|
|
"examples/market-boundary-discovery",
|
|
# Health state boundary detection from wearable sensor data
|
|
"examples/health-boundary-discovery",
|
|
# SETI exotic signals gallery: boundary-first detection of sub-threshold signals
|
|
"examples/seti-exotic-signals",
|
|
# SETI boundary-first discovery: sub-noise signal detection via coherence graphs
|
|
"examples/seti-boundary-discovery",
|
|
# Earthquake precursor detection via inter-station correlation boundary shifts
|
|
"examples/earthquake-boundary-discovery",
|
|
# Pandemic outbreak detection 60 days before case counts via correlation boundaries
|
|
"examples/pandemic-boundary-discovery",
|
|
# Infrastructure failure prediction via sensor correlation boundaries
|
|
"examples/infrastructure-boundary-discovery",
|
|
# Pre-seizure detection via brain correlation boundary shifts
|
|
"examples/brain-boundary-discovery",
|
|
# Clinical-publication-grade pre-seizure detection report with CSV output
|
|
"examples/seizure-clinical-report",
|
|
# Closed-loop seizure detection + therapeutic response simulation
|
|
"examples/seizure-therapeutic-sim",
|
|
# Real EEG analysis: CHB-MIT PhysioNet data with boundary-first detection
|
|
"examples/real-eeg-analysis",
|
|
# Multi-seizure cross-patient analysis: all 7 chb01 seizures
|
|
"examples/real-eeg-multi-seizure",
|
|
]
|
|
resolver = "2"
|
|
|
|
[workspace.package]
|
|
version = "2.2.0"
|
|
edition = "2021"
|
|
rust-version = "1.77"
|
|
license = "MIT"
|
|
authors = ["Ruvector Team"]
|
|
repository = "https://github.com/ruvnet/ruvector"
|
|
|
|
[workspace.dependencies]
|
|
# Core functionality
|
|
redb = "2.1"
|
|
memmap2 = "0.9"
|
|
hnsw_rs = "0.3"
|
|
simsimd = "5.9"
|
|
rayon = "1.10"
|
|
crossbeam = "0.8"
|
|
|
|
# Serialization
|
|
rkyv = "0.8"
|
|
bincode = { version = "2.0.0-rc.3", features = ["serde"] }
|
|
serde = { version = "1.0", features = ["derive"] }
|
|
serde_json = "1.0"
|
|
|
|
# Node.js bindings
|
|
napi = { version = "2.16", default-features = false, features = ["napi9", "async", "tokio_rt"] }
|
|
napi-derive = "2.16"
|
|
|
|
# WASM
|
|
wasm-bindgen = "0.2"
|
|
wasm-bindgen-futures = "0.4"
|
|
js-sys = "0.3"
|
|
web-sys = { version = "0.3", features = ["Worker", "MessagePort", "console"] }
|
|
getrandom = { version = "0.3", features = ["wasm_js"] }
|
|
|
|
# Async runtime
|
|
tokio = { version = "1.41", features = ["rt-multi-thread", "sync", "macros"] }
|
|
futures = "0.3"
|
|
|
|
# Error handling and utilities
|
|
thiserror = "2.0"
|
|
anyhow = "1.0"
|
|
tracing = "0.1"
|
|
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
|
|
|
|
# Math and numerics
|
|
nalgebra = { version = "0.33", default-features = false, features = ["std"] }
|
|
ndarray = "0.16"
|
|
rand = "0.8"
|
|
rand_distr = "0.4"
|
|
|
|
# Time and UUID
|
|
chrono = { version = "0.4", features = ["serde"] }
|
|
uuid = { version = "1.11", features = ["v4", "serde", "js"] }
|
|
|
|
# CLI
|
|
clap = { version = "4.5", features = ["derive", "cargo"] }
|
|
indicatif = "0.17"
|
|
console = "0.15"
|
|
|
|
# Testing and benchmarking
|
|
criterion = { version = "0.5", features = ["html_reports"] }
|
|
proptest = "1.5"
|
|
mockall = "0.13"
|
|
|
|
# Formal verification
|
|
lean-agentic = "=0.1.0"
|
|
|
|
# Performance
|
|
dashmap = "6.1"
|
|
parking_lot = "0.12"
|
|
once_cell = "1.20"
|
|
|
|
[profile.release]
|
|
opt-level = 3
|
|
lto = "fat"
|
|
codegen-units = 1
|
|
strip = true
|
|
panic = "unwind"
|
|
|
|
[profile.bench]
|
|
inherits = "release"
|
|
debug = true
|
|
|
|
[profile.dev]
|
|
opt-level = 0
|
|
debug = true
|
|
|
|
[profile.test]
|
|
|
|
# Patch hnsw_rs to use rand 0.8 instead of 0.9 for WASM compatibility
|
|
# This resolves the getrandom version conflict (0.2 vs 0.3)
|
|
[patch.crates-io]
|
|
hnsw_rs = { path = "./patches/hnsw_rs" }
|