ruvector/docs/sdk/01-survey.md
ruvnet f6c684aba0 docs(sdk): add deep planning review for ruvector Python SDK
Seven-file design review at docs/sdk/ covering the binding strategy,
API surface, M1-M4 milestones, risks, and a one-page decision record
for shipping a Python SDK.

Recommended path: **PyO3 + maturin, single in-tree
`crates/ruvector-py/` cdylib, abi3-py39 wheel via cibuildwheel,
`pyo3-asyncio` over a singleton tokio runtime.**

Why:
- The existing `*-node` NAPI templates (e.g.
  `crates/ruvector-diskann-node/src/lib.rs`) already prove out the
  opaque-handle + `Arc<RwLock<…>>` shape PyO3 mirrors line-for-line —
  ~70% port, ~30% lifetime gymnastics.
- abi3 collapses the wheel matrix from ~25 (cpython36 × 5 platforms)
  to 5 (one wheel per platform, all py3.9+).
- Singleton tokio runtime avoids the "one runtime per call" overhead
  while remaining compatible with asyncio + uvloop.

Milestone shape (each with explicit scope + acceptance tests):

  M1 — RaBitQ-only Python wheel. Just the published
       `ruvector-rabitq` crate exposed via PyO3. Smallest possible
       useful surface. ~600 LoC, 3 weeks.
  M2 — ruLake. Async via pyo3-asyncio. Witness verify exposed.
       ~900 LoC, 4 weeks.
  M3 — Embeddings + ML helpers. Wrap consumer-facing parts of
       `ruvector-cnn` / `ruvllm`. ~700 LoC, 3 weeks.
  M4 — A2A agent client. Wrap `rvagent-a2a` so Python apps can
       dispatch tasks to A2A peers, including signed AgentCard
       discovery. ~800 LoC, 4 weeks.

Three acceptance gates that gate the whole effort:
  1. A Python user can do RAG over 1 M vectors in <5 lines.
  2. An asyncio user can stream A2A task updates without thread
     fights.
  3. `pip install ruvector` takes <10 s on a stock machine.

Top 3 risks identified:
  R1 — tokio runtime + PyO3 + asyncio/uvloop interop. Mitigation:
       single lazy runtime, `pyo3-asyncio` shim.
  R3 — wheel size. M4 budget is 22 MB; A2A deps (axum + reqwest +
       rustls) could blow it. Mitigation: feature-gate axum/reqwest
       behind `agent` extra; default install is rabitq + rulake only.
  R7 — PyPI name squat on `ruvector`. Mitigation: register placeholder
       before M1 ships.

Nuance discovered: `ruvector-rabitq` has **no** sibling `*-node` or
`*-wasm` crate — unlike most consumer crates. M1 is therefore clean
greenfield: no parity-pressure to match a flaky NAPI signature, and
it confirms rabitq alone is the right starter target rather than the
umbrella `ruvector` crate the npm package wraps.

Planning doc only; no implementation.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-25 20:28:54 -04:00

6.2 KiB

01 — Survey: What ruvector Ships Today

Snapshot taken 2026-04-25 against main at commit 2e68f0c9f.

Workspace shape

  • crates/ contains ~110 directories. The workspace Cargo.toml has 96 active members = entries (rest are excluded for env-specific build reasons — ruvector-postgres needs pgrx, mcp-brain-server is private, the hyperbolic-hnsw pair is intentionally out of the default workspace).
  • Workspace version pin is 2.2.0 for first-party ruvector-* crates; rvAgent/* crates are independently versioned at 0.1.0.
  • The two crates that have actual [package].description text indicating a consumer-facing v1 are:
    • ruvector-rabitq"RaBitQ: rotation-based 1-bit quantization for ultra-fast approximate nearest-neighbor search with theoretical error bounds." No NAPI/wasm sibling crate. Pure Rust, 9 source files, ~3,700 LoC, the trait surface is AnnIndex over four index variants (FlatF32Index, RabitqIndex, RabitqPlusIndex, RabitqAsymIndex). Already published on crates.io at 2.2.0 per the workspace version.
    • ruvector-rulake"ruLake — vector-native federation intermediary over heterogeneous backends (ADR-155)." Depends on ruvector-rabitq. 7 source files, ~3,100 LoC. Public surface is RuLake, BackendAdapter, LocalBackend, FsBackend, VectorCache, RuLakeBundle. Methods on RuLake include search_one, search_federated, search_batch, publish_bundle, refresh_from_bundle_dir, save_cache_to_dir, warm_from_dir. All sync (no async).

These are the obvious starter targets — they're recent, they're small, they're the ones the ADR pair (ADR-154 + ADR-155) is shipping behind, and they're the only crates whose names appear in the workspace member list ahead of ruvector-core.

Existing FFI surfaces (the templates we copy)

NAPI-RS bindings (Node.js)

The workspace has 14 *-node crates wired through napi-derive 2.16. The cleanest minimal template is crates/ruvector-diskann-node/src/lib.rs — one file, ~250 LoC, wraps ruvector-diskann with:

  • #[napi(object)] config struct (DiskAnnOptions).
  • #[napi] result struct (DiskAnnSearchResult).
  • #[napi] opaque handle holding Arc<RwLock<CoreIndex>>.
  • Sync methods (insert, insert_batch, search).
  • Async methods via tokio::task::spawn_blocking + .await on the JoinHandle (build_async).

This shape — opaque handle, Arc<RwLock<inner>>, sync + spawn_blocking async pair — is the existing house style. PyO3 bindings should mirror it module-for-module so reviewers can diff them against each other and so behaviour is identical across language clients.

wasm-bindgen modules (browser / Node)

There are ~30 *-wasm crates. They use wasm-bindgen 0.2 + js-sys 0.3

  • a getrandom shim (features = ["wasm_js"]) that's the workspace default. Pattern is identical: opaque handle, sync methods only (WASM has no real threads in stable browsers without SharedArrayBuffer gymnastics).

WASM is relevant to the SDK strategy as an alternative-not-taken (see 02-strategy), not as a code-share opportunity.

Raw cbindgen / FFI

crates/ruvector-router-ffi is the only -ffi crate. C ABI. We do not use it. Mentioning here because someone will ask.

What's published

  • ruvector-rabitq and ruvector-rulake — both at workspace version 2.2.0. These are the v1 consumer-facing crates.
  • npm packages: npm/packages/ has 57 directories. The flagship ruvector npm package is at 0.2.23 and pulls in @ruvector/core (0.1.25), @ruvector/attention (0.1.3), @ruvector/gnn (0.1.22), @ruvector/sona (0.1.4) — i.e. the JS/TS story is fragmented: one umbrella package over four core sub-packages, each backed by a *-node crate. The umbrella also bundles a CLI (bin/cli.js), WASM artifacts (wasm/), and an MCP server (@modelcontextprotocol/sdk is a runtime dep).

What the JS/TS SDK actually covers (anchor for parity)

Reading npm/packages/ruvector/package.json keywords + dependencies:

  • HNSW search, hybrid search, RaBitQ ("turboquant" appears), Graph RAG, FlashAttention-3, ColBERT, Mamba, hyperbolic geometry, ONNX MiniLM (semantic embeddings), SONA / LoRA / EWC adaptive learning, MCP server, Pi-Brain identity ("pi-key").

The Python SDK does not need to chase parity. The JS package is the everything-bagel; the Python package should be narrow and deliberate (see 02-strategy and 04-milestones).

Examples that map to Python notebooks

examples/ has 60+ directories. The ones that translate naturally:

  • examples/refrag-pipeline/ — RAG pipeline using compress.rs / expand.rs / sense.rs. Becomes the M1 hello-world notebook (01_rag_in_5_lines.ipynb).
  • examples/onnx-embeddings/ — MiniLM ONNX embedder. Backs the M3 embedding tutorial.
  • examples/a2a-swarm/ — multi-peer A2A demo. Backs the M4 agent tutorial. Lives at the workspace top level, was added with ADR-159.
  • crates/ruvector-rulake/examples/sidecar_daemon.rs and warm_restart.rs — the "production deployment" patterns. Become the M2 ops notebook.

The notebooks are tracked under 04-milestones.md per milestone, not checked in here.

What we are deliberately ignoring

These crates exist, are interesting, and will not be in the Python SDK roadmap:

  • The 30+ *-wasm browser crates. Not Python's market.
  • ruvix/ (cognition kernel, bare-metal AArch64). Out of scope for any host-language SDK.
  • mcp-* crates. MCP is a coordination protocol; if a Python user wants MCP they use the official MCP SDK.
  • examples/*-consciousness, examples/*-boundary-discovery, examples/seti-*, examples/seizure-*, etc. — research demos, not API surfaces.
  • crates/ruQu*, crates/ruvix/*, crates/cognitum-*, crates/prime-radiant, crates/thermorust. Internal R&D.

Net assessment

There is no existing Python work — confirmed by exhaustive search. This is a clean room. The four crates that matter for v1 of a Python SDK are, in order: ruvector-rabitq, ruvector-rulake, the embedder (ruvector-cnn + ONNX glue), and rvagent-a2a. The NAPI template at crates/ruvector-diskann-node/src/lib.rs is the structural exemplar to follow for every PyO3 module we write.