mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-23 04:27:11 +00:00
Seven-file design review at docs/sdk/ covering the binding strategy,
API surface, M1-M4 milestones, risks, and a one-page decision record
for shipping a Python SDK.
Recommended path: **PyO3 + maturin, single in-tree
`crates/ruvector-py/` cdylib, abi3-py39 wheel via cibuildwheel,
`pyo3-asyncio` over a singleton tokio runtime.**
Why:
- The existing `*-node` NAPI templates (e.g.
`crates/ruvector-diskann-node/src/lib.rs`) already prove out the
opaque-handle + `Arc<RwLock<…>>` shape PyO3 mirrors line-for-line —
~70% port, ~30% lifetime gymnastics.
- abi3 collapses the wheel matrix from ~25 (cpython36 × 5 platforms)
to 5 (one wheel per platform, all py3.9+).
- Singleton tokio runtime avoids the "one runtime per call" overhead
while remaining compatible with asyncio + uvloop.
Milestone shape (each with explicit scope + acceptance tests):
M1 — RaBitQ-only Python wheel. Just the published
`ruvector-rabitq` crate exposed via PyO3. Smallest possible
useful surface. ~600 LoC, 3 weeks.
M2 — ruLake. Async via pyo3-asyncio. Witness verify exposed.
~900 LoC, 4 weeks.
M3 — Embeddings + ML helpers. Wrap consumer-facing parts of
`ruvector-cnn` / `ruvllm`. ~700 LoC, 3 weeks.
M4 — A2A agent client. Wrap `rvagent-a2a` so Python apps can
dispatch tasks to A2A peers, including signed AgentCard
discovery. ~800 LoC, 4 weeks.
Three acceptance gates that gate the whole effort:
1. A Python user can do RAG over 1 M vectors in <5 lines.
2. An asyncio user can stream A2A task updates without thread
fights.
3. `pip install ruvector` takes <10 s on a stock machine.
Top 3 risks identified:
R1 — tokio runtime + PyO3 + asyncio/uvloop interop. Mitigation:
single lazy runtime, `pyo3-asyncio` shim.
R3 — wheel size. M4 budget is 22 MB; A2A deps (axum + reqwest +
rustls) could blow it. Mitigation: feature-gate axum/reqwest
behind `agent` extra; default install is rabitq + rulake only.
R7 — PyPI name squat on `ruvector`. Mitigation: register placeholder
before M1 ships.
Nuance discovered: `ruvector-rabitq` has **no** sibling `*-node` or
`*-wasm` crate — unlike most consumer crates. M1 is therefore clean
greenfield: no parity-pressure to match a flaky NAPI signature, and
it confirms rabitq alone is the right starter target rather than the
umbrella `ruvector` crate the npm package wraps.
Planning doc only; no implementation.
Co-Authored-By: claude-flow <ruv@ruv.net>
2.5 KiB
2.5 KiB
ruvector Python SDK — Planning Index
This directory contains the design review for a first-party Python SDK over the
ruvector workspace. It is a planning artifact, not source code. No pyproject.toml,
*-py crate, or PyO3 dependency exists in the workspace today (verified
2026-04-25 by searching for pyo3/maturin in every Cargo.toml and for
pyproject.toml/*.pyi outside target/ and node_modules/). Everything
below is greenfield.
Documents
- 01-survey.md — What ruvector ships today: which crates
are realistic SDK targets vs internal-only, what FFI surfaces already exist
(NAPI-RS templates, wasm-bindgen modules, raw cbindgen consumers), the
shape of the JS/TS distribution, and which
examples/are good Python notebook material. - 02-strategy.md — The binding-approach decision. Reviews PyO3 + maturin, CFFI, ctypes-over-cbindgen, wasmtime-py over the WASM crates, and gRPC-server-with-Python-client. Picks PyO3 + maturin and defends the choice. Covers the asyncio story, the GIL story, the wheel matrix, and the type-stub plan.
- 03-api-surface.md — A concrete sketch of the
Python API the user types:
ruvector.RabitqIndex.build(...),ruvector.RuLake.builder()...build(),ruvector.A2aClient(...). Locks in the error hierarchy, sync-vs-async signatures per call, NumPy interop, and the Pythonic conveniences (len(idx),idx[i], context managers). - 04-milestones.md — Four buildable milestones
with explicit scope, file lists, LoC budgets, and acceptance tests in
the same shape as ADR-159's milestone plan. M1 is RaBitQ-only. M2 adds
ruLake. M3 adds embeddings. M4 wraps
rvagent-a2a. - 05-risks-and-tradeoffs.md — The honest
reservations: tokio runtime in a PyO3 extension, GIL for batched ops,
wheel size, NEON/AVX-512 build-time-vs-runtime detection, abi3 vs
version-specific wheels, the
ruvectorPyPI squat question, and where this code lives in the repo (a newcrates/ruvector-py/member, not a separate repo). - 06-decision-record.md — One-page summary with the chosen strategy, the 4-milestone roadmap, three acceptance gates that gate the whole effort, and the open questions for stakeholders to answer before M1 starts.
How to read this
Read 06 first if you want the call-to-action. Read 02 first if you want
to argue with the binding strategy. Read 01 first if you've never opened
this codebase before.