mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-25 23:24:03 +00:00
Seven-file design review at docs/sdk/ covering the binding strategy,
API surface, M1-M4 milestones, risks, and a one-page decision record
for shipping a Python SDK.
Recommended path: **PyO3 + maturin, single in-tree
`crates/ruvector-py/` cdylib, abi3-py39 wheel via cibuildwheel,
`pyo3-asyncio` over a singleton tokio runtime.**
Why:
- The existing `*-node` NAPI templates (e.g.
`crates/ruvector-diskann-node/src/lib.rs`) already prove out the
opaque-handle + `Arc<RwLock<…>>` shape PyO3 mirrors line-for-line —
~70% port, ~30% lifetime gymnastics.
- abi3 collapses the wheel matrix from ~25 (cpython36 × 5 platforms)
to 5 (one wheel per platform, all py3.9+).
- Singleton tokio runtime avoids the "one runtime per call" overhead
while remaining compatible with asyncio + uvloop.
Milestone shape (each with explicit scope + acceptance tests):
M1 — RaBitQ-only Python wheel. Just the published
`ruvector-rabitq` crate exposed via PyO3. Smallest possible
useful surface. ~600 LoC, 3 weeks.
M2 — ruLake. Async via pyo3-asyncio. Witness verify exposed.
~900 LoC, 4 weeks.
M3 — Embeddings + ML helpers. Wrap consumer-facing parts of
`ruvector-cnn` / `ruvllm`. ~700 LoC, 3 weeks.
M4 — A2A agent client. Wrap `rvagent-a2a` so Python apps can
dispatch tasks to A2A peers, including signed AgentCard
discovery. ~800 LoC, 4 weeks.
Three acceptance gates that gate the whole effort:
1. A Python user can do RAG over 1 M vectors in <5 lines.
2. An asyncio user can stream A2A task updates without thread
fights.
3. `pip install ruvector` takes <10 s on a stock machine.
Top 3 risks identified:
R1 — tokio runtime + PyO3 + asyncio/uvloop interop. Mitigation:
single lazy runtime, `pyo3-asyncio` shim.
R3 — wheel size. M4 budget is 22 MB; A2A deps (axum + reqwest +
rustls) could blow it. Mitigation: feature-gate axum/reqwest
behind `agent` extra; default install is rabitq + rulake only.
R7 — PyPI name squat on `ruvector`. Mitigation: register placeholder
before M1 ships.
Nuance discovered: `ruvector-rabitq` has **no** sibling `*-node` or
`*-wasm` crate — unlike most consumer crates. M1 is therefore clean
greenfield: no parity-pressure to match a flaky NAPI signature, and
it confirms rabitq alone is the right starter target rather than the
umbrella `ruvector` crate the npm package wraps.
Planning doc only; no implementation.
Co-Authored-By: claude-flow <ruv@ruv.net>
46 lines
2.5 KiB
Markdown
46 lines
2.5 KiB
Markdown
# ruvector Python SDK — Planning Index
|
|
|
|
This directory contains the design review for a first-party Python SDK over the
|
|
ruvector workspace. It is a planning artifact, not source code. No `pyproject.toml`,
|
|
`*-py` crate, or PyO3 dependency exists in the workspace today (verified
|
|
2026-04-25 by searching for `pyo3`/`maturin` in every `Cargo.toml` and for
|
|
`pyproject.toml`/`*.pyi` outside `target/` and `node_modules/`). Everything
|
|
below is greenfield.
|
|
|
|
## Documents
|
|
|
|
- **[01-survey.md](./01-survey.md)** — What ruvector ships today: which crates
|
|
are realistic SDK targets vs internal-only, what FFI surfaces already exist
|
|
(NAPI-RS templates, wasm-bindgen modules, raw cbindgen consumers), the
|
|
shape of the JS/TS distribution, and which `examples/` are good Python
|
|
notebook material.
|
|
- **[02-strategy.md](./02-strategy.md)** — The binding-approach decision.
|
|
Reviews PyO3 + maturin, CFFI, ctypes-over-cbindgen, wasmtime-py over the
|
|
WASM crates, and gRPC-server-with-Python-client. Picks PyO3 + maturin and
|
|
defends the choice. Covers the asyncio story, the GIL story, the wheel
|
|
matrix, and the type-stub plan.
|
|
- **[03-api-surface.md](./03-api-surface.md)** — A concrete sketch of the
|
|
Python API the user types: `ruvector.RabitqIndex.build(...)`,
|
|
`ruvector.RuLake.builder()...build()`, `ruvector.A2aClient(...)`. Locks
|
|
in the error hierarchy, sync-vs-async signatures per call, NumPy interop,
|
|
and the Pythonic conveniences (`len(idx)`, `idx[i]`, context managers).
|
|
- **[04-milestones.md](./04-milestones.md)** — Four buildable milestones
|
|
with explicit scope, file lists, LoC budgets, and acceptance tests in
|
|
the same shape as ADR-159's milestone plan. M1 is RaBitQ-only. M2 adds
|
|
ruLake. M3 adds embeddings. M4 wraps `rvagent-a2a`.
|
|
- **[05-risks-and-tradeoffs.md](./05-risks-and-tradeoffs.md)** — The honest
|
|
reservations: tokio runtime in a PyO3 extension, GIL for batched ops,
|
|
wheel size, NEON/AVX-512 build-time-vs-runtime detection, abi3 vs
|
|
version-specific wheels, the `ruvector` PyPI squat question, and where
|
|
this code lives in the repo (a new `crates/ruvector-py/` member, not a
|
|
separate repo).
|
|
- **[06-decision-record.md](./06-decision-record.md)** — One-page summary
|
|
with the chosen strategy, the 4-milestone roadmap, three acceptance
|
|
gates that gate the whole effort, and the open questions for stakeholders
|
|
to answer before M1 starts.
|
|
|
|
## How to read this
|
|
|
|
Read `06` first if you want the call-to-action. Read `02` first if you want
|
|
to argue with the binding strategy. Read `01` first if you've never opened
|
|
this codebase before.
|