ruvector/docs/sdk/INDEX.md
ruvnet f6c684aba0 docs(sdk): add deep planning review for ruvector Python SDK
Seven-file design review at docs/sdk/ covering the binding strategy,
API surface, M1-M4 milestones, risks, and a one-page decision record
for shipping a Python SDK.

Recommended path: **PyO3 + maturin, single in-tree
`crates/ruvector-py/` cdylib, abi3-py39 wheel via cibuildwheel,
`pyo3-asyncio` over a singleton tokio runtime.**

Why:
- The existing `*-node` NAPI templates (e.g.
  `crates/ruvector-diskann-node/src/lib.rs`) already prove out the
  opaque-handle + `Arc<RwLock<…>>` shape PyO3 mirrors line-for-line —
  ~70% port, ~30% lifetime gymnastics.
- abi3 collapses the wheel matrix from ~25 (cpython36 × 5 platforms)
  to 5 (one wheel per platform, all py3.9+).
- Singleton tokio runtime avoids the "one runtime per call" overhead
  while remaining compatible with asyncio + uvloop.

Milestone shape (each with explicit scope + acceptance tests):

  M1 — RaBitQ-only Python wheel. Just the published
       `ruvector-rabitq` crate exposed via PyO3. Smallest possible
       useful surface. ~600 LoC, 3 weeks.
  M2 — ruLake. Async via pyo3-asyncio. Witness verify exposed.
       ~900 LoC, 4 weeks.
  M3 — Embeddings + ML helpers. Wrap consumer-facing parts of
       `ruvector-cnn` / `ruvllm`. ~700 LoC, 3 weeks.
  M4 — A2A agent client. Wrap `rvagent-a2a` so Python apps can
       dispatch tasks to A2A peers, including signed AgentCard
       discovery. ~800 LoC, 4 weeks.

Three acceptance gates that gate the whole effort:
  1. A Python user can do RAG over 1 M vectors in <5 lines.
  2. An asyncio user can stream A2A task updates without thread
     fights.
  3. `pip install ruvector` takes <10 s on a stock machine.

Top 3 risks identified:
  R1 — tokio runtime + PyO3 + asyncio/uvloop interop. Mitigation:
       single lazy runtime, `pyo3-asyncio` shim.
  R3 — wheel size. M4 budget is 22 MB; A2A deps (axum + reqwest +
       rustls) could blow it. Mitigation: feature-gate axum/reqwest
       behind `agent` extra; default install is rabitq + rulake only.
  R7 — PyPI name squat on `ruvector`. Mitigation: register placeholder
       before M1 ships.

Nuance discovered: `ruvector-rabitq` has **no** sibling `*-node` or
`*-wasm` crate — unlike most consumer crates. M1 is therefore clean
greenfield: no parity-pressure to match a flaky NAPI signature, and
it confirms rabitq alone is the right starter target rather than the
umbrella `ruvector` crate the npm package wraps.

Planning doc only; no implementation.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-25 20:28:54 -04:00

46 lines
2.5 KiB
Markdown

# ruvector Python SDK — Planning Index
This directory contains the design review for a first-party Python SDK over the
ruvector workspace. It is a planning artifact, not source code. No `pyproject.toml`,
`*-py` crate, or PyO3 dependency exists in the workspace today (verified
2026-04-25 by searching for `pyo3`/`maturin` in every `Cargo.toml` and for
`pyproject.toml`/`*.pyi` outside `target/` and `node_modules/`). Everything
below is greenfield.
## Documents
- **[01-survey.md](./01-survey.md)** — What ruvector ships today: which crates
are realistic SDK targets vs internal-only, what FFI surfaces already exist
(NAPI-RS templates, wasm-bindgen modules, raw cbindgen consumers), the
shape of the JS/TS distribution, and which `examples/` are good Python
notebook material.
- **[02-strategy.md](./02-strategy.md)** — The binding-approach decision.
Reviews PyO3 + maturin, CFFI, ctypes-over-cbindgen, wasmtime-py over the
WASM crates, and gRPC-server-with-Python-client. Picks PyO3 + maturin and
defends the choice. Covers the asyncio story, the GIL story, the wheel
matrix, and the type-stub plan.
- **[03-api-surface.md](./03-api-surface.md)** — A concrete sketch of the
Python API the user types: `ruvector.RabitqIndex.build(...)`,
`ruvector.RuLake.builder()...build()`, `ruvector.A2aClient(...)`. Locks
in the error hierarchy, sync-vs-async signatures per call, NumPy interop,
and the Pythonic conveniences (`len(idx)`, `idx[i]`, context managers).
- **[04-milestones.md](./04-milestones.md)** — Four buildable milestones
with explicit scope, file lists, LoC budgets, and acceptance tests in
the same shape as ADR-159's milestone plan. M1 is RaBitQ-only. M2 adds
ruLake. M3 adds embeddings. M4 wraps `rvagent-a2a`.
- **[05-risks-and-tradeoffs.md](./05-risks-and-tradeoffs.md)** — The honest
reservations: tokio runtime in a PyO3 extension, GIL for batched ops,
wheel size, NEON/AVX-512 build-time-vs-runtime detection, abi3 vs
version-specific wheels, the `ruvector` PyPI squat question, and where
this code lives in the repo (a new `crates/ruvector-py/` member, not a
separate repo).
- **[06-decision-record.md](./06-decision-record.md)** — One-page summary
with the chosen strategy, the 4-milestone roadmap, three acceptance
gates that gate the whole effort, and the open questions for stakeholders
to answer before M1 starts.
## How to read this
Read `06` first if you want the call-to-action. Read `02` first if you want
to argue with the binding strategy. Read `01` first if you've never opened
this codebase before.