diff --git a/README.md b/README.md index 0226dcd2a..e24857800 100644 --- a/README.md +++ b/README.md @@ -74,7 +74,7 @@ Most vector databases are static — they store embeddings and search them. That | 21 | **Witness chains** | Tamper-evident hash-linked audit trail for every operation | | 22 | **Post-quantum signatures** | ML-DSA-65 and SLH-DSA-128s alongside Ed25519 | | 23 | **DNA-style lineage** | Track parent/child derivation chains with cryptographic hashes | -| 24 | **20 segment types** | VEC, INDEX, KERNEL, EBPF, COW_MAP, WITNESS, CRYPTO, and 13 more | +| 24 | **24 segment types** | VEC, INDEX, KERNEL, EBPF, WASM, COW_MAP, WITNESS, CRYPTO, and 16 more | **Specialized Processing** | # | Capability | What It Does | @@ -221,7 +221,7 @@ npx @ruvector/rvf-mcp-server --transport stdio # MCP server for AI agents | Tamper-evident audit | Hash-linked witness chain for every insert, query, and deletion | | Post-quantum signatures | ML-DSA-65 and Ed25519 signing on every segment | | DNA-style lineage | Parent/child derivation chains with cryptographic verification | -| 20 segment types | VEC, INDEX, KERNEL, EBPF, WASM, COW_MAP, WITNESS, CRYPTO, and 12 more | +| 24 segment types | VEC, INDEX, KERNEL, EBPF, WASM, COW_MAP, WITNESS, CRYPTO, and 16 more | **Rust crates** (13): [`rvf-types`](https://crates.io/crates/rvf-types) `rvf-wire` `rvf-manifest` `rvf-quant` `rvf-index` `rvf-crypto` [`rvf-runtime`](https://crates.io/crates/rvf-runtime) `rvf-kernel` `rvf-ebpf` `rvf-launch` `rvf-server` `rvf-import` [`rvf-cli`](https://crates.io/crates/rvf-cli) @@ -230,7 +230,7 @@ npx @ruvector/rvf-mcp-server --transport stdio # MCP server for AI agents - **Full documentation**: [crates/rvf/README.md](./crates/rvf/README.md) - **ADR-030**: [Cognitive Container Architecture](./docs/adr/ADR-030-rvf-cognitive-container.md) - **ADR-031**: [COW Branching & Real Containers](./docs/adr/ADR-031-rvcow-branching-and-real-cognitive-containers.md) -- **45 runnable examples**: [examples/rvf/examples/](./examples/rvf/examples/) +- **46 runnable examples**: [examples/rvf/examples/](./examples/rvf/examples/) @@ -1431,7 +1431,15 @@ let syndrome = gate.assess_coherence(&quantum_state)?; | [rvf-import](./crates/rvf/rvf-import) | JSON, CSV, NumPy importers | [![crates.io](https://img.shields.io/crates/v/rvf-import.svg)](https://crates.io/crates/rvf-import) | | [rvf-cli](./crates/rvf/rvf-cli) | Unified CLI with 17 subcommands | [![crates.io](https://img.shields.io/crates/v/rvf-cli.svg)](https://crates.io/crates/rvf-cli) | -**RVF Features:** Single-file cognitive containers that boot as Linux microservices, COW branching at cluster granularity, eBPF acceleration, witness chains, post-quantum signatures, 20 segment types. [Full README](./crates/rvf/README.md) +**RVF Features:** Single-file cognitive containers that boot as Linux microservices, COW branching at cluster granularity, eBPF acceleration, witness chains, post-quantum signatures, 24 segment types. [Full README](./crates/rvf/README.md) + +**Self-booting example** — the `claude_code_appliance` builds a complete AI dev environment as one file: + +```bash +cd examples/rvf && cargo run --example claude_code_appliance +``` + +Final file: **5.1 MB single `.rvf`** — boots Linux, serves queries, runs Claude Code. One file. Boots on QEMU/Firecracker. Runs SSH. Serves vectors. Installs Claude Code. Proves every step with a cryptographic witness chain. ### Personal AI Memory (OSpipe) diff --git a/crates/rvf/Cargo.lock b/crates/rvf/Cargo.lock index e30c828b2..8eb90a4ce 100644 --- a/crates/rvf/Cargo.lock +++ b/crates/rvf/Cargo.lock @@ -1689,7 +1689,7 @@ dependencies = [ [[package]] name = "rvf-crypto" -version = "0.1.0" +version = "0.2.0" dependencies = [ "ed25519-dalek", "rand", @@ -1795,7 +1795,7 @@ dependencies = [ [[package]] name = "rvf-runtime" -version = "0.1.0" +version = "0.2.0" dependencies = [ "rand", "rvf-types", @@ -1833,7 +1833,7 @@ dependencies = [ [[package]] name = "rvf-types" -version = "0.1.0" +version = "0.2.0" dependencies = [ "ed25519-dalek", "rand_core", diff --git a/crates/rvf/README.md b/crates/rvf/README.md index 78c67b6a9..76efda7c0 100644 --- a/crates/rvf/README.md +++ b/crates/rvf/README.md @@ -16,10 +16,10 @@

- Tests - Examples - Crates - Lines + Tests + Examples + Crates + Lines License MSRV no_std @@ -69,7 +69,7 @@ This is not a database format. It is an **executable knowledge unit**. | Capability | How | Segment | |------------|-----|---------| | 🤖 **Plug into AI agents** | An MCP server lets Claude Code, Cursor, and other AI tools create, query, and manage vector stores directly. | npm package | -| 📦 **Use from any language** | Published as 13 Rust crates, 4 npm packages, a CLI tool, and an HTTP server. Works from Rust, Node.js, browsers, and the command line. | 13 crates + 4 npm | +| 📦 **Use from any language** | Published as 14 Rust crates, 6 adapters, 4 npm packages, a CLI tool, and an HTTP server. Works from Rust, Node.js, browsers, and the command line. | 14 crates + 6 adapters + 4 npm | | ♻️ **Always backward-compatible** | Old tools skip new segment types they don't understand. A file with COW branching still works in a reader that only knows basic vectors. | Format rule | ``` @@ -150,19 +150,20 @@ A single `.rvf` file is crash-safe (no WAL needed), self-describing, and progres | Crate | Version | Description | |-------|---------|-------------| -| [`rvf-types`](https://crates.io/crates/rvf-types) | 0.1.0 | Segment types, 20 headers, enums (`no_std`) | +| [`rvf-types`](https://crates.io/crates/rvf-types) | 0.2.0 | Segment types, 24 headers, quality, security, AGI container types (`no_std`) | | [`rvf-wire`](https://crates.io/crates/rvf-wire) | 0.1.0 | Wire format read/write (`no_std`) | | [`rvf-manifest`](https://crates.io/crates/rvf-manifest) | 0.1.0 | Two-level manifest, FileIdentity, COW pointers | | [`rvf-quant`](https://crates.io/crates/rvf-quant) | 0.1.0 | Scalar, product, and binary quantization | | [`rvf-index`](https://crates.io/crates/rvf-index) | 0.1.0 | HNSW progressive indexing (Layer A/B/C) | -| [`rvf-crypto`](https://crates.io/crates/rvf-crypto) | 0.1.0 | SHAKE-256, Ed25519, witness chains, attestation | -| [`rvf-runtime`](https://crates.io/crates/rvf-runtime) | 0.1.0 | Full store API, COW engine, compaction | +| [`rvf-crypto`](https://crates.io/crates/rvf-crypto) | 0.2.0 | SHAKE-256, Ed25519, witness chains, seed crypto | +| [`rvf-runtime`](https://crates.io/crates/rvf-runtime) | 0.2.0 | Full store API, COW engine, AGI containers, QR seeds, safety net | | [`rvf-kernel`](https://crates.io/crates/rvf-kernel) | 0.1.0 | Linux kernel builder, initramfs, Docker pipeline | | [`rvf-ebpf`](https://crates.io/crates/rvf-ebpf) | 0.1.0 | BPF C compiler (XDP, socket filter, TC) | | [`rvf-launch`](https://crates.io/crates/rvf-launch) | 0.1.0 | QEMU microvm launcher, KVM/TCG, QMP | | [`rvf-server`](https://crates.io/crates/rvf-server) | 0.1.0 | HTTP REST + TCP streaming server | | [`rvf-import`](https://crates.io/crates/rvf-import) | 0.1.0 | JSON, CSV, NumPy importers | | [`rvf-cli`](https://crates.io/crates/rvf-cli) | 0.1.0 | Unified CLI with 17 subcommands | +| [`rvf-solver-wasm`](https://crates.io/crates/rvf-solver-wasm) | 0.1.0 | Thompson Sampling temporal solver (WASM, `no_std`) | ### npm Packages (npmjs.org) @@ -215,10 +216,10 @@ npx @ruvector/rvf-mcp-server --transport stdio ```toml # Cargo.toml [dependencies] -rvf-runtime = "0.1" # full store API -rvf-types = "0.1" # types only (no_std) +rvf-runtime = "0.2" # full store API +rvf-types = "0.2" # types only (no_std) rvf-wire = "0.1" # wire format (no_std) -rvf-crypto = "0.1" # signatures + witness chains +rvf-crypto = "0.2" # signatures + witness chains rvf-import = "0.1" # JSON/CSV/NumPy importers ``` @@ -325,7 +326,7 @@ rvf inspect output/linux_microkernel.rvf ## 📋 What RVF Contains -An RVF file is a sequence of typed segments. Each segment is self-describing, 64-byte aligned, and independently integrity-checked. The format supports 20 segment types that together constitute a complete cognitive runtime: +An RVF file is a sequence of typed segments. Each segment is self-describing, 64-byte aligned, and independently integrity-checked. The format supports 24 segment types that together constitute a complete cognitive runtime: ``` .rvf file (Sealed Cognitive Engine) @@ -350,6 +351,9 @@ An RVF file is a sequence of typed segments. Each segment is self-describing, 64 +-- REFCOUNT_SEG .... Cluster reference counts, rebuildable (0x21) +-- MEMBERSHIP_SEG .. Vector visibility filter for branches (0x22) +-- DELTA_SEG ....... Sparse delta patches / LoRA overlays (0x23) + +-- TRANSFER_PRIOR .. Transfer learning priors (0x30) + +-- POLICY_KERNEL ... Thompson Sampling policy state (0x31) + +-- COST_CURVE ...... Cost/reward curves for solver (0x32) ``` --- @@ -407,7 +411,7 @@ This is not a database. It is a **sealed, auditable, self-booting domain expert* ## 🔌 RuVector Ecosystem Integration -RVF is the canonical binary format across 75+ Rust crates in the RuVector ecosystem: +RVF is the canonical binary format across 87+ Rust crates in the RuVector ecosystem: | Domain | Crates | RVF Segment | |--------|--------|-------------| @@ -439,7 +443,7 @@ The same `.rvf` file format runs on cloud servers, Firecracker microVMs, TEE enc | **Temperature-tiered quantization** | Hot vectors stay fp16, warm use product quantization, cold use binary — automatically. | | **Metadata filtering** | Filtered k-NN with boolean expressions (AND/OR/NOT/IN/RANGE). | | **4 KB instant boot** | Root manifest fits in one page read. Cold boot < 5 ms. | -| **20 segment types** | VEC, INDEX, MANIFEST, QUANT, WITNESS, CRYPTO, KERNEL, EBPF, COW_MAP, MEMBERSHIP, DELTA, and 9 more. | +| **24 segment types** | VEC, INDEX, MANIFEST, QUANT, WITNESS, CRYPTO, KERNEL, EBPF, WASM, COW_MAP, MEMBERSHIP, DELTA, TRANSFER_PRIOR, POLICY_KERNEL, COST_CURVE, and 9 more. | ### COW Branching (RVCOW) @@ -530,17 +534,18 @@ An `.rvf` file is a sequence of 64-byte-aligned segments. Each segment has a sel | Crate | Lines | Purpose | |-------|------:|---------| -| `rvf-types` | 5,200+ | Segment types, 20 headers, COW/membership/delta/kernel-binding types, enums (`no_std`) | +| `rvf-types` | 7,000+ | 24 segment types, AGI container, quality, security, WASM bootstrap, QR seed (`no_std`) | | `rvf-wire` | 2,011 | Wire format read/write (`no_std`) | | `rvf-manifest` | 1,700+ | Two-level manifest with 4 KB root, FileIdentity codec, COW pointers, double-root scheme | | `rvf-index` | 2,691 | HNSW progressive indexing (Layer A/B/C) | | `rvf-quant` | 1,443 | Scalar, product, and binary quantization | -| `rvf-crypto` | 1,725 | SHAKE-256, Ed25519, witness chains, attestation, lineage witnesses | -| `rvf-runtime` | 5,500+ | Full store API, COW engine, membership filters, compaction, branch/freeze | +| `rvf-crypto` | 1,725 | SHAKE-256, Ed25519, witness chains, attestation, seed crypto | +| `rvf-runtime` | 8,000+ | Full store API, COW engine, AGI containers, QR seeds, safety net, adversarial defense | | `rvf-kernel` | 2,400+ | Real Linux kernel builder, cpio/newc initramfs, Docker build, SHA3-256 verification | | `rvf-launch` | 1,200+ | QEMU microvm launcher, KVM/TCG detection, QMP shutdown protocol | | `rvf-ebpf` | 1,100+ | Real BPF C compiler (XDP, socket filter, TC), vmlinux.h generation | | `rvf-wasm` | 1,700+ | WASM control plane: in-memory store, query, segment inspection, witness chain verification (~46 KB) | +| `rvf-solver-wasm` | 1,500+ | Thompson Sampling temporal solver, PolicyKernel, three-loop architecture (`no_std`) | | `rvf-node` | 852 | Node.js N-API bindings with lineage, kernel/eBPF, and inspection | | `rvf-cli` | 1,800+ | Unified CLI with 17 subcommands (create, ingest, query, delete, status, inspect, compact, derive, serve, launch, embed-kernel, embed-ebpf, filter, freeze, verify-witness, verify-attestation, rebuild-refcounts) | | `rvf-server` | 1,165 | HTTP REST + TCP streaming server | @@ -791,6 +796,115 @@ if let Some((header, program_data)) = store.extract_ebpf()? { For the full specification including wire formats, attestation binding, and implementation phases, see [ADR-030: RVF Cognitive Container](docs/adr/ADR-030-rvf-computational-container.md). +### End-to-End: Claude Code Appliance + +The `claude_code_appliance` example builds a complete self-booting AI development environment as a single `.rvf` file. It uses real infrastructure — a Docker-built Linux kernel, Ed25519 SSH keys, a BPF C socket filter, and a cryptographic witness chain. + +**Prerequisites:** Docker (for kernel build), Rust 1.87+ + +```bash +# Build and run the example +cd examples/rvf +cargo run --example claude_code_appliance +``` + +**What it produces** (5.1 MB file): + +``` +claude_code_appliance.rvf + ├── KERNEL_SEG Linux 6.8.12 bzImage (5.2 MB, x86_64) + ├── EBPF_SEG Socket filter — allows ports 2222, 8080 only + ├── VEC_SEG 20 package embeddings (128-dim) + ├── INDEX_SEG HNSW graph for package search + ├── WITNESS_SEG 6-entry tamper-evident audit trail + ├── CRYPTO_SEG 3 Ed25519 SSH user keys (root, deploy, claude) + ├── MANIFEST_SEG 4 KB root with segment directory + └── Snapshot v1 derived image with lineage tracking +``` + +**Boot sequence** (once launched on Firecracker/QEMU): + +``` +1. Firecracker loads KERNEL_SEG → Linux boots (<125 ms) +2. SSH server starts on port 2222 +3. curl -fsSL https://claude.ai/install.sh | bash +4. RVF query server starts on port 8080 +5. Claude Code ready for use +``` + +**Connect and use:** + +```bash +# Boot the file (requires QEMU or Firecracker) +rvf launch claude_code_appliance.rvf + +# SSH in +ssh -p 2222 deploy@localhost + +# Query the package database +curl -s localhost:8080/query -d '{"vector":[0.1,...], "k":5}' + +# Or use the CLI +rvf query claude_code_appliance.rvf --vector "0.1,0.2,..." --k 5 +``` + +**Verified output from the example run:** + +``` +=== Claude Code Appliance Summary === + File size: 5,260,093 bytes (5.1 MB) + Segments: 8 + Packages: 20 (203.1 MB manifest) + KERNEL_SEG: MicroLinux x86_64 (5,243,904 bytes) + EBPF_SEG: SocketFilter (3,805 bytes) + SSH users: 3 (Ed25519 signed, all verified) + Witness chain: 6 entries (tamper-evident, all verified) + Lineage: base + v1 snapshot (parent hash matches) +``` + +Final file: **5.1 MB single `.rvf`** — boots Linux, serves queries, runs Claude Code. + +One file. Boots Linux. Runs SSH. Serves vectors. Installs Claude Code. Proves every step. + +### Launching with QEMU + +```bash +# CLI launcher (auto-detects KVM or falls back to TCG) +rvf launch vectors.rvf + +# Manual QEMU (if you want control) +rvf launch vectors.rvf --memory 512M --cpus 2 --port-forward 2222:22,8080:8080 + +# Extract kernel for external use +rvf inspect vectors.rvf --segment kernel --output kernel.bin +qemu-system-x86_64 -M microvm -kernel kernel.bin -append "console=ttyS0" -nographic +``` + +### Building Your Own Bootable RVF + +Step-by-step to create a self-booting `.rvf` from scratch: + +```bash +# 1. Create a vector store +rvf create myservice.rvf --dimension 384 + +# 2. Ingest your data +rvf ingest myservice.rvf --input embeddings.json --format json + +# 3. Build and embed a Linux kernel (uses Docker) +rvf embed-kernel myservice.rvf --arch x86_64 + +# 4. Optionally embed an eBPF filter +rvf embed-ebpf myservice.rvf --program filter.c + +# 5. Verify the result +rvf inspect myservice.rvf +# MANIFEST_SEG, VEC_SEG, INDEX_SEG, KERNEL_SEG, EBPF_SEG, WITNESS_SEG + +# 6. Boot it +rvf launch myservice.rvf +``` + --- ## 🔗 Library Adapters @@ -808,6 +922,143 @@ RVF provides drop-in adapters for 6 libraries in the RuVector ecosystem: --- +## 🤖 AGI Cognitive Container (ADR-036) + +An AGI container packages a complete AI agent runtime into a single sealed `.rvf` file. Where the [Self-Booting RVF](#%EF%B8%8F-self-booting-rvf-cognitive-container) section covers the compute tiers (WASM/eBPF/Kernel), the AGI container adds the intelligence layer on top: model identity, orchestration config, tool registries, evaluation harnesses, authority controls, and coherence gates. + +``` +AGI Cognitive Container (.rvf) +├── Identity ────── container UUID, build UUID, model ID hash +├── Orchestrator ── Claude Code / Claude Flow config (JSON) +├── Tools ──────── MCP tool adapter registry +├── Agent Prompts ─ role definitions per agent type +├── Eval Harness ── task suite + grading rules +├── Skills ──────── promoted skill library +├── Policy ──────── governance rules + authority config +├── Coherence ───── min score, contradiction rate, rollback ratio +├── Resources ───── time/token/cost budgets with clamping +├── Replay ──────── automation script for deterministic re-execution +├── Kernel Config ─ boot parameters, network, SSH +├── Domain Profile ─ coding / research / ops specialization +└── Signature ───── HMAC-SHA256 or Ed25519 tamper seal +``` + +### Execution Modes + +| Mode | Purpose | Requires | +|------|---------|----------| +| **Replay** | Deterministic re-execution from witness logs | Witness chain | +| **Verify** | Validate container integrity and run eval harness | Kernel + world model, or WASM + vectors | +| **Live** | Full autonomous operation with tool use | Kernel + world model | + +### Authority Levels + +Authority is hierarchical — each level permits everything below it: + +| Level | Allows | +|-------|--------| +| `ReadOnly` | Read vectors, run queries | +| `WriteMemory` | + Write to vector store, update index | +| `ExecuteTools` | + Invoke MCP tools, run commands | +| `WriteExternal` | + Network access, file I/O, push to git | + +Default authority per mode: Replay → ReadOnly, Verify → ExecuteTools, Live → WriteMemory. + +### Resource Budgets + +Every container carries hard limits that are clamped to safety maximums: + +| Resource | Max | Default | +|----------|-----|---------| +| Time | 3,600 sec | 300 sec | +| Tokens | 1,000,000 | 100,000 | +| Cost | $10.00 | $1.00 | +| Tool calls | 500 | 100 | +| External writes | 50 | 10 | + +### Coherence Gates + +Coherence thresholds halt execution when the agent's world model drifts: + +- `min_coherence_score` (0.0–1.0) — minimum quality gate +- `max_contradiction_rate` (0.0–1.0) — tolerable contradiction frequency +- `max_rollback_ratio` (0.0–1.0) — ratio of rolled-back decisions + +### Building a Container + +```rust +use rvf_runtime::agi_container::AgiContainerBuilder; +use rvf_types::agi_container::*; + +let (payload, header) = AgiContainerBuilder::new(container_id, build_id) + .with_model_id("claude-opus-4-6") + .with_orchestrator(b"{\"max_turns\":100}") + .with_tool_registry(b"[{\"name\":\"search\",\"type\":\"rvf_query\"}]") + .with_eval_tasks(b"[{\"id\":1,\"spec\":\"fix bug\"}]") + .with_eval_graders(b"[{\"type\":\"test_pass\"}]") + .with_authority_config(b"{\"level\":\"WriteMemory\"}") + .with_coherence_config(b"{\"min_cut\":0.7,\"rollback\":true}") + .with_project_instructions(b"# CLAUDE.md\nFix bugs, run tests.") + .with_segments(ContainerSegments { + kernel_present: true, manifest_present: true, + world_model_present: true, ..Default::default() + }) + .build_and_sign(signing_key)?; + +// Parse and validate +let manifest = ParsedAgiManifest::parse(&payload)?; +assert_eq!(manifest.model_id_str(), Some("claude-opus-4-6")); +assert!(manifest.is_autonomous_capable()); +assert!(header.is_signed()); +``` + +See [ADR-036](../../docs/adr/ADR-036-agi-cognitive-container.md) for the full specification. + +## 📱 QR Cognitive Seed (ADR-034) + +A QR Cognitive Seed (RVQS) encodes a portable intelligence capsule into a scannable QR code. It carries bootstrap hosts, layer hashes, and cryptographic signatures in a compact binary format. + +```rust +use rvf_runtime::seed_crypto; + +let hash = seed_crypto::seed_content_hash(data); // 8-byte SHAKE-256 +let sig = seed_crypto::sign_seed(key, payload); // 32-byte HMAC +let ok = seed_crypto::verify_seed(key, payload, &sig); +``` + +Types: `SeedHeader`, `HostEntry`, `LayerEntry` (rvf-types), plus `qr_encode` for QR matrix generation (rvf-runtime). + +## 🔒 Quality & Safety Net + +The quality system tracks retrieval fidelity across progressive index layers and enforces graceful degradation when budgets are exceeded. + +- `RetrievalQuality` — Full / Partial / Degraded / Failed +- `ResponseQuality` — per-query quality metadata with evidence +- `SafetyNetBudget` — time, token, and cost budgets with automatic clamping +- `DegradationReport` — structured fallback path and reason tracking + +## 🛡️ Security Hardening + +Built-in defenses against adversarial inputs and resource exhaustion: + +- `SecurityPolicy` / `HardeningFields` — declarative security configuration (rvf-types) +- `adversarial` module — input validation and tamper detection (rvf-runtime) +- `dos` module — rate limiting and resource exhaustion guards (rvf-runtime) + +## 🧬 WASM Self-Bootstrapping (0x10) + +WASM_SEG enables an RVF file to carry its own WASM interpreter, creating a three-layer bootstrap stack: + +``` +Raw bytes → WASM interpreter → microkernel → vector data +``` + +Types: `WasmRole` (Interpreter/Microkernel/Solver), `WasmTarget` (Browser/Node/Edge/Embedded), `WasmHeader` (rvf-types/wasm_bootstrap). + +The `rvf-solver-wasm` crate implements a Thompson Sampling temporal solver as a `no_std` WASM module with `dlmalloc`, producing segment types `TRANSFER_PRIOR` (0x30), `POLICY_KERNEL` (0x31), and `COST_CURVE` (0x32). + +--- +

45 Runnable Examples @@ -906,12 +1157,13 @@ cargo run --example | 42 | [`membership_filter`](../../examples/rvf/examples/membership_filter.rs) | Include/exclude bitmap filters for shared HNSW traversal | | 43 | [`snapshot_freeze`](../../examples/rvf/examples/snapshot_freeze.rs) | Generation snapshots, immutable freeze, generation tracking | -#### Appliance & Generation (2) +#### Appliance & Generation (3) | # | Example | What It Demonstrates | |---|---------|---------------------| | 44 | [`claude_code_appliance`](../../examples/rvf/examples/claude_code_appliance.rs) | Bootable AI dev environment: real kernel + eBPF + vectors + witness + crypto | -| 45 | [`generate_all`](../../examples/rvf/examples/generate_all.rs) | Batch generation of all 45 example `.rvf` files | +| 45 | [`live_boot_proof`](../../examples/rvf/examples/live_boot_proof.rs) | Docker-boot an `.rvf`, SSH in, verify segments are live and operational | +| 46 | [`generate_all`](../../examples/rvf/examples/generate_all.rs) | Batch generation of all example `.rvf` files | See the [examples README](../../examples/rvf/README.md) for tutorials, usage patterns, and detailed walkthroughs. @@ -1317,10 +1569,14 @@ let dist = hamming_distance(&bits_a, &bits_b); | `0x0D` | META_IDX | Metadata inverted indexes | | `0x0E` | KERNEL | Compressed unikernel image (self-booting) | | `0x0F` | EBPF | eBPF program for kernel-level acceleration | +| `0x10` | WASM | WASM microkernel / self-bootstrapping bytecode | | `0x20` | COW_MAP | Cluster ownership map (local vs parent) | | `0x21` | REFCOUNT | Cluster reference counts (rebuildable) | | `0x22` | MEMBERSHIP | Vector visibility filter for branches | | `0x23` | DELTA | Sparse delta patches (LoRA overlays) | +| `0x30` | TRANSFER_PRIOR | Transfer learning prior distributions | +| `0x31` | POLICY_KERNEL | Thompson Sampling policy kernels | +| `0x32` | COST_CURVE | Cost/reward curves for solver | ### Segment Flags @@ -1700,10 +1956,14 @@ cargo run --example claude_code_appliance # File size: 17 KB — sealed cognitive container ``` -### Integration Test Suite: 46/46 Passing +### Test Suite: 1,156 Passing ```bash cargo test --workspace +# agi_e2e .................. 12 passed +# adr033_integration ....... 34 passed +# qr_seed_e2e .............. 11 passed +# witness_e2e .............. 10 passed # attestation .............. 6 passed # crypto ................... 10 passed # computational_container .. 8 passed @@ -1711,10 +1971,11 @@ cargo test --workspace # cross_platform ........... 6 passed # lineage .................. 4 passed # smoke .................... 4 passed -# Total: 46/46 integration tests passed +# + unit tests across all crates +# Total: 1,156 tests passed ``` -### Generate All 45 Example Files +### Generate All 46 Example Files ```bash cd examples/rvf && cargo run --example generate_all @@ -1731,7 +1992,21 @@ cd ruvector/crates/rvf cargo test --workspace ``` -All contributions must pass `cargo clippy --all-targets` with zero warnings and maintain the existing test count (currently 795+). +All contributions must pass `cargo clippy --all-targets` with zero warnings and maintain the existing test count (currently 1,156+). + +### Architecture Decision Records + +| ADR | Title | +|-----|-------| +| [ADR-030](docs/adr/ADR-030-rvf-computational-container.md) | RVF Cognitive Container (Kernel, eBPF, WASM tiers) | +| [ADR-031](docs/adr/ADR-031-rvcow-branching-and-real-cognitive-containers.md) | RVCOW Branching & Real Cognitive Containers | +| [ADR-033](../../docs/adr/ADR-033-progressive-indexing-hardening.md) | Progressive Indexing Hardening | +| [ADR-034](../../docs/adr/ADR-034-qr-cognitive-seed.md) | QR Cognitive Seed (RVQS) | +| [ADR-035](../../docs/adr/ADR-035-capability-report.md) | Capability Report | +| [ADR-036](../../docs/adr/ADR-036-agi-cognitive-container.md) | AGI Cognitive Container | +| [ADR-037](../../docs/adr/ADR-037-publishable-rvf-acceptance-test.md) | Publishable RVF Acceptance Tests | +| [ADR-038](../../docs/adr/ADR-038-npx-ruvector-rvlite-witness-integration.md) | npx ruvector rvlite Witness Integration | +| [ADR-039](../../docs/adr/ADR-039-rvf-solver-wasm-agi-integration.md) | RVF Solver WASM AGI Integration | ## 📄 License diff --git a/crates/rvf/rvf-node/README.md b/crates/rvf/rvf-node/README.md index 95e996077..ec20b738b 100644 --- a/crates/rvf/rvf-node/README.md +++ b/crates/rvf/rvf-node/README.md @@ -1,132 +1,278 @@ -# rvf-node +# @ruvector/rvf-node -Node.js N-API bindings for native RuVector Format operations. +Native Node.js bindings for the [RuVector Format](https://github.com/ruvnet/ruvector/tree/main/crates/rvf) (RVF) vector database. Built with Rust via N-API for native speed with zero serialization overhead. -## Overview - -`rvf-node` exposes the RVF runtime to Node.js via N-API for high-performance vector operations without leaving JavaScript: - -- **Async API** -- non-blocking vector store operations -- **Native speed** -- Rust-compiled N-API addon, no serialization overhead -- **Cross-platform** -- builds for Linux, macOS, and Windows -- **Full feature parity** -- lineage, kernel/eBPF embedding, segment inspection - -## Build +## Install ```bash -cd crates/rvf/rvf-node -npm run build +npm install @ruvector/rvf-node ``` -## API +## Features + +- **Native Rust performance** via N-API (napi-rs), no FFI marshaling +- **Single-file vector database** — crash-safe, no WAL, append-only +- **k-NN search** with HNSW progressive indexing (recall 0.70 → 0.95) +- **Metadata filtering** — Eq, Ne, Lt, Gt, Range, In, And, Or, Not +- **Lineage tracking** — DNA-style parent/child derivation chains +- **Kernel & eBPF embedding** — embed compute alongside vector data +- **Segment inspection** — enumerate all segments in the file +- **Cross-platform** — Linux (x86_64, aarch64), macOS (x86_64, Apple Silicon), Windows (x86_64) + +## Quick Start + +```javascript +const { RvfDatabase } = require('@ruvector/rvf-node'); + +// Create a store +const db = RvfDatabase.create('vectors.rvf', { + dimension: 384, + metric: 'cosine', +}); + +// Insert vectors +const vectors = new Float32Array(384 * 2); // 2 vectors, 384 dims each +vectors.fill(0.1); +db.ingestBatch(vectors, [1, 2]); + +// Query nearest neighbors +const query = new Float32Array(384); +query.fill(0.15); +const results = db.query(query, 5); +// [{ id: 1, distance: 0.002 }, { id: 2, distance: 0.002 }] + +db.close(); +``` + +## API Reference ### Store Lifecycle ```typescript -import { RvfDatabase } from '@ruvector/rvf'; +// Create a new store +const db = RvfDatabase.create(path: string, options: RvfOptions); -// Create -const db = RvfDatabase.create('/tmp/store.rvf', { dimension: 128, metric: 'cosine' }); +// Open existing store (read-write, acquires writer lock) +const db = RvfDatabase.open(path: string); -// Open for read-write -const db = RvfDatabase.open('/tmp/store.rvf'); +// Open read-only (no lock, concurrent readers allowed) +const db = RvfDatabase.openReadonly(path: string); -// Open read-only (no lock) -const db = RvfDatabase.openReadonly('/tmp/store.rvf'); - -// Close +// Close and flush db.close(); ``` -### Vector Operations +**RvfOptions:** + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `dimension` | `number` | required | Vector dimensionality | +| `metric` | `string` | `"l2"` | `"l2"`, `"cosine"`, or `"inner_product"` | +| `profile` | `number` | `0` | Hardware profile: 0=Generic, 1=Core, 2=Hot, 3=Full | +| `signing` | `boolean` | `false` | Enable segment signing | +| `m` | `number` | `16` | HNSW M parameter (neighbor count) | +| `efConstruction` | `number` | `200` | HNSW index build quality | + +### Ingest Vectors ```typescript -// Ingest vectors -const vectors = new Float32Array([1,0,0,0, 0,1,0,0]); -const result = db.ingestBatch(vectors, [0, 1]); -// { accepted: 2, rejected: 0, epoch: 1 } +const result = db.ingestBatch( + vectors: Float32Array, // flat array of n * dimension floats + ids: number[], // vector IDs + metadata?: RvfMetadataEntry[] // optional metadata per vector +); +// Returns: { accepted: number, rejected: number, epoch: number } +``` -// Query -const results = db.query(new Float32Array([1,0,0,0]), 5); -// [{ id: 0, distance: 0.0 }, { id: 1, distance: 1.414 }] +**Metadata entry format:** -// Query with filter -const results = db.query(new Float32Array([1,0,0,0]), 5, { - filter: '{"op":"eq","fieldId":0,"valueType":"string","value":"cat_a"}' +```typescript +{ fieldId: 0, valueType: 'string', value: 'category_a' } +{ fieldId: 1, valueType: 'f64', value: '0.95' } +{ fieldId: 2, valueType: 'u64', value: '42' } +``` + +### Query + +```typescript +const results = db.query( + vector: Float32Array, // query vector + k: number, // number of neighbors + options?: RvfQueryOptions // optional search parameters +); +// Returns: [{ id: number, distance: number }, ...] +``` + +**RvfQueryOptions:** + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `efSearch` | `number` | `100` | HNSW search quality (higher = better recall, slower) | +| `filter` | `string` | — | Filter expression as JSON string | +| `timeoutMs` | `number` | `0` | Query timeout in ms (0 = no timeout) | + +### Filter Expressions + +Filters are passed as JSON strings. All leaf filters require `fieldId`, `valueType`, and `value`: + +```javascript +// Equality +db.query(vec, 10, { + filter: '{"op":"eq","fieldId":0,"valueType":"string","value":"science"}' }); +// Range +db.query(vec, 10, { + filter: '{"op":"range","fieldId":1,"valueType":"f64","low":"0.5","high":"1.0"}' +}); + +// In-set +db.query(vec, 10, { + filter: '{"op":"in","fieldId":0,"valueType":"u64","values":["1","2","5"]}' +}); + +// Boolean combinations +db.query(vec, 10, { + filter: JSON.stringify({ + op: 'and', + children: [ + { op: 'eq', fieldId: 0, valueType: 'string', value: 'science' }, + { op: 'gt', fieldId: 1, valueType: 'f64', value: '0.8' } + ] + }) +}); + +// Negation +db.query(vec, 10, { + filter: '{"op":"not","child":{"op":"eq","fieldId":0,"valueType":"string","value":"spam"}}' +}); +``` + +**Supported operators:** `eq`, `ne`, `lt`, `le`, `gt`, `ge`, `in`, `range`, `and`, `or`, `not` + +**Supported value types:** `u64`, `i64`, `f64`, `string`, `bool` + +### Delete + +```typescript // Delete by ID -db.delete([0, 1]); +const result = db.delete([1, 2, 3]); +// Returns: { deleted: number, epoch: number } // Delete by filter -db.deleteByFilter('{"op":"gt","fieldId":1,"valueType":"f64","value":"0.5"}'); +const result = db.deleteByFilter( + '{"op":"gt","fieldId":1,"valueType":"f64","value":"0.9"}' +); +``` -// Compact -const compaction = db.compact(); -// { segmentsCompacted: 2, bytesReclaimed: 4096, epoch: 3 } +### Compact -// Status +Reclaims space from deleted vectors: + +```typescript +const result = db.compact(); +// Returns: { segmentsCompacted: number, bytesReclaimed: number, epoch: number } +``` + +### Status + +```typescript const status = db.status(); -// { totalVectors, totalSegments, fileSize, currentEpoch, ... } +// { +// totalVectors: number, +// totalSegments: number, +// fileSize: number, +// currentEpoch: number, +// profileId: number, +// compactionState: 'idle' | 'running' | 'emergency', +// deadSpaceRatio: number, +// readOnly: boolean +// } ``` -### Lineage +### Lineage & Derivation + +RVF tracks parent/child relationships with cryptographic hashes: ```typescript -// Get file identity -const fileId = db.fileId(); // "a1b2c3d4..." -const parentId = db.parentId(); // "00000000..." for root -const depth = db.lineageDepth(); // 0 for root +db.fileId(); // hex string — unique file identifier +db.parentId(); // hex string — parent's ID (zeros if root) +db.lineageDepth(); // 0 for root files -// Derive a child store -const child = db.derive('/tmp/child.rvf', { dimension: 128 }); +// Derive a child store (inherits dimensions and options) +const child = db.derive('/tmp/child.rvf'); child.lineageDepth(); // 1 +child.parentId(); // matches parent's fileId() ``` -### Kernel / eBPF Embedding +### Kernel & eBPF Embedding + +Embed compute segments alongside vector data: ```typescript -// Embed a kernel image -const kernelImage = Buffer.from(fs.readFileSync('kernel.bin')); -const segId = db.embedKernel( - 1, // arch: x86_64 - 0, // kernel_type - 0, // flags - kernelImage, // image bytes - 9000, // api_port - 'root=/dev/sda' // cmdline (optional) +// Embed a Linux microkernel +db.embedKernel( + 1, // arch: 0=x86_64, 1=aarch64 + 0, // kernel type + 0, // flags + Buffer.from(kernelImage), // kernel binary + 8080, // API port + 'console=ttyS0 quiet' // kernel cmdline (optional) ); // Extract kernel const kernel = db.extractKernel(); if (kernel) { - // kernel.header: Buffer (128-byte KernelHeader) - // kernel.image: Buffer (kernel image bytes) + console.log(kernel.header); // Buffer: 128-byte KernelHeader + console.log(kernel.image); // Buffer: kernel image bytes } -// Embed an eBPF program -const ebpfCode = Buffer.from(fs.readFileSync('program.o')); -db.embedEbpf(1, 2, 128, ebpfCode); +// Embed an eBPF XDP program +db.embedEbpf( + 1, // program type (XDP distance) + 2, // attach type (XDP ingress) + 384, // max vector dimension + Buffer.from(bytecode), // BPF ELF object + Buffer.from(btf) // optional BTF section +); // Extract eBPF const ebpf = db.extractEbpf(); if (ebpf) { - // ebpf.header: Buffer (64-byte EbpfHeader) - // ebpf.payload: Buffer (bytecode + optional BTF) + console.log(ebpf.header); // Buffer: 64-byte EbpfHeader + console.log(ebpf.payload); // Buffer: bytecode + BTF } ``` ### Segment Inspection ```typescript -// List all segments const segments = db.segments(); -// [{ id: 1, offset: 0, payloadLength: 4096, segType: "vec" }, ...] +// [{ id: 1, offset: 0, payloadLength: 4096, segType: 'manifest' }, +// { id: 2, offset: 4160, payloadLength: 51200, segType: 'vec' }, +// { id: 3, offset: 55424, payloadLength: 12288, segType: 'index' }] -// Get dimension -const dim = db.dimension(); // 128 +db.dimension(); // 384 ``` +## Build from Source + +```bash +# Prerequisites: Rust 1.87+, Node.js 18+ +cd crates/rvf/rvf-node +npm install +npm run build +``` + +## Related Packages + +| Package | Description | +|---------|-------------| +| [`@ruvector/rvf`](https://www.npmjs.com/package/@ruvector/rvf) | Unified TypeScript SDK | +| [`@ruvector/rvf-wasm`](https://www.npmjs.com/package/@ruvector/rvf-wasm) | Browser WASM package | +| [`@ruvector/rvf-mcp-server`](https://www.npmjs.com/package/@ruvector/rvf-mcp-server) | MCP server for AI agents | +| [`rvf-runtime`](https://crates.io/crates/rvf-runtime) | Rust runtime (powers this package) | + ## License MIT OR Apache-2.0 diff --git a/examples/rvf/Cargo.lock b/examples/rvf/Cargo.lock index e72bc7caf..5c7fa6ea0 100644 --- a/examples/rvf/Cargo.lock +++ b/examples/rvf/Cargo.lock @@ -441,7 +441,7 @@ dependencies = [ [[package]] name = "rvf-crypto" -version = "0.1.0" +version = "0.2.0" dependencies = [ "ed25519-dalek", "rvf-types", @@ -517,14 +517,14 @@ dependencies = [ [[package]] name = "rvf-runtime" -version = "0.1.0" +version = "0.2.0" dependencies = [ "rvf-types", ] [[package]] name = "rvf-types" -version = "0.1.0" +version = "0.2.0" [[package]] name = "rvf-wire" diff --git a/examples/rvf/README.md b/examples/rvf/README.md index 24aac4f56..421097843 100644 --- a/examples/rvf/README.md +++ b/examples/rvf/README.md @@ -353,12 +353,46 @@ cargo run --example network_interfaces # Network OS telemetry (60 interfaces) |---------|---------|-------------| | DNA-Style Lineage | (API) | Every derived file records its parent's hash and derivation type | | Domain Profiles | (API) | `.rvdna`, `.rvtext`, `.rvgraph`, `.rvvis` — same format, domain-specific hints | -| Computational Container | (API) | Embed a WASM microkernel, eBPF program, or bootable unikernel | +| Computational Container | `claude_code_appliance` | Embed a WASM microkernel, eBPF program, or bootable unikernel | +| Self-Booting Appliance | `claude_code_appliance` | 5.1 MB `.rvf` — boots Linux, serves queries, runs Claude Code | | Import (JSON/CSV/NumPy) | (API) | Load embeddings from `.json`, `.csv`, or `.npy` files via `rvf-import` or `rvf ingest` CLI | | Unified CLI | `rvf` | 9 subcommands: create, ingest, query, delete, status, inspect, compact, derive, serve | | Compaction | (API) | Garbage-collect tombstoned vectors and reclaim disk space | | Batch Delete | (API) | Delete vectors by ID with tombstone markers | +### Self-Booting RVF — Claude Code Appliance + +The `claude_code_appliance` example builds a complete self-booting AI development environment as a single `.rvf` file. It uses real infrastructure — a Docker-built Linux kernel, Ed25519 SSH keys, a BPF C socket filter, and a cryptographic witness chain. + +```bash +cd examples/rvf +cargo run --example claude_code_appliance +``` + +**What it produces** (5.1 MB file): + +``` +claude_code_appliance.rvf + ├── KERNEL_SEG Linux 6.8.12 bzImage (5.2 MB, x86_64) + ├── EBPF_SEG Socket filter — allows ports 2222, 8080 only + ├── VEC_SEG 20 package embeddings (128-dim) + ├── INDEX_SEG HNSW graph for package search + ├── WITNESS_SEG 6-entry tamper-evident audit trail + ├── CRYPTO_SEG 3 Ed25519 SSH user keys (root, deploy, claude) + ├── MANIFEST_SEG 4 KB root with segment directory + └── Snapshot v1 derived image with lineage tracking +``` + +**Boot and connect:** + +```bash +rvf launch claude_code_appliance.rvf # Boot on QEMU/Firecracker +ssh -p 2222 deploy@localhost # SSH in +curl -s localhost:8080/query -d '{"vector":[0.1,...], "k":5}' +``` + +Final file: **5.1 MB single `.rvf`** — boots Linux, serves queries, runs Claude Code. +
diff --git a/examples/rvf/output/claude_code_appliance.rvf b/examples/rvf/output/claude_code_appliance.rvf index 6d5080682..a8f8748f1 100644 Binary files a/examples/rvf/output/claude_code_appliance.rvf and b/examples/rvf/output/claude_code_appliance.rvf differ diff --git a/examples/rvf/output/claude_code_appliance_v1.rvf b/examples/rvf/output/claude_code_appliance_v1.rvf new file mode 100644 index 000000000..32698bec7 Binary files /dev/null and b/examples/rvf/output/claude_code_appliance_v1.rvf differ diff --git a/npm/packages/rvf-mcp-server/package.json b/npm/packages/rvf-mcp-server/package.json index ed1b94a6d..50e0bd17a 100644 --- a/npm/packages/rvf-mcp-server/package.json +++ b/npm/packages/rvf-mcp-server/package.json @@ -1,6 +1,6 @@ { "name": "@ruvector/rvf-mcp-server", - "version": "0.1.1", + "version": "0.1.3", "description": "MCP server for RuVector Format (RVF) vector database — stdio and SSE transports", "type": "module", "main": "dist/index.js", diff --git a/npm/packages/rvf-node/README.md b/npm/packages/rvf-node/README.md index fbe3fff76..362d198a6 100644 --- a/npm/packages/rvf-node/README.md +++ b/npm/packages/rvf-node/README.md @@ -1,6 +1,6 @@ # @ruvector/rvf-node -Node.js native bindings for the RuVector Format (RVF) cognitive container. +Native Node.js bindings for the [RuVector Format](https://github.com/ruvnet/ruvector/tree/main/crates/rvf) (RVF) vector database. Built with Rust via N-API for native speed with zero serialization overhead. ## Install @@ -8,52 +8,296 @@ Node.js native bindings for the RuVector Format (RVF) cognitive container. npm install @ruvector/rvf-node ``` -## Usage +## Features + +- **Native Rust performance** via N-API (napi-rs), no FFI marshaling +- **Single-file vector database** — crash-safe, no WAL, append-only +- **k-NN search** with HNSW progressive indexing (recall 0.70 → 0.95) +- **Metadata filtering** — Eq, Ne, Lt, Gt, Range, In, And, Or, Not +- **Lineage tracking** — DNA-style parent/child derivation chains +- **Kernel & eBPF embedding** — embed compute alongside vector data +- **Segment inspection** — enumerate all segments in the file +- **Cross-platform** — Linux (x86_64, aarch64), macOS (x86_64, Apple Silicon), Windows (x86_64) + +## Quick Start ```javascript const { RvfDatabase } = require('@ruvector/rvf-node'); -// Create a vector store -const db = RvfDatabase.create('vectors.rvf', { dimension: 384 }); +// Create a store +const db = RvfDatabase.create('vectors.rvf', { + dimension: 384, + metric: 'cosine', +}); // Insert vectors -db.ingestBatch(new Float32Array(384), [1]); +const vectors = new Float32Array(384 * 2); // 2 vectors, 384 dims each +vectors.fill(0.1); +db.ingestBatch(vectors, [1, 2]); // Query nearest neighbors -const results = db.query(new Float32Array(384), 10); -console.log(results); // [{ id, distance }] - -// Inspect segments -console.log(db.fileId()); // unique file UUID -console.log(db.dimension()); // 384 -console.log(db.segments()); // [{ type, id, size }] +const query = new Float32Array(384); +query.fill(0.15); +const results = db.query(query, 5); +// [{ id: 1, distance: 0.002 }, { id: 2, distance: 0.002 }] db.close(); ``` -## Features +## API Reference -- Native performance via N-API bindings to Rust `rvf-runtime` -- Full store lifecycle: create, open, ingest, query, delete, compact -- Lineage tracking with FileIdentity derivation chains -- Kernel/eBPF segment inspection -- Cross-platform: Linux x64/arm64, macOS x64/arm64, Windows x64 +### Store Lifecycle -## API +```typescript +// Create a new store +const db = RvfDatabase.create(path: string, options: RvfOptions); -| Method | Description | -|--------|-------------| -| `RvfDatabase.create(path, options)` | Create a new RVF store | -| `RvfDatabase.open(path)` | Open an existing store | -| `db.ingestBatch(vectors, ids)` | Insert vectors | -| `db.query(vector, k)` | k-NN similarity search | -| `db.delete(ids)` | Delete vectors by ID | -| `db.compact()` | Reclaim deleted space | -| `db.status()` | Get store stats | -| `db.segments()` | List all segments | -| `db.fileId()` | Get unique file UUID | -| `db.close()` | Close and release lock | +// Open existing store (read-write, acquires writer lock) +const db = RvfDatabase.open(path: string); + +// Open read-only (no lock, concurrent readers allowed) +const db = RvfDatabase.openReadonly(path: string); + +// Close and flush +db.close(); +``` + +**RvfOptions:** + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `dimension` | `number` | required | Vector dimensionality | +| `metric` | `string` | `"l2"` | `"l2"`, `"cosine"`, or `"inner_product"` | +| `profile` | `number` | `0` | Hardware profile: 0=Generic, 1=Core, 2=Hot, 3=Full | +| `signing` | `boolean` | `false` | Enable segment signing | +| `m` | `number` | `16` | HNSW M parameter (neighbor count) | +| `efConstruction` | `number` | `200` | HNSW index build quality | + +### Ingest Vectors + +```typescript +const result = db.ingestBatch( + vectors: Float32Array, // flat array of n * dimension floats + ids: number[], // vector IDs + metadata?: RvfMetadataEntry[] // optional metadata per vector +); +// Returns: { accepted: number, rejected: number, epoch: number } +``` + +**Metadata entry format:** + +```typescript +{ fieldId: 0, valueType: 'string', value: 'category_a' } +{ fieldId: 1, valueType: 'f64', value: '0.95' } +{ fieldId: 2, valueType: 'u64', value: '42' } +``` + +### Query + +```typescript +const results = db.query( + vector: Float32Array, // query vector + k: number, // number of neighbors + options?: RvfQueryOptions // optional search parameters +); +// Returns: [{ id: number, distance: number }, ...] +``` + +**RvfQueryOptions:** + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `efSearch` | `number` | `100` | HNSW search quality (higher = better recall, slower) | +| `filter` | `string` | — | Filter expression as JSON string | +| `timeoutMs` | `number` | `0` | Query timeout in ms (0 = no timeout) | + +### Filter Expressions + +Filters are passed as JSON strings. All leaf filters require `fieldId`, `valueType`, and `value`: + +```javascript +// Equality +db.query(vec, 10, { + filter: '{"op":"eq","fieldId":0,"valueType":"string","value":"science"}' +}); + +// Range +db.query(vec, 10, { + filter: '{"op":"range","fieldId":1,"valueType":"f64","low":"0.5","high":"1.0"}' +}); + +// In-set +db.query(vec, 10, { + filter: '{"op":"in","fieldId":0,"valueType":"u64","values":["1","2","5"]}' +}); + +// Boolean combinations +db.query(vec, 10, { + filter: JSON.stringify({ + op: 'and', + children: [ + { op: 'eq', fieldId: 0, valueType: 'string', value: 'science' }, + { op: 'gt', fieldId: 1, valueType: 'f64', value: '0.8' } + ] + }) +}); + +// Negation +db.query(vec, 10, { + filter: '{"op":"not","child":{"op":"eq","fieldId":0,"valueType":"string","value":"spam"}}' +}); +``` + +**Supported operators:** `eq`, `ne`, `lt`, `le`, `gt`, `ge`, `in`, `range`, `and`, `or`, `not` + +**Supported value types:** `u64`, `i64`, `f64`, `string`, `bool` + +### Delete + +```typescript +// Delete by ID +const result = db.delete([1, 2, 3]); +// Returns: { deleted: number, epoch: number } + +// Delete by filter +const result = db.deleteByFilter( + '{"op":"gt","fieldId":1,"valueType":"f64","value":"0.9"}' +); +``` + +### Compact + +Reclaims space from deleted vectors: + +```typescript +const result = db.compact(); +// Returns: { segmentsCompacted: number, bytesReclaimed: number, epoch: number } +``` + +### Status + +```typescript +const status = db.status(); +// { +// totalVectors: number, +// totalSegments: number, +// fileSize: number, +// currentEpoch: number, +// profileId: number, +// compactionState: 'idle' | 'running' | 'emergency', +// deadSpaceRatio: number, +// readOnly: boolean +// } +``` + +### Lineage & Derivation + +RVF tracks parent/child relationships with cryptographic hashes: + +```typescript +db.fileId(); // hex string — unique file identifier +db.parentId(); // hex string — parent's ID (zeros if root) +db.lineageDepth(); // 0 for root files + +// Derive a child store (inherits dimensions and options) +const child = db.derive('/tmp/child.rvf'); +child.lineageDepth(); // 1 +child.parentId(); // matches parent's fileId() +``` + +### Kernel & eBPF Embedding + +Embed compute segments alongside vector data: + +```typescript +// Embed a Linux microkernel +db.embedKernel( + 1, // arch: 0=x86_64, 1=aarch64 + 0, // kernel type + 0, // flags + Buffer.from(kernelImage), // kernel binary + 8080, // API port + 'console=ttyS0 quiet' // kernel cmdline (optional) +); + +// Extract kernel +const kernel = db.extractKernel(); +if (kernel) { + console.log(kernel.header); // Buffer: 128-byte KernelHeader + console.log(kernel.image); // Buffer: kernel image bytes +} + +// Embed an eBPF XDP program +db.embedEbpf( + 1, // program type (XDP distance) + 2, // attach type (XDP ingress) + 384, // max vector dimension + Buffer.from(bytecode), // BPF ELF object + Buffer.from(btf) // optional BTF section +); + +// Extract eBPF +const ebpf = db.extractEbpf(); +if (ebpf) { + console.log(ebpf.header); // Buffer: 64-byte EbpfHeader + console.log(ebpf.payload); // Buffer: bytecode + BTF +} +``` + +### Segment Inspection + +```typescript +const segments = db.segments(); +// [{ id: 1, offset: 0, payloadLength: 4096, segType: 'manifest' }, +// { id: 2, offset: 4160, payloadLength: 51200, segType: 'vec' }, +// { id: 3, offset: 55424, payloadLength: 12288, segType: 'index' }] + +db.dimension(); // 384 +``` + +## Self-Booting RVF + +An `.rvf` file can embed a Linux kernel, eBPF programs, and SSH keys alongside vector data — producing a single file that boots as a microservice. + +The Claude Code Appliance example builds a complete AI dev environment: + +```bash +cd examples/rvf +cargo run --example claude_code_appliance +``` + +``` +claude_code_appliance.rvf + ├── KERNEL_SEG Linux 6.8.12 bzImage (5.2 MB, x86_64) + ├── EBPF_SEG Socket filter — ports 2222, 8080 only + ├── VEC_SEG 20 package embeddings (128-dim) + ├── INDEX_SEG HNSW graph for package search + ├── WITNESS_SEG 6-entry tamper-evident audit trail + └── CRYPTO_SEG 3 Ed25519 SSH user keys +``` + +Final file: **5.1 MB single `.rvf`** — boots Linux, serves queries, runs Claude Code. + +See the [full RVF documentation](https://github.com/ruvnet/ruvector/tree/main/crates/rvf) for details. + +## Build from Source + +```bash +# Prerequisites: Rust 1.87+, Node.js 18+ +cd crates/rvf/rvf-node +npm install +npm run build +``` + +## Related Packages + +| Package | Description | +|---------|-------------| +| [`@ruvector/rvf`](https://www.npmjs.com/package/@ruvector/rvf) | Unified TypeScript SDK | +| [`@ruvector/rvf-wasm`](https://www.npmjs.com/package/@ruvector/rvf-wasm) | Browser WASM package | +| [`@ruvector/rvf-mcp-server`](https://www.npmjs.com/package/@ruvector/rvf-mcp-server) | MCP server for AI agents | +| [`rvf-runtime`](https://crates.io/crates/rvf-runtime) | Rust runtime (powers this package) | ## License -MIT +MIT OR Apache-2.0 diff --git a/npm/packages/rvf-node/package.json b/npm/packages/rvf-node/package.json index ad59e8827..dcfb9d465 100644 --- a/npm/packages/rvf-node/package.json +++ b/npm/packages/rvf-node/package.json @@ -1,6 +1,6 @@ { "name": "@ruvector/rvf-node", - "version": "0.1.1", + "version": "0.1.3", "description": "RuVector Format Node.js native bindings", "main": "index.js", "types": "index.d.ts", diff --git a/npm/packages/rvf-wasm/package.json b/npm/packages/rvf-wasm/package.json index d981cbf66..f7e4a9721 100644 --- a/npm/packages/rvf-wasm/package.json +++ b/npm/packages/rvf-wasm/package.json @@ -1,6 +1,6 @@ { "name": "@ruvector/rvf-wasm", - "version": "0.1.1", + "version": "0.1.3", "description": "RuVector Format WASM build for browsers", "main": "pkg/rvf_runtime.js", "types": "pkg/rvf_runtime.d.ts", diff --git a/npm/packages/rvf/README.md b/npm/packages/rvf/README.md index 5a12d6871..c6d8d702c 100644 --- a/npm/packages/rvf/README.md +++ b/npm/packages/rvf/README.md @@ -419,11 +419,13 @@ Build an AI development environment as a single sealed file: // - Ed25519 + ML-DSA-65 signatures let store = RvfStore::create("claude_code_appliance.rvf", options)?; // ... embed packages, kernel, eBPF, witness chain, signatures ... -// Result: 17 KB sealed cognitive container +// Result: 5.1 MB sealed cognitive container ``` Run: `cd examples/rvf && cargo run --example claude_code_appliance` +Final file: **5.1 MB single `.rvf`** — boots Linux, serves queries, runs Claude Code. + ### CLI Proof-of-Operations ```bash diff --git a/npm/packages/rvf/package.json b/npm/packages/rvf/package.json index b76e944fa..a9683454d 100644 --- a/npm/packages/rvf/package.json +++ b/npm/packages/rvf/package.json @@ -1,6 +1,6 @@ { "name": "@ruvector/rvf", - "version": "0.1.3", + "version": "0.1.5", "description": "RuVector Format — unified TypeScript SDK for vector intelligence", "main": "dist/index.js", "module": "dist/index.js",