mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-26 07:44:05 +00:00
* feat(rvf): add RuVector Format universal substrate specification Research and design for RVF — a streaming, progressive, adaptive, quantum-secure binary format for vector intelligence. Covers append-only segment model, two-level tail manifests, temperature tiering, progressive HNSW indexing, epoch-based overlay system, SIMD-optimized query paths, WASM microkernel for Cognitum tiles, domain profiles (RVDNA, RVText, RVGraph, RVVision), and post-quantum cryptography. https://claude.ai/code/session_01DDqjGE51JpsRE3DgUjFyjW * feat(rvf): add deletion, filtered search, concurrency, and operations specs Fill four specification gaps in the RVF format design: - spec/07: Vector deletion lifecycle, JOURNAL_SEG wire format, deletion bitmaps - spec/08: Filtered search with META_SEG, METAIDX_SEG, filter expression language - spec/09: Writer locking, reader-writer coordination, versioning, space reclamation - spec/10: Batch operations API, error codes, network streaming protocol Also fixes the segment header field conflict between spec/01 and wire/binary-layout.md (checksum_algo/compression now u8, adds uncompressed_len at 0x38). https://claude.ai/code/session_01DDqjGE51JpsRE3DgUjFyjW * feat(rvf): add RuVector Format SDK, 40 examples, MCP server, and documentation Complete RVF implementation including: - 12 Rust crates (rvf-types, rvf-wire, rvf-manifest, rvf-index, rvf-quant, rvf-crypto, rvf-runtime, rvf-import, rvf-wasm, rvf-node, rvf-server, plus integration tests) - 40 runnable examples covering core storage, agentic AI, production patterns, vertical domains, exotic capabilities, runtime targets, network/security, POSIX/systems, and network operations - TypeScript SDK (npm/packages/rvf) with RvfDatabase class - MCP server (npm/packages/rvf-mcp-server) with stdio and SSE transports - Node.js N-API bindings (npm/packages/rvf-node) - WASM package (npm/packages/rvf-wasm) - ADR-029 (canonical format), ADR-030 (computational container), ADR-031 (example repository) - DNA-style lineage provenance, computational containers (KERNEL_SEG, EBPF_SEG), witness chains, TEE attestation, domain profiles - Superseded ADR annotations for ADR-001, ADR-005, ADR-006, ADR-018-021 Co-Authored-By: claude-flow <ruv@ruv.net> * feat(rvf): add CLI, WASM store, generate_all, and 46 output .rvf files - Add rvf-cli crate (665 lines, 9 subcommands: create/ingest/query/delete/status/inspect/compact/derive/serve) - Add WASM control plane store (alloc_setup, segment, store modules) for ~46 KB binary - Add generate_all.rs example producing 46 persistent .rvf files in output/ - Add Node.js N-API bindings for lineage, kernel/eBPF, and inspection - Add npm TypeScript backend/database/types for RVF integration - Update READMEs with CLI sections, MCP server docs, and crate map (13 crates) - All 40 examples verified passing Co-Authored-By: claude-flow <ruv@ruv.net> * feat(rvf): add Claude Code appliance, improve Quick Start, fix API docs - Add claude_code_appliance.rs: self-booting RVF with SSH + Claude Code install (curl -fsSL https://claude.ai/install.sh | bash), 3 SSH users, eBPF filter, 20-package manifest, witness chain, lineage snapshot - Improve Quick Start: Install section (crate/CLI/npm/WASM/MCP), WASM browser example, generate_all reference, expanded Rust crate deps - Fix embed_kernel/embed_ebpf API docs to match actual signatures (u8 params with `as u8` cast, 6-param kernel, Option<&[u8]> btf) - Update generate_all.rs: add claude_code_appliance generator (47 files) - Regenerate all 47 output .rvf files Co-Authored-By: claude-flow <ruv@ruv.net> * feat(rvf): add RVCOW branching, real kernel/eBPF/launcher, 795 tests Vector-native copy-on-write branching (ADR-031) with four new segment types (COW_MAP 0x20, REFCOUNT 0x21, MEMBERSHIP 0x22, DELTA 0x23), real Linux microkernel builder, QEMU microVM launcher, real eBPF programs, and 128-byte KernelBinding for tamper-evident kernel-manifest linkage. New crates: - rvf-kernel: Docker-based kernel build, real cpio/newc initramfs builder, SHA3-256 verification, prebuilt kernel support (37 tests) - rvf-launch: QEMU microVM launcher with QMP shutdown, KVM/TCG detection, virtio-blk/net port forwarding, kernel extraction (8 tests) - rvf-ebpf: 3 real BPF C programs (xdp_distance, socket_filter, tc_query_route) with clang compilation support (17 tests) RVCOW runtime: - CowEngine with read/write paths, write coalescing, snapshot-freeze - CowMap (flat-array), MembershipFilter (bitmap), CowCompactor - 3x read performance via pread optimization (1.3us/vector) - Branch creation: 2.6ms for 10K vectors, child = 162 bytes Security: 20-finding audit, 7 fixes applied including division-by-zero guards, integer overflow checks, and KernelBinding::from_bytes_validated(). CLI: 8 new commands (launch, embed-kernel, embed-ebpf, filter, freeze, verify-witness, verify-attestation, rebuild-refcounts), serve wired to real rvf-server. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(rvf): update README, add crate/npm READMEs, publish to crates.io and npm - Rewrite README with cognitive container terminology, grouped features, 4 comparison tables (vs Docker, Vector DBs, Git LFS, SQLite), updated benchmarks, architecture diagram, and 45 examples - Add READMEs for rvf-kernel, rvf-launch, rvf-ebpf, rvf-import crates - Add READMEs for @ruvector/rvf, rvf-node, rvf-wasm, rvf-mcp-server npm packages - Fix Cargo.toml metadata (homepage, readme, categories, keywords) and add version specs to all path dependencies for crates.io publishing - Fix clippy warnings in rvf-kernel/initramfs.rs and rvf-launch/lib.rs - Published to crates.io: rvf-types, rvf-wire, rvf-manifest, rvf-quant, rvf-index, rvf-crypto (remaining crates pending rate limit) - Published to npm: @ruvector/rvf, @ruvector/rvf-node, @ruvector/rvf-wasm, @ruvector/rvf-mcp-server Co-Authored-By: claude-flow <ruv@ruv.net> * chore: add rvf-kernel, rvf-ebpf, rvf-launch, rvf-server, rvf-import, rvf-cli to workspace Include all 15 RVF crates plus integration tests and benchmarks in the root workspace members list so cargo publish can resolve them by name. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(rvf): add published packages, cognitive container branding, grouped capabilities - Add Published Packages section with 13 crates.io + 4 npm tables - Add Platform Support table (Linux, macOS, Windows, WASM, no_std) - Expand capability table from 9 to 15 rows in 4 groups - Rewrite all "How" descriptions in plain language - Update .rvf diagram to show all 20 segment types - Rename ADRs: computational container -> cognitive container - Add emojis to all section headers Co-Authored-By: claude-flow <ruv@ruv.net> * feat: update root README with RVF cognitive containers, expanded capabilities - Update intro: "gets smarter + ships as cognitive container" - Add self-booting microservice row to Pinecone comparison table - Expand capabilities from 34 to 42 features with dedicated RVF section - Update "Think of it as" to include Docker comparison and RVF explanation - Add RVF collapsed group to Ecosystem (13 crates, 4 npm, install commands) - Add RVF to Platform & Edge section with install commands - Add RVF npm packages (4) and Rust crates (13) to package reference - Add RVF rows to feature comparison table (6 new rows) - Add ADR-030/031 to ADR list - Add RVF to Installation table, Project Structure - Update attention mechanisms count from 39 to 40+ - Update npm count to 49+, Rust crates to 83 - Update footer with crates.io and RVF links Co-Authored-By: claude-flow <ruv@ruv.net> * feat: expand comparison table with emojis, cost, audit, branching, single-file Co-Authored-By: claude-flow <ruv@ruv.net> * docs: rewrite comparison table in plain language Co-Authored-By: claude-flow <ruv@ruv.net> * chore: clean up empty code change sections in the changes log --------- Co-authored-by: Claude <noreply@anthropic.com>
9.8 KiB
9.8 KiB
RVF Acceptance Tests and Performance Targets
1. Primary Acceptance Test
Cold start on a 10 million vector file: load and answer the first query with a useful result (recall@10 >= 0.70) without reading more than the last 4 MB, then converge to full quality (recall@10 >= 0.95) as it progressively maps more segments.
Test Parameters
Dataset: 10 million vectors
Dimensions: 384 (sentence embedding size)
Base dtype: fp16 (768 bytes per vector)
Raw file size: ~7.2 GB (vectors only)
With index: ~10-12 GB total
Query set: 1000 queries from held-out test set
Ground truth: Brute-force exact k-NN (k=10)
Metric: L2 distance
Success Criteria
| Phase | Time Budget | Data Read | Min Recall@10 | Description |
|---|---|---|---|---|
| Boot | < 5 ms | 4 KB (Level 0) | N/A | Parse root manifest |
| First query | < 50 ms | <= 4 MB | >= 0.70 | Layer A + hot cache |
| Working quality | < 500 ms | <= 200 MB | >= 0.85 | Layer A + B |
| Full quality | < 5 s | <= 4 GB | >= 0.95 | Layers A + B + C |
| Optimized | < 30 s | Full file | >= 0.98 | All layers + hot tier |
Measurement Methodology
1. Create RVF file from 10M vector dataset
- Build full HNSW index (M=16, ef_construction=200)
- Compute temperature tiers (default: all warm initially)
- Write with all segment types
2. Cold start measurement
- Drop filesystem cache: echo 3 > /proc/sys/vm/drop_caches
- Open file, start timer
- Read Level 0 (4 KB), record time T_boot
- Read hotset data, record time T_hotset
- Execute first query, record time T_first_query and recall@10
- Continue progressive loading
- At each milestone: record time, data read, recall@10
3. Throughput measurement (warm)
- After full load, execute 1000 queries
- Measure queries per second (QPS)
- Measure p50, p95, p99 latency
- Measure recall@10 average
4. Streaming ingest measurement
- Start with empty file
- Ingest 10M vectors in streaming mode
- Measure ingest rate (vectors/second)
- Measure file size over time
- Verify crash safety (kill -9 at random points, verify recovery)
2. Performance Targets
Query Latency (10M vectors, 384 dim, fp16)
| Hardware | QPS (single thread) | p50 Latency | p95 Latency | p99 Latency |
|---|---|---|---|---|
| Desktop (AVX-512) | 5,000-15,000 | 0.1 ms | 0.3 ms | 1.0 ms |
| Desktop (AVX2) | 3,000-8,000 | 0.2 ms | 0.5 ms | 2.0 ms |
| Laptop (NEON) | 2,000-5,000 | 0.3 ms | 1.0 ms | 3.0 ms |
| WASM (browser) | 500-2,000 | 1.0 ms | 3.0 ms | 10.0 ms |
| Cognitum tile | 100-500 | 2.0 ms | 5.0 ms | 15.0 ms |
Streaming Ingest Rate
| Hardware | Vectors/Second | Bytes/Second | Notes |
|---|---|---|---|
| NVMe SSD | 200K-500K | 150-380 MB/s | fsync every 1000 vectors |
| SATA SSD | 50K-100K | 38-76 MB/s | fsync every 1000 vectors |
| HDD | 10K-30K | 7-23 MB/s | Sequential append |
| Network (1 Gbps) | 50K-100K | 38-76 MB/s | Streaming over network |
Progressive Load Times
| Phase | NVMe SSD | SATA SSD | HDD | Network |
|---|---|---|---|---|
| Boot (4 KB) | < 0.1 ms | < 0.5 ms | < 10 ms | < 50 ms |
| First query (4 MB) | < 2 ms | < 10 ms | < 100 ms | < 500 ms |
| Working quality (200 MB) | < 100 ms | < 500 ms | < 5 s | < 20 s |
| Full quality (4 GB) | < 2 s | < 10 s | < 120 s | < 400 s |
Space Efficiency
| Configuration | Bytes/Vector | File Size (10M) | Ratio vs Raw |
|---|---|---|---|
| Raw fp32 | 1,536 | 14.3 GB | 1.0x |
| RVF uniform fp16 | 768 + overhead | 8.0 GB | 0.56x |
| RVF adaptive (equilibrium) | ~300 avg | 3.2 GB | 0.22x |
| RVF aggressive (binary cold) | ~100 avg | 1.1 GB | 0.08x |
3. Crash Safety Tests
Test 1: Kill During Vector Ingest
1. Start ingesting 1M vectors
2. After 500K vectors: kill -9 the writer
3. Verify: file is readable
4. Verify: latest valid manifest is found
5. Verify: all vectors referenced by latest manifest are intact
6. Verify: no data corruption (all segment hashes valid)
Pass criteria: Zero data loss for committed segments. At most the last incomplete segment is lost (bounded by fsync interval).
Test 2: Kill During Manifest Write
1. Create file with 1M vectors
2. Trigger manifest rewrite (add metadata, trigger compaction)
3. Kill -9 during manifest write
4. Verify: file falls back to previous valid manifest
5. Verify: all queries work correctly with previous manifest
Pass criteria: Automatic fallback to previous manifest. No manual recovery needed.
Test 3: Kill During Compaction
1. Create file with 1M vectors across 100 small VEC_SEGs
2. Trigger compaction
3. Kill -9 during compaction
4. Verify: file is readable (old segments still valid)
5. Verify: partial compaction output is safely ignored
Pass criteria: Old segments remain valid. Incomplete compaction output has no manifest reference and is safely orphaned.
Test 4: Bit Flip Detection
1. Create valid RVF file
2. Flip random bits in various locations
3. Verify: corruption detected by hash/CRC checks
4. Verify: specific corrupted segment identified
5. Verify: other segments still readable
Pass criteria: 100% detection of single-bit flips. Corruption isolated to affected segment.
4. Scalability Tests
Test: 1 Billion Vectors
Dataset: 1B vectors, 384 dimensions, fp16
File size: ~700 GB (raw) -> ~200 GB (adaptive RVF)
Hardware: Server with 256 GB RAM, NVMe array
Verify:
- Boot time < 10 ms
- First query < 100 ms
- Full quality convergence < 60 s
- Recall@10 >= 0.95 at full quality
- Streaming ingest sustained at 100K+ vectors/second
Test: High Dimensionality
Dataset: 1M vectors, 4096 dimensions (LLM embeddings)
File size: ~8 GB (fp16)
Verify:
- PQ compression to 5-bit achieves >= 10x compression
- Recall@10 >= 0.90 with PQ
- Query latency < 5 ms (p95) with PQ + HNSW
Test: Multi-File Sharding
Dataset: 100M vectors across 10 shard files
Verify:
- Transparent query across all shards
- Shard addition without full rebuild
- Individual shard compaction
- Shard removal with manifest update only
5. WASM Performance Tests
Browser Environment
Runtime: Chrome V8 / Firefox SpiderMonkey
SIMD: WASM v128
Memory: Limited to 4 GB WASM heap
Test: Load 1M vector RVF file via fetch()
- Boot time < 50 ms
- First query < 200 ms (after boot)
- QPS >= 500 (single thread)
- Memory usage < 500 MB
Cognitum Tile Simulation
Runtime: wasmtime with memory limits
Code limit: 8 KB
Data limit: 8 KB
Scratch: 64 KB
Test: Process 1000 blocks via hub protocol
- Distance computation matches reference implementation
- Top-K results match brute-force within quantization tolerance
- No memory access out of bounds
- Tile recovers from simulated faults
6. Interoperability Tests
Round-Trip Test
1. Create RVF file from numpy arrays
2. Read back with independent implementation
3. Verify: all vectors bit-identical
4. Verify: all metadata preserved
5. Verify: index produces same results
Profile Compatibility Test
1. Create RVDNA file with genomic data
2. Create RVText file with text embeddings
3. Read both with generic RVF reader
4. Verify: generic reader can access vectors and metadata
5. Verify: profile-specific features degrade gracefully
Version Forward Compatibility Test
1. Create RVF file with version 1
2. Add segments with hypothetical version 2 features (unknown tags)
3. Read with version 1 reader
4. Verify: version 1 reader skips unknown segments/tags
5. Verify: version 1 data is fully accessible
7. Security Tests
Signature Verification
1. Create signed RVF file (ML-DSA-65)
2. Verify all segment signatures
3. Modify one byte in a signed segment
4. Verify: modification detected
5. Verify: other segments still valid
Encryption Round-Trip
1. Create encrypted RVF file (ML-KEM-768 + AES-256-GCM)
2. Decrypt with correct key
3. Verify: plaintext matches original
4. Attempt decrypt with wrong key
5. Verify: decryption fails (GCM auth tag mismatch)
Key Rotation
1. Create file signed with key A
2. Rotate to key B (write CRYPTO_SEG rotation record)
3. Write new segments signed with key B
4. Verify: old segments valid with key A
5. Verify: new segments valid with key B
6. Verify: cross-signature in rotation record is valid
8. Benchmark Harness
Recommended Tools
| Purpose | Tool | Notes |
|---|---|---|
| Latency measurement | criterion (Rust) / benchmark.js | Statistical rigor |
| Recall measurement | Custom recall@K computation | Against brute-force ground truth |
| Memory profiling | valgrind massif / Chrome DevTools | Peak and sustained |
| I/O profiling | blktrace / iostat | Verify read patterns |
| SIMD verification | Intel SDE / ARM emulator | Correct SIMD codegen |
| Crash testing | Custom harness with kill -9 | Random timing |
Report Format
Each benchmark run produces a report:
{
"test_name": "cold_start_10m",
"dataset": {
"vector_count": 10000000,
"dimensions": 384,
"dtype": "fp16",
"file_size_bytes": 10737418240
},
"hardware": {
"cpu": "Intel Xeon w5-3435X",
"simd": "AVX-512",
"ram_gb": 256,
"storage": "NVMe Samsung 990 Pro"
},
"results": {
"boot_ms": 0.08,
"first_query_ms": 12.3,
"first_query_recall_at_10": 0.73,
"working_quality_ms": 340,
"working_quality_recall_at_10": 0.87,
"full_quality_ms": 3200,
"full_quality_recall_at_10": 0.96,
"steady_state_qps": 8500,
"steady_state_p50_ms": 0.12,
"steady_state_p95_ms": 0.28,
"steady_state_p99_ms": 0.85,
"data_read_first_query_mb": 3.2,
"data_read_working_quality_mb": 180
}
}