ruvector/docs/research/rvf/microkernel/wasm-runtime.md
rUv f8870b3c71 feat(rvf): RuVector Format — Universal Cognitive Container SDK (#166)
* feat(rvf): add RuVector Format universal substrate specification

Research and design for RVF — a streaming, progressive, adaptive, quantum-secure
binary format for vector intelligence. Covers append-only segment model, two-level
tail manifests, temperature tiering, progressive HNSW indexing, epoch-based overlay
system, SIMD-optimized query paths, WASM microkernel for Cognitum tiles, domain
profiles (RVDNA, RVText, RVGraph, RVVision), and post-quantum cryptography.

https://claude.ai/code/session_01DDqjGE51JpsRE3DgUjFyjW

* feat(rvf): add deletion, filtered search, concurrency, and operations specs

Fill four specification gaps in the RVF format design:
- spec/07: Vector deletion lifecycle, JOURNAL_SEG wire format, deletion bitmaps
- spec/08: Filtered search with META_SEG, METAIDX_SEG, filter expression language
- spec/09: Writer locking, reader-writer coordination, versioning, space reclamation
- spec/10: Batch operations API, error codes, network streaming protocol

Also fixes the segment header field conflict between spec/01 and wire/binary-layout.md
(checksum_algo/compression now u8, adds uncompressed_len at 0x38).

https://claude.ai/code/session_01DDqjGE51JpsRE3DgUjFyjW

* feat(rvf): add RuVector Format SDK, 40 examples, MCP server, and documentation

Complete RVF implementation including:
- 12 Rust crates (rvf-types, rvf-wire, rvf-manifest, rvf-index, rvf-quant,
  rvf-crypto, rvf-runtime, rvf-import, rvf-wasm, rvf-node, rvf-server,
  plus integration tests)
- 40 runnable examples covering core storage, agentic AI, production
  patterns, vertical domains, exotic capabilities, runtime targets,
  network/security, POSIX/systems, and network operations
- TypeScript SDK (npm/packages/rvf) with RvfDatabase class
- MCP server (npm/packages/rvf-mcp-server) with stdio and SSE transports
- Node.js N-API bindings (npm/packages/rvf-node)
- WASM package (npm/packages/rvf-wasm)
- ADR-029 (canonical format), ADR-030 (computational container),
  ADR-031 (example repository)
- DNA-style lineage provenance, computational containers (KERNEL_SEG,
  EBPF_SEG), witness chains, TEE attestation, domain profiles
- Superseded ADR annotations for ADR-001, ADR-005, ADR-006, ADR-018-021

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(rvf): add CLI, WASM store, generate_all, and 46 output .rvf files

- Add rvf-cli crate (665 lines, 9 subcommands: create/ingest/query/delete/status/inspect/compact/derive/serve)
- Add WASM control plane store (alloc_setup, segment, store modules) for ~46 KB binary
- Add generate_all.rs example producing 46 persistent .rvf files in output/
- Add Node.js N-API bindings for lineage, kernel/eBPF, and inspection
- Add npm TypeScript backend/database/types for RVF integration
- Update READMEs with CLI sections, MCP server docs, and crate map (13 crates)
- All 40 examples verified passing

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(rvf): add Claude Code appliance, improve Quick Start, fix API docs

- Add claude_code_appliance.rs: self-booting RVF with SSH + Claude Code
  install (curl -fsSL https://claude.ai/install.sh | bash), 3 SSH users,
  eBPF filter, 20-package manifest, witness chain, lineage snapshot
- Improve Quick Start: Install section (crate/CLI/npm/WASM/MCP), WASM
  browser example, generate_all reference, expanded Rust crate deps
- Fix embed_kernel/embed_ebpf API docs to match actual signatures
  (u8 params with `as u8` cast, 6-param kernel, Option<&[u8]> btf)
- Update generate_all.rs: add claude_code_appliance generator (47 files)
- Regenerate all 47 output .rvf files

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(rvf): add RVCOW branching, real kernel/eBPF/launcher, 795 tests

Vector-native copy-on-write branching (ADR-031) with four new segment
types (COW_MAP 0x20, REFCOUNT 0x21, MEMBERSHIP 0x22, DELTA 0x23),
real Linux microkernel builder, QEMU microVM launcher, real eBPF
programs, and 128-byte KernelBinding for tamper-evident kernel-manifest
linkage.

New crates:
- rvf-kernel: Docker-based kernel build, real cpio/newc initramfs builder,
  SHA3-256 verification, prebuilt kernel support (37 tests)
- rvf-launch: QEMU microVM launcher with QMP shutdown, KVM/TCG detection,
  virtio-blk/net port forwarding, kernel extraction (8 tests)
- rvf-ebpf: 3 real BPF C programs (xdp_distance, socket_filter,
  tc_query_route) with clang compilation support (17 tests)

RVCOW runtime:
- CowEngine with read/write paths, write coalescing, snapshot-freeze
- CowMap (flat-array), MembershipFilter (bitmap), CowCompactor
- 3x read performance via pread optimization (1.3us/vector)
- Branch creation: 2.6ms for 10K vectors, child = 162 bytes

Security: 20-finding audit, 7 fixes applied including division-by-zero
guards, integer overflow checks, and KernelBinding::from_bytes_validated().

CLI: 8 new commands (launch, embed-kernel, embed-ebpf, filter, freeze,
verify-witness, verify-attestation, rebuild-refcounts), serve wired to
real rvf-server.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(rvf): update README, add crate/npm READMEs, publish to crates.io and npm

- Rewrite README with cognitive container terminology, grouped features,
  4 comparison tables (vs Docker, Vector DBs, Git LFS, SQLite), updated
  benchmarks, architecture diagram, and 45 examples
- Add READMEs for rvf-kernel, rvf-launch, rvf-ebpf, rvf-import crates
- Add READMEs for @ruvector/rvf, rvf-node, rvf-wasm, rvf-mcp-server npm packages
- Fix Cargo.toml metadata (homepage, readme, categories, keywords) and
  add version specs to all path dependencies for crates.io publishing
- Fix clippy warnings in rvf-kernel/initramfs.rs and rvf-launch/lib.rs
- Published to crates.io: rvf-types, rvf-wire, rvf-manifest, rvf-quant,
  rvf-index, rvf-crypto (remaining crates pending rate limit)
- Published to npm: @ruvector/rvf, @ruvector/rvf-node, @ruvector/rvf-wasm,
  @ruvector/rvf-mcp-server

Co-Authored-By: claude-flow <ruv@ruv.net>

* chore: add rvf-kernel, rvf-ebpf, rvf-launch, rvf-server, rvf-import, rvf-cli to workspace

Include all 15 RVF crates plus integration tests and benchmarks in the
root workspace members list so cargo publish can resolve them by name.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(rvf): add published packages, cognitive container branding, grouped capabilities

- Add Published Packages section with 13 crates.io + 4 npm tables
- Add Platform Support table (Linux, macOS, Windows, WASM, no_std)
- Expand capability table from 9 to 15 rows in 4 groups
- Rewrite all "How" descriptions in plain language
- Update .rvf diagram to show all 20 segment types
- Rename ADRs: computational container -> cognitive container
- Add emojis to all section headers

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: update root README with RVF cognitive containers, expanded capabilities

- Update intro: "gets smarter + ships as cognitive container"
- Add self-booting microservice row to Pinecone comparison table
- Expand capabilities from 34 to 42 features with dedicated RVF section
- Update "Think of it as" to include Docker comparison and RVF explanation
- Add RVF collapsed group to Ecosystem (13 crates, 4 npm, install commands)
- Add RVF to Platform & Edge section with install commands
- Add RVF npm packages (4) and Rust crates (13) to package reference
- Add RVF rows to feature comparison table (6 new rows)
- Add ADR-030/031 to ADR list
- Add RVF to Installation table, Project Structure
- Update attention mechanisms count from 39 to 40+
- Update npm count to 49+, Rust crates to 83
- Update footer with crates.io and RVF links

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: expand comparison table with emojis, cost, audit, branching, single-file

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: rewrite comparison table in plain language

Co-Authored-By: claude-flow <ruv@ruv.net>

* chore: clean up empty code change sections in the changes log

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-14 13:14:49 -05:00

397 lines
13 KiB
Markdown

# RVF WASM Microkernel and Cognitum Hardware Mapping
## 1. Design Philosophy
RVF must run on hardware ranging from a 64 KB WASM tile to a petabyte
cluster. The WASM microkernel is the minimal runtime that makes a tile
a first-class RVF citizen — capable of answering queries, ingesting
streams, and participating in distributed search.
The microkernel is not a shrunken version of the full runtime. It is a
**purpose-built execution core** that exposes the exact set of operations
a tile needs, and nothing more.
## 2. Cognitum Tile Architecture
### Hardware Constraints
```
+-----------------------------------+
| Cognitum Tile |
| |
| Code Memory: 8 KB |
| Data Memory: 8 KB |
| SIMD Scratch: 64 KB |
| Registers: v128 (WASM SIMD) |
| Clock: ~1 GHz |
| Interconnect: Mesh to hub |
| |
| No filesystem. No mmap. |
| No allocator beyond scratch. |
| All I/O through hub messages. |
+-----------------------------------+
```
### Memory Map
```
Code (8 KB):
0x0000 - 0x0FFF Microkernel WASM bytecode (4 KB)
0x1000 - 0x17FF Distance function hot path (2 KB)
0x1800 - 0x1FFF Decode / quantization stubs (2 KB)
Data (8 KB):
0x0000 - 0x003F Tile configuration (64 B)
0x0040 - 0x00FF Query scratch (192 B: query vector fp16)
0x0100 - 0x01FF Result buffer (256 B: top-K candidates)
0x0200 - 0x03FF Routing table (512 B: entry points + centroids)
0x0400 - 0x07FF Decode workspace (1 KB)
0x0800 - 0x0FFF Message I/O buffer (2 KB)
0x1000 - 0x1FFF Neighbor list cache (4 KB)
SIMD Scratch (64 KB):
0x0000 - 0x7FFF Vector block (up to 85 vectors @ 384-dim fp16)
0x8000 - 0xBFFF Distance accumulator / PQ tables (16 KB)
0xC000 - 0xEFFF Hot cache subset (12 KB)
0xF000 - 0xFFFF Temporary / spill (4 KB)
```
### Tile Budget
For 384-dim fp16 vectors:
- One vector: 768 bytes
- SIMD scratch holds: 64 KB / 768 = ~85 vectors
- Top-K result buffer: 16 candidates * 16 B = 256 B
- Query vector: 768 B
A tile can process one block of ~85 vectors per cycle, computing distances
and maintaining a top-K heap entirely within scratch memory.
## 3. Microkernel Exports
The WASM microkernel exports exactly these functions:
```wat
;; === Core Query Path ===
;; Initialize tile with configuration
;; config_ptr: pointer to 64B tile config in data memory
(export "rvf_init" (func $rvf_init (param $config_ptr i32) (result i32)))
;; Load query vector into query scratch
;; query_ptr: pointer to fp16 vector in data memory
;; dim: vector dimensionality
(export "rvf_load_query" (func $rvf_load_query
(param $query_ptr i32) (param $dim i32) (result i32)))
;; Load a block of vectors into SIMD scratch
;; block_ptr: pointer to vector block in SIMD scratch
;; count: number of vectors
;; dtype: data type enum
(export "rvf_load_block" (func $rvf_load_block
(param $block_ptr i32) (param $count i32)
(param $dtype i32) (result i32)))
;; Compute distances between query and loaded block
;; metric: 0=L2, 1=IP, 2=cosine, 3=hamming
;; result_ptr: pointer to write distances
(export "rvf_distances" (func $rvf_distances
(param $metric i32) (param $result_ptr i32) (result i32)))
;; Merge distances into top-K heap
;; dist_ptr: pointer to distance array
;; id_ptr: pointer to vector ID array
;; count: number of candidates
;; k: top-K to maintain
(export "rvf_topk_merge" (func $rvf_topk_merge
(param $dist_ptr i32) (param $id_ptr i32)
(param $count i32) (param $k i32) (result i32)))
;; Read current top-K results
;; out_ptr: pointer to write results (id, distance pairs)
(export "rvf_topk_read" (func $rvf_topk_read
(param $out_ptr i32) (result i32)))
;; === Quantization ===
;; Load scalar quantization parameters (min/max per dim)
(export "rvf_load_sq_params" (func $rvf_load_sq_params
(param $params_ptr i32) (param $dim i32) (result i32)))
;; Dequantize int8 block to fp16 in SIMD scratch
(export "rvf_dequant_i8" (func $rvf_dequant_i8
(param $src_ptr i32) (param $dst_ptr i32)
(param $count i32) (result i32)))
;; Load PQ codebook subset
(export "rvf_load_pq_codebook" (func $rvf_load_pq_codebook
(param $codebook_ptr i32) (param $M i32)
(param $K i32) (result i32)))
;; Compute PQ asymmetric distances
(export "rvf_pq_distances" (func $rvf_pq_distances
(param $codes_ptr i32) (param $count i32)
(param $result_ptr i32) (result i32)))
;; === HNSW Navigation ===
;; Load neighbor list for a node
(export "rvf_load_neighbors" (func $rvf_load_neighbors
(param $node_id i64) (param $layer i32)
(param $out_ptr i32) (result i32)))
;; Greedy search step: given current node, find nearest neighbor
(export "rvf_greedy_step" (func $rvf_greedy_step
(param $current_id i64) (param $layer i32) (result i64)))
;; === Segment Verification ===
;; Verify segment header hash
(export "rvf_verify_header" (func $rvf_verify_header
(param $header_ptr i32) (result i32)))
;; Compute CRC32C of a data region
(export "rvf_crc32c" (func $rvf_crc32c
(param $data_ptr i32) (param $len i32) (result i32)))
```
### Export Count
14 exports. Each maps to a tight inner loop that fits in the 8 KB code budget.
The host (hub) is responsible for all I/O, segment parsing, and orchestration.
## 4. Host-Tile Protocol
Communication between the hub and tile uses fixed-size messages through
the 2 KB I/O buffer:
### Message Format
```
Offset Size Field Description
------ ---- ----- -----------
0x00 2 msg_type Message type enum
0x02 2 msg_length Payload length
0x04 4 msg_id Correlation ID
0x08 var payload Type-specific payload
```
### Message Types
```
Hub -> Tile:
0x01 LOAD_QUERY Send query vector (768 B for 384-dim fp16)
0x02 LOAD_BLOCK Send vector block (up to ~1.5 KB compressed)
0x03 LOAD_NEIGHBORS Send neighbor list for a node
0x04 LOAD_PARAMS Send quantization parameters
0x05 COMPUTE Trigger distance computation
0x06 READ_TOPK Request current top-K results
0x07 RESET Clear tile state for new query
Tile -> Hub:
0x81 TOPK_RESULT Top-K results (id, distance pairs)
0x82 NEED_BLOCK Request a specific vector block
0x83 NEED_NEIGHBORS Request neighbor list for a node
0x84 DONE Computation complete
0x85 ERROR Error with code
```
### Execution Flow
```
Hub Tile
| |
|--- LOAD_QUERY (768B) ------------>|
| | rvf_load_query()
|--- LOAD_PARAMS (SQ params) ------>|
| | rvf_load_sq_params()
|--- LOAD_BLOCK (block 0) -------->|
| | rvf_load_block()
| | rvf_distances()
| | rvf_topk_merge()
|--- LOAD_BLOCK (block 1) -------->|
| | rvf_load_block()
| | rvf_distances()
| | rvf_topk_merge()
| ... |
|--- READ_TOPK -------------------->|
| | rvf_topk_read()
|<--- TOPK_RESULT ------------------|
| |
```
### Pull Mode
For HNSW search, the tile drives the traversal:
```
Hub Tile
| |
|--- LOAD_QUERY -------------------->|
|--- LOAD_NEIGHBORS (entry point) -->|
| | rvf_greedy_step()
|<--- NEED_NEIGHBORS (next node) ----|
|--- LOAD_NEIGHBORS (next node) ---->|
| | rvf_greedy_step()
|<--- NEED_BLOCK (for candidate) ----|
|--- LOAD_BLOCK -------------------->|
| | rvf_distances()
| | rvf_topk_merge()
|<--- DONE ----------------------------|
|--- READ_TOPK --------------------->|
|<--- TOPK_RESULT ------------------|
```
## 5. Three Hardware Profiles
### RVF Core Profile (Tile)
```
Target: Cognitum tile (8KB + 8KB + 64KB)
Features: Distance compute, top-K, SQ dequant, CRC32C verify
Max vectors: ~85 per block load
Max dimensions: 384 (fp16) or 768 (i8)
Index: None (hub routes, tile computes)
Streaming: Receive blocks from hub
Quantization: i8 scalar only (no PQ on tile)
Compression: None (hub decompresses before sending)
```
### RVF Hot Profile (Chip)
```
Target: Cognitum chip (multiple tiles + shared memory)
Features: Core + PQ distance, HNSW navigation, parallel tiles
Max vectors: Limited by shared memory (~10K in shared cache)
Max dimensions: 1024
Index: Layer A in shared memory
Streaming: Block streaming across tiles
Quantization: i8 scalar + PQ (6-bit)
Compression: LZ4 decompress in shared memory
```
### RVF Full Profile (Hub/Desktop)
```
Target: Desktop CPU, server, hub controller
Features: All features, all segment types, all quantization
Max vectors: Billions (limited by storage)
Max dimensions: Unlimited
Index: Full HNSW (Layers A + B + C)
Streaming: Full append-only segment model
Quantization: All tiers (fp16, i8, PQ, binary)
Compression: All (LZ4, ZSTD, custom)
Crypto: Full (ML-DSA-65 signatures, SHAKE-256)
Temperature: Full adaptive tiering
Overlay: Full epoch model with compaction
```
### Profile Detection
The root manifest's `profile_id` field declares the minimum profile needed:
```
0x00 generic Requires Full Profile features
0x01 core Fully usable with Core Profile
0x02 hot Requires Hot Profile minimum
0x03 full Requires Full Profile
```
A Full Profile reader can always read Core or Hot files. A Core Profile
reader rejects Full Profile files but can read Core files. Hot Profile
readers can read Core and Hot files.
## 6. SIMD Strategy by Platform
### WASM v128 (Tile/Browser)
```wasm
;; L2 distance: fp16 vectors, 384 dimensions
;; Process 8 fp16 values per v128 operation
(func $l2_fp16_384 (param $a_ptr i32) (param $b_ptr i32) (result f32)
(local $acc v128)
(local $i i32)
(local.set $acc (v128.const i64x2 0 0))
(local.set $i (i32.const 0))
(block $done
(loop $loop
;; Load 8 fp16 values, widen to f32x4 pairs
;; Subtract, square, accumulate
;; ... (8 values per iteration, 48 iterations for 384 dims)
(br_if $done (i32.ge_u (local.get $i) (i32.const 384)))
(br $loop)
)
)
;; Horizontal sum of accumulator
;; Return L2 distance
)
```
### AVX-512 (Desktop/Server)
```
; Process 32 fp16 values per cycle with VCVTPH2PS + VFMADD231PS
; 384 dims = 12 iterations of 32 values
; ~12 cycles per distance computation
```
### ARM NEON (Mobile/Edge)
```
; Process 8 fp16 values per cycle with FMLA
; 384 dims = 48 iterations of 8 values
; ~48 cycles per distance computation
```
## 7. Microkernel Size Budget
```
Function Estimated Size
-------- --------------
rvf_init 128 B
rvf_load_query 64 B
rvf_load_block 256 B
rvf_distances (L2 fp16) 512 B
rvf_distances (L2 i8) 384 B
rvf_distances (IP fp16) 512 B
rvf_distances (hamming) 256 B
rvf_topk_merge 384 B
rvf_topk_read 64 B
rvf_load_sq_params 64 B
rvf_dequant_i8 256 B
rvf_load_pq_codebook 128 B
rvf_pq_distances 512 B
rvf_load_neighbors 128 B
rvf_greedy_step 512 B
rvf_verify_header 128 B
rvf_crc32c 256 B
Message dispatch loop 384 B
Utility functions 256 B
WASM overhead 512 B
----------
Total ~5,500 B (< 8 KB code budget)
```
Remaining ~2.5 KB of code space is available for domain-specific extensions
(e.g., codon distance for RVDNA profile, token overlap for RVText profile).
## 8. Fault Isolation
Each tile runs in a WASM sandbox. A tile cannot:
- Access hub memory directly
- Communicate with other tiles except through the hub
- Allocate memory beyond its 8 KB data + 64 KB scratch
- Execute code beyond its 8 KB code space
- Trap without the hub catching and recovering
If a tile traps (out-of-bounds, unreachable, stack overflow):
1. Hub catches the trap
2. Hub marks tile as faulted
3. Hub reassigns the tile's work to another tile (or processes locally)
4. Hub optionally restarts the faulted tile with fresh state
This makes the system resilient to individual tile failures — important for
large tile arrays where hardware faults are inevitable.