ruvector/docs
rUv aee77babaf
docs(research): add ultra-low-bit quantization & edge deployment research (#255)
* docs(research): add ultra-low-bit quantization & edge deployment research

Comprehensive research collection on 2-bit/3-bit quantization for ruvLLM:

- 01: Ultra-low-bit quantization survey (ICLR'26, QuIP, BitNet, I-quants)
- 02: Quantization-aware training (QAT) with reasoning preservation
- 03: QuIP 2-bit framework analysis (incoherence processing, E8 lattice)
- 04: MoE memory-aware routing for edge SRAM budgets
- 05: ruvLLM quantization architecture deep review and gap analysis
- 06: Rust implementation plan for 2-bit QAT pipeline (14-week roadmap)
- 07: Novel 3-int pi-constant quantization using irrational scaling

Key findings: ruvLLM has strong foundations (BitNet, K-quants, GGUF, KV cache)
but needs QAT training loop and differentiable quantization primitives.
Pi-constant scaling provides ~0.5 bit effective precision gain at 3-bit.

https://claude.ai/code/session_01E4pmfETYzknb1xq2dzCCaj

* docs(adr): add ADR-090 ultra-low-bit QAT & pi-quantization DDD architecture

Comprehensive architecture decision record for implementing 2-bit/3-bit
quantization-aware training in ruvLLM using Domain-Driven Design:

- 5 bounded contexts: Quantization Core, Training, MoE Routing, WASM Runtime, Observability
- Pi-constant quantization with irrational scaling (pi/k step sizes)
- QAT training loop with STE variants and LoRA-QAT lightweight path
- QuIP incoherence via fast Walsh-Hadamard (O(n log n))
- Memory-aware MoE routing with expert precision allocation
- WASM SIMD128 kernels reusing existing tl1_wasm.rs LUT pattern
- Security: weight integrity, GGUF validation, WASM sandbox
- Benchmarking: criterion suite with throughput/quality targets
- 14-week timeline, maps to 18 existing files for extension

Placed in docs/adr/ddd/ per DDD architectural pattern organization.

https://claude.ai/code/session_01E4pmfETYzknb1xq2dzCCaj

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-12 10:21:30 -04:00
..
adr docs(research): add ultra-low-bit quantization & edge deployment research (#255) 2026-03-12 10:21:30 -04:00
analysis docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
api docs: Add Cypher reference, include Tiny Dancer, fix WASM build 2025-11-26 12:54:04 +00:00
architecture Merge origin/main into claude/quantum-engine-adrs-6OsEO - resolve Cargo.toml conflict 2026-02-08 17:04:57 +00:00
benchmarks docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
cloud-architecture Implement global streaming optimization for 500M concurrent streams 2025-11-20 18:51:26 +00:00
cnn feat(demo): add Self-Learning tab with 6 interactive training demos 2026-03-11 19:31:23 -04:00
code-reviews docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
dag docs(dag): add comprehensive Neural DAG Learning implementation plan 2025-12-29 22:15:55 +00:00
development feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
examples feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
gnn feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
guides docs: add missing capabilities to advanced features guide 2026-02-26 16:09:06 +00:00
hnsw docs: Reorganize documentation and add postgres README 2025-12-02 16:45:44 +00:00
hooks feat(cli): Implement full hooks system in Rust CLI 2025-12-27 01:08:36 +00:00
implementation docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
integration docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
nervous-system docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
optimization docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
plans/subpolynomial-time-mincut chore(docs): Clean up and reorganize documentation structure 2025-12-25 19:39:44 +00:00
postgres Feat/ruvector postgres v2 (#82) 2025-12-25 17:02:55 -05:00
project-phases Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
publishing feat(training): RuvLTRA v2.4 Ecosystem Edition - 100% routing accuracy (#123) 2026-01-20 20:08:30 -05:00
research docs(research): add ultra-low-bit quantization & edge deployment research (#255) 2026-03-12 10:21:30 -04:00
ruvllm docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
security docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
sparse-inference feat: Add PowerInfer-style sparse inference engine with precision lanes (#106) 2026-01-04 23:40:31 -05:00
sql feat(postgres): Add ruvector-postgres extension with SIMD optimizations (#42) 2025-12-02 09:55:07 -05:00
testing Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
training docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
.gitkeep Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
.nojekyll fix: add .nojekyll to disable Jekyll processing 2026-03-11 17:53:19 -04:00
index.html refactor: move CNN demo to docs/cnn/ for shorter URL 2026-03-11 17:52:13 -04:00
INDEX.md docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
README.md docs: update guides to match current API surface and versions 2026-02-26 16:05:29 +00:00
REPO_STRUCTURE.md docs: reorganize into subfolders 2026-01-21 23:43:50 -05:00
research-openfang.md Add OpenFang project research document 2026-02-26 14:14:58 +00:00

RuVector Documentation

Complete documentation for RuVector, the high-performance Rust vector database with global scale capabilities.

📚 Documentation Structure

docs/
├── adr/                    # Architecture Decision Records
├── analysis/               # Research & analysis docs
├── api/                    # API references (Rust, Node.js, Cypher)
├── architecture/           # System design docs
├── benchmarks/             # Performance benchmarks & results
├── cloud-architecture/     # Cloud deployment guides
├── code-reviews/           # Code review documentation
├── dag/                    # DAG implementation
├── development/            # Developer guides
├── examples/               # SQL examples
├── gnn/                    # GNN/Graph implementation
├── guides/                 # User guides & tutorials
├── hnsw/                   # HNSW index documentation
├── hooks/                  # Hooks system documentation
├── implementation/         # Implementation details & summaries
├── integration/            # Integration guides
├── nervous-system/         # Nervous system architecture
├── optimization/           # Performance optimization guides
├── plans/                  # Implementation plans
├── postgres/               # PostgreSQL extension docs
├── project-phases/         # Development phases
├── publishing/             # NPM publishing guides
├── research/               # Research documentation
├── ruvllm/                 # RuVLLM documentation
├── security/               # Security audits & reports
├── sparse-inference/       # Sparse inference docs
├── sql/                    # SQL examples
├── testing/                # Testing documentation
└── training/               # Training & LoRA docs

Getting Started

Architecture & Design

API Reference

Performance & Benchmarks

Security

Implementation

Specialized Topics

Development

Research

  • research/ - Research documentation
    • cognitive-frontier/ - Cognitive frontier research
    • gnn-v2/ - GNN v2 research
    • latent-space/ - HNSW & attention research
    • mincut/ - MinCut algorithm research

For New Users

  1. Start with Getting Started Guide
  2. Try the Basic Tutorial
  3. Review API Documentation

For Cloud Deployment

  1. Read Architecture Overview
  2. Follow Deployment Guide
  3. Apply Performance Optimizations

For Contributors

  1. Read Contributing Guidelines
  2. Review Architecture Decisions
  3. Check Migration Guide

For Performance Tuning

  1. Review Optimization Guide
  2. Run Benchmarks
  3. Check Analysis

📊 Documentation Status

Category Directory Status
Getting Started guides/ Complete
Architecture architecture/, adr/ Complete
API Reference api/ Complete
Performance benchmarks/, optimization/, analysis/ Complete
Security security/ Complete
Implementation implementation/, integration/ Complete
Development development/, testing/ Complete
Research research/ 📚 Ongoing

Total Documentation: 170+ comprehensive documents across 25+ directories


🔗 External Resources


Last Updated: 2026-02-26 | Version: 2.0.4 (core) / 0.1.100 (npm) | Status: Production Ready