ruvector/docs
Reuven 3cb3954eb3 perf: Major M4 Pro optimization pass - 6-12x speedups
## GEMM/GEMV Optimizations (matmul.rs)
- 12x4 micro-kernel with better register utilization
- Cache blocking: 96x64x256 tiles for M4 Pro L1d (192KB)
- GEMV: 35.9 GFLOPS (was 5-6 GFLOPS) - 6x improvement
- GEMM: 19.2 GFLOPS (was 6 GFLOPS) - 3.2x improvement
- FP16 compute path using half crate

## Flash Attention 2 (attention.rs)
- Proper online softmax with rescaling
- Auto block sizing (32/64/128) for cache hierarchy
- 8x-unrolled SIMD helpers (dot product, rescale, accumulate)
- Parallel MQA/GQA/MHA with rayon
- +10% throughput improvement

## Quantized Kernels (NEW: quantized.rs)
- INT8 GEMV with NEON vmull_s8/vpadalq_s16 (~2.5x speedup)
- INT4 GEMV with block-wise quantization (~4x speedup)
- Q4_K format compatible with llama.cpp
- Quantization/dequantization helpers

## Metal GPU Shaders
- attention.metal: Flash Attention v2, simd_sum/simd_max
- gemm.metal: simdgroup_matrix 8x8 tiles, double-buffered
- norm.metal: SIMD reduction, fused residual+norm
- rope.metal: Constant memory tables, fused Q+K

## Memory Pool (NEW: memory_pool.rs)
- InferenceArena: O(1) bump allocation, 64-byte aligned
- BufferPool: 5 size classes (1KB-256KB), hit tracking
- ScratchSpaceManager: Per-thread scratch buffers
- PooledKvCache integration

## Rayon Parallelization
- gemm_parallel/gemv_parallel/batched_gemm_parallel
- 12.7x speedup on M4 Pro 10-core
- Work-stealing scheduler, row-level parallelism
- Feature flag: parallel = ["dep:rayon"]

All 331 tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 09:12:34 -05:00
..
adr docs: Add comprehensive ADRs for ruvector and ruvllm architecture 2026-01-18 16:31:14 -05:00
api docs: Add Cypher reference, include Tiny Dancer, fix WASM build 2025-11-26 12:54:04 +00:00
architecture feat: Complete LLM system with Candle, MicroLoRA, NEON kernels 2026-01-18 21:04:21 -05:00
benchmarks feat(postgres): Add HNSW index and embedding functions support (#62) 2025-12-09 11:14:52 -05:00
cloud-architecture Implement global streaming optimization for 500M concurrent streams 2025-11-20 18:51:26 +00:00
dag docs(dag): add comprehensive Neural DAG Learning implementation plan 2025-12-29 22:15:55 +00:00
development feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
examples feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
gnn feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
guides feat(postgres): Add HNSW index and embedding functions support (#62) 2025-12-09 11:14:52 -05:00
hnsw docs: Reorganize documentation and add postgres README 2025-12-02 16:45:44 +00:00
hooks feat(cli): Implement full hooks system in Rust CLI 2025-12-27 01:08:36 +00:00
implementation feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
integration feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
nervous-system feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
optimization chore(docs): Clean up and reorganize documentation structure 2025-12-25 19:39:44 +00:00
plans/subpolynomial-time-mincut chore(docs): Clean up and reorganize documentation structure 2025-12-25 19:39:44 +00:00
postgres Feat/ruvector postgres v2 (#82) 2025-12-25 17:02:55 -05:00
project-phases Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
publishing feat: Implement all 6 ADRs for ruvector and ruvllm optimization 2026-01-18 16:52:15 -05:00
research chore(docs): Clean up and reorganize documentation structure 2025-12-25 19:39:44 +00:00
ruvllm feat: Complete production LLM system with Metal GPU, streaming, speculative decoding 2026-01-18 22:06:22 -05:00
sparse-inference feat: Add PowerInfer-style sparse inference engine with precision lanes (#106) 2026-01-04 23:40:31 -05:00
sql feat(postgres): Add ruvector-postgres extension with SIMD optimizations (#42) 2025-12-02 09:55:07 -05:00
testing Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
.DS_Store perf: Major M4 Pro optimization pass - 6-12x speedups 2026-01-19 09:12:34 -05:00
.gitkeep Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
algorithmic-optimization-analysis.md docs: Add performance optimization analysis reports 2025-12-26 17:41:13 +00:00
BENCHMARK_RESULTS.md docs: Add comprehensive benchmark results and CI script 2026-01-18 17:01:06 -05:00
BTSP_IMPLEMENTATION.md feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
code-review-mincut-gated-transformer.md fix(security): Critical security and performance improvements 2025-12-26 16:25:02 +00:00
dendrite-implementation-summary.md feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
exotic-neural-trader-code-review.md docs: add neural-trader code review and performance analysis reports 2025-12-31 02:56:08 +00:00
INDEX.md chore(docs): Clean up and reorganize documentation structure 2025-12-25 19:39:44 +00:00
LLM_BENCHMARK_RESULTS.md feat: Complete production LLM system with Metal GPU, streaming, speculative decoding 2026-01-18 22:06:22 -05:00
mincut-transformer-memory-optimization-analysis.md docs: Add performance optimization analysis reports 2025-12-26 17:41:13 +00:00
nervous-system-eventbus-summary.md feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
neural-trader-performance-analysis.md docs: add neural-trader code review and performance analysis reports 2025-12-31 02:56:08 +00:00
plaid-bottleneck-summary.md fix(security): Address critical security and performance issues 2026-01-01 18:36:58 +00:00
plaid-optimization-guide.md fix(security): Address critical security and performance issues 2026-01-01 18:36:58 +00:00
plaid-performance-analysis.md fix(security): Address critical security and performance issues 2026-01-01 18:36:58 +00:00
qudag-token-implementation.md feat(dag): implement Neural Self-Learning DAG with QuDAG integration 2025-12-29 22:58:43 +00:00
README.md Clean up repository structure and organize documentation 2025-11-20 19:50:03 +00:00
REPO_STRUCTURE.md feat(micro-hnsw-wasm): Add Neuromorphic HNSW v2.3 with SNN Integration (#40) 2025-12-01 22:30:15 -05:00
security-audit-fpga-transformer.md feat: Add FPGA Transformer backend crates (#105) 2026-01-04 18:59:02 -05:00
SECURITY_AUDIT.md feat: Implement all 6 ADRs for ruvector and ruvllm optimization 2026-01-18 16:52:15 -05:00
simd-optimization-analysis.md docs: Add performance optimization analysis reports 2025-12-26 17:41:13 +00:00
SPECULATIVE_DECODING.md docs(mincut-transformer): Add examples and documentation for SOTA features 2025-12-26 19:55:06 +00:00
workspace-implementation-summary.md feat(nervous-system): Complete bio-inspired neural architecture implementation 2025-12-28 04:05:08 +00:00
zk_security_audit_report.md fix(security): Address critical security and performance issues 2026-01-01 18:36:58 +00:00

RuVector Documentation

Complete documentation for RuVector, the high-performance Rust vector database with global scale capabilities.

📚 Documentation Structure

Getting Started

Quick start guides and tutorials for new users:

Architecture & Design

System architecture and design documentation:

API Reference

API documentation for different platforms:

User Guides

Comprehensive user guides:

Performance & Optimization

Performance tuning and benchmarking:

Development

Contributing and development guides:

Testing

Testing documentation and reports:

Project History

Historical project phase documentation:

Implementation Summary


For New Users

  1. Start with Getting Started Guide
  2. Try the Basic Tutorial
  3. Review API Documentation

For Cloud Deployment

  1. Read Architecture Overview
  2. Follow Deployment Guide
  3. Apply Performance Optimizations

For Contributors

  1. Read Contributing Guidelines
  2. Review Technical Plan
  3. Check Migration Guide

For Performance Tuning

  1. Review Optimization Guide
  2. Run Benchmarks
  3. Apply Query Optimizations

📊 Documentation Status

Category Files Status
Getting Started 7 Complete
Architecture 11 Complete
API Reference 2 Complete
User Guides 4 Complete
Optimization 4 Complete
Development 3 Complete
Testing 2 Complete
Project Phases 8 📚 Historical

Total Documentation: 40+ comprehensive documents


🔗 External Resources


Last Updated: 2025-11-20 | Version: 0.1.0 | Status: Production Ready