mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-24 22:15:18 +00:00
## GEMM/GEMV Optimizations (matmul.rs) - 12x4 micro-kernel with better register utilization - Cache blocking: 96x64x256 tiles for M4 Pro L1d (192KB) - GEMV: 35.9 GFLOPS (was 5-6 GFLOPS) - 6x improvement - GEMM: 19.2 GFLOPS (was 6 GFLOPS) - 3.2x improvement - FP16 compute path using half crate ## Flash Attention 2 (attention.rs) - Proper online softmax with rescaling - Auto block sizing (32/64/128) for cache hierarchy - 8x-unrolled SIMD helpers (dot product, rescale, accumulate) - Parallel MQA/GQA/MHA with rayon - +10% throughput improvement ## Quantized Kernels (NEW: quantized.rs) - INT8 GEMV with NEON vmull_s8/vpadalq_s16 (~2.5x speedup) - INT4 GEMV with block-wise quantization (~4x speedup) - Q4_K format compatible with llama.cpp - Quantization/dequantization helpers ## Metal GPU Shaders - attention.metal: Flash Attention v2, simd_sum/simd_max - gemm.metal: simdgroup_matrix 8x8 tiles, double-buffered - norm.metal: SIMD reduction, fused residual+norm - rope.metal: Constant memory tables, fused Q+K ## Memory Pool (NEW: memory_pool.rs) - InferenceArena: O(1) bump allocation, 64-byte aligned - BufferPool: 5 size classes (1KB-256KB), hit tracking - ScratchSpaceManager: Per-thread scratch buffers - PooledKvCache integration ## Rayon Parallelization - gemm_parallel/gemv_parallel/batched_gemm_parallel - 12.7x speedup on M4 Pro 10-core - Work-stealing scheduler, row-level parallelism - Feature flag: parallel = ["dep:rayon"] All 331 tests pass. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| adr | ||
| api | ||
| architecture | ||
| benchmarks | ||
| cloud-architecture | ||
| dag | ||
| development | ||
| examples | ||
| gnn | ||
| guides | ||
| hnsw | ||
| hooks | ||
| implementation | ||
| integration | ||
| nervous-system | ||
| optimization | ||
| plans/subpolynomial-time-mincut | ||
| postgres | ||
| project-phases | ||
| publishing | ||
| research | ||
| ruvllm | ||
| sparse-inference | ||
| sql | ||
| testing | ||
| .DS_Store | ||
| .gitkeep | ||
| algorithmic-optimization-analysis.md | ||
| BENCHMARK_RESULTS.md | ||
| BTSP_IMPLEMENTATION.md | ||
| code-review-mincut-gated-transformer.md | ||
| dendrite-implementation-summary.md | ||
| exotic-neural-trader-code-review.md | ||
| INDEX.md | ||
| LLM_BENCHMARK_RESULTS.md | ||
| mincut-transformer-memory-optimization-analysis.md | ||
| nervous-system-eventbus-summary.md | ||
| neural-trader-performance-analysis.md | ||
| plaid-bottleneck-summary.md | ||
| plaid-optimization-guide.md | ||
| plaid-performance-analysis.md | ||
| qudag-token-implementation.md | ||
| README.md | ||
| REPO_STRUCTURE.md | ||
| security-audit-fpga-transformer.md | ||
| SECURITY_AUDIT.md | ||
| simd-optimization-analysis.md | ||
| SPECULATIVE_DECODING.md | ||
| workspace-implementation-summary.md | ||
| zk_security_audit_report.md | ||
RuVector Documentation
Complete documentation for RuVector, the high-performance Rust vector database with global scale capabilities.
📚 Documentation Structure
Getting Started
Quick start guides and tutorials for new users:
- AGENTICDB_QUICKSTART.md - Quick start for AgenticDB compatibility
- OPTIMIZATION_QUICK_START.md - Performance optimization quick guide
- AGENTICDB_API.md - AgenticDB API reference
- wasm-api.md - WebAssembly API documentation
- wasm-build-guide.md - Building WASM bindings
- advanced-features.md - Advanced features guide
- quick-fix-guide.md - Common issues and fixes
Architecture & Design
System architecture and design documentation:
- TECHNICAL_PLAN.md - Complete technical plan and architecture
- INDEX.md - Documentation index
- architecture/ - System architecture details
- cloud-architecture/ - Global cloud deployment architecture
- architecture-overview.md - 15-region topology
- scaling-strategy.md - Auto-scaling & burst handling
- infrastructure-design.md - GCP infrastructure specs
- DEPLOYMENT_GUIDE.md - Step-by-step deployment
- PERFORMANCE_OPTIMIZATION_GUIDE.md - Advanced tuning
API Reference
API documentation for different platforms:
- api/ - Core API documentation
- RUST_API.md - Rust API reference
- NODEJS_API.md - Node.js API reference
User Guides
Comprehensive user guides:
- guide/ - User guides
- GETTING_STARTED.md - Getting started guide
- BASIC_TUTORIAL.md - Basic tutorial
- ADVANCED_FEATURES.md - Advanced features
- INSTALLATION.md - Installation instructions
Performance & Optimization
Performance tuning and benchmarking:
- optimization/ - Performance optimization guides
- BUILD_OPTIMIZATION.md - Build optimizations
- IMPLEMENTATION_SUMMARY.md - Implementation details
- OPTIMIZATION_RESULTS.md - Optimization results
- PERFORMANCE_TUNING_GUIDE.md - Performance tuning
- benchmarks/ - Benchmarking documentation
- BENCHMARKING_GUIDE.md - How to run benchmarks
Development
Contributing and development guides:
- development/ - Development documentation
- CONTRIBUTING.md - Contribution guidelines
- MIGRATION.md - Migration guide
- FIXING_COMPILATION_ERRORS.md - Troubleshooting compilation
Testing
Testing documentation and reports:
- testing/ - Testing documentation
- TDD_TEST_SUITE_SUMMARY.md - TDD test suite summary
- integration-testing-report.md - Integration test report
Project History
Historical project phase documentation:
- project-phases/ - Project phase documentation
- phase2_hnsw_implementation.md - Phase 2: HNSW
- PHASE3_SUMMARY.md - Phase 3 summary
- phase4-implementation-summary.md - Phase 4 summary
- PHASE5_COMPLETE.md - Phase 5 complete
- phase5-implementation-summary.md - Phase 5 summary
- PHASE6_ADVANCED.md - Phase 6 advanced features
- PHASE6_COMPLETION_REPORT.md - Phase 6 report
- PHASE6_SUMMARY.md - Phase 6 summary
Implementation Summary
- IMPLEMENTATION_SUMMARY.md - Complete implementation overview for global streaming
🚀 Quick Links
For New Users
- Start with Getting Started Guide
- Try the Basic Tutorial
- Review API Documentation
For Cloud Deployment
- Read Architecture Overview
- Follow Deployment Guide
- Apply Performance Optimizations
For Contributors
- Read Contributing Guidelines
- Review Technical Plan
- Check Migration Guide
For Performance Tuning
- Review Optimization Guide
- Run Benchmarks
- Apply Query Optimizations
📊 Documentation Status
| Category | Files | Status |
|---|---|---|
| Getting Started | 7 | ✅ Complete |
| Architecture | 11 | ✅ Complete |
| API Reference | 2 | ✅ Complete |
| User Guides | 4 | ✅ Complete |
| Optimization | 4 | ✅ Complete |
| Development | 3 | ✅ Complete |
| Testing | 2 | ✅ Complete |
| Project Phases | 8 | 📚 Historical |
Total Documentation: 40+ comprehensive documents
🔗 External Resources
- GitHub Repository: https://github.com/ruvnet/ruvector
- Main README: ../README.md
- Changelog: ../CHANGELOG.md
- License: ../LICENSE
Last Updated: 2025-11-20 | Version: 0.1.0 | Status: Production Ready