mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-25 15:03:46 +00:00
## GEMM/GEMV Optimizations (matmul.rs) - 12x4 micro-kernel with better register utilization - Cache blocking: 96x64x256 tiles for M4 Pro L1d (192KB) - GEMV: 35.9 GFLOPS (was 5-6 GFLOPS) - 6x improvement - GEMM: 19.2 GFLOPS (was 6 GFLOPS) - 3.2x improvement - FP16 compute path using half crate ## Flash Attention 2 (attention.rs) - Proper online softmax with rescaling - Auto block sizing (32/64/128) for cache hierarchy - 8x-unrolled SIMD helpers (dot product, rescale, accumulate) - Parallel MQA/GQA/MHA with rayon - +10% throughput improvement ## Quantized Kernels (NEW: quantized.rs) - INT8 GEMV with NEON vmull_s8/vpadalq_s16 (~2.5x speedup) - INT4 GEMV with block-wise quantization (~4x speedup) - Q4_K format compatible with llama.cpp - Quantization/dequantization helpers ## Metal GPU Shaders - attention.metal: Flash Attention v2, simd_sum/simd_max - gemm.metal: simdgroup_matrix 8x8 tiles, double-buffered - norm.metal: SIMD reduction, fused residual+norm - rope.metal: Constant memory tables, fused Q+K ## Memory Pool (NEW: memory_pool.rs) - InferenceArena: O(1) bump allocation, 64-byte aligned - BufferPool: 5 size classes (1KB-256KB), hit tracking - ScratchSpaceManager: Per-thread scratch buffers - PooledKvCache integration ## Rayon Parallelization - gemm_parallel/gemv_parallel/batched_gemm_parallel - 12.7x speedup on M4 Pro 10-core - Work-stealing scheduler, row-level parallelism - Feature flag: parallel = ["dep:rayon"] All 331 tests pass. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| agentic-jujutsu | ||
| apify | ||
| benchmarks | ||
| data | ||
| docs | ||
| edge | ||
| edge-full/pkg | ||
| edge-net | ||
| exo-ai-2025 | ||
| google-cloud | ||
| graph | ||
| meta-cognition-spiking-neural-network | ||
| mincut | ||
| neural-trader | ||
| nodejs | ||
| onnx-embeddings | ||
| onnx-embeddings-wasm | ||
| refrag-pipeline | ||
| rust | ||
| ruvLLM | ||
| scipix | ||
| spiking-network | ||
| subpolynomial-time | ||
| ultra-low-latency-sim | ||
| vibecast-7sense | ||
| wasm/ios | ||
| wasm-react | ||
| wasm-vanilla | ||
| .DS_Store | ||
| bounded_instance_demo.rs | ||
| README.md | ||
RuVector Examples
Comprehensive examples demonstrating RuVector's capabilities across multiple platforms and use cases.
Directory Structure
examples/
├── rust/ # Rust SDK examples
├── nodejs/ # Node.js SDK examples
├── graph/ # Graph database features
├── wasm-react/ # React + WebAssembly integration
├── wasm-vanilla/ # Vanilla JS + WebAssembly
├── agentic-jujutsu/ # AI agent version control
├── exo-ai-2025/ # Advanced cognitive substrate
├── refrag-pipeline/ # Document processing pipeline
└── docs/ # Additional documentation
Quick Start by Platform
Rust
cd rust
cargo run --example basic_usage
cargo run --example advanced_features
cargo run --example agenticdb_demo
Node.js
cd nodejs
npm install
node basic_usage.js
node semantic_search.js
WebAssembly (React)
cd wasm-react
npm install
npm run dev
WebAssembly (Vanilla)
cd wasm-vanilla
# Open index.html in browser
Example Categories
| Category | Directory | Description |
|---|---|---|
| Core API | rust/basic_usage.rs |
Vector DB fundamentals |
| Batch Ops | rust/batch_operations.rs |
High-throughput ingestion |
| RAG Pipeline | rust/rag_pipeline.rs |
Retrieval-Augmented Generation |
| Advanced | rust/advanced_features.rs |
Hypergraphs, neural hashing |
| AgenticDB | rust/agenticdb_demo.rs |
AI agent memory system |
| GNN | rust/gnn_example.rs |
Graph Neural Networks |
| Graph | graph/ |
Cypher queries, clustering |
| Node.js | nodejs/ |
JavaScript integration |
| WASM React | wasm-react/ |
Modern React apps |
| WASM Vanilla | wasm-vanilla/ |
Browser without framework |
| Agentic Jujutsu | agentic-jujutsu/ |
Multi-agent version control |
| EXO-AI 2025 | exo-ai-2025/ |
Cognitive substrate research |
| Refrag | refrag-pipeline/ |
Document fragmentation |
Feature Highlights
Vector Database Core
- High-performance similarity search
- Multiple distance metrics (Cosine, Euclidean, Dot Product)
- Metadata filtering
- Batch operations
Advanced Features
- Hypergraph Index: Multi-entity relationships
- Temporal Hypergraph: Time-aware relationships
- Causal Memory: Cause-effect chains
- Learned Index: ML-optimized indexing
- Neural Hash: Locality-sensitive hashing
- Topological Analysis: Persistent homology
AgenticDB
- Reflexion episodes (self-critique)
- Skill library (consolidated patterns)
- Causal memory (hypergraph relationships)
- Learning sessions (RL training data)
- Vector embeddings (core storage)
EXO-AI Cognitive Substrate
- exo-core: IIT consciousness, thermodynamics
- exo-temporal: Causal memory coordination
- exo-hypergraph: Topological structures
- exo-manifold: Continuous deformation
- exo-exotic: 10 cutting-edge experiments
- exo-wasm: Browser deployment
- exo-federation: Distributed consensus
- exo-node: Native bindings
- exo-backend-classical: Classical compute
Running Benchmarks
# Rust benchmarks
cargo bench --example advanced_features
# Refrag pipeline benchmarks
cd refrag-pipeline
cargo bench
# EXO-AI benchmarks
cd exo-ai-2025
cargo bench
Related Documentation
License
MIT OR Apache-2.0