mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-22 19:56:25 +00:00
- Move router-* folders into crates/ directory - Move profiling folder into crates/ - Update Cargo.toml workspace to include new crate locations - Add node_modules/ and package-lock.json to .gitignore - Remove node_modules directory from repository - Create new README.md with project overview and badges - Move old technical documentation to docs/TECHNICAL_PLAN.md This reorganization improves the project structure by: - Consolidating all Rust crates in the crates/ directory - Following standard Rust workspace conventions - Cleaning up root directory clutter - Providing a clear, professional README for new users |
||
|---|---|---|
| .. | ||
| scripts | ||
| README.md | ||
Ruvector Performance Profiling
This directory contains profiling scripts, reports, and analysis for Ruvector performance optimization.
Directory Structure
profiling/
├── scripts/ # Profiling and benchmarking scripts
├── reports/ # Generated profiling reports
├── flamegraphs/ # CPU flamegraphs
├── memory/ # Memory profiling data
└── benchmarks/ # Benchmark results
Profiling Tools
CPU Profiling
- perf: Linux performance counters
- flamegraph: Visualization of CPU hotspots
- cargo-flamegraph: Integrated Rust profiling
Memory Profiling
- valgrind: Memory leak detection and profiling
- heaptrack: Heap memory profiling
- massif: Heap profiler
Cache Profiling
- perf-cache: Cache miss analysis
- cachegrind: Cache simulation
Quick Start
# Install profiling tools
./scripts/install_tools.sh
# Run CPU profiling
./scripts/cpu_profile.sh
# Run memory profiling
./scripts/memory_profile.sh
# Generate flamegraph
./scripts/generate_flamegraph.sh
# Run comprehensive benchmark suite
./scripts/benchmark_all.sh
Performance Targets
- Throughput: 50,000+ queries per second (QPS)
- Latency: Sub-millisecond p50 latency (<1ms)
- Recall: 95% recall at high QPS
- Memory: Efficient memory usage with minimal allocations in hot paths
- Scalability: Linear scaling from 1-128 threads
Optimization Areas
-
CPU Optimization
- SIMD intrinsics (AVX2/AVX-512)
- Target-specific compilation
- Hot path optimization
-
Memory Optimization
- Arena allocation
- Object pooling
- Zero-copy operations
-
Cache Optimization
- Structure-of-Arrays layout
- Cache-line alignment
- Prefetching
-
Concurrency Optimization
- Lock-free data structures
- RwLock optimization
- Rayon tuning
-
Compile-Time Optimization
- Profile-Guided Optimization (PGO)
- Link-Time Optimization (LTO)
- Target CPU features