Implements a full LLM inference and fine-tuning system optimized for Mac M4 Pro: ## New Crates - ruvllm-cli: CLI tool with download, serve, chat, benchmark commands ## Backends (crates/ruvllm/src/backends/) - LlmBackend trait for pluggable inference backends - CandleBackend with Metal acceleration, GGUF quantization, HF Hub ## MicroLoRA (crates/ruvllm/src/lora/) - Rank 1-2 adapters for <1ms per-request adaptation - EWC++ regularization to prevent catastrophic forgetting - Hot-swap adapter registry with composition strategies - Training pipeline with LR schedules (Constant, Cosine, OneCycle) ## NEON Kernels (crates/ruvllm/src/kernels/) - Flash Attention 2 with online softmax - Paged Attention for KV cache efficiency - Multi-Query (MQA) and Grouped-Query (GQA) attention - RoPE with precomputed tables and NTK-aware scaling - RMSNorm and LayerNorm with batched variants - GEMV, GEMM, batched GEMM with 4x unrolling ## Real-time Optimization (crates/ruvllm/src/optimization/) - SONA-LLM with 3 learning loops (instant <1ms, background ~100ms, deep) - RealtimeOptimizer with dynamic batch sizing - KV cache pressure policies (Evict, Quantize, Reject, Spill) - Metrics collection with moving averages and histograms ## Benchmarks - 6 Criterion benchmark suites for M4 Pro profiling - Runner script with baseline comparison ## Tests - 297 total tests (171 unit + 126 integration) - Full coverage of backends, LoRA, kernels, SONA, e2e ## Recommended Models for 48GB M4 Pro - Primary: Qwen2.5-14B-Instruct (Q8, 15-25 t/s) - Fast: Mistral-7B-Instruct-v0.3 (Q8, 30-45 t/s) - Tiny: Phi-4-mini (Q4, 40-60 t/s) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| build | ||
| patches/hnsw_rs | ||
| check-and-publish-router-wasm.sh | ||
| ci-sync-lockfile.sh | ||
| deploy.sh | ||
| DEPLOYMENT-QUICKSTART.md | ||
| DEPLOYMENT.md | ||
| install-hooks.sh | ||
| publish-all.sh | ||
| publish-cli.sh | ||
| publish-crates.sh | ||
| publish-router-wasm.sh | ||
| README.md | ||
| run_benchmarks.sh | ||
| run_llm_benchmarks.sh | ||
| sync-lockfile.sh | ||
| test-all-graph-commands.sh | ||
| test-deploy.sh | ||
| test-docker-package.sh | ||
| test-graph-cli.sh | ||
| test-wasm.mjs | ||
| validate-packages-simple.sh | ||
| validate-packages.sh | ||
| verify-paper-impl.sh | ||
| verify_hnsw_build.sh | ||
RuVector Automation Scripts
This directory contains automation scripts to streamline development, deployment, and prevent common issues.
📜 Available Scripts
🚀 deploy.sh
Comprehensive deployment script for publishing to crates.io and npm.
Handles:
- Version management and synchronization
- Pre-deployment checks (tests, linting, formatting)
- WASM package builds
- Crate publishing to crates.io
- NPM package publishing
- GitHub Actions trigger for cross-platform builds
Usage:
# Full deployment
./scripts/deploy.sh
# Dry run (test without publishing)
./scripts/deploy.sh --dry-run
# See all options
./scripts/deploy.sh --help
See: DEPLOYMENT.md for complete documentation
🧪 test-deploy.sh
Tests the deployment script without publishing.
Usage: ./scripts/test-deploy.sh
🔄 sync-lockfile.sh
Automatically syncs package-lock.json with package.json changes.
Usage: ./scripts/sync-lockfile.sh
🪝 install-hooks.sh
Installs git hooks for automatic lock file management.
Usage: ./scripts/install-hooks.sh
🤖 ci-sync-lockfile.sh
CI/CD script for automatic lock file fixing.
Usage: ./scripts/ci-sync-lockfile.sh
📦 publish-crates.sh
Legacy script for publishing individual crates. Use deploy.sh instead.
🧭 validate-packages.sh
Validates package configurations and dependencies.
🚀 Quick Start
For Development
-
Install git hooks (recommended):
./scripts/install-hooks.sh -
Test the hook:
cd npm/packages/ruvector npm install chalk git add package.json git commit -m "test: Add chalk dependency" # Hook automatically updates lock file
For Deployment
-
Test deployment script:
./scripts/test-deploy.sh -
Set credentials (required):
export CRATES_API_KEY="your-crates-io-token" export NPM_TOKEN="your-npm-token" -
Run dry run (recommended first):
./scripts/deploy.sh --dry-run -
Deploy:
./scripts/deploy.sh
📖 Documentation
- DEPLOYMENT.md - Comprehensive deployment guide
- ../docs/CONTRIBUTING.md - Development guide
🔐 Security
Never commit credentials! Always use environment variables or secure credential storage.
See DEPLOYMENT.md#security-best-practices for details.