ruvector/examples/data/framework/src
rUv cbacb0b9d6 feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107)
## New Features
- HNSW Integration: O(log n) similarity search replaces O(n²) brute force (10-50x speedup)
- Similarity Cache: 2-3x speedup for repeated similarity queries
- Batch ONNX Embeddings: Chunked processing with progress callbacks
- Shared Utils Module: cosine_similarity, euclidean_distance, normalize_vector
- Auto-connect by Embeddings: CoherenceEngine creates edges from vector similarity

## Performance Improvements
- 8.8x faster batch vector insertion (parallel processing)
- 10-50x faster similarity search (HNSW vs brute force)
- 2.9x faster similarity computation (SIMD acceleration)
- 2-3x faster repeated queries (similarity cache)

## Files Changed
- coherence.rs: HNSW integration, new CoherenceConfig fields
- optimized.rs: Similarity cache implementation
- utils.rs: New shared utility functions
- api_clients.rs: Batch embedding methods (embed_batch_chunked, embed_batch_with_progress)
- README.md: Documented all new features and configuration options

Published as ruvector-data-framework v0.3.0 on crates.io

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 16:16:38 -05:00
..
bin feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
academic_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
api_clients.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
arxiv_client.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
biorxiv_client.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
coherence.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
crossref_client.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
cut_aware_hnsw.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
discovery.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
dynamic_mincut.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
economic_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
export.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
finance_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
forecasting.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
genomics_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
geospatial_clients.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
government_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
hnsw.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
ingester.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
lib.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
mcp_server.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
medical_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
ml_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
news_clients.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
optimized.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
patent_clients.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
persistence.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
physics_clients.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
realtime.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
ruvector_native.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
semantic_scholar.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
space_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
streaming.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
transportation_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
utils.rs feat(data-framework): v0.3.0 with HNSW, similarity cache, and batch embeddings (#107) 2026-01-05 16:16:38 -05:00
visualization.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00
wiki_clients.rs feat: Add comprehensive dataset discovery framework for RuVector (#104) 2026-01-04 14:36:41 -05:00