mirror of https://github.com/ruvnet/RuVector.git synced 2026-05-25 15:03:46 +00:00

History

rUv f0f33625f3 feat(prime-radiant): Universal Coherence Engine with Sheaf Laplacian AI Safety (#131 ) * docs(coherence-engine): add ADR-014 and DDD for sheaf Laplacian coherence engine Add comprehensive architecture documentation for ruvector-coherence crate: - ADR-014: Sheaf Laplacian-based coherence witnessing architecture - Universal coherence object with domain-agnostic interpretation - 5-layer architecture (Application → Gate → Computation → Governance → Storage) - 4-tier compute ladder (Reflex → Retrieval → Heavy → Human) - Full ruvector ecosystem integration (10+ crates) - 15 internal architectural decisions - DDD: Domain-Driven Design with 10 bounded contexts - Tile Fabric (cognitum-gate-kernel) - Adaptive Learning (sona) - Neural Gating (ruvector-nervous-system) - Learned Restriction Maps (ruvector-gnn) - Hyperbolic Coherence (ruvector-hyperbolic-hnsw) - Incoherence Isolation (ruvector-mincut) - Attention-Weighted Coherence (ruvector-attention) - Distributed Consensus (ruvector-raft) Key concept: "This is not prediction. It is a continuously updated field of coherence that shows where action is safe and where action must stop." Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(prime-radiant): implement sheaf Laplacian coherence engine Implement the complete Prime-Radiant crate based on ADR-014: Core Modules: - substrate/: SheafGraph, SheafNode, SheafEdge, RestrictionMap (SIMD-optimized) - coherence/: CoherenceEngine, energy computation, spectral drift detection - governance/: PolicyBundle, WitnessRecord, LineageRecord (Blake3 hashing) - execution/: CoherenceGate, ComputeLane, ActionExecutor Ecosystem Integrations (feature-gated): - tiles/: cognitum-gate-kernel 256-tile WASM fabric adapter - sona_tuning/: Adaptive threshold learning with EWC++ - neural_gate/: Biologically-inspired gating with HDC encoding - learned_rho/: GNN-based learned restriction maps - attention/: Topology-gated attention, MoE routing, PDE diffusion - distributed/: Raft-based multi-node coherence Testing: - 138 tests (integration, property-based, chaos) - 8 benchmarks covering ADR-014 performance targets Stats: 91 files, ~30K lines of Rust code "This is not prediction. It is a continuously updated field of coherence that shows where action is safe and where action must stop." Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(adr): add RuvLLM integration to ADR-014 v0.4 - Add coherence-gated LLM inference architecture diagram - Add 5 integration modules with code examples: - SheafCoherenceValidator (replaces heuristic scoring) - UnifiedWitnessLog (merged audit trail) - PatternToRestrictionBridge (ReasoningBank → learned ρ) - MemoryCoherenceLayer (context as sheaf nodes) - CoherenceConfidence (energy → confidence mapping) - Add 7 integration ADRs (ADR-CE-016 through ADR-CE-022) - Add ruvllm to crate integration matrix and dependencies - Add 4 LLM-specific benefits to consequences - Add ruvllm feature flag Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(adr): add 22 coherence engine internal ADRs Create detailed ADR files for all internal coherence engine decisions: Core Architecture (ADR-CE-001 to ADR-CE-008): - 001: Sheaf Laplacian defines coherence witness - 002: Incremental computation with stored residuals - 003: PostgreSQL + ruvector hybrid storage - 004: Signed event log with deterministic replay - 005: First-class governance objects - 006: Coherence gate controls compute ladder - 007: Thresholds auto-tuned from traces - 008: Multi-tenant isolation boundaries Universal Coherence (ADR-CE-009 to ADR-CE-015): - 009: Single coherence object (one math, many interpretations) - 010: Domain-agnostic nodes and edges - 011: Residual = contradiction energy - 012: Gate = refusal mechanism with witness - 013: Not prediction (coherence field, not forecasting) - 014: Reflex lane default (most ops stay fast) - 015: Adapt without losing control RuvLLM Integration (ADR-CE-016 to ADR-CE-022): - 016: CoherenceValidator uses sheaf energy - 017: Unified audit trail (WitnessLog + governance) - 018: Pattern-to-restriction bridge (ReasoningBank) - 019: Memory as nodes (agentic, working, episodic) - 020: Confidence from energy (sigmoid mapping) - 021: Shared SONA between ruvllm and prime-radiant - 022: Failure learning (ErrorPatternLearner → ρ maps) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(prime-radiant): implement RuvLLM integration layer (ADR-014 v0.4) Implement complete Prime-Radiant + RuvLLM integration per ADR-CE-016 through ADR-CE-022: Core Integration Modules: - coherence_validator.rs: SheafCoherenceValidator using sheaf energy - witness_log.rs: UnifiedWitnessLog with hash chain for tamper evidence - pattern_bridge.rs: PatternToRestrictionBridge learning from verdicts - memory_layer.rs: MemoryCoherenceLayer tracking context as sheaf nodes - confidence.rs: CoherenceConfidence with sigmoid energy→confidence mapping Supporting Infrastructure: - mod.rs: Public API, re-exports, convenience constructors - error.rs: Comprehensive error types for each ADR - config.rs: LlmCoherenceConfig, thresholds, policies - gate.rs: LlmCoherenceGate high-level interface - adapter.rs: RuvLlmAdapter bridging type systems - bridge.rs: PolicyBridge, SonaBridge for synchronization - witness.rs: WitnessAdapter for correlation - traits.rs: Trait definitions for loose coupling Testing: - 22 integration tests covering all modules - Self-contained mock implementations - Feature-gated with #[cfg(feature = "ruvllm")] Feature Flags: - ruvllm feature in Cargo.toml - Optional dependency on ruvllm crate - Added to "full" feature set Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prime-radiant): add comprehensive README with examples Add user-friendly documentation covering: - Introduction explaining coherence vs confidence - Core concepts (coherence field, compute ladder) - Features overview (engine, governance, RuvLLM integration) - Quick start code examples: - Basic coherence check - LLM response validation - Memory consistency tracking - Confidence from energy - Application tiers (today, near-term, future) - Domain examples (AI, finance, medical, robotics, security) - Feature flags reference - Performance targets - Architecture diagram Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(adr): add ADR-015 Coherence-Gated Transformer (Sheaf Attention) Propose novel low-latency transformer architecture using coherence energy: Core Innovation: - Route tokens to compute lanes based on coherence energy, not confidence - Sparse attention using residual energy (skip coherent pairs) - Early exit when energy converges (not confidence threshold) - Restriction maps replace QKV projections Architecture: - Lane 0 (Reflex): 1-2 layers, local attention, <0.1ms - Lane 1 (Standard): 6 layers, sparse sheaf attention, ~1ms - Lane 2 (Deep): 12+ layers, full + MoE, ~5ms - Lane 3 (Escalate): Return uncertainty Performance Targets: - 5-10x latency reduction (10ms → 1-2ms for 128 tokens) - 2.5x memory reduction - <5% quality degradation - Provable coherence bound on output Mathematical Foundation: - Attention weight ∝ exp(-β × residual_energy) - Token routing via E(t) = Σ w_e \|\|ρ_t(x) - ρ_ctx(x)\|\|² - Early exit when ΔE < ε (energy converged) Target: ruvector-attention crate with sheaf/ and coherence_gated/ modules Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(prime-radiant): implement coherence engine with CGT attention Complete implementation of Prime-Radiant coherence engine and Coherence-Gated Transformer (CGT) sheaf attention module. Core Features: - Sheaf Laplacian energy computation with restriction maps - 4-lane compute ladder (Reflex/Retrieval/Heavy/Human) - Cryptographic witness chains for audit trails - Policy bundles with multi-party approval Storage Backends: - InMemoryStorage with KNN search - FileStorage with Write-Ahead Logging (WAL) - PostgresStorage with full schema (feature-gated) - HybridStorage combining file + optional PostgreSQL CGT Sheaf Attention (ruvector-attention): - RestrictionMap with residual/energy computation - SheafAttention layer: A_ij = exp(-β×E_ij)/Z - TokenRouter with compute lane routing - SparseResidualAttention with energy-based masking - EarlyExit with energy convergence detection Performance Optimizations: - Zero-allocation hot paths (apply_into, compute_residual_norm_sq) - SIMD-friendly 4-way unrolled loops - Branchless lane routing - Pre-allocated buffers for batch operations RuvLLM Integration: - SheafCoherenceValidator for LLM response validation - UnifiedWitnessLog linking inference + coherence - MemoryCoherenceLayer for contradiction detection - CoherenceConfidence for interpretable uncertainty Tests: 202 passing in ruvector-attention, 180+ in prime-radiant Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(prime-radiant): add GPU acceleration, SIMD optimizations, and benchmarks GPU Acceleration (wgpu-rs): - GpuCoherenceEngine with automatic CPU fallback - GpuDevice: adapter/device management with high-perf selection - GpuDispatcher: kernel execution with pipeline caching and buffer pooling - GpuBufferManager: typed buffer management with pooling - Compute kernels: residuals, energy reduction, sheaf attention, token routing WGSL Compute Shaders (6 files, 1,412 lines): - compute_residuals.wgsl: parallel edge residual computation - compute_energy.wgsl: two-phase parallel reduction - sheaf_attention.wgsl: energy-based attention weights A_ij = exp(-beta * E_ij) - token_routing.wgsl: branchless lane assignment - sparse_mask.wgsl: sparse attention mask generation - types.wgsl: shared GPU struct definitions SIMD Optimizations (wide crate): - Runtime CPU feature detection (AVX2, AVX-512, SSE4.2, NEON) - f32x8 vectorized operations - simd/vectors.rs: dot_product_simd, norm_squared_simd, subtract_simd - simd/matrix.rs: matmul_simd, matvec_simd, transpose_simd - simd/energy.rs: batch_residuals_simd, weighted_energy_sum_simd - 38 unit tests verifying SIMD correctness Benchmarks (criterion): - coherence_benchmarks.rs: core operations, graph scaling - simd_benchmarks.rs: SIMD vs naive comparisons - gpu_benchmarks.rs: CPU vs GPU performance Tests: - 18 GPU coherence tests (16 active, 2 perf ignored) - GPU-CPU consistency within 1% relative error - Error handling and fallback verification README improvements: - "What Prime-Radiant is NOT" section - Concrete numeric example with arithmetic - Flagship LLM hallucination refusal walkthrough - Infrastructure positioning Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf(prime-radiant): optimize SIMD and core computation patterns SIMD Optimizations: - Replace element-by-element load_f32x8 with try_into for direct memory copy - Fix redundant SIMD comparisons in lane assignment (compute masks once, use blend) - Apply across vectors.rs, matrix.rs, and energy.rs Core Computation Patterns: - Replace i % 4 modulo with chunks_exact() for proper auto-vectorization - Fix edge.rs: residual_norm_squared, residual_with_energy - Fix node.rs: norm_squared, dot product Graph API: - Add get_node_ref() for zero-copy node access via DashMap reference - Add with_node() closure API for efficient read-only operations Benchmark findings: - Incremental updates meet target (<100us): 59us actual - Linear O(n) scaling confirmed - Further SIMD/parallelization needed for <1us/edge target Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf(prime-radiant): add CSR sparse matrix, GPU buffer prealloc, thread-local scratch Performance optimizations for Prime-Radiant coherence engine: CSR Sparse Matrix (restriction.rs): - Full CsrMatrix struct with row_ptr, col_indices, values - COO to CSR conversion with from_coo() and from_coo_arrays() - Zero-allocation matvec_into() and matvec_add_into() - SIMD-friendly 4-element loop unrolling - 13 new tests covering all CSR operations GPU Buffer Pre-allocation (engine.rs, kernels.rs): - Pre-allocated params, energy_params, partial_sums, staging buffers - Zero per-frame allocations in compute_energy() - New create_bind_group_raw() methods for raw buffer references - CSR matrix support in convert_restriction_map() Thread-Local Scratch Buffers (edge.rs): - EdgeScratch struct with 3 reusable Vec<f32> buffers - thread_local! SCRATCH for zero-allocation hot paths - residual_norm_squared_no_alloc() and weighted_residual_energy_no_alloc() - 7 new tests for allocation-free energy computation WGSL Vec4 Optimization (compute_residuals.wgsl): - vec4-based processing loop with dot(r_vec, r_vec) - store_residuals flag in GpuParams struct - ~4x GPU throughput improvement README Updates: - Root README: 40 attention mechanisms, Prime-Radiant section, CGT Sheaf Attention - WASM README: CGT Sheaf Attention API documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: SEO optimize package metadata for crates.io and npm - prime-radiant: Enhanced description, keywords, categories - ruvector-attention-wasm: Add version to path dep, SEO keywords - package.json: 23 keywords, better description, engines config Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(hyperbolic-hnsw): SEO optimize for crates.io publish * chore(prime-radiant): add version numbers to path dependencies for crates.io publish * fix(prime-radiant): shorten keyword for crates.io compliance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(readme): add prime-radiant and ruvector-attention-wasm package references - Add prime-radiant to Quantum Coherence section (sheaf Laplacian AI safety) - Add ruvector-attention-wasm to npm WASM packages (Flash, MoE, Hyperbolic, CGT) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Reuven <cohen@ruv-mac-mini.local> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>		2026-01-22 21:27:27 -05:00
..
benches	feat(hyperbolic-hnsw): Add Poincaré ball embeddings with HNSW integration (#114 )	2026-01-15 10:58:08 -05:00
src	feat(hyperbolic-hnsw): Add Poincaré ball embeddings with HNSW integration (#114 )	2026-01-15 10:58:08 -05:00
tests	feat(hyperbolic-hnsw): Add Poincaré ball embeddings with HNSW integration (#114 )	2026-01-15 10:58:08 -05:00
Cargo.lock	feat(hyperbolic-hnsw): Add Poincaré ball embeddings with HNSW integration (#114 )	2026-01-15 10:58:08 -05:00
Cargo.toml	feat(prime-radiant): Universal Coherence Engine with Sheaf Laplacian AI Safety (#131 )	2026-01-22 21:27:27 -05:00
README.md	feat(hyperbolic-hnsw): Add Poincaré ball embeddings with HNSW integration (#114 )	2026-01-15 10:58:08 -05:00

README.md

ruvector-hyperbolic-hnsw

Hyperbolic (Poincaré ball) embeddings with HNSW integration for hierarchy-aware vector search.

Why Hyperbolic Space?

Hierarchies compress naturally in hyperbolic space. Taxonomies, catalogs, ICD trees, product facets, org charts, and long-tail tags all fit better than in Euclidean space, which means higher recall on deep leaves without blowing up memory or latency.

Key Features

Poincaré Ball Model: Store vectors in the Poincaré ball with clamp r < 1 − eps
HNSW Speed Trick: Prune with cheap tangent-space proxy, rank with true hyperbolic distance
Per-Shard Curvature: Different parts of the hierarchy can have different optimal curvatures
Dual-Space Index: Keep a synchronized Euclidean ANN for fallback and mutual-ranking fusion
Production Guardrails: Numerical stability, canary testing, hot curvature reload

Installation

Rust

[dependencies]
ruvector-hyperbolic-hnsw = "0.1.0"

WebAssembly

cd crates/ruvector-hyperbolic-hnsw-wasm
wasm-pack build --target web --release

TypeScript/JavaScript

import init, {
  HyperbolicIndex,
  poincareDistance,
  mobiusAdd,
  expMap,
  logMap
} from 'ruvector-hyperbolic-hnsw-wasm';

await init();

const index = new HyperbolicIndex(16, 1.0);
index.insert(new Float32Array([0.1, 0.2, 0.3]));
const results = index.search(new Float32Array([0.15, 0.1, 0.2]), 5);

Quick Start

use ruvector_hyperbolic_hnsw::{HyperbolicHnsw, HyperbolicHnswConfig};

// Create index with default settings
let mut index = HyperbolicHnsw::default_config();

// Insert vectors (automatically projected to Poincaré ball)
index.insert(vec![0.1, 0.2, 0.3]).unwrap();
index.insert(vec![-0.1, 0.15, 0.25]).unwrap();
index.insert(vec![0.2, -0.1, 0.1]).unwrap();

// Search for nearest neighbors
let results = index.search(&[0.15, 0.1, 0.2], 2).unwrap();
for r in results {
    println!("ID: {}, Distance: {:.4}", r.id, r.distance);
}

HNSW Speed Trick

The core optimization:

Precompute u = log_c(x) at a shard centroid c
During neighbor selection, use Euclidean ||u_q - u_p|| to prune
Run exact Poincaré distance only on top N candidates before final ranking

use ruvector_hyperbolic_hnsw::{HyperbolicHnsw, HyperbolicHnswConfig};

let mut config = HyperbolicHnswConfig::default();
config.use_tangent_pruning = true;
config.prune_factor = 10; // Consider 10x candidates in tangent space

let mut index = HyperbolicHnsw::new(config);

// ... insert vectors ...

// Build tangent cache for pruning optimization
index.build_tangent_cache().unwrap();

// Search with pruning (faster!)
let results = index.search_with_pruning(&[0.1, 0.15], 5).unwrap();

Core Mathematical Operations

use ruvector_hyperbolic_hnsw::poincare::{
    mobius_add, exp_map, log_map, poincare_distance, project_to_ball
};

let x = vec![0.3, 0.2];
let y = vec![-0.1, 0.4];
let c = 1.0; // Curvature

// Möbius addition (hyperbolic vector addition)
let z = mobius_add(&x, &y, c);

// Geodesic distance in hyperbolic space
let d = poincare_distance(&x, &y, c);

// Map to tangent space at x
let v = log_map(&y, &x, c);

// Map back to manifold
let y_recovered = exp_map(&v, &x, c);

Sharded Index with Per-Shard Curvature

use ruvector_hyperbolic_hnsw::{ShardedHyperbolicHnsw, ShardStrategy};

let mut manager = ShardedHyperbolicHnsw::new(1.0);

// Insert with hierarchy depth information
manager.insert(vec![0.1, 0.2], Some(0)).unwrap(); // Root level
manager.insert(vec![0.3, 0.1], Some(3)).unwrap(); // Deeper level

// Update curvature for specific shard
manager.update_curvature("radius_1", 0.5).unwrap();

// Canary testing for new curvature
manager.registry.set_canary("radius_1", 0.3, 10); // 10% traffic

// Search across all shards
let results = manager.search(&[0.2, 0.15], 5).unwrap();

Numerical Stability

All operations include numerical safeguards:

Norm clamping: Points projected with eps = 1e-5
Projection after updates: All operations keep points inside the ball
Stable acosh: Uses log1p expansions for safety
Clamp arguments: arctanh and atanh arguments bounded away from ±1

Evaluation Protocol

Datasets

WordNet
DBpedia slices
Synthetic scale-free tree
Domain taxonomy

Primary Metrics

recall@k (1, 5, 10)
Mean rank
NDCG

Hierarchy Metrics

Radius vs depth Spearman correlation
Distance distortion
Ancestor AUPRC

Baselines

Euclidean HNSW
OPQ/PQ compressed
Simple mutual-ranking fusion

Ablations

Tangent proxy vs full hyperbolic
Fixed vs learnable curvature c
Global vs shard centroids

Production Integration

Reflex Loop (on writes)

Small Möbius deltas and tangent-space micro updates that never push points outside the ball.

use ruvector_hyperbolic_hnsw::tangent_micro_update;

let updated = tangent_micro_update(
    &point,
    &delta,
    &centroid,
    curvature,
    0.1  // max step size
);

Habit (nightly)

Riemannian SGD passes to clean neighborhoods and optionally relearn per-shard curvature. Run canary first.

Structural (periodic)

Rebuild of HNSW with true hyperbolic metric, curvature retune, and shard reshuffle if hierarchy preservation drops below SLO.

Dependencies (Exact Versions)

nalgebra = "0.34.1"
ndarray = "0.17.1"
wasm-bindgen = "0.2.106"

Benchmarks

cd crates/ruvector-hyperbolic-hnsw
cargo bench

Benchmark suite includes:

Poincaré distance computation
Möbius addition
exp/log map operations
HNSW insert and search
Tangent cache building
Search with vs without pruning

License

MIT

ruvector-attention - Hyperbolic attention mechanisms
micro-hnsw-wasm - Minimal HNSW for WASM
ruvector-math - General math primitives

README.md Unescape Escape