mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-06-01 23:00:37 +00:00
* feat(postgres): Add 7 advanced AI modules to ruvector-postgres Comprehensive implementation of advanced AI capabilities: ## New Modules (23,541 lines of code) ### 1. Self-Learning / ReasoningBank (`src/learning/`) - Trajectory tracking for query optimization - Pattern extraction using K-means clustering - ReasoningBank for pattern storage and matching - Adaptive search parameter optimization ### 2. Attention Mechanisms (`src/attention/`) - Scaled dot-product attention (core) - Multi-head attention with parallel heads - Flash Attention v2 (memory-efficient) - 10 attention types with PostgresEnum support ### 3. GNN Layers (`src/gnn/`) - Message passing framework - GCN (Graph Convolutional Network) - GraphSAGE with mean/max aggregation - Configurable aggregation methods ### 4. Hyperbolic Embeddings (`src/hyperbolic/`) - Poincaré ball model - Lorentz hyperboloid model - Hyperbolic distance metrics - Möbius operations ### 5. Sparse Vectors (`src/sparse/`) - COO format sparse vector type - Efficient sparse-sparse distance functions - BM25/SPLADE compatible - Top-k pruning operations ### 6. Graph Operations & Cypher (`src/graph/`) - Property graph storage (nodes/edges) - BFS, DFS, Dijkstra traversal - Cypher query parser (AST-based) - Query executor with pattern matching ### 7. Tiny Dancer Routing (`src/routing/`) - FastGRNN neural network - Agent registry with capabilities - Multi-objective routing optimization - Cost/latency/quality balancing ## Docker Infrastructure - Dockerfile with pgrx 0.12.6 and PostgreSQL 16 - docker-compose.yml with test runner - Initialization SQL with test tables - Shell scripts for dev/test/benchmark ## Feature Flags - `learning`, `attention`, `gnn`, `hyperbolic` - `sparse`, `graph`, `routing` - `ai-complete` and `graph-complete` bundles 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Copy entire workspace for pgrx build 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Build standalone crate without workspace 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: Update README to enhance clarity and structure * fix(postgres): Resolve compilation errors and Docker build issues - Fix simsimd Option/Result type mismatch in scaled_dot.rs - Fix f32/f64 type conversions in poincare.rs and lorentz.rs - Fix AVX512 missing wrapper functions by using AVX2 fallback - Fix Vec<Vec<f32>> to JsonB for pgrx pg_extern compatibility - Fix DashMap get() to get_mut() for mutable access - Fix router.rs dereference for best_score comparison - Update Dockerfile to copy pre-written SQL file for pgrx - Simplify init.sql to use correct function names - Add postgres-cli npm package for CLI tooling All changes tested successfully in Docker with: - Extension loads with AVX2 SIMD support (8 floats/op) - Distance functions verified working - PostgreSQL 16 container runs successfully 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add ruvLLM examples and enhanced postgres-cli Added from claude/ruvector-lfm2-llm-01YS5Tc7i64PyYCLecT9L1dN branch: - examples/ruvLLM: Complete LLM inference system with SIMD optimization - Pretraining, benchmarking, and optimization system - Real SIMD-optimized CPU inference engine - Comprehensive SOTA benchmark suite - Attention mechanisms, memory management, router Enhanced postgres-cli with full ruvector-postgres integration: - Sparse vector operations (BM25, top-k, prune, conversions) - Hyperbolic geometry (Poincare, Lorentz, Mobius operations) - Agent routing (Tiny Dancer system) - Vector quantization (binary, scalar, product) - Enhanced graph and learning commands 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(postgres-cli): Use native ruvector type instead of pgvector - Change createVectorTable to use ruvector type (native RuVector extension) - Add dimensions column for metadata since ruvector is variable-length - Update index creation to use simple btree (HNSW/IVFFlat TBD) - Tested against Docker container with ruvector extension 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat(postgres): Add 53 SQL function definitions for all advanced modules Enable all advanced PostgreSQL extension functions by adding their SQL definitions to the extension file. This exposes all Rust #[pg_extern] functions to PostgreSQL. ## New SQL Functions (53 total) ### Hyperbolic Geometry (8 functions) - ruvector_poincare_distance, ruvector_lorentz_distance - ruvector_mobius_add, ruvector_exp_map, ruvector_log_map - ruvector_poincare_to_lorentz, ruvector_lorentz_to_poincare - ruvector_minkowski_dot ### Sparse Vectors (14 functions) - ruvector_sparse_create, ruvector_sparse_from_dense - ruvector_sparse_dot, ruvector_sparse_cosine, ruvector_sparse_l2_distance - ruvector_sparse_add, ruvector_sparse_scale, ruvector_sparse_to_dense - ruvector_sparse_nnz, ruvector_sparse_dim - ruvector_bm25_score, ruvector_tf_idf, ruvector_sparse_normalize - ruvector_sparse_topk ### GNN - Graph Neural Networks (5 functions) - ruvector_gnn_gcn_layer, ruvector_gnn_graphsage_layer - ruvector_gnn_gat_layer, ruvector_gnn_message_pass - ruvector_gnn_aggregate ### Routing/Agents - "Tiny Dancer" (11 functions) - ruvector_route_query, ruvector_route_with_context - ruvector_calculate_agent_affinity, ruvector_select_best_agent - ruvector_multi_agent_route, ruvector_create_agent_embedding - ruvector_get_routing_stats, ruvector_register_agent - ruvector_update_agent_performance, ruvector_adaptive_route - ruvector_fastgrnn_forward ### Learning/ReasoningBank (7 functions) - ruvector_record_trajectory, ruvector_get_verdict - ruvector_distill_memory, ruvector_adaptive_search - ruvector_learning_feedback, ruvector_get_learning_patterns - ruvector_optimize_search_params ### Graph/Cypher (8 functions) - ruvector_graph_create_node, ruvector_graph_create_edge - ruvector_graph_get_neighbors, ruvector_graph_shortest_path - ruvector_graph_pagerank, ruvector_cypher_query - ruvector_graph_traverse, ruvector_graph_similarity_search ## CLI Updates - Enabled hyperbolic geometry commands in postgres-cli - Added vector distance and normalize commands - Enhanced client with connection pooling and retry logic 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| cypher | ||
| mod.rs | ||
| operators.rs | ||
| README.md | ||
| storage.rs | ||
| traversal.rs | ||
Graph Operations & Cypher Module
This module provides graph database capabilities for the ruvector-postgres extension, including graph storage, traversal algorithms, and Cypher query support.
Features
- Concurrent Graph Storage: Thread-safe graph storage using DashMap
- Node & Edge Management: Full-featured node and edge storage with properties
- Label Indexing: Fast node lookups by label
- Adjacency Lists: Efficient edge traversal with O(1) neighbor access
- Graph Traversal: BFS, DFS, and Dijkstra's shortest path algorithms
- Cypher Support: Simplified Cypher query language for graph operations
- PostgreSQL Integration: Native pgrx-based PostgreSQL functions
Architecture
Storage Layer (storage.rs)
// Node with labels and properties
pub struct Node {
pub id: u64,
pub labels: Vec<String>,
pub properties: HashMap<String, JsonValue>,
}
// Edge with type and properties
pub struct Edge {
pub id: u64,
pub source: u64,
pub target: u64,
pub edge_type: String,
pub properties: HashMap<String, JsonValue>,
}
// Concurrent storage with indexing
pub struct GraphStore {
pub nodes: NodeStore, // DashMap-based
pub edges: EdgeStore, // DashMap-based
}
Traversal Layer (traversal.rs)
Implements common graph algorithms:
- BFS: Breadth-first search for shortest path by hop count
- DFS: Depth-first search with visitor pattern
- Dijkstra: Weighted shortest path with custom edge weights
- All Paths: Find multiple paths between nodes
Cypher Layer (cypher/)
Simplified Cypher query language support:
- AST (
ast.rs): Complete abstract syntax tree for Cypher - Parser (
parser.rs): Basic parser for common Cypher patterns - Executor (
executor.rs): Query execution engine
Supported Cypher clauses:
CREATE: Create nodes and relationshipsMATCH: Pattern matchingWHERE: FilteringRETURN: Result projectionSET,DELETE,WITH: Basic support
PostgreSQL Functions
Graph Management
-- Create a new graph
SELECT ruvector_create_graph('my_graph');
-- List all graphs
SELECT ruvector_list_graphs();
-- Delete a graph
SELECT ruvector_delete_graph('my_graph');
-- Get graph statistics
SELECT ruvector_graph_stats('my_graph');
-- Returns: {"name": "my_graph", "node_count": 100, "edge_count": 250, ...}
Node Operations
-- Add a node
SELECT ruvector_add_node(
'my_graph',
ARRAY['Person', 'Employee'], -- Labels
'{"name": "Alice", "age": 30, "department": "Engineering"}'::jsonb
);
-- Returns: node_id (bigint)
-- Get a node by ID
SELECT ruvector_get_node('my_graph', 1);
-- Returns: {"id": 1, "labels": ["Person"], "properties": {...}}
-- Find nodes by label
SELECT ruvector_find_nodes_by_label('my_graph', 'Person');
-- Returns: array of nodes
Edge Operations
-- Add an edge
SELECT ruvector_add_edge(
'my_graph',
1, -- source_id
2, -- target_id
'KNOWS', -- edge_type
'{"since": 2020, "weight": 0.8}'::jsonb
);
-- Returns: edge_id (bigint)
-- Get an edge by ID
SELECT ruvector_get_edge('my_graph', 1);
-- Get neighbors of a node
SELECT ruvector_get_neighbors('my_graph', 1);
-- Returns: array of node IDs
Graph Traversal
-- Find shortest path (unweighted)
SELECT ruvector_shortest_path(
'my_graph',
1, -- start_id
10, -- end_id
5 -- max_hops
);
-- Returns: {"nodes": [1, 3, 7, 10], "edges": [12, 45, 89], "length": 4, "cost": 0}
-- Find weighted shortest path
SELECT ruvector_shortest_path_weighted(
'my_graph',
1, -- start_id
10, -- end_id
'weight' -- property name for edge weights
);
-- Returns: {"nodes": [...], "edges": [...], "length": 4, "cost": 2.5}
Cypher Queries
-- Create nodes
SELECT ruvector_cypher(
'my_graph',
'CREATE (n:Person {name: ''Alice'', age: 30}) RETURN n',
NULL
);
-- Match and filter
SELECT ruvector_cypher(
'my_graph',
'MATCH (n:Person) WHERE n.age > 25 RETURN n.name, n.age',
NULL
);
-- Parameterized queries
SELECT ruvector_cypher(
'my_graph',
'MATCH (n:Person) WHERE n.name = $name RETURN n',
'{"name": "Alice"}'::jsonb
);
-- Create relationships
SELECT ruvector_cypher(
'my_graph',
'CREATE (a:Person {name: ''Alice''})-[:KNOWS {since: 2020}]->(b:Person {name: ''Bob''}) RETURN a, b',
NULL
);
Usage Examples
Social Network
-- Create graph
SELECT ruvector_create_graph('social_network');
-- Add users
WITH users AS (
SELECT ruvector_add_node('social_network', ARRAY['Person'],
jsonb_build_object('name', name, 'age', age))
FROM (VALUES
('Alice', 30),
('Bob', 25),
('Charlie', 35),
('Diana', 28)
) AS t(name, age)
)
-- Create friendships
SELECT ruvector_add_edge('social_network', 1, 2, 'FRIENDS',
'{"since": "2020-01-15"}'::jsonb);
SELECT ruvector_add_edge('social_network', 2, 3, 'FRIENDS',
'{"since": "2019-06-20"}'::jsonb);
SELECT ruvector_add_edge('social_network', 1, 4, 'FRIENDS',
'{"since": "2021-03-10"}'::jsonb);
-- Find connection between Alice and Charlie
SELECT ruvector_shortest_path('social_network', 1, 3, 10);
-- Cypher: Find all friends of friends
SELECT ruvector_cypher(
'social_network',
'MATCH (a:Person)-[:FRIENDS]->(b:Person)-[:FRIENDS]->(c:Person)
WHERE a.name = ''Alice'' RETURN c.name',
NULL
);
Knowledge Graph
-- Create knowledge graph
SELECT ruvector_create_graph('knowledge');
-- Add concepts
SELECT ruvector_add_node('knowledge', ARRAY['Concept'],
'{"name": "Machine Learning", "category": "AI"}'::jsonb);
SELECT ruvector_add_node('knowledge', ARRAY['Concept'],
'{"name": "Neural Networks", "category": "AI"}'::jsonb);
SELECT ruvector_add_node('knowledge', ARRAY['Concept'],
'{"name": "Deep Learning", "category": "AI"}'::jsonb);
-- Create relationships
SELECT ruvector_add_edge('knowledge', 1, 2, 'INCLUDES',
'{"strength": 0.9}'::jsonb);
SELECT ruvector_add_edge('knowledge', 2, 3, 'SPECIALIZES_IN',
'{"strength": 0.95}'::jsonb);
-- Find weighted path
SELECT ruvector_shortest_path_weighted('knowledge', 1, 3, 'strength');
Recommendation System
-- Create graph
SELECT ruvector_create_graph('recommendations');
-- Add users and items
SELECT ruvector_cypher('recommendations',
'CREATE (u:User {name: ''Alice''})
CREATE (m1:Movie {title: ''Inception''})
CREATE (m2:Movie {title: ''Interstellar''})
CREATE (u)-[:WATCHED {rating: 5}]->(m1)
CREATE (u)-[:WATCHED {rating: 4}]->(m2)
RETURN u, m1, m2',
NULL
);
-- Find similar users or items
SELECT ruvector_cypher('recommendations',
'MATCH (u1:User)-[:WATCHED]->(m:Movie)<-[:WATCHED]-(u2:User)
WHERE u1.name = ''Alice''
RETURN u2.name, COUNT(m) AS common_movies
ORDER BY common_movies DESC',
NULL
);
Performance Characteristics
Storage
- Node Lookup: O(1) by ID, O(k) by label (k = nodes with label)
- Edge Lookup: O(1) by ID, O(d) for neighbors (d = degree)
- Concurrent Access: Lock-free reads, minimal contention on writes
Traversal
- BFS: O(V + E) time, O(V) space
- DFS: O(V + E) time, O(h) space (h = max depth)
- Dijkstra: O((V + E) log V) time with binary heap
Scalability
- Thread-safe concurrent operations
- Memory-efficient adjacency lists
- Label and type indexing for fast filtering
Implementation Details
Concurrent Storage
Uses DashMap for lock-free concurrent access:
pub struct NodeStore {
nodes: DashMap<u64, Node>,
label_index: DashMap<String, HashSet<u64>>,
next_id: AtomicU64,
}
Graph Registry
Global registry for named graphs:
static GRAPH_REGISTRY: Lazy<DashMap<String, Arc<GraphStore>>> = ...
Cypher Parser
Basic recursive descent parser:
- Handles common patterns:
(n:Label {prop: value}) - Relationship patterns:
-[:TYPE]->,<-[:TYPE]- - WHERE conditions, RETURN projections
- Property extraction and type inference
Limitations
Current Parser Limitations
The Cypher parser is simplified for demonstration:
- No support for complex WHERE conditions (AND/OR)
- Limited expression support (basic comparisons only)
- No aggregation functions (COUNT, SUM, etc.)
- No ORDER BY or GROUP BY clauses
- Basic pattern matching only
Production Recommendations
For production use, consider:
- Using a proper parser library (nom, pest, lalrpop)
- Adding comprehensive error messages
- Implementing full Cypher specification
- Query optimization and planning
- Transaction support
- Persistence layer
Testing
Comprehensive test suite included:
# Run all tests
cargo pgrx test
# Run specific test
cargo pgrx test test_create_graph
Test coverage:
- Node and edge CRUD operations
- Graph traversal algorithms
- Cypher query execution
- PostgreSQL function integration
- Concurrent access patterns
Future Enhancements
- Graph analytics (PageRank, community detection)
- Temporal graphs (time-aware edges)
- Property graph constraints
- Full-text search on properties
- Persistent storage backend
- Query optimization
- Distributed graph support
- GraphQL interface