# RuVector-Postgres [![Crates.io](https://img.shields.io/crates/v/ruvector-postgres.svg)](https://crates.io/crates/ruvector-postgres) [![Documentation](https://docs.rs/ruvector-postgres/badge.svg)](https://docs.rs/ruvector-postgres) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![PostgreSQL](https://img.shields.io/badge/PostgreSQL-14--17-blue.svg)](https://www.postgresql.org/) [![Docker](https://img.shields.io/badge/Docker-available-blue.svg)](https://hub.docker.com/r/ruvector/postgres) **The most advanced PostgreSQL vector database extension.** A drop-in pgvector replacement with 53+ SQL functions, SIMD acceleration, 39 attention mechanisms, GNN layers, hyperbolic embeddings, and self-learning capabilities. ## Why RuVector? | Feature | pgvector | RuVector-Postgres | |---------|----------|-------------------| | Vector Search | HNSW, IVFFlat | HNSW, IVFFlat (optimized) | | Distance Metrics | 3 | 8+ (including hyperbolic) | | **Attention Mechanisms** | - | **39 types** | | **Graph Neural Networks** | - | **GCN, GraphSAGE, GAT** | | **Hyperbolic Embeddings** | - | **Poincare, Lorentz** | | **Sparse Vectors / BM25** | Partial | **Full support** | | **Self-Learning** | - | **ReasoningBank** | | **Agent Routing** | - | **Tiny Dancer** | | **Graph/Cypher** | - | **Full support** | | AVX-512/NEON SIMD | Partial | **Full** | | Quantization | No | **Scalar, Product, Binary** | ## Installation ### Docker (Recommended) ```bash docker run -d --name ruvector-pg \ -e POSTGRES_PASSWORD=secret \ -p 5432:5432 \ ruvector/postgres:latest ``` ### From Source ```bash # Install pgrx cargo install cargo-pgrx --version "0.12.9" --locked cargo pgrx init --pg16 $(which pg_config) # Build and install cd crates/ruvector-postgres cargo pgrx install --release ``` ### CLI Tool ```bash npm install -g @ruvector/postgres-cli ruvector-pg -c "postgresql://localhost:5432/mydb" install ``` ## Quick Start ```sql -- Create the extension CREATE EXTENSION ruvector; -- Create a table with vector column CREATE TABLE documents ( id SERIAL PRIMARY KEY, content TEXT, embedding ruvector(1536) ); -- Create an HNSW index CREATE INDEX ON documents USING ruhnsw (embedding ruvector_l2_ops); -- Find similar documents SELECT content, embedding <-> '[0.15, 0.25, ...]'::ruvector AS distance FROM documents ORDER BY distance LIMIT 10; ``` ## 53+ SQL Functions RuVector exposes all advanced AI capabilities as native PostgreSQL functions. ### Core Vector Operations ```sql -- Distance metrics SELECT ruvector_cosine_distance(a, b); SELECT ruvector_l2_distance(a, b); SELECT ruvector_inner_product(a, b); SELECT ruvector_manhattan_distance(a, b); -- Vector operations SELECT ruvector_normalize(embedding); SELECT ruvector_add(a, b); SELECT ruvector_scalar_mul(embedding, 2.0); ``` ### Hyperbolic Geometry (8 functions) Perfect for hierarchical data like taxonomies, knowledge graphs, and org charts. ```sql -- Poincare ball model SELECT ruvector_poincare_distance(a, b, -1.0); -- curvature -1 -- Lorentz hyperboloid model SELECT ruvector_lorentz_distance(a, b, -1.0); -- Hyperbolic operations SELECT ruvector_mobius_add(a, b, -1.0); -- Hyperbolic translation SELECT ruvector_exp_map(base, tangent, -1.0); -- Tangent to manifold SELECT ruvector_log_map(base, target, -1.0); -- Manifold to tangent -- Model conversion SELECT ruvector_poincare_to_lorentz(poincare_vec, -1.0); SELECT ruvector_lorentz_to_poincare(lorentz_vec, -1.0); -- Minkowski inner product SELECT ruvector_minkowski_dot(a, b); ``` ### Sparse Vectors & BM25 (14 functions) Full sparse vector support with text scoring. ```sql -- Create sparse vectors SELECT ruvector_sparse_create(ARRAY[0, 5, 10], ARRAY[0.5, 0.3, 0.2], 100); SELECT ruvector_sparse_from_dense(dense_vector, 0.01); -- threshold -- Sparse operations SELECT ruvector_sparse_dot(a, b); SELECT ruvector_sparse_cosine(a, b); SELECT ruvector_sparse_l2_distance(a, b); SELECT ruvector_sparse_add(a, b); SELECT ruvector_sparse_scale(vec, 2.0); SELECT ruvector_sparse_normalize(vec); SELECT ruvector_sparse_topk(vec, 10); -- Top-k elements -- Text scoring SELECT ruvector_bm25_score(query_terms, doc_freqs, doc_len, avg_doc_len, total_docs); SELECT ruvector_tf_idf(term_freq, doc_freq, total_docs); ``` ### 39 Attention Mechanisms Full transformer-style attention in PostgreSQL. ```sql -- Scaled dot-product attention SELECT ruvector_attention_scaled_dot(query, keys, values); -- Multi-head attention SELECT ruvector_attention_multi_head(query, keys, values, num_heads); -- Flash attention (memory efficient) SELECT ruvector_attention_flash(query, keys, values, block_size); -- Sparse attention patterns SELECT ruvector_attention_sparse(query, keys, values, sparsity_pattern); -- Linear attention (O(n) complexity) SELECT ruvector_attention_linear(query, keys, values); -- Causal/masked attention SELECT ruvector_attention_causal(query, keys, values); -- Cross attention SELECT ruvector_attention_cross(query, context_keys, context_values); -- Self attention SELECT ruvector_attention_self(input, num_heads); ``` ### Graph Neural Networks (5 functions) GNN layers for graph-structured data. ```sql -- GCN (Graph Convolutional Network) SELECT ruvector_gnn_gcn_layer(features, adjacency, weights); -- GraphSAGE (inductive learning) SELECT ruvector_gnn_graphsage_layer(features, neighbor_features, weights); -- GAT (Graph Attention Network) SELECT ruvector_gnn_gat_layer(features, adjacency, attention_weights); -- Message passing SELECT ruvector_gnn_message_pass(node_features, edge_index, edge_weights); -- Aggregation SELECT ruvector_gnn_aggregate(messages, aggregation_type); -- mean, max, sum ``` ### Agent Routing - Tiny Dancer (11 functions) Intelligent query routing to specialized AI agents. ```sql -- Route query to best agent SELECT ruvector_route_query(query_embedding, agent_registry); -- Route with context SELECT ruvector_route_with_context(query, context, agents); -- Multi-agent routing SELECT ruvector_multi_agent_route(query, agents, top_k); -- Agent management SELECT ruvector_register_agent(name, capabilities, embedding); SELECT ruvector_update_agent_performance(agent_id, metrics); SELECT ruvector_get_routing_stats(); -- Affinity calculation SELECT ruvector_calculate_agent_affinity(query, agent); SELECT ruvector_select_best_agent(query, agent_list); -- Adaptive routing SELECT ruvector_adaptive_route(query, context, learning_rate); -- FastGRNN acceleration SELECT ruvector_fastgrnn_forward(input, hidden, weights); ``` ### Self-Learning / ReasoningBank (7 functions) Adaptive search parameter optimization. ```sql -- Record learning trajectory SELECT ruvector_record_trajectory(input, output, success, context); -- Get verdict on approach SELECT ruvector_get_verdict(trajectory_id); -- Memory distillation SELECT ruvector_distill_memory(trajectories, compression_ratio); -- Adaptive search SELECT ruvector_adaptive_search(query, context, ef_search); -- Learning feedback SELECT ruvector_learning_feedback(search_id, relevance_scores); -- Get learned patterns SELECT ruvector_get_learning_patterns(context); -- Optimize search parameters SELECT ruvector_optimize_search_params(query_type, historical_data); ``` ### Graph Storage & Cypher (8 functions) Graph operations with Cypher query support. ```sql -- Create graph elements SELECT ruvector_graph_create_node(labels, properties, embedding); SELECT ruvector_graph_create_edge(from_node, to_node, edge_type, properties); -- Graph queries SELECT ruvector_graph_get_neighbors(node_id, edge_type, depth); SELECT ruvector_graph_shortest_path(start_node, end_node); SELECT ruvector_graph_pagerank(edge_table, damping, iterations); -- Cypher queries SELECT ruvector_cypher_query('MATCH (n:Person)-[:KNOWS]->(m) RETURN n, m'); -- Traversal SELECT ruvector_graph_traverse(start_node, direction, max_depth); -- Similarity search on graph SELECT ruvector_graph_similarity_search(query_embedding, node_type, top_k); ``` ## Vector Types ### `ruvector(n)` - Dense Vector ```sql CREATE TABLE items (embedding ruvector(1536)); -- Storage: 8 + (4 x dimensions) bytes ``` ### `halfvec(n)` - Half-Precision Vector ```sql CREATE TABLE items (embedding halfvec(1536)); -- Storage: 8 + (2 x dimensions) bytes (50% savings) ``` ### `sparsevec(n)` - Sparse Vector ```sql CREATE TABLE items (embedding sparsevec(50000)); INSERT INTO items VALUES ('{1:0.5, 100:0.8, 5000:0.3}/50000'); -- Storage: 12 + (8 x non_zero_elements) bytes ``` ## Distance Operators | Operator | Distance | Use Case | |----------|----------|----------| | `<->` | L2 (Euclidean) | General similarity | | `<=>` | Cosine | Text embeddings | | `<#>` | Inner Product | Normalized vectors | | `<+>` | Manhattan (L1) | Sparse features | ## Index Types ### HNSW (Hierarchical Navigable Small World) ```sql CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops) WITH (m = 16, ef_construction = 64); SET ruvector.ef_search = 100; -- Tune search quality ``` ### IVFFlat (Inverted File Flat) ```sql CREATE INDEX ON items USING ruivfflat (embedding ruvector_l2_ops) WITH (lists = 100); SET ruvector.ivfflat_probes = 10; -- Tune search quality ``` ## Performance Benchmarks *AMD EPYC 7763 (64 cores), 256GB RAM:* | Operation | 10K vectors | 100K vectors | 1M vectors | |-----------|-------------|--------------|------------| | HNSW Build | 0.8s | 8.2s | 95s | | HNSW Search (top-10) | 0.3ms | 0.5ms | 1.2ms | | Cosine Distance | 0.01ms | 0.01ms | 0.01ms | | Poincare Distance | 0.02ms | 0.02ms | 0.02ms | | GCN Forward | 2.1ms | 18ms | 180ms | | BM25 Score | 0.05ms | 0.08ms | 0.15ms | *Single distance calculation (1536 dimensions):* | Metric | AVX2 Time | Speedup vs Scalar | |--------|-----------|-------------------| | L2 (Euclidean) | 38 ns | 3.7x | | Cosine | 51 ns | 3.7x | | Inner Product | 36 ns | 3.7x | ## Use Cases ### Semantic Search with RAG ```sql SELECT content, embedding <=> $query_embedding AS similarity FROM documents WHERE category = 'technical' ORDER BY similarity LIMIT 5; ``` ### Knowledge Graph with Hierarchical Embeddings ```sql -- Use hyperbolic embeddings for taxonomy SELECT name, ruvector_poincare_distance(embedding, $query, -1.0) AS distance FROM taxonomy_nodes ORDER BY distance LIMIT 10; ``` ### Hybrid Search (Vector + BM25) ```sql SELECT content, 0.7 * (1.0 / (1.0 + embedding <-> $query_vector)) + 0.3 * ruvector_bm25_score(terms, doc_freqs, length, avg_len, total) AS score FROM documents ORDER BY score DESC LIMIT 10; ``` ### Multi-Agent Query Routing ```sql SELECT ruvector_route_query( $user_query_embedding, (SELECT array_agg(row(name, capabilities)) FROM agents) ) AS best_agent; ``` ### Graph Neural Network Inference ```sql SELECT ruvector_gnn_gcn_layer( node_features, adjacency_matrix, trained_weights ) AS updated_features FROM graph_nodes; ``` ## CLI Tool Install the CLI for easy management: ```bash npm install -g @ruvector/postgres-cli # Commands ruvector-pg install # Install extension ruvector-pg vector create table --dim 384 --index hnsw ruvector-pg hyperbolic poincare-distance --a "[0.1,0.2]" --b "[0.3,0.4]" ruvector-pg gnn gcn --features "[[...]]" --adj "[[...]]" ruvector-pg graph query "MATCH (n) RETURN n" ruvector-pg routing route --query "[...]" --agents agents.json ruvector-pg learning adaptive-search --context "[...]" ruvector-pg bench run --type all --size 10000 ``` ## Related Packages - [`@ruvector/postgres-cli`](https://www.npmjs.com/package/@ruvector/postgres-cli) - CLI for RuVector PostgreSQL - [`ruvector-core`](https://crates.io/crates/ruvector-core) - Core vector operations library - [`ruvector-tiny-dancer`](https://crates.io/crates/ruvector-tiny-dancer) - Agent routing library ## Documentation | Document | Description | |----------|-------------| | [docs/API.md](docs/API.md) | Complete SQL API reference | | [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) | System architecture | | [docs/SIMD_OPTIMIZATION.md](docs/SIMD_OPTIMIZATION.md) | SIMD details | | [docs/guides/ATTENTION_QUICK_REFERENCE.md](docs/guides/ATTENTION_QUICK_REFERENCE.md) | Attention mechanisms | | [docs/GNN_QUICK_REFERENCE.md](docs/GNN_QUICK_REFERENCE.md) | GNN layers | | [docs/ROUTING_QUICK_REFERENCE.md](docs/ROUTING_QUICK_REFERENCE.md) | Tiny Dancer routing | | [docs/LEARNING_MODULE_README.md](docs/LEARNING_MODULE_README.md) | ReasoningBank | ## Requirements - PostgreSQL 14, 15, 16, or 17 - x86_64 (AVX2/AVX-512) or ARM64 (NEON) - Linux, macOS, or Windows (WSL) ## License MIT License - See [LICENSE](../../LICENSE) ## Contributing Contributions welcome! See [CONTRIBUTING.md](../../CONTRIBUTING.md)