ruvector/crates/ruvector-postgres/src/attention
rUv bc20fc99ef fix(postgres): clean up cfg attributes and unused imports
- Fix dual cfg attributes causing linker errors in test builds
- Remove unused EarlyExitDecision import from gated_transformer
- Update intelligence layer data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 23:32:24 +00:00
..
flash.rs fix(postgres): clean up cfg attributes and unused imports 2025-12-26 23:32:24 +00:00
mod.rs fix(postgres): clean up cfg attributes and unused imports 2025-12-26 23:32:24 +00:00
multi_head.rs fix(postgres): clean up cfg attributes and unused imports 2025-12-26 23:32:24 +00:00
operators.rs fix(ci): Fix test type mismatches and remove cargo test --lib 2025-12-26 22:11:59 +00:00
README.md feat(postgres): Add 53 SQL function definitions for all advanced modules (#46) 2025-12-02 22:49:29 -05:00
scaled_dot.rs fix(postgres): clean up cfg attributes and unused imports 2025-12-26 23:32:24 +00:00

Attention Mechanisms Module

High-performance attention implementations for PostgreSQL vector operations with SIMD acceleration.

Overview

This module provides production-ready attention mechanisms optimized for PostgreSQL:

  • Scaled Dot-Product Attention: Standard transformer attention with SIMD acceleration
  • Multi-Head Attention: Parallel head computation using Rayon
  • Flash Attention v2: Memory-efficient O(√N) space complexity with tiled computation
  • PostgreSQL Integration: 6 SQL-callable functions for direct database usage

Files

  • mod.rs: Module exports, AttentionType enum, Attention trait, softmax implementations
  • scaled_dot.rs: ScaledDotAttention with SIMD-accelerated dot products
  • multi_head.rs: MultiHeadAttention with parallel head processing
  • flash.rs: FlashAttention with memory-efficient tiled computation
  • operators.rs: PostgreSQL SQL functions

Quick Example

Rust

use ruvector_postgres::attention::{ScaledDotAttention, Attention};

let attention = ScaledDotAttention::new(64);
let query = vec![1.0; 64];
let keys = vec![&vec![1.0; 64][..], &vec![0.5; 64][..]];
let scores = attention.attention_scores(&query, &keys);

SQL

SELECT ruvector_attention_score(
    ARRAY[1.0, 0.0, 0.0]::float4[],
    ARRAY[1.0, 0.0, 0.0]::float4[],
    'scaled_dot'
);

Features

SIMD Acceleration

  • Leverages simsimd for vectorized operations
  • AVX-512/AVX2/NEON support
  • Automatic fallback to scalar

Parallel Processing

  • Multi-head computation uses Rayon
  • Efficient work distribution
  • Scales with CPU cores

Memory Efficiency

  • Flash Attention reduces bandwidth
  • In-place softmax operations
  • Tiled/blocked computation

Numerical Stability

  • Max subtraction in softmax
  • Overflow/underflow protection
  • Online softmax updates

SQL Functions

Function Purpose
ruvector_attention_score() Single query-key attention score
ruvector_softmax() Softmax activation
ruvector_multi_head_attention() Multi-head attention forward pass
ruvector_flash_attention() Flash Attention v2
ruvector_attention_scores() Multiple attention scores
ruvector_attention_types() List available types

Testing

# Unit tests
cargo test --lib attention

# PostgreSQL tests (requires pgrx setup)
cargo pgrx test pg16

# Integration tests
cargo test --test attention_integration_test

Performance

Operation Seq Len Time (μs) Memory
scaled_dot 512 45 2MB
multi_head 512 (8h) 38 2.5MB
flash_v2 512 (8h) 38 0.5MB
flash_v2 2048 (8h) 150 1MB

Documentation

Dependencies

  • pgrx: PostgreSQL extension framework
  • simsimd: SIMD acceleration
  • rayon: Parallel processing
  • serde: Serialization

Status

Production Ready

  • 1,716 lines of implementation code
  • 39 comprehensive tests
  • Full PostgreSQL integration
  • SIMD and parallel optimized