From 9e6f87641baca0dc7eb6ea762a0fd8cd52163922 Mon Sep 17 00:00:00 2001 From: rUv Date: Tue, 2 Dec 2025 17:43:33 +0000 Subject: [PATCH] docs: Add brief introductions to attention mechanism sections MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Added one-line descriptions before each table: - Core: Standard attention for sequence modeling - Graph: Attention for graph-structured data and GNNs - Specialized: Task-specific variants for efficiency - Hyperbolic: Curved space for hierarchies - Async: High-throughput inference utilities 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c21bb938..fd9db2ce 100644 --- a/README.md +++ b/README.md @@ -91,6 +91,8 @@ High-performance attention mechanisms for transformers, graph neural networks, a #### Core Attention Mechanisms +Standard attention layers for sequence modeling and transformers. + | Mechanism | Complexity | Memory | Best For | |-----------|------------|--------|----------| | **DotProductAttention** | O(n²) | O(n²) | Basic attention for small-medium sequences | @@ -102,6 +104,8 @@ High-performance attention mechanisms for transformers, graph neural networks, a #### Graph Attention Mechanisms +Attention layers designed for graph-structured data and GNNs. + | Mechanism | Complexity | Best For | |-----------|------------|----------| | **GraphRoPeAttention** | O(n²) | Position-aware graph transformers | @@ -111,6 +115,8 @@ High-performance attention mechanisms for transformers, graph neural networks, a #### Specialized Mechanisms +Task-specific attention variants for efficiency and multi-modal learning. + | Mechanism | Type | Best For | |-----------|------|----------| | **SparseAttention** | Efficiency | Long docs, low-memory inference | @@ -120,7 +126,7 @@ High-performance attention mechanisms for transformers, graph neural networks, a #### Hyperbolic Math Functions -Operations for Poincaré ball embeddings (curved space for hierarchies): +Operations for Poincaré ball embeddings—curved space that naturally represents hierarchies. | Function | Description | Use Case | |----------|-------------|----------| @@ -132,6 +138,8 @@ Operations for Poincaré ball embeddings (curved space for hierarchies): #### Async & Batch Operations +Utilities for high-throughput inference and training optimization. + | Operation | Description | Performance | |-----------|-------------|-------------| | `asyncBatchCompute()` | Process batches in parallel | 3-5x faster |