docs: Add brief introductions to attention mechanism sections

Added one-line descriptions before each table:
- Core: Standard attention for sequence modeling
- Graph: Attention for graph-structured data and GNNs
- Specialized: Task-specific variants for efficiency
- Hyperbolic: Curved space for hierarchies
- Async: High-throughput inference utilities

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
rUv 2025-12-02 17:43:33 +00:00
parent 4808901486
commit 9e6f87641b

View file

@ -91,6 +91,8 @@ High-performance attention mechanisms for transformers, graph neural networks, a
#### Core Attention Mechanisms
Standard attention layers for sequence modeling and transformers.
| Mechanism | Complexity | Memory | Best For |
|-----------|------------|--------|----------|
| **DotProductAttention** | O(n²) | O(n²) | Basic attention for small-medium sequences |
@ -102,6 +104,8 @@ High-performance attention mechanisms for transformers, graph neural networks, a
#### Graph Attention Mechanisms
Attention layers designed for graph-structured data and GNNs.
| Mechanism | Complexity | Best For |
|-----------|------------|----------|
| **GraphRoPeAttention** | O(n²) | Position-aware graph transformers |
@ -111,6 +115,8 @@ High-performance attention mechanisms for transformers, graph neural networks, a
#### Specialized Mechanisms
Task-specific attention variants for efficiency and multi-modal learning.
| Mechanism | Type | Best For |
|-----------|------|----------|
| **SparseAttention** | Efficiency | Long docs, low-memory inference |
@ -120,7 +126,7 @@ High-performance attention mechanisms for transformers, graph neural networks, a
#### Hyperbolic Math Functions
Operations for Poincaré ball embeddings (curved space for hierarchies):
Operations for Poincaré ball embeddings—curved space that naturally represents hierarchies.
| Function | Description | Use Case |
|----------|-------------|----------|
@ -132,6 +138,8 @@ Operations for Poincaré ball embeddings (curved space for hierarchies):
#### Async & Batch Operations
Utilities for high-throughput inference and training optimization.
| Operation | Description | Performance |
|-----------|-------------|-------------|
| `asyncBatchCompute()` | Process batches in parallel | 3-5x faster |