Ruview/ui/pose-fusion/js/fusion-engine.js
rUv 7c1351fd5d
feat(demo): wire all 6 RuVector WASM attention mechanisms into pose fusion
* feat: dual-modal WASM browser pose estimation demo (ADR-058)

Live webcam video + WiFi CSI fusion for real-time pose estimation.
Two parallel CNN pipelines (ruvector-cnn-wasm) with attention-weighted
fusion and dynamic confidence gating. Three modes: Dual, Video-only,
CSI-only. Includes pre-built WASM package (~52KB) for browser deployment.

- ADR-058: Dual-modal architecture design
- ui/pose-fusion.html: Main demo page with dark theme UI
- 7 JS modules: video-capture, csi-simulator, cnn-embedder, fusion-engine,
  pose-decoder, canvas-renderer, main orchestrator
- Pre-built ruvector-cnn-wasm WASM package for browser
- CSI heatmap, embedding space visualization, latency metrics
- WebSocket support for live ESP32 CSI data
- Navigation link added to main dashboard

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: motion-responsive skeleton + through-wall CSI tracking

- Pose decoder now uses per-cell motion grid to track actual arm/head
  positions — raising arms moves the skeleton's arms, head follows
  lateral movement
- Motion grid (10x8 cells) tracks intensity per body zone: head,
  left/right arm upper/mid, legs
- Through-wall mode: when person exits frame, CSI maintains presence
  with slow decay (~10s) and skeleton drifts in exit direction
- CSI simulator persists sensing after video loss, ghost pose renders
  with decreasing confidence
- Reduced temporal smoothing (0.45) for faster response to movement

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: video fills available space + correct WASM path resolution

- Remove fixed aspect-ratio and max-height from video panel so it
  fills the available viewport space without scrolling
- Grid uses 1fr row for content area, overflow:hidden on main grid
- Fix WASM path: resolve relative to JS module file using import.meta.url
  instead of hardcoded ./pkg/ which resolved incorrectly on gh-pages
- Responsive: mobile still gets aspect-ratio constraint

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: live ESP32 CSI pipeline + auto-connect WebSocket

- Add auto-connect to local sensing server WebSocket (ws://localhost:8765)
- Demo shows "Live ESP32" when connected to real CSI data
- Add build_firmware.ps1 for native Windows ESP-IDF builds (no Docker)
- Add read_serial.ps1 for ESP32 serial monitor

Pipeline: ESP32 → UDP:5005 → sensing-server → WS:8765 → browser demo

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: add ADR-059 live ESP32 CSI pipeline + update README with demo links

- ADR-059: Documents end-to-end ESP32 → sensing server → browser pipeline
- README: Add dual-modal pose fusion demo link, update ADR count to 49
- References issue #245

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: RSSI visualization, RuVector attention WASM, cache-bust fixes

- Add animated RSSI Signal Strength panel with sparkline history
- Fix RuVector WasmMultiHeadAttention retptr calling convention
- Wire up RuVector Multi-Head + Flash Attention in CNN embedder
- Add ambient temporal drift to CSI simulator for visible heatmap animation
- Fix embedding space projection (sparse projection replaces cancelling sum)
- Add auto-scaling to embedding space renderer
- Add cache busters (?v=4) to all ES module imports to prevent stale caches
- Add diagnostic logging for module version verification
- Add RSSI tracking with quality labels and color-coded dBm display
- Includes ruvector-attention-wasm v2.0.5 browser ESM wrapper

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: 26-keypoint dexterous pose + full RuVector attention pipeline

Pose Decoder (17 → 26 keypoints):
- Add finger approximations: thumb, index, pinky per hand (6 new)
- Add toe tips: left/right foot index (2 new)
- Add neck keypoint (1 new)
- Hand openness driven by arm motion intensity
- Finger positions computed from wrist-elbow axis angles

CNN Embedder (full RuVector WASM pipeline):
- Stage 1: Multi-Head Attention (global spatial reasoning)
- Stage 2: Hyperbolic Attention (hierarchical body-part tree)
- Stage 3: MoE Attention (3 experts: upper/lower/extremities, top-2)
- Blended 40/30/30 weighting → final embedding projection

Canvas Renderer:
- Magenta finger joints with distinct glow
- Cyan toe tips
- White neck keypoint
- Thinner limb lines for hand/foot connections
- Joint count shown in overlay label

CSI Simulator:
- Skip synthetic person state when live ESP32 connected
- Only simulate CSI data in demo mode (was already correct)

Embedding Space:
- Fixed projection: sparse 8-dim projection replaces cancelling sum
- Auto-scaling normalizes point spread to fill canvas

Cache busters bumped to v=5 on all imports.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: centroid-based pose tracking for responsive limb movement

Rewrites pose decoder from intensity-based to position-based tracking:
- Arms now track toward motion centroid in each body zone
- Elbow/wrist positions computed along shoulder→centroid vector
- Legs track toward lower-body zone centroids
- Smoothing reduced from 0.45 to 0.25 for responsiveness
- Zone centroids blend 30% old / 70% new each frame

6 body zones with overlapping coverage:
- Head (top 20%, center cols)
- Left/Right Arm (rows 10-60%, outer cols)
- Torso (rows 15-55%, center cols)
- Left/Right Leg (rows 50-100%, half cols each)

Hand openness now driven by arm spread distance + raise amount.
Cache busters v=6.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: remove duplicate lAnkleX/rAnkleX declarations in pose-decoder

Stale code block from old intensity-based tracking was left behind,
re-declaring variables already defined by centroid-based tracking.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(demo): wire all 6 RuVector WASM attention mechanisms into pose fusion

- Add WasmLinearAttention and WasmLocalGlobalAttention to browser ESM wrapper
- Add 6 WASM utility functions (batch_normalize, pairwise_distances, etc.)
- Extend CnnEmbedder to 6-stage pipeline: Flash → MHA → Hyperbolic → Linear → MoE → L+G
- Use log-energy softmax blending across all 6 stages
- Wire WASM cosine_similarity and normalize into FusionEngine
- Add RuVector pipeline stats panel to UI (energy, refinement, pose impact)
- Compute embedding-to-joint mapping stats without modifying joint positions
- Center camera prompt with flexbox layout
- Add cache busters v=12

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-12 20:59:57 -04:00

183 lines
6.5 KiB
JavaScript

/**
* FusionEngine — Attention-weighted dual-modal embedding fusion.
*
* Combines visual (camera) and CSI (WiFi) embeddings with dynamic
* confidence gating based on signal quality.
*/
export class FusionEngine {
/**
* @param {number} embeddingDim
* @param {object} opts
* @param {object} opts.wasmModule - RuVector WASM module for cosine_similarity etc.
*/
constructor(embeddingDim = 128, opts = {}) {
this.embeddingDim = embeddingDim;
this.wasmModule = opts.wasmModule || null;
// Learnable attention weights (initialized to balanced 0.5)
this.attentionWeights = new Float32Array(embeddingDim).fill(0.5);
// Dynamic modality confidence [0, 1]
this.videoConfidence = 1.0;
this.csiConfidence = 0.0;
this.fusedConfidence = 0.5;
// Smoothing for confidence transitions
this._smoothAlpha = 0.85;
// Embedding history for visualization
this.recentVideoEmbeddings = [];
this.recentCsiEmbeddings = [];
this.recentFusedEmbeddings = [];
this.maxHistory = 50;
}
/** Set the WASM module reference (called after WASM loads) */
setWasmModule(mod) { this.wasmModule = mod; }
/**
* Update quality-based confidence scores
* @param {number} videoBrightness - [0,1] video brightness quality
* @param {number} videoMotion - [0,1] motion detected
* @param {number} csiSnr - CSI signal-to-noise ratio in dB
* @param {boolean} csiActive - Whether CSI source is connected
*/
updateConfidence(videoBrightness, videoMotion, csiSnr, csiActive) {
// Video confidence: drops with low brightness, boosted by motion
let vc = 0;
if (videoBrightness > 0.05) {
vc = Math.min(1, videoBrightness * 1.5) * 0.7 + Math.min(1, videoMotion * 3) * 0.3;
}
// CSI confidence: based on SNR and connection status
let cc = 0;
if (csiActive) {
cc = Math.min(1, csiSnr / 25); // 25dB = full confidence
}
// Smooth transitions
this.videoConfidence = this._smoothAlpha * this.videoConfidence + (1 - this._smoothAlpha) * vc;
this.csiConfidence = this._smoothAlpha * this.csiConfidence + (1 - this._smoothAlpha) * cc;
// Fused confidence is the max of either (fusion can only help)
this.fusedConfidence = Math.min(1, Math.sqrt(
this.videoConfidence * this.videoConfidence + this.csiConfidence * this.csiConfidence
));
}
/**
* Fuse video and CSI embeddings
* @param {Float32Array|null} videoEmb - Visual embedding (or null if video-off)
* @param {Float32Array|null} csiEmb - CSI embedding (or null if CSI-off)
* @param {string} mode - 'dual' | 'video' | 'csi'
* @returns {Float32Array} Fused embedding
*/
fuse(videoEmb, csiEmb, mode = 'dual') {
const dim = this.embeddingDim;
const fused = new Float32Array(dim);
if (mode === 'video' || !csiEmb) {
if (videoEmb) fused.set(videoEmb);
this._recordEmbedding(videoEmb, null, fused);
return fused;
}
if (mode === 'csi' || !videoEmb) {
if (csiEmb) fused.set(csiEmb);
this._recordEmbedding(null, csiEmb, fused);
return fused;
}
// Dual mode: attention-weighted fusion with confidence gating
const totalConf = this.videoConfidence + this.csiConfidence;
const videoWeight = totalConf > 0 ? this.videoConfidence / totalConf : 0.5;
for (let i = 0; i < dim; i++) {
const alpha = this.attentionWeights[i] * videoWeight +
(1 - this.attentionWeights[i]) * (1 - videoWeight);
fused[i] = alpha * videoEmb[i] + (1 - alpha) * csiEmb[i];
}
// Re-normalize using WASM when available
if (this.wasmModule) {
try { this.wasmModule.normalize(fused); } catch (_) { this._jsNormalize(fused); }
} else {
this._jsNormalize(fused);
}
this._recordEmbedding(videoEmb, csiEmb, fused);
return fused;
}
/**
* Get embedding pairs for 2D visualization (PCA projection)
* @returns {{ video: Array, csi: Array, fused: Array }}
*/
getEmbeddingPoints() {
// Sparse random projection: pick a few dimensions with fixed coefficients
// to get visible 2D spread (avoids cancellation from summing all 128 dims)
const project = (emb) => {
if (!emb || emb.length < 4) return null;
// Use 8 sparse dimensions with predetermined signs (seeded, not random)
const dim = emb.length;
const x = emb[0] * 3.2 - emb[3] * 2.8 + emb[7] * 2.1 - emb[12] * 1.9
+ (dim > 30 ? emb[29] * 1.5 - emb[31] * 1.3 : 0)
+ (dim > 60 ? emb[55] * 1.1 - emb[60] * 0.9 : 0);
const y = emb[1] * 3.0 - emb[5] * 2.5 + emb[9] * 2.3 - emb[15] * 1.7
+ (dim > 40 ? emb[37] * 1.4 - emb[42] * 1.2 : 0)
+ (dim > 80 ? emb[73] * 1.0 - emb[80] * 0.8 : 0);
return [x, y];
};
return {
video: this.recentVideoEmbeddings.map(project).filter(Boolean),
csi: this.recentCsiEmbeddings.map(project).filter(Boolean),
fused: this.recentFusedEmbeddings.map(project).filter(Boolean)
};
}
/**
* Cross-modal similarity score
* @returns {number} Cosine similarity between latest video and CSI embeddings
*/
getCrossModalSimilarity() {
const v = this.recentVideoEmbeddings[this.recentVideoEmbeddings.length - 1];
const c = this.recentCsiEmbeddings[this.recentCsiEmbeddings.length - 1];
if (!v || !c) return 0;
// Use WASM cosine_similarity when available
if (this.wasmModule) {
try { return this.wasmModule.cosine_similarity(v, c); } catch (_) { /* fallback */ }
}
let dot = 0, na = 0, nb = 0;
for (let i = 0; i < v.length; i++) {
dot += v[i] * c[i];
na += v[i] * v[i];
nb += c[i] * c[i];
}
na = Math.sqrt(na); nb = Math.sqrt(nb);
return (na > 1e-8 && nb > 1e-8) ? dot / (na * nb) : 0;
}
_jsNormalize(vec) {
let norm = 0;
for (let i = 0; i < vec.length; i++) norm += vec[i] * vec[i];
norm = Math.sqrt(norm);
if (norm > 1e-8) for (let i = 0; i < vec.length; i++) vec[i] /= norm;
}
_recordEmbedding(video, csi, fused) {
if (video) {
this.recentVideoEmbeddings.push(new Float32Array(video));
if (this.recentVideoEmbeddings.length > this.maxHistory) this.recentVideoEmbeddings.shift();
}
if (csi) {
this.recentCsiEmbeddings.push(new Float32Array(csi));
if (this.recentCsiEmbeddings.length > this.maxHistory) this.recentCsiEmbeddings.shift();
}
this.recentFusedEmbeddings.push(new Float32Array(fused));
if (this.recentFusedEmbeddings.length > this.maxHistory) this.recentFusedEmbeddings.shift();
}
}