# ADR-048: Adaptive CSI Activity Classifier | Field | Value | |-------|-------| | Status | Accepted | | Date | 2026-03-05 | | Deciders | ruv | | Depends on | ADR-024 (AETHER Embeddings), ADR-039 (Edge Processing), ADR-045 (AMOLED Display) | ## Context WiFi-based activity classification using ESP32 Channel State Information (CSI) relies on hand-tuned thresholds to distinguish between activity states (absent, present_still, present_moving, active). These static thresholds are brittle — they don't account for: - **Environment-specific signal patterns**: Room geometry, furniture, wall materials, and ESP32 placement all affect how CSI signals respond to human activity. - **Temporal noise characteristics**: Real ESP32 CSI data at ~10 FPS has significant frame-to-frame jitter that causes classification to jump between states. - **Vital signs estimation noise**: Heart rate and breathing rate estimates from Goertzel filter banks produce large swings (50+ BPM frame-to-frame) at low confidence levels. The existing threshold-based approach produces noisy, unstable classifications that degrade the user experience in the Observatory visualization and the main dashboard. ## Decision ### 1. Three-Stage Signal Smoothing Pipeline All CSI-derived metrics pass through a three-stage pipeline before reaching the UI: #### Stage 1: Adaptive Baseline Subtraction - EMA with α=0.003 (~30s time constant) tracks the "quiet room" noise floor - Only updates during low-motion periods to avoid inflating baseline during activity - 50-frame warm-up period for initial baseline learning - Subtracts 70% of baseline from raw motion score to remove environmental drift #### Stage 2: EMA + Median Filtering - **Motion score**: Blended from 4 signals (temporal diff 40%, variance 20%, motion band power 25%, change points 15%), then EMA-smoothed with α=0.15 - **Vital signs**: 21-frame sliding window → trimmed mean (drop top/bottom 25%) → EMA with α=0.02 (~5s time constant) - **Dead-band**: HR won't update unless trimmed mean differs by >2 BPM; BR needs >0.5 BPM - **Outlier rejection**: HR jumps >8 BPM/frame and BR jumps >2 BPM/frame are discarded #### Stage 3: Hysteresis Debounce - Activity state transitions require 4 consecutive frames (~0.4s) of agreement before committing - Prevents rapid flickering between states - Independent candidate tracking resets on new direction changes ### 2. Adaptive Classifier Module (`adaptive_classifier.rs`) A Rust-native environment-tuned classifier that learns from labeled JSONL recordings: #### Feature Extraction (15 features) | # | Feature | Source | Discriminative Power | |---|---------|--------|---------------------| | 0 | variance | Server | Medium — temporal CSI spread | | 1 | motion_band_power | Server | Medium — high-frequency subcarrier energy | | 2 | breathing_band_power | Server | Low — respiratory band energy | | 3 | spectral_power | Server | Low — mean squared amplitude | | 4 | dominant_freq_hz | Server | Low — peak subcarrier index | | 5 | change_points | Server | Medium — threshold crossing count | | 6 | mean_rssi | Server | Low — received signal strength | | 7 | amp_mean | Subcarrier | Medium — mean amplitude across 56 subcarriers | | 8 | amp_std | Subcarrier | **High** — amplitude spread (motion increases spread) | | 9 | amp_skew | Subcarrier | Medium — asymmetry of amplitude distribution | | 10 | amp_kurt | Subcarrier | **High** — peakedness (presence creates peaks) | | 11 | amp_iqr | Subcarrier | Medium — inter-quartile range | | 12 | amp_entropy | Subcarrier | **High** — spectral entropy (motion increases disorder) | | 13 | amp_max | Subcarrier | Medium — peak amplitude value | | 14 | amp_range | Subcarrier | Medium — amplitude dynamic range | #### Training Algorithm - **Multiclass logistic regression** with softmax output - **Mini-batch SGD** (batch size 32, 200 epochs, linear learning rate decay) - **Z-score normalisation** using global mean/stddev computed from all training data - Per-class statistics (mean, stddev) stored for Mahalanobis distance fallback - Deterministic shuffling (LCG PRNG, seed 42) for reproducible results #### Training Data Pipeline 1. Record labeled CSI sessions via `POST /api/v1/recording/start {"id":"train_