mirror of
https://github.com/ruvnet/RuView.git
synced 2026-04-28 05:59:32 +00:00
merge: bring feat/adr-080-qe-remediation up to date with main
Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
commit
ccb27b280c
9 changed files with 3832 additions and 5 deletions
53
README.md
53
README.md
|
|
@ -9,7 +9,7 @@
|
||||||
> **Beta Software** — Under active development. APIs and firmware may change. Known limitations:
|
> **Beta Software** — Under active development. APIs and firmware may change. Known limitations:
|
||||||
> - ESP32-C3 and original ESP32 are not supported (single-core, insufficient for CSI DSP)
|
> - ESP32-C3 and original ESP32 are not supported (single-core, insufficient for CSI DSP)
|
||||||
> - Single ESP32 deployments have limited spatial resolution — use 2+ nodes or add a [Cognitum Seed](https://cognitum.one) for best results
|
> - Single ESP32 deployments have limited spatial resolution — use 2+ nodes or add a [Cognitum Seed](https://cognitum.one) for best results
|
||||||
> - Camera-free pose accuracy is limited (2.5% PCK@20) — camera-labeled data significantly improves accuracy
|
> - Camera-free pose accuracy is limited — use [camera ground-truth training](docs/adr/ADR-079-camera-ground-truth-training.md) for 92.9% PCK@20
|
||||||
>
|
>
|
||||||
> Contributions and bug reports welcome at [Issues](https://github.com/ruvnet/RuView/issues).
|
> Contributions and bug reports welcome at [Issues](https://github.com/ruvnet/RuView/issues).
|
||||||
|
|
||||||
|
|
@ -56,6 +56,7 @@ RuView also supports pose estimation (17 COCO keypoints via the WiFlow architect
|
||||||
> | **Through-wall** | Fresnel zone geometry + multipath modeling | Up to 5m depth |
|
> | **Through-wall** | Fresnel zone geometry + multipath modeling | Up to 5m depth |
|
||||||
> | **Edge intelligence** | 8-dim feature vectors + RVF store on Cognitum Seed | $140 total BOM |
|
> | **Edge intelligence** | 8-dim feature vectors + RVF store on Cognitum Seed | $140 total BOM |
|
||||||
> | **Camera-free training** | 10 sensor signals, no labels needed | 84s on M4 Pro |
|
> | **Camera-free training** | 10 sensor signals, no labels needed | 84s on M4 Pro |
|
||||||
|
> | **Camera-supervised training** | MediaPipe + ESP32 CSI → 92.9% PCK@20 | 19 min on laptop |
|
||||||
> | **Multi-frequency mesh** | Channel hopping across 6 bands, neighbor APs as illuminators | 3x sensing bandwidth |
|
> | **Multi-frequency mesh** | Channel hopping across 6 bands, neighbor APs as illuminators | 3x sensing bandwidth |
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -95,9 +96,52 @@ node scripts/mincut-person-counter.js --port 5006 # Correct person counting
|
||||||
>
|
>
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### What's New in v0.7.0
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary><strong>Camera Ground-Truth Training — 92.9% PCK@20</strong></summary>
|
||||||
|
|
||||||
|
**v0.7.0 adds camera-supervised pose training** using MediaPipe + real ESP32 CSI data:
|
||||||
|
|
||||||
|
| Capability | What it does | ADR |
|
||||||
|
|-----------|-------------|-----|
|
||||||
|
| **Camera ground-truth collection** | MediaPipe PoseLandmarker captures 17 COCO keypoints at 30fps, synced with ESP32 CSI | [ADR-079](docs/adr/ADR-079-camera-ground-truth-training.md) |
|
||||||
|
| **ruvector subcarrier selection** | Variance-based top-K reduces input by 50% (70→35 subcarriers) | ADR-079 O6 |
|
||||||
|
| **Stoer-Wagner min-cut** | Person-specific subcarrier cluster separation for multi-person training | ADR-079 O8 |
|
||||||
|
| **Scalable WiFlow model** | 4 presets: lite (189K) → small (474K) → medium (800K) → full (7.7M params) | ADR-079 |
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Collect ground truth (camera + ESP32 simultaneously)
|
||||||
|
python scripts/collect-ground-truth.py --duration 300 --preview
|
||||||
|
python scripts/record-csi-udp.py --duration 300
|
||||||
|
|
||||||
|
# Align CSI windows with camera keypoints
|
||||||
|
node scripts/align-ground-truth.js --gt data/ground-truth/*.jsonl --csi data/recordings/*.csi.jsonl
|
||||||
|
|
||||||
|
# Train WiFlow model (start lite, scale up as data grows)
|
||||||
|
node scripts/train-wiflow-supervised.js --data data/paired/*.jsonl --scale lite
|
||||||
|
|
||||||
|
# Evaluate
|
||||||
|
node scripts/eval-wiflow.js --model models/wiflow-real/wiflow-v1.json --data data/paired/*.jsonl
|
||||||
|
```
|
||||||
|
|
||||||
|
**Result: 92.9% PCK@20** from a 5-minute data collection session with one ESP32-S3 and one webcam.
|
||||||
|
|
||||||
|
| Metric | Before (proxy) | After (camera-supervised) |
|
||||||
|
|--------|----------------|--------------------------|
|
||||||
|
| PCK@20 | 0% | **92.9%** |
|
||||||
|
| Eval loss | 0.700 | **0.082** |
|
||||||
|
| Bone constraint | N/A | **0.008** |
|
||||||
|
| Training time | N/A | **19 minutes** |
|
||||||
|
| Model size | N/A | **974 KB** |
|
||||||
|
|
||||||
|
Pre-trained model: [HuggingFace ruv/ruview/wiflow-v1](https://huggingface.co/ruv/ruview)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
### Pre-Trained Models (v0.6.0) — No Training Required
|
### Pre-Trained Models (v0.6.0) — No Training Required
|
||||||
|
|
||||||
<details open>
|
<details>
|
||||||
<summary><strong>Download from HuggingFace and start sensing immediately</strong></summary>
|
<summary><strong>Download from HuggingFace and start sensing immediately</strong></summary>
|
||||||
|
|
||||||
Pre-trained models are available on HuggingFace:
|
Pre-trained models are available on HuggingFace:
|
||||||
|
|
@ -294,7 +338,7 @@ See [ADR-069](docs/adr/ADR-069-cognitum-seed-csi-pipeline.md), [ADR-071](docs/ad
|
||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| [User Guide](docs/user-guide.md) | Step-by-step guide: installation, first run, API usage, hardware setup, training |
|
| [User Guide](docs/user-guide.md) | Step-by-step guide: installation, first run, API usage, hardware setup, training |
|
||||||
| [Build Guide](docs/build-guide.md) | Building from source (Rust and Python) |
|
| [Build Guide](docs/build-guide.md) | Building from source (Rust and Python) |
|
||||||
| [Architecture Decisions](docs/adr/README.md) | 62 ADRs — why each technical choice was made, organized by domain (hardware, signal processing, ML, platform, infrastructure) |
|
| [Architecture Decisions](docs/adr/README.md) | 79 ADRs — why each technical choice was made, organized by domain (hardware, signal processing, ML, platform, infrastructure) |
|
||||||
| [Domain Models](docs/ddd/README.md) | 7 DDD models (RuvSense, Signal Processing, Training Pipeline, Hardware Platform, Sensing Server, WiFi-Mat, CHCI) — bounded contexts, aggregates, domain events, and ubiquitous language |
|
| [Domain Models](docs/ddd/README.md) | 7 DDD models (RuvSense, Signal Processing, Training Pipeline, Hardware Platform, Sensing Server, WiFi-Mat, CHCI) — bounded contexts, aggregates, domain events, and ubiquitous language |
|
||||||
| [Desktop App](rust-port/wifi-densepose-rs/crates/wifi-densepose-desktop/README.md) | **WIP** — Tauri v2 desktop app for node management, OTA updates, WASM deployment, and mesh visualization |
|
| [Desktop App](rust-port/wifi-densepose-rs/crates/wifi-densepose-desktop/README.md) | **WIP** — Tauri v2 desktop app for node management, OTA updates, WASM deployment, and mesh visualization |
|
||||||
| [Medical Examples](examples/medical/README.md) | Contactless blood pressure, heart rate, breathing rate via 60 GHz mmWave radar — $15 hardware, no wearable |
|
| [Medical Examples](examples/medical/README.md) | Contactless blood pressure, heart rate, breathing rate via 60 GHz mmWave radar — $15 hardware, no wearable |
|
||||||
|
|
@ -1267,7 +1311,8 @@ Download a pre-built binary — no build toolchain needed:
|
||||||
|
|
||||||
| Release | What's included | Tag |
|
| Release | What's included | Tag |
|
||||||
|---------|-----------------|-----|
|
|---------|-----------------|-----|
|
||||||
| [v0.6.0](https://github.com/ruvnet/RuView/releases/tag/v0.6.0-esp32) | **Latest** — [Pre-trained models on HuggingFace](https://huggingface.co/ruv/ruview), 17 sensing apps, 51.6% contrastive improvement, 0.008ms inference | `v0.6.0-esp32` |
|
| [v0.7.0](https://github.com/ruvnet/RuView/releases/tag/v0.7.0) | **Latest** — Camera-supervised WiFlow model (92.9% PCK@20), ground-truth training pipeline, ruvector optimizations | `v0.7.0` |
|
||||||
|
| [v0.6.0](https://github.com/ruvnet/RuView/releases/tag/v0.6.0-esp32) | [Pre-trained models on HuggingFace](https://huggingface.co/ruv/ruview), 17 sensing apps, 51.6% contrastive improvement, 0.008ms inference | `v0.6.0-esp32` |
|
||||||
| [v0.5.5](https://github.com/ruvnet/RuView/releases/tag/v0.5.5-esp32) | SNN + MinCut (#348 fix) + CNN spectrogram + WiFlow + multi-freq mesh + graph transformer | `v0.5.5-esp32` |
|
| [v0.5.5](https://github.com/ruvnet/RuView/releases/tag/v0.5.5-esp32) | SNN + MinCut (#348 fix) + CNN spectrogram + WiFlow + multi-freq mesh + graph transformer | `v0.5.5-esp32` |
|
||||||
| [v0.5.4](https://github.com/ruvnet/RuView/releases/tag/v0.5.4-esp32) | Cognitum Seed integration ([ADR-069](docs/adr/ADR-069-cognitum-seed-csi-pipeline.md)), 8-dim feature vectors, RVF store, witness chain, security hardening | `v0.5.4-esp32` |
|
| [v0.5.4](https://github.com/ruvnet/RuView/releases/tag/v0.5.4-esp32) | Cognitum Seed integration ([ADR-069](docs/adr/ADR-069-cognitum-seed-csi-pipeline.md)), 8-dim feature vectors, RVF store, witness chain, security hardening | `v0.5.4-esp32` |
|
||||||
| [v0.5.0](https://github.com/ruvnet/RuView/releases/tag/v0.5.0-esp32) | mmWave sensor fusion ([ADR-063](docs/adr/ADR-063-mmwave-sensor-fusion.md)), auto-detect MR60BHA2/LD2410, 48-byte fused vitals, all v0.4.3.1 fixes | `v0.5.0-esp32` |
|
| [v0.5.0](https://github.com/ruvnet/RuView/releases/tag/v0.5.0-esp32) | mmWave sensor fusion ([ADR-063](docs/adr/ADR-063-mmwave-sensor-fusion.md)), auto-detect MR60BHA2/LD2410, 48-byte fused vitals, all v0.4.3.1 fixes | `v0.5.0-esp32` |
|
||||||
|
|
|
||||||
512
docs/adr/ADR-079-camera-ground-truth-training.md
Normal file
512
docs/adr/ADR-079-camera-ground-truth-training.md
Normal file
|
|
@ -0,0 +1,512 @@
|
||||||
|
# ADR-079: Camera Ground-Truth Training Pipeline
|
||||||
|
|
||||||
|
- **Status**: Accepted
|
||||||
|
- **Date**: 2026-04-06
|
||||||
|
- **Deciders**: ruv
|
||||||
|
- **Relates to**: ADR-072 (WiFlow Architecture), ADR-070 (Self-Supervised Pretraining), ADR-071 (ruvllm Training Pipeline), ADR-024 (AETHER Contrastive), ADR-064 (Multimodal Ambient Intelligence), ADR-075 (MinCut Person Separation)
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
WiFlow (ADR-072) currently trains without ground-truth pose labels, using proxy poses
|
||||||
|
generated from presence/motion heuristics. This produces a PCK@20 of only 2.5% — far
|
||||||
|
below the 30-50% achievable with supervised training. The fundamental bottleneck is the
|
||||||
|
absence of spatial keypoint labels.
|
||||||
|
|
||||||
|
Academic WiFi pose estimation systems (Wi-Pose, Person-in-WiFi 3D, MetaFi++) all train
|
||||||
|
with synchronized camera ground truth and achieve PCK@20 of 40-85%. They discard the
|
||||||
|
camera at deployment — the camera is a training-time teacher, not a runtime dependency.
|
||||||
|
|
||||||
|
ADR-064 already identified this: *"Record CSI + mmWave while performing signs with a
|
||||||
|
camera as ground truth, then deploy camera-free."* This ADR specifies the implementation.
|
||||||
|
|
||||||
|
### Current Training Pipeline Gap
|
||||||
|
|
||||||
|
```
|
||||||
|
Current: CSI amplitude → WiFlow → 17 keypoints (proxy-supervised, PCK@20 = 2.5%)
|
||||||
|
↑
|
||||||
|
Heuristic proxies:
|
||||||
|
- Standing skeleton when presence > 0.3
|
||||||
|
- Limb perturbation from motion energy
|
||||||
|
- No spatial accuracy
|
||||||
|
```
|
||||||
|
|
||||||
|
### Target Pipeline
|
||||||
|
|
||||||
|
```
|
||||||
|
Training: CSI amplitude ──→ WiFlow ──→ 17 keypoints (camera-supervised, PCK@20 target: 35%+)
|
||||||
|
↑
|
||||||
|
Laptop camera ──→ MediaPipe ──→ 17 COCO keypoints (ground truth)
|
||||||
|
(time-synchronized, 30 fps)
|
||||||
|
|
||||||
|
Deploy: CSI amplitude ──→ WiFlow ──→ 17 keypoints (camera-free, trained model only)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
Build a camera ground-truth collection and training pipeline using the laptop webcam
|
||||||
|
as a teacher signal. The camera is used **only during training data collection** and is
|
||||||
|
not required at deployment.
|
||||||
|
|
||||||
|
### Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Data Collection Phase │
|
||||||
|
│ │
|
||||||
|
│ ESP32-S3 nodes ──UDP──→ Sensing Server ──→ CSI frames (.jsonl) │
|
||||||
|
│ ↑ time sync │
|
||||||
|
│ Laptop Camera ──→ MediaPipe Pose ──→ Keypoints (.jsonl) │
|
||||||
|
│ ↑ │
|
||||||
|
│ collect-ground-truth.py │
|
||||||
|
│ (single orchestrator) │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Training Phase │
|
||||||
|
│ │
|
||||||
|
│ Paired dataset: { csi_window[128,20], keypoints[17,2], conf } │
|
||||||
|
│ ↓ │
|
||||||
|
│ train-wiflow-supervised.js │
|
||||||
|
│ Phase 1: Contrastive pretrain (ADR-072, reuse) │
|
||||||
|
│ Phase 2: Supervised keypoint regression (NEW) │
|
||||||
|
│ Phase 3: Fine-tune with bone constraints + confidence │
|
||||||
|
│ ↓ │
|
||||||
|
│ WiFlow model (1.8M params) → SafeTensors export │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Deployment (camera-free) │
|
||||||
|
│ │
|
||||||
|
│ ESP32-S3 CSI → Sensing Server → WiFlow inference → 17 keypoints│
|
||||||
|
│ (No camera. Trained model runs on CSI input only.) │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Component 1: `scripts/collect-ground-truth.py`
|
||||||
|
|
||||||
|
Single Python script that orchestrates synchronized capture from the laptop camera
|
||||||
|
and the ESP32 CSI stream.
|
||||||
|
|
||||||
|
**Dependencies:** `mediapipe`, `opencv-python`, `requests` (all pip-installable, no GPU)
|
||||||
|
|
||||||
|
**Capture flow:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Pseudocode
|
||||||
|
camera = cv2.VideoCapture(0) # Laptop webcam
|
||||||
|
sensing_api = "http://localhost:3000" # Sensing server
|
||||||
|
|
||||||
|
# Start CSI recording via existing API
|
||||||
|
requests.post(f"{sensing_api}/api/v1/recording/start")
|
||||||
|
|
||||||
|
while recording:
|
||||||
|
frame = camera.read()
|
||||||
|
t = time.time_ns() # Nanosecond timestamp
|
||||||
|
|
||||||
|
# MediaPipe Pose: 33 landmarks → map to 17 COCO keypoints
|
||||||
|
result = mp_pose.process(frame)
|
||||||
|
keypoints_17 = map_mediapipe_to_coco(result.pose_landmarks)
|
||||||
|
confidence = mean(landmark.visibility for relevant landmarks)
|
||||||
|
|
||||||
|
# Write to ground-truth JSONL (one line per frame)
|
||||||
|
write_jsonl({
|
||||||
|
"ts_ns": t,
|
||||||
|
"keypoints": keypoints_17, # [[x,y], ...] normalized [0,1]
|
||||||
|
"confidence": confidence, # 0-1, used for loss weighting
|
||||||
|
"n_visible": count(visibility > 0.5),
|
||||||
|
})
|
||||||
|
|
||||||
|
# Optional: show live preview with skeleton overlay
|
||||||
|
if preview:
|
||||||
|
draw_skeleton(frame, keypoints_17)
|
||||||
|
cv2.imshow("Ground Truth", frame)
|
||||||
|
|
||||||
|
# Stop CSI recording
|
||||||
|
requests.post(f"{sensing_api}/api/v1/recording/stop")
|
||||||
|
```
|
||||||
|
|
||||||
|
**MediaPipe → COCO keypoint mapping:**
|
||||||
|
|
||||||
|
| COCO Index | Joint | MediaPipe Index |
|
||||||
|
|------------|-------|-----------------|
|
||||||
|
| 0 | Nose | 0 |
|
||||||
|
| 1 | Left Eye | 2 |
|
||||||
|
| 2 | Right Eye | 5 |
|
||||||
|
| 3 | Left Ear | 7 |
|
||||||
|
| 4 | Right Ear | 8 |
|
||||||
|
| 5 | Left Shoulder | 11 |
|
||||||
|
| 6 | Right Shoulder | 12 |
|
||||||
|
| 7 | Left Elbow | 13 |
|
||||||
|
| 8 | Right Elbow | 14 |
|
||||||
|
| 9 | Left Wrist | 15 |
|
||||||
|
| 10 | Right Wrist | 16 |
|
||||||
|
| 11 | Left Hip | 23 |
|
||||||
|
| 12 | Right Hip | 24 |
|
||||||
|
| 13 | Left Knee | 25 |
|
||||||
|
| 14 | Right Knee | 26 |
|
||||||
|
| 15 | Left Ankle | 27 |
|
||||||
|
| 16 | Right Ankle | 28 |
|
||||||
|
|
||||||
|
### Component 2: Time Alignment (`scripts/align-ground-truth.js`)
|
||||||
|
|
||||||
|
CSI frames arrive at ~100 Hz with server-side timestamps. Camera keypoints arrive at
|
||||||
|
~30 fps with client-side timestamps. Alignment is needed because:
|
||||||
|
|
||||||
|
1. Camera and sensing server clocks differ (typically < 50ms on LAN)
|
||||||
|
2. CSI is aggregated into 20-frame windows for WiFlow input
|
||||||
|
3. Ground-truth keypoints must be averaged over the same window
|
||||||
|
|
||||||
|
**Alignment algorithm:**
|
||||||
|
|
||||||
|
```
|
||||||
|
For each CSI window W_i (20 frames, ~200ms at 100Hz):
|
||||||
|
t_start = W_i.first_frame.timestamp
|
||||||
|
t_end = W_i.last_frame.timestamp
|
||||||
|
|
||||||
|
# Find all camera keypoints within this time window
|
||||||
|
matching_keypoints = [k for k in camera_data if t_start <= k.ts <= t_end]
|
||||||
|
|
||||||
|
if len(matching_keypoints) >= 3: # At least 3 camera frames per window
|
||||||
|
# Average keypoints, weighted by confidence
|
||||||
|
avg_keypoints = weighted_mean(matching_keypoints, weights=confidences)
|
||||||
|
avg_confidence = mean(confidences)
|
||||||
|
|
||||||
|
paired_dataset.append({
|
||||||
|
csi_window: W_i.amplitudes, # [128, 20] float32
|
||||||
|
keypoints: avg_keypoints, # [17, 2] float32
|
||||||
|
confidence: avg_confidence, # scalar
|
||||||
|
n_camera_frames: len(matching_keypoints),
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
**Clock sync strategy:**
|
||||||
|
|
||||||
|
- NTP is sufficient (< 20ms error on LAN)
|
||||||
|
- The 200ms CSI window is 10x larger than typical clock drift
|
||||||
|
- For tighter sync: use a handclap/jump as a sync marker — visible spike in both
|
||||||
|
CSI motion energy and camera skeleton velocity. Auto-detect and align.
|
||||||
|
|
||||||
|
**Output:** `data/recordings/paired-{timestamp}.jsonl` — one line per paired sample:
|
||||||
|
```json
|
||||||
|
{"csi": [128x20 flat], "kp": [[0.45,0.12], ...], "conf": 0.92, "ts": 1775300000000}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Component 3: Supervised Training (`scripts/train-wiflow-supervised.js`)
|
||||||
|
|
||||||
|
Extends the existing `train-ruvllm.js` pipeline with a supervised phase.
|
||||||
|
|
||||||
|
**Phase 1: Contrastive Pretrain (reuse ADR-072)**
|
||||||
|
- Same as existing: temporal + cross-node triplets
|
||||||
|
- Learns CSI representation without labels
|
||||||
|
- 50 epochs, ~5 min on laptop
|
||||||
|
|
||||||
|
**Phase 2: Supervised Keypoint Regression (NEW)**
|
||||||
|
- Load paired dataset from Component 2
|
||||||
|
- Loss: confidence-weighted SmoothL1 on keypoints
|
||||||
|
|
||||||
|
```
|
||||||
|
L_supervised = (1/N) * sum_i [ conf_i * SmoothL1(pred_i, gt_i, beta=0.05) ]
|
||||||
|
```
|
||||||
|
|
||||||
|
- Only train on samples where `conf > 0.5` (discard frames where MediaPipe lost tracking)
|
||||||
|
- Learning rate: 1e-4 with cosine decay
|
||||||
|
- 200 epochs, ~15 min on laptop CPU (1.8M params, no GPU needed)
|
||||||
|
|
||||||
|
**Phase 3: Refinement with Bone Constraints**
|
||||||
|
- Fine-tune with combined loss:
|
||||||
|
|
||||||
|
```
|
||||||
|
L = L_supervised + 0.3 * L_bone + 0.1 * L_temporal
|
||||||
|
|
||||||
|
L_bone = (1/14) * sum_b (bone_len_b - prior_b)^2 # ADR-072 bone priors
|
||||||
|
L_temporal = SmoothL1(kp_t, kp_{t-1}) # Temporal smoothness
|
||||||
|
```
|
||||||
|
|
||||||
|
- 50 epochs at lower LR (1e-5)
|
||||||
|
- Tighten bone constraint weight from 0.3 → 0.5 over epochs
|
||||||
|
|
||||||
|
**Phase 4: Quantization + Export**
|
||||||
|
- Reuse ruvllm TurboQuant: float32 → int8 (4x smaller, ~881 KB)
|
||||||
|
- Export via SafeTensors for cross-platform deployment
|
||||||
|
- Validate quantized model PCK@20 within 2% of full-precision
|
||||||
|
|
||||||
|
### Component 4: Evaluation Script (`scripts/eval-wiflow.js`)
|
||||||
|
|
||||||
|
Measure actual PCK@20 using held-out paired data (20% split).
|
||||||
|
|
||||||
|
```
|
||||||
|
PCK@k = (1/N) * sum_i [ (||pred_i - gt_i|| < k * torso_length) ? 1 : 0 ]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Metrics reported:**
|
||||||
|
|
||||||
|
| Metric | Description | Target |
|
||||||
|
|--------|-------------|--------|
|
||||||
|
| PCK@20 | % of keypoints within 20% torso length | > 35% |
|
||||||
|
| PCK@50 | % within 50% torso length | > 60% |
|
||||||
|
| MPJPE | Mean per-joint position error (pixels) | < 40px |
|
||||||
|
| Per-joint PCK | Breakdown by joint (wrists are hardest) | Report all 17 |
|
||||||
|
| Inference latency | Single window prediction time | < 50ms |
|
||||||
|
|
||||||
|
### Optimization Strategy
|
||||||
|
|
||||||
|
#### O1: Curriculum Learning
|
||||||
|
|
||||||
|
Train easy poses first, hard poses later:
|
||||||
|
|
||||||
|
| Stage | Epochs | Data Filter | Rationale |
|
||||||
|
|-------|--------|-------------|-----------|
|
||||||
|
| 1 | 50 | `conf > 0.9`, standing only | Establish stable skeleton baseline |
|
||||||
|
| 2 | 50 | `conf > 0.7`, low motion | Add sitting, subtle movements |
|
||||||
|
| 3 | 50 | `conf > 0.5`, all poses | Full dataset including occlusions |
|
||||||
|
| 4 | 50 | All data, with augmentation | Robustness via noise injection |
|
||||||
|
|
||||||
|
#### O2: Data Augmentation (CSI domain)
|
||||||
|
|
||||||
|
Augment CSI windows to increase effective dataset size without collecting more data:
|
||||||
|
|
||||||
|
| Augmentation | Implementation | Expected Gain |
|
||||||
|
|-------------|----------------|---------------|
|
||||||
|
| Time shift | Roll CSI window by ±2 frames | +30% data |
|
||||||
|
| Amplitude noise | Gaussian noise, sigma=0.02 | Robustness |
|
||||||
|
| Subcarrier dropout | Zero 10% of subcarriers randomly | Robustness |
|
||||||
|
| Temporal flip | Reverse window + reverse keypoint velocity | +100% data |
|
||||||
|
| Multi-node mix | Swap node CSI, keep same-time keypoints | Cross-node generalization |
|
||||||
|
|
||||||
|
#### O3: Knowledge Distillation from MediaPipe
|
||||||
|
|
||||||
|
Instead of raw keypoint regression, distill MediaPipe's confidence and heatmap
|
||||||
|
information:
|
||||||
|
|
||||||
|
```
|
||||||
|
L_distill = KL_div(softmax(wifi_heatmap / T), softmax(camera_heatmap / T))
|
||||||
|
```
|
||||||
|
|
||||||
|
- Temperature T=4 for soft targets (transfers inter-joint relationships)
|
||||||
|
- WiFlow predicts a 17-channel heatmap [17, H, W] instead of direct [17, 2]
|
||||||
|
- Argmax for final keypoint extraction
|
||||||
|
- **Trade-off:** Adds ~200K params for heatmap decoder, but improves spatial precision
|
||||||
|
|
||||||
|
#### O4: Active Learning Loop
|
||||||
|
|
||||||
|
Identify which poses the model is worst at and collect more data for those:
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Train initial model on first collection session
|
||||||
|
2. Run inference on new CSI data, compute prediction entropy
|
||||||
|
3. Flag high-entropy windows (model is uncertain)
|
||||||
|
4. During next collection, the preview overlay highlights these moments:
|
||||||
|
"Hold this pose — model needs more examples"
|
||||||
|
5. Re-train with augmented dataset
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: 2-3 active learning iterations reach saturation.
|
||||||
|
|
||||||
|
#### O6: Subcarrier Selection (ruvector-solver)
|
||||||
|
|
||||||
|
Variance-based top-K subcarrier selection, equivalent to ruvector-solver's sparse
|
||||||
|
interpolation (114→56). Removes noise/static subcarriers before training:
|
||||||
|
|
||||||
|
```
|
||||||
|
For each subcarrier d in [0, dim):
|
||||||
|
variance[d] = mean over samples of temporal_variance(csi[d, :])
|
||||||
|
Select top-K by variance (K = dim * 0.5)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Validated:** 128 → 56 subcarriers (56% input reduction), proportional model size reduction.
|
||||||
|
|
||||||
|
#### O7: Attention-Weighted Subcarriers (ruvector-attention)
|
||||||
|
|
||||||
|
Compute per-subcarrier attention weights based on temporal energy correlation with
|
||||||
|
ground-truth keypoint motion. High-energy subcarriers that covary with skeleton
|
||||||
|
movement get amplified:
|
||||||
|
|
||||||
|
```
|
||||||
|
For each subcarrier d:
|
||||||
|
energy[d] = sum of squared first-differences over time
|
||||||
|
weight[d] = softmax(energy, temperature=0.1)
|
||||||
|
Apply: csi[d, :] *= weight[d] * dim (mean weight = 1)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Validated:** Top-5 attention subcarriers identified automatically per dataset.
|
||||||
|
|
||||||
|
#### O8: Stoer-Wagner MinCut Person Separation (ruvector-mincut / ADR-075)
|
||||||
|
|
||||||
|
JS implementation of the Stoer-Wagner algorithm for person separation in CSI, equivalent
|
||||||
|
to `DynamicPersonMatcher` in `wifi-densepose-train/src/metrics.rs`. Builds a subcarrier
|
||||||
|
correlation graph and finds the minimum cut to identify person-specific subcarrier clusters:
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Build dim×dim Pearson correlation matrix across subcarriers
|
||||||
|
2. Run Stoer-Wagner min-cut on correlation graph
|
||||||
|
3. Partition subcarriers into person-specific groups
|
||||||
|
4. Train per-partition models for multi-person scenarios
|
||||||
|
```
|
||||||
|
|
||||||
|
**Validated:** Stoer-Wagner executes on 56-dim graph, identifies partition boundaries.
|
||||||
|
|
||||||
|
#### O9: Multi-SPSA Gradient Estimation
|
||||||
|
|
||||||
|
Average over K=3 random perturbation directions per gradient step. Reduces variance
|
||||||
|
by sqrt(K) = 1.73x compared to single SPSA, at 3x forward pass cost (net win for
|
||||||
|
convergence quality):
|
||||||
|
|
||||||
|
```
|
||||||
|
For k in 1..K:
|
||||||
|
delta_k = random ±1 per parameter
|
||||||
|
grad_k = (loss(w + eps*delta_k) - loss(w - eps*delta_k)) / (2*eps*delta_k)
|
||||||
|
grad = mean(grad_1, ..., grad_K)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### O10: Mac M4 Pro Training via Tailscale
|
||||||
|
|
||||||
|
Training runs on Mac Mini M4 Pro (16-core GPU, ARM NEON SIMD) via Tailscale SSH,
|
||||||
|
using ruvllm's native Node.js SIMD ops:
|
||||||
|
|
||||||
|
| | Windows (CPU) | Mac M4 Pro |
|
||||||
|
|---|---|---|
|
||||||
|
| Node.js | v24.12.0 (x86) | v25.9.0 (ARM) |
|
||||||
|
| SIMD | SSE4/AVX2 | NEON |
|
||||||
|
| Cores | Consumer laptop | 12P + 4E cores |
|
||||||
|
| Training | Slow (minutes/epoch) | Fast (seconds/epoch) |
|
||||||
|
|
||||||
|
#### O5: Cross-Environment Transfer
|
||||||
|
|
||||||
|
Train on one room, deploy in another:
|
||||||
|
|
||||||
|
| Strategy | Implementation |
|
||||||
|
|----------|---------------|
|
||||||
|
| Room-invariant features | Normalize CSI by running mean/variance |
|
||||||
|
| LoRA adapters | Train a 4-rank LoRA per room (ADR-071) — 7.3 KB each |
|
||||||
|
| Few-shot calibration | 2 min of camera data in new room → fine-tune LoRA only |
|
||||||
|
| AETHER embeddings | Use contrastive room-independent features (ADR-024) as input |
|
||||||
|
|
||||||
|
The LoRA approach is most practical: ship a base model + collect 2 min of calibration
|
||||||
|
data per new room using the laptop camera.
|
||||||
|
|
||||||
|
### Data Collection Protocol
|
||||||
|
|
||||||
|
Recommended collection sessions per room:
|
||||||
|
|
||||||
|
| Session | Duration | Activity | People | Total CSI Frames |
|
||||||
|
|---------|----------|----------|--------|-----------------|
|
||||||
|
| 1. Baseline | 5 min | Empty + 1 person entry/exit | 0-1 | 30,000 |
|
||||||
|
| 2. Standing poses | 5 min | Stand, arms up/down/sides, turn | 1 | 30,000 |
|
||||||
|
| 3. Sitting | 5 min | Sit, type, lean, stand up/sit down | 1 | 30,000 |
|
||||||
|
| 4. Walking | 5 min | Walk paths across room | 1 | 30,000 |
|
||||||
|
| 5. Mixed | 5 min | Varied activities, transitions | 1 | 30,000 |
|
||||||
|
| 6. Multi-person | 5 min | 2 people, varied activities | 2 | 30,000 |
|
||||||
|
| **Total** | **30 min** | | | **180,000** |
|
||||||
|
|
||||||
|
At 20-frame windows: **9,000 paired training samples** per 30-min session.
|
||||||
|
With augmentation (O2): **~27,000 effective samples**.
|
||||||
|
|
||||||
|
Camera placement: position laptop so the camera has a clear view of the sensing area.
|
||||||
|
The camera FOV should cover the same space the ESP32 nodes cover.
|
||||||
|
|
||||||
|
### File Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
scripts/
|
||||||
|
collect-ground-truth.py # Camera capture + MediaPipe + CSI sync
|
||||||
|
align-ground-truth.js # Time-align CSI windows with camera keypoints
|
||||||
|
train-wiflow-supervised.js # Supervised training pipeline
|
||||||
|
eval-wiflow.js # PCK evaluation on held-out data
|
||||||
|
|
||||||
|
data/
|
||||||
|
ground-truth/ # Raw camera keypoint captures
|
||||||
|
gt-{timestamp}.jsonl
|
||||||
|
paired/ # Aligned CSI + keypoint pairs
|
||||||
|
paired-{timestamp}.jsonl
|
||||||
|
|
||||||
|
models/
|
||||||
|
wiflow-supervised/ # Trained model outputs
|
||||||
|
wiflow-v1.safetensors
|
||||||
|
wiflow-v1-int8.safetensors
|
||||||
|
training-log.json
|
||||||
|
eval-report.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### Privacy Considerations
|
||||||
|
|
||||||
|
- Camera frames are processed **locally** by MediaPipe — no cloud upload
|
||||||
|
- Raw video is **never saved** — only extracted keypoint coordinates are stored
|
||||||
|
- The `.jsonl` ground-truth files contain only `[x,y]` joint coordinates, not images
|
||||||
|
- The trained model runs on CSI only — no camera data leaves the laptop
|
||||||
|
- Users can delete `data/ground-truth/` after training; the model is self-contained
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
|
||||||
|
- **10-20x accuracy improvement**: PCK@20 from 2.5% → 35%+ with real supervision
|
||||||
|
- **Reuses existing infrastructure**: sensing server recording API, ruvllm training, SafeTensors
|
||||||
|
- **No new hardware**: laptop webcam + existing ESP32 nodes
|
||||||
|
- **Privacy preserved at deployment**: camera only needed during 30-min training session
|
||||||
|
- **Incremental**: can improve with more collection sessions + active learning
|
||||||
|
- **Distributable**: trained model weights can be shared on HuggingFace (ADR-070)
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
|
||||||
|
- **Camera placement matters**: must see the same area ESP32 nodes sense
|
||||||
|
- **Single-room models**: need LoRA calibration per room (2 min + camera)
|
||||||
|
- **MediaPipe limitations**: occlusion, side views, multiple people reduce keypoint quality
|
||||||
|
- **Time sync**: NTP drift can misalign frames (mitigated by 200ms windows)
|
||||||
|
|
||||||
|
### Risks
|
||||||
|
|
||||||
|
| Risk | Probability | Impact | Mitigation |
|
||||||
|
|------|-------------|--------|------------|
|
||||||
|
| MediaPipe keypoints too noisy | Low | Medium | Filter by confidence; MediaPipe is robust indoors |
|
||||||
|
| Clock drift > 100ms | Low | High | Add handclap sync marker detection |
|
||||||
|
| Single camera can't see all poses | Medium | Medium | Position camera centrally; collect from 2 angles |
|
||||||
|
| Model overfits to one room | High | Medium | LoRA adapters + AETHER normalization (O5) |
|
||||||
|
| Insufficient data (< 5K pairs) | Low | High | Augmentation (O2) + active learning (O4) |
|
||||||
|
|
||||||
|
## Implementation Plan
|
||||||
|
|
||||||
|
| Phase | Task | Effort | Status |
|
||||||
|
|-------|------|--------|--------|
|
||||||
|
| P1 | `collect-ground-truth.py` — camera + MediaPipe capture | 2 hrs | **Done** |
|
||||||
|
| P2 | `align-ground-truth.js` — time alignment + pairing | 1 hr | **Done** |
|
||||||
|
| P3 | `train-wiflow-supervised.js` — supervised training | 3 hrs | **Done** |
|
||||||
|
| P4 | `eval-wiflow.js` — PCK evaluation | 1 hr | **Done** |
|
||||||
|
| P5 | ruvector optimizations (O6-O9) | 2 hrs | **Done** |
|
||||||
|
| P6 | Mac M4 Pro training via Tailscale (O10) | 1 hr | **Done** |
|
||||||
|
| P7 | Data collection session (30 min recording) | 1 hr | Pending |
|
||||||
|
| P8 | Training + evaluation on real paired data | 30 min | Pending |
|
||||||
|
| P9 | LoRA cross-room calibration (O5) | 2 hrs | Pending |
|
||||||
|
|
||||||
|
## Validated Hardware
|
||||||
|
|
||||||
|
| Component | Spec | Validated |
|
||||||
|
|-----------|------|-----------|
|
||||||
|
| Mac Mini camera | 1920x1080, 30fps | Yes — 14/17 keypoints, conf 0.94-1.0 |
|
||||||
|
| MediaPipe PoseLandmarker | v0.10.33 Tasks API, lite model | Yes — via Tailscale SSH |
|
||||||
|
| Mac M4 Pro GPU | 16-core, Metal 4, NEON SIMD | Yes — Node.js v25.9.0 |
|
||||||
|
| Tailscale SSH | LAN-accessible Mac, passwordless | Yes |
|
||||||
|
| ESP32-S3 CSI | 128 subcarriers, 100Hz | Yes — existing recordings |
|
||||||
|
| Sensing server recording API | `/api/v1/recording/start\|stop` | Yes — existing |
|
||||||
|
|
||||||
|
## Baseline Benchmark
|
||||||
|
|
||||||
|
Proxy-pose baseline (no camera supervision, standing skeleton heuristic):
|
||||||
|
|
||||||
|
```
|
||||||
|
PCK@10: 11.8%
|
||||||
|
PCK@20: 35.3%
|
||||||
|
PCK@50: 94.1%
|
||||||
|
MPJPE: 0.067
|
||||||
|
Latency: 0.03ms/sample
|
||||||
|
```
|
||||||
|
|
||||||
|
Per-joint PCK@20: upper body (nose, shoulders, wrists) at 0% — proxy has no spatial
|
||||||
|
accuracy for these. Camera supervision targets these joints specifically.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- WiFlow: arXiv:2602.08661 — WiFi-based pose estimation with TCN + axial attention
|
||||||
|
- Wi-Pose (CVPR 2021) — 3D CNN WiFi pose with camera supervision
|
||||||
|
- Person-in-WiFi 3D (CVPR 2024) — Deformable attention with camera labels
|
||||||
|
- MediaPipe Pose — Google's real-time 33-landmark body pose estimator
|
||||||
|
- MetaFi++ (NeurIPS 2023) — Meta-learning cross-modal WiFi sensing
|
||||||
|
|
@ -1055,6 +1055,65 @@ See [ADR-071](adr/ADR-071-ruvllm-training-pipeline.md) and the [pretraining tuto
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Camera-Supervised Pose Training (v0.7.0)
|
||||||
|
|
||||||
|
For significantly higher accuracy, use a webcam as a **temporary teacher** during training. The camera captures real 17-keypoint poses via MediaPipe, paired with simultaneous ESP32 CSI data. After training, the camera is no longer needed — the model runs on CSI only.
|
||||||
|
|
||||||
|
**Result: 92.9% PCK@20** from a 5-minute collection session.
|
||||||
|
|
||||||
|
### Requirements
|
||||||
|
|
||||||
|
- Python 3.9+ with `mediapipe` and `opencv-python` (`pip install mediapipe opencv-python`)
|
||||||
|
- ESP32-S3 node streaming CSI over UDP (port 5005)
|
||||||
|
- A webcam (laptop, USB, or Mac camera via Tailscale)
|
||||||
|
|
||||||
|
### Step 1: Capture Camera + CSI Simultaneously
|
||||||
|
|
||||||
|
Run both scripts at the same time (in separate terminals):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Terminal 1: Record ESP32 CSI
|
||||||
|
python scripts/record-csi-udp.py --duration 300
|
||||||
|
|
||||||
|
# Terminal 2: Capture camera keypoints
|
||||||
|
python scripts/collect-ground-truth.py --duration 300 --preview
|
||||||
|
```
|
||||||
|
|
||||||
|
Move around naturally in front of the camera for 5 minutes. The `--preview` flag shows a live skeleton overlay.
|
||||||
|
|
||||||
|
### Step 2: Align and Train
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Align camera keypoints with CSI windows
|
||||||
|
node scripts/align-ground-truth.js \
|
||||||
|
--gt data/ground-truth/*.jsonl \
|
||||||
|
--csi data/recordings/csi-*.csi.jsonl
|
||||||
|
|
||||||
|
# Train (start with lite, scale up as you collect more data)
|
||||||
|
node scripts/train-wiflow-supervised.js \
|
||||||
|
--data data/paired/*.jsonl \
|
||||||
|
--scale lite \
|
||||||
|
--epochs 50
|
||||||
|
|
||||||
|
# Evaluate
|
||||||
|
node scripts/eval-wiflow.js \
|
||||||
|
--model models/wiflow-supervised/wiflow-v1.json \
|
||||||
|
--data data/paired/*.jsonl
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scale Presets
|
||||||
|
|
||||||
|
| Preset | Params | Training Time | Best For |
|
||||||
|
|--------|--------|---------------|----------|
|
||||||
|
| `--scale lite` | 189K | ~19 min | < 1,000 samples (5 min capture) |
|
||||||
|
| `--scale small` | 474K | ~1 hr | 1K-10K samples |
|
||||||
|
| `--scale medium` | 800K | ~2 hrs | 10K-50K samples |
|
||||||
|
| `--scale full` | 7.7M | ~8 hrs | 50K+ samples (GPU recommended) |
|
||||||
|
|
||||||
|
See [ADR-079](adr/ADR-079-camera-ground-truth-training.md) for the full design and optimization details.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Pre-Trained Models (No Training Required)
|
## Pre-Trained Models (No Training Required)
|
||||||
|
|
||||||
Pre-trained models are available on HuggingFace: **https://huggingface.co/ruvnet/wifi-densepose-pretrained**
|
Pre-trained models are available on HuggingFace: **https://huggingface.co/ruvnet/wifi-densepose-pretrained**
|
||||||
|
|
|
||||||
477
scripts/align-ground-truth.js
Normal file
477
scripts/align-ground-truth.js
Normal file
|
|
@ -0,0 +1,477 @@
|
||||||
|
#!/usr/bin/env node
|
||||||
|
/**
|
||||||
|
* Ground-Truth Alignment — Camera Keypoints <-> CSI Recording
|
||||||
|
*
|
||||||
|
* Time-aligns camera keypoint data with CSI recording data to produce
|
||||||
|
* paired training samples for WiFlow supervised training (ADR-079).
|
||||||
|
*
|
||||||
|
* Camera keypoints: data/ground-truth/gt-{timestamp}.jsonl
|
||||||
|
* CSI recordings: data/recordings/*.csi.jsonl
|
||||||
|
* Paired output: data/paired/*.paired.jsonl
|
||||||
|
*
|
||||||
|
* Usage:
|
||||||
|
* node scripts/align-ground-truth.js \
|
||||||
|
* --gt data/ground-truth/gt-1775300000.jsonl \
|
||||||
|
* --csi data/recordings/overnight-1775217646.csi.jsonl \
|
||||||
|
* --output data/paired/aligned.paired.jsonl
|
||||||
|
*
|
||||||
|
* # With clock offset correction (camera ahead by 50ms)
|
||||||
|
* node scripts/align-ground-truth.js \
|
||||||
|
* --gt data/ground-truth/gt-1775300000.jsonl \
|
||||||
|
* --csi data/recordings/overnight-1775217646.csi.jsonl \
|
||||||
|
* --clock-offset-ms -50
|
||||||
|
*
|
||||||
|
* ADR: docs/adr/ADR-079
|
||||||
|
*/
|
||||||
|
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
const fs = require('fs');
|
||||||
|
const path = require('path');
|
||||||
|
const { parseArgs } = require('util');
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// CLI argument parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
const { values: args } = parseArgs({
|
||||||
|
options: {
|
||||||
|
gt: { type: 'string' },
|
||||||
|
csi: { type: 'string' },
|
||||||
|
output: { type: 'string', short: 'o' },
|
||||||
|
'window-ms': { type: 'string', default: '200' },
|
||||||
|
'window-frames': { type: 'string', default: '20' },
|
||||||
|
'min-camera-frames': { type: 'string', default: '3' },
|
||||||
|
'min-confidence': { type: 'string', default: '0.5' },
|
||||||
|
'clock-offset-ms': { type: 'string', default: '0' },
|
||||||
|
help: { type: 'boolean', short: 'h', default: false },
|
||||||
|
},
|
||||||
|
strict: true,
|
||||||
|
});
|
||||||
|
|
||||||
|
if (args.help || !args.gt || !args.csi) {
|
||||||
|
console.log(`
|
||||||
|
Usage: node scripts/align-ground-truth.js --gt <gt.jsonl> --csi <csi.jsonl> [options]
|
||||||
|
|
||||||
|
Required:
|
||||||
|
--gt <path> Camera ground-truth JSONL file
|
||||||
|
--csi <path> CSI recording JSONL file
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--output, -o <path> Output paired JSONL (default: data/paired/<basename>.paired.jsonl)
|
||||||
|
--window-ms <ms> CSI window size in ms (default: 200)
|
||||||
|
--window-frames <n> Frames per CSI window (default: 20)
|
||||||
|
--min-camera-frames <n> Minimum camera frames per window (default: 3)
|
||||||
|
--min-confidence <f> Minimum average confidence threshold (default: 0.5)
|
||||||
|
--clock-offset-ms <ms> Manual clock offset: added to camera timestamps (default: 0)
|
||||||
|
--help, -h Show this help
|
||||||
|
`);
|
||||||
|
process.exit(args.help ? 0 : 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
const WINDOW_FRAMES = parseInt(args['window-frames'], 10);
|
||||||
|
const WINDOW_MS = parseInt(args['window-ms'], 10);
|
||||||
|
const MIN_CAMERA_FRAMES = parseInt(args['min-camera-frames'], 10);
|
||||||
|
const MIN_CONFIDENCE = parseFloat(args['min-confidence']);
|
||||||
|
const CLOCK_OFFSET_MS = parseFloat(args['clock-offset-ms']);
|
||||||
|
const NUM_KEYPOINTS = 17; // COCO 17-keypoint format
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Timestamp conversion
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Convert camera nanosecond timestamp to milliseconds.
|
||||||
|
* Applies clock offset correction.
|
||||||
|
*/
|
||||||
|
function cameraTsToMs(tsNs) {
|
||||||
|
return tsNs / 1e6 + CLOCK_OFFSET_MS;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Convert ISO 8601 timestamp string to milliseconds since epoch.
|
||||||
|
*/
|
||||||
|
function isoToMs(isoStr) {
|
||||||
|
return new Date(isoStr).getTime();
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// IQ hex parsing (matches train-wiflow.js conventions)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Parse IQ hex string into signed byte pairs [I0, Q0, I1, Q1, ...].
|
||||||
|
*/
|
||||||
|
function parseIqHex(hexStr) {
|
||||||
|
const bytes = [];
|
||||||
|
for (let i = 0; i < hexStr.length; i += 2) {
|
||||||
|
let val = parseInt(hexStr.substr(i, 2), 16);
|
||||||
|
if (val > 127) val -= 256; // signed byte
|
||||||
|
bytes.push(val);
|
||||||
|
}
|
||||||
|
return bytes;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Extract amplitude from IQ data for a given number of subcarriers.
|
||||||
|
* Returns Float32Array of amplitudes [nSubcarriers].
|
||||||
|
* Skips first I/Q pair (DC offset) per WiFlow paper recommendation.
|
||||||
|
*/
|
||||||
|
function extractAmplitude(iqBytes, nSubcarriers) {
|
||||||
|
const amp = new Float32Array(nSubcarriers);
|
||||||
|
const start = 2; // skip first IQ pair (DC offset)
|
||||||
|
for (let sc = 0; sc < nSubcarriers; sc++) {
|
||||||
|
const idx = start + sc * 2;
|
||||||
|
if (idx + 1 < iqBytes.length) {
|
||||||
|
const I = iqBytes[idx];
|
||||||
|
const Q = iqBytes[idx + 1];
|
||||||
|
amp[sc] = Math.sqrt(I * I + Q * Q);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return amp;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// File loading
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Load and parse a JSONL file, skipping blank/malformed lines.
|
||||||
|
*/
|
||||||
|
function loadJsonl(filePath) {
|
||||||
|
const lines = fs.readFileSync(filePath, 'utf8').split('\n');
|
||||||
|
const records = [];
|
||||||
|
for (const line of lines) {
|
||||||
|
const trimmed = line.trim();
|
||||||
|
if (!trimmed) continue;
|
||||||
|
try {
|
||||||
|
records.push(JSON.parse(trimmed));
|
||||||
|
} catch {
|
||||||
|
// skip malformed lines
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return records;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Load camera ground-truth file.
|
||||||
|
* Returns array of { tsMs, keypoints, confidence, nVisible, nPersons }.
|
||||||
|
*/
|
||||||
|
function loadGroundTruth(filePath) {
|
||||||
|
const raw = loadJsonl(filePath);
|
||||||
|
const frames = [];
|
||||||
|
for (const r of raw) {
|
||||||
|
if (r.ts_ns == null || !r.keypoints) continue;
|
||||||
|
frames.push({
|
||||||
|
tsMs: cameraTsToMs(r.ts_ns),
|
||||||
|
keypoints: r.keypoints,
|
||||||
|
confidence: r.confidence ?? 0,
|
||||||
|
nVisible: r.n_visible ?? 0,
|
||||||
|
nPersons: r.n_persons ?? 1,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
// Sort by timestamp
|
||||||
|
frames.sort((a, b) => a.tsMs - b.tsMs);
|
||||||
|
return frames;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Load CSI recording file.
|
||||||
|
* Separates raw_csi frames and feature frames.
|
||||||
|
*/
|
||||||
|
function loadCsi(filePath) {
|
||||||
|
const raw = loadJsonl(filePath);
|
||||||
|
const rawCsi = [];
|
||||||
|
const features = [];
|
||||||
|
|
||||||
|
for (const r of raw) {
|
||||||
|
if (!r.timestamp) continue;
|
||||||
|
const tsMs = isoToMs(r.timestamp);
|
||||||
|
if (isNaN(tsMs)) continue;
|
||||||
|
|
||||||
|
if (r.type === 'raw_csi') {
|
||||||
|
rawCsi.push({
|
||||||
|
tsMs,
|
||||||
|
nodeId: r.node_id,
|
||||||
|
subcarriers: r.subcarriers ?? 128,
|
||||||
|
iqHex: r.iq_hex,
|
||||||
|
rssi: r.rssi,
|
||||||
|
seq: r.seq,
|
||||||
|
});
|
||||||
|
} else if (r.type === 'feature') {
|
||||||
|
features.push({
|
||||||
|
tsMs,
|
||||||
|
nodeId: r.node_id,
|
||||||
|
features: r.features,
|
||||||
|
rssi: r.rssi,
|
||||||
|
seq: r.seq,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Sort by timestamp
|
||||||
|
rawCsi.sort((a, b) => a.tsMs - b.tsMs);
|
||||||
|
features.sort((a, b) => a.tsMs - b.tsMs);
|
||||||
|
return { rawCsi, features };
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Windowing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Group frames into non-overlapping windows of `windowSize` consecutive frames.
|
||||||
|
*/
|
||||||
|
function groupIntoWindows(frames, windowSize) {
|
||||||
|
const windows = [];
|
||||||
|
for (let i = 0; i + windowSize <= frames.length; i += windowSize) {
|
||||||
|
windows.push(frames.slice(i, i + windowSize));
|
||||||
|
}
|
||||||
|
return windows;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Camera frame matching (binary search)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Find all camera frames within [tStart, tEnd] using binary search.
|
||||||
|
*/
|
||||||
|
function findCameraFramesInRange(cameraFrames, tStartMs, tEndMs) {
|
||||||
|
// Binary search for first frame >= tStartMs
|
||||||
|
let lo = 0;
|
||||||
|
let hi = cameraFrames.length;
|
||||||
|
while (lo < hi) {
|
||||||
|
const mid = (lo + hi) >>> 1;
|
||||||
|
if (cameraFrames[mid].tsMs < tStartMs) lo = mid + 1;
|
||||||
|
else hi = mid;
|
||||||
|
}
|
||||||
|
|
||||||
|
const matched = [];
|
||||||
|
for (let i = lo; i < cameraFrames.length; i++) {
|
||||||
|
if (cameraFrames[i].tsMs > tEndMs) break;
|
||||||
|
matched.push(cameraFrames[i]);
|
||||||
|
}
|
||||||
|
return matched;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Keypoint averaging (confidence-weighted)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Average keypoints weighted by per-frame confidence.
|
||||||
|
* Returns { keypoints: [[x,y],...], avgConfidence }.
|
||||||
|
*/
|
||||||
|
function averageKeypoints(cameraFrames) {
|
||||||
|
let totalWeight = 0;
|
||||||
|
const sumKp = new Array(NUM_KEYPOINTS).fill(null).map(() => [0, 0]);
|
||||||
|
|
||||||
|
for (const f of cameraFrames) {
|
||||||
|
const w = f.confidence || 1e-6;
|
||||||
|
totalWeight += w;
|
||||||
|
for (let k = 0; k < NUM_KEYPOINTS && k < f.keypoints.length; k++) {
|
||||||
|
sumKp[k][0] += f.keypoints[k][0] * w;
|
||||||
|
sumKp[k][1] += f.keypoints[k][1] * w;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (totalWeight === 0) totalWeight = 1;
|
||||||
|
const keypoints = sumKp.map(([x, y]) => [x / totalWeight, y / totalWeight]);
|
||||||
|
const avgConfidence = cameraFrames.reduce((s, f) => s + (f.confidence || 0), 0) / cameraFrames.length;
|
||||||
|
|
||||||
|
return { keypoints, avgConfidence };
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// CSI matrix extraction
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Extract CSI amplitude matrix from raw_csi window.
|
||||||
|
* Returns { data: flat Float32Array, shape: [subcarriers, windowFrames] }.
|
||||||
|
*/
|
||||||
|
function extractCsiMatrix(window) {
|
||||||
|
const nFrames = window.length;
|
||||||
|
const nSc = window[0].subcarriers || 128;
|
||||||
|
const matrix = new Float32Array(nSc * nFrames);
|
||||||
|
|
||||||
|
for (let f = 0; f < nFrames; f++) {
|
||||||
|
const frame = window[f];
|
||||||
|
if (frame.iqHex) {
|
||||||
|
const iq = parseIqHex(frame.iqHex);
|
||||||
|
const amp = extractAmplitude(iq, nSc);
|
||||||
|
matrix.set(amp, f * nSc);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return { data: Array.from(matrix), shape: [nSc, nFrames] };
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Extract feature matrix from feature-type window.
|
||||||
|
* Returns { data: flat array, shape: [featureDim, windowFrames] }.
|
||||||
|
*/
|
||||||
|
function extractFeatureMatrix(window) {
|
||||||
|
const nFrames = window.length;
|
||||||
|
const dim = window[0].features ? window[0].features.length : 8;
|
||||||
|
const matrix = new Float32Array(dim * nFrames);
|
||||||
|
|
||||||
|
for (let f = 0; f < nFrames; f++) {
|
||||||
|
const feats = window[f].features || new Array(dim).fill(0);
|
||||||
|
for (let d = 0; d < dim; d++) {
|
||||||
|
matrix[f * dim + d] = feats[d] || 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return { data: Array.from(matrix), shape: [dim, nFrames] };
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Main alignment
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
function align() {
|
||||||
|
const gtPath = path.resolve(args.gt);
|
||||||
|
const csiPath = path.resolve(args.csi);
|
||||||
|
|
||||||
|
// Determine output path
|
||||||
|
let outputPath;
|
||||||
|
if (args.output) {
|
||||||
|
outputPath = path.resolve(args.output);
|
||||||
|
} else {
|
||||||
|
const baseName = path.basename(csiPath, '.csi.jsonl');
|
||||||
|
outputPath = path.resolve('data', 'paired', `${baseName}.paired.jsonl`);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ensure output directory exists
|
||||||
|
const outputDir = path.dirname(outputPath);
|
||||||
|
if (!fs.existsSync(outputDir)) {
|
||||||
|
fs.mkdirSync(outputDir, { recursive: true });
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log('=== Ground-Truth Alignment (ADR-079) ===');
|
||||||
|
console.log(` GT file: ${gtPath}`);
|
||||||
|
console.log(` CSI file: ${csiPath}`);
|
||||||
|
console.log(` Output: ${outputPath}`);
|
||||||
|
console.log(` Window: ${WINDOW_FRAMES} frames / ${WINDOW_MS} ms`);
|
||||||
|
console.log(` Min camera frames: ${MIN_CAMERA_FRAMES}`);
|
||||||
|
console.log(` Min confidence: ${MIN_CONFIDENCE}`);
|
||||||
|
console.log(` Clock offset: ${CLOCK_OFFSET_MS} ms`);
|
||||||
|
console.log();
|
||||||
|
|
||||||
|
// Load data
|
||||||
|
console.log('Loading ground-truth...');
|
||||||
|
const cameraFrames = loadGroundTruth(gtPath);
|
||||||
|
console.log(` ${cameraFrames.length} camera frames loaded`);
|
||||||
|
if (cameraFrames.length > 0) {
|
||||||
|
console.log(` Time range: ${new Date(cameraFrames[0].tsMs).toISOString()} -> ${new Date(cameraFrames[cameraFrames.length - 1].tsMs).toISOString()}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log('Loading CSI data...');
|
||||||
|
const { rawCsi, features } = loadCsi(csiPath);
|
||||||
|
console.log(` ${rawCsi.length} raw_csi frames, ${features.length} feature frames`);
|
||||||
|
|
||||||
|
// Decide which CSI source to use
|
||||||
|
const useRawCsi = rawCsi.length >= WINDOW_FRAMES;
|
||||||
|
const csiSource = useRawCsi ? rawCsi : features;
|
||||||
|
const sourceLabel = useRawCsi ? 'raw_csi' : 'feature';
|
||||||
|
|
||||||
|
if (csiSource.length < WINDOW_FRAMES) {
|
||||||
|
console.error(`ERROR: Not enough CSI frames (${csiSource.length}) for even one window of ${WINDOW_FRAMES} frames.`);
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log(` Using ${sourceLabel} frames (${csiSource.length} total)`);
|
||||||
|
if (csiSource.length > 0) {
|
||||||
|
console.log(` CSI time range: ${new Date(csiSource[0].tsMs).toISOString()} -> ${new Date(csiSource[csiSource.length - 1].tsMs).toISOString()}`);
|
||||||
|
}
|
||||||
|
console.log();
|
||||||
|
|
||||||
|
// Group CSI into windows
|
||||||
|
const windows = groupIntoWindows(csiSource, WINDOW_FRAMES);
|
||||||
|
console.log(`Grouped into ${windows.length} CSI windows`);
|
||||||
|
|
||||||
|
// Align
|
||||||
|
const paired = [];
|
||||||
|
let totalConfidence = 0;
|
||||||
|
|
||||||
|
for (const window of windows) {
|
||||||
|
const tStartMs = window[0].tsMs;
|
||||||
|
const tEndMs = window[window.length - 1].tsMs;
|
||||||
|
|
||||||
|
// Expand window if actual time span is smaller than window-ms
|
||||||
|
const halfWindow = WINDOW_MS / 2;
|
||||||
|
const midpoint = (tStartMs + tEndMs) / 2;
|
||||||
|
const searchStart = Math.min(tStartMs, midpoint - halfWindow);
|
||||||
|
const searchEnd = Math.max(tEndMs, midpoint + halfWindow);
|
||||||
|
|
||||||
|
// Find matching camera frames
|
||||||
|
const matched = findCameraFramesInRange(cameraFrames, searchStart, searchEnd);
|
||||||
|
|
||||||
|
if (matched.length < MIN_CAMERA_FRAMES) continue;
|
||||||
|
|
||||||
|
// Check average confidence
|
||||||
|
const avgConf = matched.reduce((s, f) => s + (f.confidence || 0), 0) / matched.length;
|
||||||
|
if (avgConf < MIN_CONFIDENCE) continue;
|
||||||
|
|
||||||
|
// Average keypoints weighted by confidence
|
||||||
|
const { keypoints, avgConfidence } = averageKeypoints(matched);
|
||||||
|
|
||||||
|
// Extract CSI matrix
|
||||||
|
const csiMatrix = useRawCsi
|
||||||
|
? extractCsiMatrix(window)
|
||||||
|
: extractFeatureMatrix(window);
|
||||||
|
|
||||||
|
paired.push({
|
||||||
|
csi: csiMatrix.data,
|
||||||
|
csi_shape: csiMatrix.shape,
|
||||||
|
kp: keypoints,
|
||||||
|
conf: Math.round(avgConfidence * 1000) / 1000,
|
||||||
|
n_camera_frames: matched.length,
|
||||||
|
ts_start: new Date(tStartMs).toISOString(),
|
||||||
|
ts_end: new Date(tEndMs).toISOString(),
|
||||||
|
});
|
||||||
|
|
||||||
|
totalConfidence += avgConfidence;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Write output
|
||||||
|
const outputLines = paired.map(s => JSON.stringify(s));
|
||||||
|
fs.writeFileSync(outputPath, outputLines.join('\n') + (outputLines.length > 0 ? '\n' : ''));
|
||||||
|
|
||||||
|
// Print summary
|
||||||
|
const alignmentRate = windows.length > 0 ? (paired.length / windows.length * 100) : 0;
|
||||||
|
const avgPairedConf = paired.length > 0 ? (totalConfidence / paired.length) : 0;
|
||||||
|
|
||||||
|
console.log();
|
||||||
|
console.log('=== Alignment Summary ===');
|
||||||
|
console.log(` Total CSI windows: ${windows.length}`);
|
||||||
|
console.log(` Paired samples: ${paired.length}`);
|
||||||
|
console.log(` Alignment rate: ${alignmentRate.toFixed(1)}%`);
|
||||||
|
console.log(` Avg confidence (paired): ${avgPairedConf.toFixed(3)}`);
|
||||||
|
console.log(` CSI source: ${sourceLabel} (${csiMatrix_shapeLabel(paired, useRawCsi)})`);
|
||||||
|
if (paired.length > 0) {
|
||||||
|
console.log(` Time range covered: ${paired[0].ts_start} -> ${paired[paired.length - 1].ts_end}`);
|
||||||
|
}
|
||||||
|
console.log(` Output written: ${outputPath}`);
|
||||||
|
console.log();
|
||||||
|
|
||||||
|
if (paired.length === 0) {
|
||||||
|
console.log('WARNING: No paired samples produced. Check that camera and CSI time ranges overlap.');
|
||||||
|
console.log(' Hint: Use --clock-offset-ms to correct misaligned clocks.');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Format CSI matrix shape label for summary.
|
||||||
|
*/
|
||||||
|
function csiMatrix_shapeLabel(paired, useRawCsi) {
|
||||||
|
if (paired.length === 0) return useRawCsi ? `[128, ${WINDOW_FRAMES}]` : `[8, ${WINDOW_FRAMES}]`;
|
||||||
|
const shape = paired[0].csi_shape;
|
||||||
|
return `[${shape[0]}, ${shape[1]}]`;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Entry point
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
align();
|
||||||
341
scripts/collect-ground-truth.py
Normal file
341
scripts/collect-ground-truth.py
Normal file
|
|
@ -0,0 +1,341 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Camera ground-truth collection for WiFi pose estimation training (ADR-079).
|
||||||
|
|
||||||
|
Captures webcam keypoints via MediaPipe PoseLandmarker (Tasks API) and
|
||||||
|
synchronizes with ESP32 CSI recording from the sensing server.
|
||||||
|
|
||||||
|
Output: JSONL file in data/ground-truth/ with per-frame 17-keypoint COCO poses.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python scripts/collect-ground-truth.py --preview --duration 60
|
||||||
|
python scripts/collect-ground-truth.py --server http://192.168.1.10:3000
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import signal
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
import urllib.request
|
||||||
|
import urllib.error
|
||||||
|
from pathlib import Path
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
import cv2
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
import mediapipe as mp
|
||||||
|
from mediapipe.tasks.python import BaseOptions
|
||||||
|
from mediapipe.tasks.python.vision import (
|
||||||
|
PoseLandmarker,
|
||||||
|
PoseLandmarkerOptions,
|
||||||
|
RunningMode,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# MediaPipe 33 landmarks -> 17 COCO keypoints
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# COCO idx : MP idx : joint name
|
||||||
|
# 0 : 0 : nose
|
||||||
|
# 1 : 2 : left_eye
|
||||||
|
# 2 : 5 : right_eye
|
||||||
|
# 3 : 7 : left_ear
|
||||||
|
# 4 : 8 : right_ear
|
||||||
|
# 5 : 11 : left_shoulder
|
||||||
|
# 6 : 12 : right_shoulder
|
||||||
|
# 7 : 13 : left_elbow
|
||||||
|
# 8 : 14 : right_elbow
|
||||||
|
# 9 : 15 : left_wrist
|
||||||
|
# 10 : 16 : right_wrist
|
||||||
|
# 11 : 23 : left_hip
|
||||||
|
# 12 : 24 : right_hip
|
||||||
|
# 13 : 25 : left_knee
|
||||||
|
# 14 : 26 : right_knee
|
||||||
|
# 15 : 27 : left_ankle
|
||||||
|
# 16 : 28 : right_ankle
|
||||||
|
|
||||||
|
MP_TO_COCO = [0, 2, 5, 7, 8, 11, 12, 13, 14, 15, 16, 23, 24, 25, 26, 27, 28]
|
||||||
|
|
||||||
|
COCO_BONES = [
|
||||||
|
(5, 7), (7, 9), (6, 8), (8, 10), # arms
|
||||||
|
(5, 6), # shoulders
|
||||||
|
(11, 13), (13, 15), (12, 14), (14, 16), # legs
|
||||||
|
(11, 12), # hips
|
||||||
|
(5, 11), (6, 12), # torso
|
||||||
|
(0, 1), (0, 2), (1, 3), (2, 4), # face
|
||||||
|
]
|
||||||
|
|
||||||
|
MODEL_URL = (
|
||||||
|
"https://storage.googleapis.com/mediapipe-models/"
|
||||||
|
"pose_landmarker/pose_landmarker_lite/float16/latest/"
|
||||||
|
"pose_landmarker_lite.task"
|
||||||
|
)
|
||||||
|
MODEL_FILENAME = "pose_landmarker_lite.task"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def ensure_model(cache_dir: Path) -> Path:
|
||||||
|
"""Download the PoseLandmarker model if not already cached."""
|
||||||
|
model_path = cache_dir / MODEL_FILENAME
|
||||||
|
if model_path.exists():
|
||||||
|
return model_path
|
||||||
|
|
||||||
|
cache_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
print(f"Downloading {MODEL_FILENAME} ...")
|
||||||
|
try:
|
||||||
|
urllib.request.urlretrieve(MODEL_URL, str(model_path))
|
||||||
|
print(f" saved to {model_path}")
|
||||||
|
except Exception as exc:
|
||||||
|
print(f"ERROR: Failed to download model: {exc}", file=sys.stderr)
|
||||||
|
print(
|
||||||
|
"Download manually from:\n"
|
||||||
|
f" {MODEL_URL}\n"
|
||||||
|
f"and place at {model_path}",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
sys.exit(1)
|
||||||
|
return model_path
|
||||||
|
|
||||||
|
|
||||||
|
def post_json(url: str, payload: dict | None = None, timeout: float = 5.0) -> bool:
|
||||||
|
"""POST JSON to a URL. Returns True on success, False on failure."""
|
||||||
|
data = json.dumps(payload or {}).encode("utf-8")
|
||||||
|
req = urllib.request.Request(
|
||||||
|
url,
|
||||||
|
data=data,
|
||||||
|
headers={"Content-Type": "application/json"},
|
||||||
|
method="POST",
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||||
|
return 200 <= resp.status < 300
|
||||||
|
except Exception as exc:
|
||||||
|
print(f"WARNING: POST {url} failed: {exc}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def draw_skeleton(frame: np.ndarray, keypoints: list[list[float]], w: int, h: int):
|
||||||
|
"""Draw COCO skeleton overlay on a BGR frame."""
|
||||||
|
pts = []
|
||||||
|
for x, y in keypoints:
|
||||||
|
px, py = int(x * w), int(y * h)
|
||||||
|
pts.append((px, py))
|
||||||
|
cv2.circle(frame, (px, py), 4, (0, 255, 0), -1)
|
||||||
|
|
||||||
|
for i, j in COCO_BONES:
|
||||||
|
if i < len(pts) and j < len(pts):
|
||||||
|
cv2.line(frame, pts[i], pts[j], (0, 200, 255), 2)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Main collection loop
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Collect camera ground-truth keypoints for WiFi pose training (ADR-079)."
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--server",
|
||||||
|
default="http://localhost:3000",
|
||||||
|
help="Sensing server URL (default: http://localhost:3000)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--preview",
|
||||||
|
action="store_true",
|
||||||
|
help="Show live skeleton overlay window",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--duration",
|
||||||
|
type=int,
|
||||||
|
default=300,
|
||||||
|
help="Recording duration in seconds (default: 300)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--camera",
|
||||||
|
type=int,
|
||||||
|
default=0,
|
||||||
|
help="Camera device index (default: 0)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
default="data/ground-truth",
|
||||||
|
help="Output directory (default: data/ground-truth)",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# --- Resolve paths relative to repo root ---
|
||||||
|
repo_root = Path(__file__).resolve().parent.parent
|
||||||
|
output_dir = repo_root / args.output
|
||||||
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
cache_dir = repo_root / "data" / ".cache"
|
||||||
|
|
||||||
|
# --- Download / locate model ---
|
||||||
|
model_path = ensure_model(cache_dir)
|
||||||
|
|
||||||
|
# --- Open camera ---
|
||||||
|
cap = cv2.VideoCapture(args.camera)
|
||||||
|
if not cap.isOpened():
|
||||||
|
print(
|
||||||
|
f"ERROR: Cannot open camera index {args.camera}. "
|
||||||
|
"Check that a webcam is connected and not in use by another app.",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
frame_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||||
|
frame_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||||
|
print(f"Camera opened: {frame_w}x{frame_h}")
|
||||||
|
|
||||||
|
# --- Create PoseLandmarker ---
|
||||||
|
options = PoseLandmarkerOptions(
|
||||||
|
base_options=BaseOptions(model_asset_path=str(model_path)),
|
||||||
|
running_mode=RunningMode.IMAGE,
|
||||||
|
num_poses=1,
|
||||||
|
min_pose_detection_confidence=0.5,
|
||||||
|
min_pose_presence_confidence=0.5,
|
||||||
|
min_tracking_confidence=0.5,
|
||||||
|
)
|
||||||
|
landmarker = PoseLandmarker.create_from_options(options)
|
||||||
|
|
||||||
|
# --- Output file ---
|
||||||
|
timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||||
|
out_path = output_dir / f"keypoints_{timestamp_str}.jsonl"
|
||||||
|
out_file = open(out_path, "w", encoding="utf-8")
|
||||||
|
print(f"Output: {out_path}")
|
||||||
|
|
||||||
|
# --- Start CSI recording ---
|
||||||
|
recording_url_start = f"{args.server}/api/v1/recording/start"
|
||||||
|
recording_url_stop = f"{args.server}/api/v1/recording/stop"
|
||||||
|
csi_started = post_json(recording_url_start)
|
||||||
|
if csi_started:
|
||||||
|
print("CSI recording started on sensing server.")
|
||||||
|
else:
|
||||||
|
print(
|
||||||
|
"WARNING: Could not start CSI recording. "
|
||||||
|
"Camera keypoints will still be captured.",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
|
||||||
|
# --- Graceful shutdown ---
|
||||||
|
shutdown_requested = False
|
||||||
|
|
||||||
|
def _handle_signal(signum, frame):
|
||||||
|
nonlocal shutdown_requested
|
||||||
|
shutdown_requested = True
|
||||||
|
|
||||||
|
signal.signal(signal.SIGINT, _handle_signal)
|
||||||
|
signal.signal(signal.SIGTERM, _handle_signal)
|
||||||
|
|
||||||
|
# --- Collection loop ---
|
||||||
|
start_time = time.monotonic()
|
||||||
|
frame_count = 0
|
||||||
|
total_confidence = 0.0
|
||||||
|
total_visible = 0
|
||||||
|
|
||||||
|
print(f"Collecting for {args.duration}s ... (press 'q' in preview to stop)")
|
||||||
|
|
||||||
|
try:
|
||||||
|
while not shutdown_requested:
|
||||||
|
elapsed = time.monotonic() - start_time
|
||||||
|
if elapsed >= args.duration:
|
||||||
|
break
|
||||||
|
|
||||||
|
ret, frame = cap.read()
|
||||||
|
if not ret:
|
||||||
|
print("WARNING: Failed to read frame, retrying ...", file=sys.stderr)
|
||||||
|
time.sleep(0.01)
|
||||||
|
continue
|
||||||
|
|
||||||
|
ts_ns = time.time_ns()
|
||||||
|
|
||||||
|
# Convert BGR -> RGB for MediaPipe
|
||||||
|
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
|
||||||
|
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb)
|
||||||
|
|
||||||
|
result = landmarker.detect(mp_image)
|
||||||
|
|
||||||
|
n_persons = len(result.pose_landmarks)
|
||||||
|
|
||||||
|
if n_persons > 0:
|
||||||
|
landmarks = result.pose_landmarks[0]
|
||||||
|
keypoints = []
|
||||||
|
visibilities = []
|
||||||
|
for coco_idx in range(17):
|
||||||
|
mp_idx = MP_TO_COCO[coco_idx]
|
||||||
|
lm = landmarks[mp_idx]
|
||||||
|
keypoints.append([round(lm.x, 5), round(lm.y, 5)])
|
||||||
|
visibilities.append(lm.visibility if lm.visibility else 0.0)
|
||||||
|
|
||||||
|
confidence = float(np.mean(visibilities))
|
||||||
|
n_visible = int(sum(1 for v in visibilities if v > 0.5))
|
||||||
|
else:
|
||||||
|
keypoints = []
|
||||||
|
confidence = 0.0
|
||||||
|
n_visible = 0
|
||||||
|
|
||||||
|
record = {
|
||||||
|
"ts_ns": ts_ns,
|
||||||
|
"keypoints": keypoints,
|
||||||
|
"confidence": round(confidence, 4),
|
||||||
|
"n_visible": n_visible,
|
||||||
|
"n_persons": n_persons,
|
||||||
|
}
|
||||||
|
out_file.write(json.dumps(record) + "\n")
|
||||||
|
frame_count += 1
|
||||||
|
total_confidence += confidence
|
||||||
|
total_visible += n_visible
|
||||||
|
|
||||||
|
# Preview overlay
|
||||||
|
if args.preview and keypoints:
|
||||||
|
draw_skeleton(frame, keypoints, frame_w, frame_h)
|
||||||
|
|
||||||
|
if args.preview:
|
||||||
|
remaining = max(0, int(args.duration - elapsed))
|
||||||
|
cv2.putText(
|
||||||
|
frame,
|
||||||
|
f"Frames: {frame_count} Visible: {n_visible}/17 Time: {remaining}s",
|
||||||
|
(10, 30),
|
||||||
|
cv2.FONT_HERSHEY_SIMPLEX,
|
||||||
|
0.7,
|
||||||
|
(255, 255, 255),
|
||||||
|
2,
|
||||||
|
)
|
||||||
|
cv2.imshow("Ground Truth Collection (ADR-079)", frame)
|
||||||
|
if cv2.waitKey(1) & 0xFF == ord("q"):
|
||||||
|
break
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# --- Cleanup ---
|
||||||
|
out_file.close()
|
||||||
|
cap.release()
|
||||||
|
if args.preview:
|
||||||
|
cv2.destroyAllWindows()
|
||||||
|
landmarker.close()
|
||||||
|
|
||||||
|
# Stop CSI recording
|
||||||
|
if csi_started:
|
||||||
|
if post_json(recording_url_stop):
|
||||||
|
print("CSI recording stopped.")
|
||||||
|
else:
|
||||||
|
print("WARNING: Failed to stop CSI recording.", file=sys.stderr)
|
||||||
|
|
||||||
|
# --- Summary ---
|
||||||
|
avg_conf = total_confidence / frame_count if frame_count > 0 else 0.0
|
||||||
|
avg_vis = total_visible / frame_count if frame_count > 0 else 0.0
|
||||||
|
print()
|
||||||
|
print("=== Collection Summary ===")
|
||||||
|
print(f" Total frames: {frame_count}")
|
||||||
|
print(f" Avg confidence: {avg_conf:.3f}")
|
||||||
|
print(f" Avg visible joints: {avg_vis:.1f} / 17")
|
||||||
|
print(f" Output: {out_path}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
625
scripts/eval-wiflow.js
Normal file
625
scripts/eval-wiflow.js
Normal file
|
|
@ -0,0 +1,625 @@
|
||||||
|
#!/usr/bin/env node
|
||||||
|
/**
|
||||||
|
* WiFlow PCK Evaluation Script (ADR-079)
|
||||||
|
*
|
||||||
|
* Measures accuracy of WiFi-based pose estimation against ground-truth
|
||||||
|
* camera keypoints using PCK (Percentage of Correct Keypoints) and MPJPE
|
||||||
|
* (Mean Per-Joint Position Error) metrics.
|
||||||
|
*
|
||||||
|
* Usage:
|
||||||
|
* node scripts/eval-wiflow.js --model models/wiflow-supervised/wiflow-v1.json --data data/paired/aligned.paired.jsonl
|
||||||
|
* node scripts/eval-wiflow.js --baseline --data data/paired/aligned.paired.jsonl
|
||||||
|
* node scripts/eval-wiflow.js --model models/wiflow-supervised/wiflow-v1.json --data data/paired/aligned.paired.jsonl --verbose
|
||||||
|
*
|
||||||
|
* ADR: docs/adr/ADR-079
|
||||||
|
*/
|
||||||
|
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
const fs = require('fs');
|
||||||
|
const path = require('path');
|
||||||
|
const { parseArgs } = require('util');
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Resolve WiFlow model dependencies
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
const {
|
||||||
|
WiFlowModel,
|
||||||
|
COCO_KEYPOINTS,
|
||||||
|
createRng,
|
||||||
|
} = require(path.join(__dirname, 'wiflow-model.js'));
|
||||||
|
|
||||||
|
const RUVLLM_PATH = path.resolve(__dirname, '..', 'vendor', 'ruvector', 'npm', 'packages', 'ruvllm', 'src');
|
||||||
|
const { SafeTensorsReader } = require(path.join(RUVLLM_PATH, 'export.js'));
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Constants
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
const NUM_KEYPOINTS = 17;
|
||||||
|
const DEFAULT_TORSO_LENGTH = 0.3; // normalized coords fallback
|
||||||
|
|
||||||
|
// Joint name aliases for display (short form)
|
||||||
|
const JOINT_NAMES = [
|
||||||
|
'nose', 'l_eye', 'r_eye', 'l_ear', 'r_ear',
|
||||||
|
'l_shoulder', 'r_shoulder', 'l_elbow', 'r_elbow',
|
||||||
|
'l_wrist', 'r_wrist', 'l_hip', 'r_hip',
|
||||||
|
'l_knee', 'r_knee', 'l_ankle', 'r_ankle',
|
||||||
|
];
|
||||||
|
|
||||||
|
// Shoulder indices: l_shoulder=5, r_shoulder=6
|
||||||
|
// Hip indices: l_hip=11, r_hip=12
|
||||||
|
const L_SHOULDER = 5;
|
||||||
|
const R_SHOULDER = 6;
|
||||||
|
const L_HIP = 11;
|
||||||
|
const R_HIP = 12;
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// CLI argument parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
const { values: args } = parseArgs({
|
||||||
|
options: {
|
||||||
|
model: { type: 'string', short: 'm' },
|
||||||
|
data: { type: 'string', short: 'd' },
|
||||||
|
baseline: { type: 'boolean', default: false },
|
||||||
|
output: { type: 'string', short: 'o' },
|
||||||
|
verbose: { type: 'boolean', short: 'v', default: false },
|
||||||
|
},
|
||||||
|
strict: true,
|
||||||
|
});
|
||||||
|
|
||||||
|
if (!args.data) {
|
||||||
|
console.error('Usage: node scripts/eval-wiflow.js --data <paired-jsonl> [--model <path>] [--baseline] [--output <path>]');
|
||||||
|
console.error('');
|
||||||
|
console.error('Required:');
|
||||||
|
console.error(' --data, -d <path> Paired CSI + keypoint JSONL (from align-ground-truth.js)');
|
||||||
|
console.error('');
|
||||||
|
console.error('Options:');
|
||||||
|
console.error(' --model, -m <path> Path to trained model directory or JSON');
|
||||||
|
console.error(' --baseline Evaluate proxy-based baseline (no model)');
|
||||||
|
console.error(' --output, -o <path> Output eval report JSON');
|
||||||
|
console.error(' --verbose, -v Verbose output');
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!args.model && !args.baseline) {
|
||||||
|
console.error('Error: Must specify either --model <path> or --baseline');
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Data loading
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Load paired JSONL samples.
|
||||||
|
* Each line: { csi: [...], csi_shape: [S, T], kp: [[x,y],...], conf: 0.xx, ... }
|
||||||
|
*/
|
||||||
|
function loadPairedData(filePath) {
|
||||||
|
const content = fs.readFileSync(filePath, 'utf-8');
|
||||||
|
const samples = [];
|
||||||
|
for (const line of content.split('\n')) {
|
||||||
|
if (!line.trim()) continue;
|
||||||
|
try {
|
||||||
|
const s = JSON.parse(line);
|
||||||
|
if (!s.kp || !Array.isArray(s.kp)) continue;
|
||||||
|
if (!s.csi && !s.csi_shape) continue;
|
||||||
|
samples.push(s);
|
||||||
|
} catch (e) {
|
||||||
|
// skip malformed lines
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return samples;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Model loading
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Load WiFlow model from a directory or JSON file.
|
||||||
|
* Tries: model.safetensors, then config.json for architecture config.
|
||||||
|
* Returns { model, name }.
|
||||||
|
*/
|
||||||
|
function loadModel(modelPath) {
|
||||||
|
const stat = fs.statSync(modelPath);
|
||||||
|
let modelDir;
|
||||||
|
|
||||||
|
if (stat.isDirectory()) {
|
||||||
|
modelDir = modelPath;
|
||||||
|
} else {
|
||||||
|
// Assume JSON file in a model directory
|
||||||
|
modelDir = path.dirname(modelPath);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Load architecture config if available
|
||||||
|
let config = {};
|
||||||
|
const configPath = path.join(modelDir, 'config.json');
|
||||||
|
if (fs.existsSync(configPath)) {
|
||||||
|
try {
|
||||||
|
const raw = JSON.parse(fs.readFileSync(configPath, 'utf-8'));
|
||||||
|
if (raw.custom) {
|
||||||
|
config.inputChannels = raw.custom.inputChannels || 128;
|
||||||
|
config.timeSteps = raw.custom.timeSteps || 20;
|
||||||
|
config.numKeypoints = raw.custom.numKeypoints || 17;
|
||||||
|
config.numHeads = raw.custom.numHeads || 8;
|
||||||
|
config.seed = raw.custom.seed || 42;
|
||||||
|
}
|
||||||
|
} catch (e) {
|
||||||
|
// use defaults
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Load training-metrics.json for additional config
|
||||||
|
const metricsPath = path.join(modelDir, 'training-metrics.json');
|
||||||
|
if (fs.existsSync(metricsPath)) {
|
||||||
|
try {
|
||||||
|
const metrics = JSON.parse(fs.readFileSync(metricsPath, 'utf-8'));
|
||||||
|
if (metrics.model && metrics.model.architecture === 'wiflow') {
|
||||||
|
// metrics available for report
|
||||||
|
}
|
||||||
|
} catch (e) {
|
||||||
|
// ignore
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create model with config
|
||||||
|
const model = new WiFlowModel(config);
|
||||||
|
model.setTraining(false); // eval mode
|
||||||
|
|
||||||
|
// Load weights from SafeTensors
|
||||||
|
const safetensorsPath = path.join(modelDir, 'model.safetensors');
|
||||||
|
if (fs.existsSync(safetensorsPath)) {
|
||||||
|
const buffer = new Uint8Array(fs.readFileSync(safetensorsPath));
|
||||||
|
const reader = new SafeTensorsReader(buffer);
|
||||||
|
const tensorNames = reader.getTensorNames();
|
||||||
|
|
||||||
|
// Build tensor map for fromTensorMap
|
||||||
|
const tensorMap = new Map();
|
||||||
|
for (const name of tensorNames) {
|
||||||
|
const tensor = reader.getTensor(name);
|
||||||
|
if (tensor) {
|
||||||
|
tensorMap.set(name, tensor.data);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
model.fromTensorMap(tensorMap);
|
||||||
|
if (args.verbose) {
|
||||||
|
console.log(`Loaded ${tensorNames.length} tensors from ${safetensorsPath}`);
|
||||||
|
console.log(`Model params: ${model.numParams().toLocaleString()}`);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
console.warn(`WARN: No model.safetensors found in ${modelDir}, using random weights`);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Derive model name
|
||||||
|
const name = path.basename(modelDir);
|
||||||
|
return { model, name };
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Baseline proxy pose generation (ADR-072 Phase 2 heuristic)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generate a proxy standing skeleton from CSI features.
|
||||||
|
* If presence detected (amplitude energy > threshold), place a standing
|
||||||
|
* person at center with standard COCO proportions, perturbed by motion energy.
|
||||||
|
*/
|
||||||
|
function generateBaselinePose(sample) {
|
||||||
|
const rng = createRng(42);
|
||||||
|
|
||||||
|
// Estimate presence from CSI amplitude energy
|
||||||
|
const csi = sample.csi;
|
||||||
|
let energy = 0;
|
||||||
|
if (Array.isArray(csi)) {
|
||||||
|
for (let i = 0; i < csi.length; i++) {
|
||||||
|
energy += csi[i] * csi[i];
|
||||||
|
}
|
||||||
|
energy = Math.sqrt(energy / csi.length);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Estimate motion energy (variance across subcarriers)
|
||||||
|
let motionEnergy = 0;
|
||||||
|
if (Array.isArray(csi) && sample.csi_shape) {
|
||||||
|
const [S, T] = sample.csi_shape;
|
||||||
|
if (T > 1) {
|
||||||
|
for (let s = 0; s < S; s++) {
|
||||||
|
let sum = 0;
|
||||||
|
let sumSq = 0;
|
||||||
|
for (let t = 0; t < T; t++) {
|
||||||
|
const v = csi[s * T + t] || 0;
|
||||||
|
sum += v;
|
||||||
|
sumSq += v * v;
|
||||||
|
}
|
||||||
|
const mean = sum / T;
|
||||||
|
motionEnergy += (sumSq / T) - (mean * mean);
|
||||||
|
}
|
||||||
|
motionEnergy = Math.sqrt(Math.max(0, motionEnergy / S));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Normalized presence heuristic
|
||||||
|
const presence = Math.min(1, energy / 10);
|
||||||
|
|
||||||
|
if (presence < 0.3) {
|
||||||
|
// No person detected: return zero pose
|
||||||
|
return new Float32Array(NUM_KEYPOINTS * 2);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Standing skeleton at center (0.5, 0.5) with standard proportions
|
||||||
|
// Coordinates are [x, y] in normalized [0, 1] space
|
||||||
|
// y=0 is top, y=1 is bottom (image convention)
|
||||||
|
const cx = 0.5;
|
||||||
|
const headY = 0.2;
|
||||||
|
const shoulderY = 0.32;
|
||||||
|
const elbowY = 0.45;
|
||||||
|
const wristY = 0.55;
|
||||||
|
const hipY = 0.55;
|
||||||
|
const kneeY = 0.72;
|
||||||
|
const ankleY = 0.88;
|
||||||
|
const shoulderW = 0.08;
|
||||||
|
const hipW = 0.06;
|
||||||
|
const armSpread = 0.12;
|
||||||
|
|
||||||
|
// Standard standing pose keypoints [x, y]
|
||||||
|
const skeleton = [
|
||||||
|
[cx, headY], // 0: nose
|
||||||
|
[cx - 0.02, headY - 0.02], // 1: l_eye
|
||||||
|
[cx + 0.02, headY - 0.02], // 2: r_eye
|
||||||
|
[cx - 0.04, headY], // 3: l_ear
|
||||||
|
[cx + 0.04, headY], // 4: r_ear
|
||||||
|
[cx - shoulderW, shoulderY], // 5: l_shoulder
|
||||||
|
[cx + shoulderW, shoulderY], // 6: r_shoulder
|
||||||
|
[cx - armSpread, elbowY], // 7: l_elbow
|
||||||
|
[cx + armSpread, elbowY], // 8: r_elbow
|
||||||
|
[cx - armSpread - 0.02, wristY], // 9: l_wrist
|
||||||
|
[cx + armSpread + 0.02, wristY], // 10: r_wrist
|
||||||
|
[cx - hipW, hipY], // 11: l_hip
|
||||||
|
[cx + hipW, hipY], // 12: r_hip
|
||||||
|
[cx - hipW, kneeY], // 13: l_knee
|
||||||
|
[cx + hipW, kneeY], // 14: r_knee
|
||||||
|
[cx - hipW, ankleY], // 15: l_ankle
|
||||||
|
[cx + hipW, ankleY], // 16: r_ankle
|
||||||
|
];
|
||||||
|
|
||||||
|
// Perturb limbs by motion energy
|
||||||
|
const perturbScale = Math.min(motionEnergy * 0.1, 0.05);
|
||||||
|
const result = new Float32Array(NUM_KEYPOINTS * 2);
|
||||||
|
for (let k = 0; k < NUM_KEYPOINTS; k++) {
|
||||||
|
const px = (rng() - 0.5) * 2 * perturbScale;
|
||||||
|
const py = (rng() - 0.5) * 2 * perturbScale;
|
||||||
|
result[k * 2] = Math.max(0, Math.min(1, skeleton[k][0] + px));
|
||||||
|
result[k * 2 + 1] = Math.max(0, Math.min(1, skeleton[k][1] + py));
|
||||||
|
}
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Metric computation
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/** Euclidean distance between two 2D points */
|
||||||
|
function dist2d(x1, y1, x2, y2) {
|
||||||
|
const dx = x1 - x2;
|
||||||
|
const dy = y1 - y2;
|
||||||
|
return Math.sqrt(dx * dx + dy * dy);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Compute torso length from ground-truth keypoints.
|
||||||
|
* Torso = distance(mid_shoulder, mid_hip).
|
||||||
|
* Returns DEFAULT_TORSO_LENGTH if shoulders or hips not visible.
|
||||||
|
*/
|
||||||
|
function computeTorsoLength(kp) {
|
||||||
|
if (!kp || kp.length < 13) return DEFAULT_TORSO_LENGTH;
|
||||||
|
|
||||||
|
const lsX = kp[L_SHOULDER][0];
|
||||||
|
const lsY = kp[L_SHOULDER][1];
|
||||||
|
const rsX = kp[R_SHOULDER][0];
|
||||||
|
const rsY = kp[R_SHOULDER][1];
|
||||||
|
const lhX = kp[L_HIP][0];
|
||||||
|
const lhY = kp[L_HIP][1];
|
||||||
|
const rhX = kp[R_HIP][0];
|
||||||
|
const rhY = kp[R_HIP][1];
|
||||||
|
|
||||||
|
// Check if joints are at origin (not visible)
|
||||||
|
const shoulderVisible = (lsX !== 0 || lsY !== 0) && (rsX !== 0 || rsY !== 0);
|
||||||
|
const hipVisible = (lhX !== 0 || lhY !== 0) && (rhX !== 0 || rhY !== 0);
|
||||||
|
|
||||||
|
if (!shoulderVisible || !hipVisible) return DEFAULT_TORSO_LENGTH;
|
||||||
|
|
||||||
|
const midShoulderX = (lsX + rsX) / 2;
|
||||||
|
const midShoulderY = (lsY + rsY) / 2;
|
||||||
|
const midHipX = (lhX + rhX) / 2;
|
||||||
|
const midHipY = (lhY + rhY) / 2;
|
||||||
|
|
||||||
|
const torso = dist2d(midShoulderX, midShoulderY, midHipX, midHipY);
|
||||||
|
return torso > 0.01 ? torso : DEFAULT_TORSO_LENGTH;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Evaluate predictions against ground truth.
|
||||||
|
*
|
||||||
|
* @param {Array<{pred: Float32Array, gt: number[][], conf: number}>} results
|
||||||
|
* @returns {object} Evaluation report
|
||||||
|
*/
|
||||||
|
function computeMetrics(results) {
|
||||||
|
const n = results.length;
|
||||||
|
if (n === 0) {
|
||||||
|
return {
|
||||||
|
n_samples: 0,
|
||||||
|
pck_10: 0, pck_20: 0, pck_50: 0,
|
||||||
|
mpjpe: 0,
|
||||||
|
per_joint_pck20: {},
|
||||||
|
per_joint_mpjpe: {},
|
||||||
|
conf_weighted_pck20: 0,
|
||||||
|
conf_weighted_mpjpe: 0,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// Accumulators
|
||||||
|
const pckCounts = { 10: 0, 20: 0, 50: 0 };
|
||||||
|
let totalJoints = 0;
|
||||||
|
let totalMPJPE = 0;
|
||||||
|
|
||||||
|
const perJointPck20 = new Float64Array(NUM_KEYPOINTS);
|
||||||
|
const perJointMPJPE = new Float64Array(NUM_KEYPOINTS);
|
||||||
|
const perJointCount = new Float64Array(NUM_KEYPOINTS);
|
||||||
|
|
||||||
|
// Confidence-weighted accumulators
|
||||||
|
let confWeightedPck20Num = 0;
|
||||||
|
let confWeightedPck20Den = 0;
|
||||||
|
let confWeightedMpjpeNum = 0;
|
||||||
|
let confWeightedMpjpeDen = 0;
|
||||||
|
|
||||||
|
for (const { pred, gt, conf } of results) {
|
||||||
|
const torso = computeTorsoLength(gt);
|
||||||
|
const w = Math.max(conf, 1e-6);
|
||||||
|
|
||||||
|
for (let k = 0; k < NUM_KEYPOINTS; k++) {
|
||||||
|
if (k >= gt.length) continue;
|
||||||
|
|
||||||
|
const gtX = gt[k][0];
|
||||||
|
const gtY = gt[k][1];
|
||||||
|
const predX = pred[k * 2];
|
||||||
|
const predY = pred[k * 2 + 1];
|
||||||
|
|
||||||
|
const d = dist2d(predX, predY, gtX, gtY);
|
||||||
|
|
||||||
|
totalJoints++;
|
||||||
|
totalMPJPE += d;
|
||||||
|
|
||||||
|
perJointMPJPE[k] += d;
|
||||||
|
perJointCount[k] += 1;
|
||||||
|
|
||||||
|
// PCK at different thresholds
|
||||||
|
if (d < 0.10 * torso) pckCounts[10]++;
|
||||||
|
if (d < 0.20 * torso) {
|
||||||
|
pckCounts[20]++;
|
||||||
|
perJointPck20[k]++;
|
||||||
|
confWeightedPck20Num += w;
|
||||||
|
}
|
||||||
|
if (d < 0.50 * torso) pckCounts[50]++;
|
||||||
|
|
||||||
|
confWeightedPck20Den += w;
|
||||||
|
confWeightedMpjpeNum += d * w;
|
||||||
|
confWeightedMpjpeDen += w;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Aggregate metrics
|
||||||
|
const pck10 = totalJoints > 0 ? pckCounts[10] / totalJoints : 0;
|
||||||
|
const pck20 = totalJoints > 0 ? pckCounts[20] / totalJoints : 0;
|
||||||
|
const pck50 = totalJoints > 0 ? pckCounts[50] / totalJoints : 0;
|
||||||
|
const mpjpe = totalJoints > 0 ? totalMPJPE / totalJoints : 0;
|
||||||
|
|
||||||
|
// Per-joint breakdown
|
||||||
|
const perJointPck20Map = {};
|
||||||
|
const perJointMpjpeMap = {};
|
||||||
|
for (let k = 0; k < NUM_KEYPOINTS; k++) {
|
||||||
|
const name = JOINT_NAMES[k];
|
||||||
|
perJointPck20Map[name] = perJointCount[k] > 0 ? perJointPck20[k] / perJointCount[k] : 0;
|
||||||
|
perJointMpjpeMap[name] = perJointCount[k] > 0 ? perJointMPJPE[k] / perJointCount[k] : 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Confidence-weighted
|
||||||
|
const confPck20 = confWeightedPck20Den > 0 ? confWeightedPck20Num / confWeightedPck20Den : 0;
|
||||||
|
const confMpjpe = confWeightedMpjpeDen > 0 ? confWeightedMpjpeNum / confWeightedMpjpeDen : 0;
|
||||||
|
|
||||||
|
return {
|
||||||
|
n_samples: n,
|
||||||
|
pck_10: pck10,
|
||||||
|
pck_20: pck20,
|
||||||
|
pck_50: pck50,
|
||||||
|
mpjpe,
|
||||||
|
per_joint_pck20: perJointPck20Map,
|
||||||
|
per_joint_mpjpe: perJointMpjpeMap,
|
||||||
|
conf_weighted_pck20: confPck20,
|
||||||
|
conf_weighted_mpjpe: confMpjpe,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Inference
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Run model inference on a single paired sample.
|
||||||
|
* @param {WiFlowModel} model
|
||||||
|
* @param {object} sample - { csi, csi_shape, kp, conf }
|
||||||
|
* @returns {Float32Array} - [17*2] predicted keypoints
|
||||||
|
*/
|
||||||
|
function runModelInference(model, sample) {
|
||||||
|
const csi = sample.csi;
|
||||||
|
const shape = sample.csi_shape;
|
||||||
|
const S = shape ? shape[0] : 128;
|
||||||
|
const T = shape ? shape[1] : 20;
|
||||||
|
|
||||||
|
// Prepare input as Float32Array [S, T]
|
||||||
|
let input;
|
||||||
|
if (csi instanceof Float32Array) {
|
||||||
|
input = csi;
|
||||||
|
} else if (Array.isArray(csi)) {
|
||||||
|
input = new Float32Array(csi);
|
||||||
|
} else {
|
||||||
|
input = new Float32Array(S * T);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ensure correct size (pad or truncate)
|
||||||
|
const expectedLen = model.inputChannels * model.timeSteps;
|
||||||
|
if (input.length !== expectedLen) {
|
||||||
|
const resized = new Float32Array(expectedLen);
|
||||||
|
const copyLen = Math.min(input.length, expectedLen);
|
||||||
|
resized.set(input.subarray(0, copyLen));
|
||||||
|
input = resized;
|
||||||
|
}
|
||||||
|
|
||||||
|
return model.forward(input);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Formatted output
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
function formatPercent(v) {
|
||||||
|
return (v * 100).toFixed(1) + '%';
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatFloat(v, decimals) {
|
||||||
|
decimals = decimals || 4;
|
||||||
|
return v.toFixed(decimals);
|
||||||
|
}
|
||||||
|
|
||||||
|
function printReport(report) {
|
||||||
|
console.log('');
|
||||||
|
console.log('WiFlow Evaluation Report (ADR-079)');
|
||||||
|
console.log('===================================');
|
||||||
|
console.log(`Model: ${report.model}`);
|
||||||
|
console.log(`Samples: ${report.n_samples.toLocaleString()}`);
|
||||||
|
console.log(`PCK@10: ${formatPercent(report.pck_10)}`);
|
||||||
|
console.log(`PCK@20: ${formatPercent(report.pck_20)}`);
|
||||||
|
console.log(`PCK@50: ${formatPercent(report.pck_50)}`);
|
||||||
|
console.log(`MPJPE: ${formatFloat(report.mpjpe)}`);
|
||||||
|
console.log('');
|
||||||
|
console.log('Per-Joint PCK@20:');
|
||||||
|
|
||||||
|
const maxNameLen = Math.max(...JOINT_NAMES.map(n => n.length));
|
||||||
|
for (const name of JOINT_NAMES) {
|
||||||
|
const pck = report.per_joint_pck20[name] || 0;
|
||||||
|
const pad = ' '.repeat(maxNameLen - name.length + 2);
|
||||||
|
console.log(` ${name}${pad}${formatPercent(pck)}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log('');
|
||||||
|
console.log('Per-Joint MPJPE:');
|
||||||
|
for (const name of JOINT_NAMES) {
|
||||||
|
const mpjpe = report.per_joint_mpjpe[name] || 0;
|
||||||
|
const pad = ' '.repeat(maxNameLen - name.length + 2);
|
||||||
|
console.log(` ${name}${pad}${formatFloat(mpjpe)}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log('');
|
||||||
|
console.log('Confidence-Weighted:');
|
||||||
|
console.log(` PCK@20: ${formatPercent(report.conf_weighted_pck20)}`);
|
||||||
|
console.log(` MPJPE: ${formatFloat(report.conf_weighted_mpjpe)}`);
|
||||||
|
console.log('');
|
||||||
|
console.log(`Inference: ${report.inference_latency_ms.toFixed(2)}ms/sample`);
|
||||||
|
console.log('');
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Main
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
function main() {
|
||||||
|
// Load paired data
|
||||||
|
if (args.verbose) console.log(`Loading paired data from ${args.data}...`);
|
||||||
|
const samples = loadPairedData(args.data);
|
||||||
|
if (samples.length === 0) {
|
||||||
|
console.error('Error: No valid paired samples found in', args.data);
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
if (args.verbose) console.log(`Loaded ${samples.length} paired samples`);
|
||||||
|
|
||||||
|
let modelName;
|
||||||
|
let model = null;
|
||||||
|
|
||||||
|
if (args.baseline) {
|
||||||
|
modelName = 'baseline-proxy';
|
||||||
|
if (args.verbose) console.log('Running baseline proxy evaluation (ADR-072 Phase 2 heuristic)');
|
||||||
|
} else {
|
||||||
|
const loaded = loadModel(args.model);
|
||||||
|
model = loaded.model;
|
||||||
|
modelName = loaded.name;
|
||||||
|
if (args.verbose) console.log(`Running model evaluation: ${modelName}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run inference and collect results
|
||||||
|
const results = [];
|
||||||
|
const startTime = process.hrtime.bigint();
|
||||||
|
|
||||||
|
for (const sample of samples) {
|
||||||
|
let pred;
|
||||||
|
if (args.baseline) {
|
||||||
|
pred = generateBaselinePose(sample);
|
||||||
|
} else {
|
||||||
|
pred = runModelInference(model, sample);
|
||||||
|
}
|
||||||
|
|
||||||
|
results.push({
|
||||||
|
pred,
|
||||||
|
gt: sample.kp,
|
||||||
|
conf: sample.conf || 0,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
const endTime = process.hrtime.bigint();
|
||||||
|
const totalMs = Number(endTime - startTime) / 1e6;
|
||||||
|
const latencyMs = totalMs / samples.length;
|
||||||
|
|
||||||
|
// Compute metrics
|
||||||
|
const metrics = computeMetrics(results);
|
||||||
|
|
||||||
|
// Build report
|
||||||
|
const report = {
|
||||||
|
model: modelName,
|
||||||
|
n_samples: metrics.n_samples,
|
||||||
|
pck_10: Math.round(metrics.pck_10 * 10000) / 10000,
|
||||||
|
pck_20: Math.round(metrics.pck_20 * 10000) / 10000,
|
||||||
|
pck_50: Math.round(metrics.pck_50 * 10000) / 10000,
|
||||||
|
mpjpe: Math.round(metrics.mpjpe * 100000) / 100000,
|
||||||
|
per_joint_pck20: {},
|
||||||
|
per_joint_mpjpe: {},
|
||||||
|
conf_weighted_pck20: Math.round(metrics.conf_weighted_pck20 * 10000) / 10000,
|
||||||
|
conf_weighted_mpjpe: Math.round(metrics.conf_weighted_mpjpe * 100000) / 100000,
|
||||||
|
inference_latency_ms: Math.round(latencyMs * 100) / 100,
|
||||||
|
timestamp: new Date().toISOString(),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Round per-joint metrics
|
||||||
|
for (const name of JOINT_NAMES) {
|
||||||
|
report.per_joint_pck20[name] = Math.round((metrics.per_joint_pck20[name] || 0) * 10000) / 10000;
|
||||||
|
report.per_joint_mpjpe[name] = Math.round((metrics.per_joint_mpjpe[name] || 0) * 100000) / 100000;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Print formatted report
|
||||||
|
printReport(report);
|
||||||
|
|
||||||
|
// Write output JSON
|
||||||
|
const outputPath = args.output ||
|
||||||
|
(args.model
|
||||||
|
? path.join(path.dirname(
|
||||||
|
fs.statSync(args.model).isDirectory() ? path.join(args.model, '.') : args.model
|
||||||
|
), 'eval-report.json')
|
||||||
|
: 'models/wiflow-supervised/eval-report.json');
|
||||||
|
|
||||||
|
const outputDir = path.dirname(outputPath);
|
||||||
|
if (!fs.existsSync(outputDir)) {
|
||||||
|
fs.mkdirSync(outputDir, { recursive: true });
|
||||||
|
}
|
||||||
|
|
||||||
|
fs.writeFileSync(outputPath, JSON.stringify(report, null, 2) + '\n');
|
||||||
|
console.log(`Report saved to ${outputPath}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
main();
|
||||||
|
|
@ -6,7 +6,7 @@ echo "Host: $(hostname) | $(sysctl -n hw.ncpu 2>/dev/null || nproc) cores | $(sy
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
REPO_DIR="${HOME}/Projects/wifi-densepose"
|
REPO_DIR="${HOME}/Projects/wifi-densepose"
|
||||||
WINDOWS_HOST="100.102.238.73" # Tailscale IP of Windows machine
|
WINDOWS_HOST="${WINDOWS_HOST:-}" # Set via env: export WINDOWS_HOST=<tailscale-ip>
|
||||||
|
|
||||||
# Step 1: Clone or update repo
|
# Step 1: Clone or update repo
|
||||||
echo "[1/7] Setting up repository..."
|
echo "[1/7] Setting up repository..."
|
||||||
|
|
|
||||||
111
scripts/record-csi-udp.py
Normal file
111
scripts/record-csi-udp.py
Normal file
|
|
@ -0,0 +1,111 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Lightweight ESP32 CSI UDP recorder (ADR-079).
|
||||||
|
|
||||||
|
Captures raw CSI packets from ESP32 nodes over UDP and writes to JSONL.
|
||||||
|
Runs alongside collect-ground-truth.py for synchronized capture.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python scripts/record-csi-udp.py --duration 300 --output data/recordings
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import socket
|
||||||
|
import struct
|
||||||
|
import time
|
||||||
|
|
||||||
|
|
||||||
|
def parse_csi_packet(data):
|
||||||
|
"""Parse ADR-018 binary CSI packet into dict."""
|
||||||
|
if len(data) < 8:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# ADR-018 header: [magic(2), len(2), node_id(1), seq(1), rssi(1), channel(1), iq_data...]
|
||||||
|
# Simplified: extract what we can from the raw packet
|
||||||
|
node_id = data[4] if len(data) > 4 else 0
|
||||||
|
rssi = struct.unpack('b', bytes([data[6]]))[0] if len(data) > 6 else 0
|
||||||
|
channel = data[7] if len(data) > 7 else 0
|
||||||
|
|
||||||
|
# IQ data starts at offset 8
|
||||||
|
iq_data = data[8:] if len(data) > 8 else b''
|
||||||
|
n_subcarriers = len(iq_data) // 2 # I,Q pairs
|
||||||
|
|
||||||
|
# Compute amplitudes
|
||||||
|
amplitudes = []
|
||||||
|
for i in range(0, len(iq_data) - 1, 2):
|
||||||
|
I = struct.unpack('b', bytes([iq_data[i]]))[0]
|
||||||
|
Q = struct.unpack('b', bytes([iq_data[i + 1]]))[0]
|
||||||
|
amplitudes.append(round((I * I + Q * Q) ** 0.5, 2))
|
||||||
|
|
||||||
|
return {
|
||||||
|
"type": "raw_csi",
|
||||||
|
"timestamp": time.strftime("%Y-%m-%dT%H:%M:%S.") + f"{int(time.time() * 1000) % 1000:03d}Z",
|
||||||
|
"ts_ns": time.time_ns(),
|
||||||
|
"node_id": node_id,
|
||||||
|
"rssi": rssi,
|
||||||
|
"channel": channel,
|
||||||
|
"subcarriers": n_subcarriers,
|
||||||
|
"amplitudes": amplitudes,
|
||||||
|
"iq_hex": iq_data.hex(),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Record ESP32 CSI over UDP")
|
||||||
|
parser.add_argument("--port", type=int, default=5005, help="UDP port (default: 5005)")
|
||||||
|
parser.add_argument("--duration", type=int, default=300, help="Duration in seconds (default: 300)")
|
||||||
|
parser.add_argument("--output", default="data/recordings", help="Output directory")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
os.makedirs(args.output, exist_ok=True)
|
||||||
|
filename = f"csi-{int(time.time())}.csi.jsonl"
|
||||||
|
filepath = os.path.join(args.output, filename)
|
||||||
|
|
||||||
|
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
|
||||||
|
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
|
||||||
|
sock.bind(("0.0.0.0", args.port))
|
||||||
|
sock.settimeout(1)
|
||||||
|
|
||||||
|
print(f"Recording CSI on UDP :{args.port} for {args.duration}s")
|
||||||
|
print(f"Output: {filepath}")
|
||||||
|
|
||||||
|
count = 0
|
||||||
|
start = time.time()
|
||||||
|
nodes_seen = set()
|
||||||
|
|
||||||
|
with open(filepath, "w") as f:
|
||||||
|
try:
|
||||||
|
while time.time() - start < args.duration:
|
||||||
|
try:
|
||||||
|
data, addr = sock.recvfrom(4096)
|
||||||
|
frame = parse_csi_packet(data)
|
||||||
|
if frame:
|
||||||
|
f.write(json.dumps(frame) + "\n")
|
||||||
|
count += 1
|
||||||
|
nodes_seen.add(frame["node_id"])
|
||||||
|
|
||||||
|
if count % 500 == 0:
|
||||||
|
elapsed = time.time() - start
|
||||||
|
rate = count / elapsed
|
||||||
|
print(f" {count} frames | {rate:.0f} fps | "
|
||||||
|
f"nodes: {sorted(nodes_seen)} | "
|
||||||
|
f"{elapsed:.0f}s / {args.duration}s")
|
||||||
|
except socket.timeout:
|
||||||
|
continue
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
print("\nStopped by user")
|
||||||
|
|
||||||
|
sock.close()
|
||||||
|
elapsed = time.time() - start
|
||||||
|
print(f"\n=== CSI Recording Complete ===")
|
||||||
|
print(f" Frames: {count}")
|
||||||
|
print(f" Duration: {elapsed:.0f}s")
|
||||||
|
print(f" Rate: {count / max(elapsed, 1):.0f} fps")
|
||||||
|
print(f" Nodes: {sorted(nodes_seen)}")
|
||||||
|
print(f" Output: {filepath}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
1657
scripts/train-wiflow-supervised.js
Normal file
1657
scripts/train-wiflow-supervised.js
Normal file
File diff suppressed because it is too large
Load diff
Loading…
Add table
Add a link
Reference in a new issue