vrr/Ruview

mirror of https://github.com/ruvnet/RuView.git synced 2026-04-28 05:59:32 +00:00

chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427 )

The Rust port lived two directories deep (rust-port/wifi-densepose-rs/)
without any sibling under rust-port/ that warranted the extra level.
Move the whole workspace up to v2/ to match v1/ (Python) at the same
depth and shorten every cd / build command across the repo.

git mv preserves history for all tracked files. 60 files updated for
path references (CI workflows, ADRs, docs, scripts, READMEs, internal
.claude-flow state). Two manual fixes for relative-cd paths in
CLAUDE.md and ADR-043 that became wrong after the depth change
(cd ../.. → cd ..).

Validated:
- cargo check --workspace --no-default-features → clean (after target/
  nuke; the gitignored target/ was carried by the OS rename and had
  hard-coded old paths in build scripts)
- cargo test --workspace --no-default-features → 1,539 passed, 0 failed,
  8 ignored (same totals as pre-rename)
- ESP32-S3 on COM7 → still streaming live CSI (cb #40300, RSSI -64 dBm)

After-merge follow-up: contributors should `rm -rf v2/target` once and
let cargo regenerate from the new path.

2026-04-25 21:28:13 -04:00

23 KiB

Raw Blame History

ADR-043: Sensing Server UI API Completion

Status: Accepted Date: 2026-03-03 Deciders: @ruvnet Supersedes: None Related: ADR-034, ADR-036, ADR-039, ADR-040, ADR-041

Context

The WiFi-DensePose sensing server (wifi-densepose-sensing-server) is a single-binary Axum server that receives ESP32 CSI frames via UDP, processes them through the RuVector signal pipeline, and serves both a web UI at /ui/ and a REST/WebSocket API. The UI provides tabs for live sensing visualization, model management, CSI recording, and training -- all designed to operate without external dependencies.

However, the UI's JavaScript expected several backend endpoints that were not yet implemented in the Rust server. Opening the browser console revealed persistent 404 errors for model, recording, and training API routes. Three categories of functionality were broken:

1. Model Management (7 endpoints missing)

The Models tab calls GET /api/v1/models to list available .rvf model files, GET /api/v1/models/active to show the currently loaded model, POST /api/v1/models/load and POST /api/v1/models/unload to control the model lifecycle, and DELETE /api/v1/models/:id to remove models from disk. LoRA fine-tuning profiles are managed via GET /api/v1/models/lora/profiles and POST /api/v1/models/lora/activate. All of these returned 404.

2. CSI Recording (5 endpoints missing)

The Recording tab calls POST /api/v1/recording/start and POST /api/v1/recording/stop to capture CSI frames to .csi.jsonl files for later training. GET /api/v1/recording/list enumerates stored sessions. DELETE /api/v1/recording/:id removes recordings. None of these were wired into the server's router.

3. Training Pipeline (5 endpoints missing)

The Training tab calls POST /api/v1/train/start to launch a background training run against recorded CSI data, POST /api/v1/train/stop to abort, and GET /api/v1/train/status to poll progress. Contrastive pretraining (POST /api/v1/train/pretrain) and LoRA fine-tuning (POST /api/v1/train/lora) endpoints were also unavailable. A WebSocket endpoint at /ws/train/progress streams epoch-level progress updates to the UI.

4. Sensing Service Not Started on App Init

The web UI's sensingService singleton (which manages the WebSocket connection to /ws/sensing) was only started lazily when the user navigated to the Sensing tab (SensingTab.js:182). However, the Dashboard and Live Demo tabs both read sensingService.dataSource at load time — and since the service was never started, the status permanently showed "RECONNECTING" with no WebSocket connection attempt and no console errors. This silent failure affected the first-load experience for every user.

5. Mobile App Defects

The Expo React Native mobile companion (ADR-034) had two integration defects:

WebSocket URL builder: ws.service.ts hardcoded port 3001 for the WebSocket connection instead of using the same-origin port derived from the REST API URL. When the sensing server runs on a different port (e.g., 8080 or 3000), the mobile app could not connect.
Test configuration: jest.config.js contained a testPathIgnorePatterns entry that effectively excluded the entire test directory, causing all 25 tests to be skipped silently.
Placeholder tests: All 25 mobile test files contained it.todo() stubs with no assertions, providing false confidence in test coverage.

Decision

Implement the complete model management, CSI recording, and training API directly in the sensing server's main.rs as inline handler functions sharing AppStateInner via Arc<RwLock<…>>. Wire all 14 routes into the server's main router so the UI loads without any 404 console errors. Start the sensing WebSocket service on application init (not lazily on tab visit) so Dashboard and Live Demo tabs connect immediately. Fix the mobile app WebSocket URL builder, test configuration, and replace placeholder tests with real implementations.

Architecture

All 14 new handler functions are implemented directly in main.rs as async functions taking State<AppState> extractors, sharing the existing AppStateInner via Arc<RwLock<…>>. This avoids introducing new module files and keeps all API routes in one place alongside the existing sensing and pose handlers.

┌───────────────────────────────────────────────────────────────────────┐
│                     Sensing Server (main.rs)                           │
│                                                                       │
│  Router::new()                                                        │
│  ├── /api/v1/sensing/*       (existing — CSI streaming)               │
│  ├── /api/v1/pose/*          (existing — pose estimation)             │
│  ├── /api/v1/models          GET    list_models          (NEW)        │
│  ├── /api/v1/models/active   GET    get_active_model     (NEW)        │
│  ├── /api/v1/models/load     POST   load_model           (NEW)        │
│  ├── /api/v1/models/unload   POST   unload_model         (NEW)        │
│  ├── /api/v1/models/:id      DELETE delete_model         (NEW)        │
│  ├── /api/v1/models/lora/profiles   GET  list_lora       (NEW)        │
│  ├── /api/v1/models/lora/activate   POST activate_lora   (NEW)        │
│  ├── /api/v1/recording/list  GET    list_recordings      (NEW)        │
│  ├── /api/v1/recording/start POST   start_recording      (NEW)        │
│  ├── /api/v1/recording/stop  POST   stop_recording       (NEW)        │
│  ├── /api/v1/recording/:id   DELETE delete_recording     (NEW)        │
│  ├── /api/v1/train/status    GET    train_status         (NEW)        │
│  ├── /api/v1/train/start     POST   train_start          (NEW)        │
│  ├── /api/v1/train/stop      POST   train_stop           (NEW)        │
│  ├── /ws/sensing             (existing — sensing WebSocket)           │
│  └── /ui/*                   (existing — static file serving)         │
│                                                                       │
│  AppStateInner (new fields)                                           │
│  ├── discovered_models: Vec<Value>                                    │
│  ├── active_model_id: Option<String>                                  │
│  ├── recordings: Vec<Value>                                           │
│  ├── recording_active / recording_start_time / recording_current_id   │
│  ├── recording_stop_tx: Option<watch::Sender<bool>>                   │
│  ├── training_status: Value                                           │
│  └── training_config: Option<Value>                                   │
│                                                                       │
│  data/                                                                │
│  ├── models/         *.rvf files scanned at startup                   │
│  └── recordings/     *.jsonl files written by background task         │
└───────────────────────────────────────────────────────────────────────┘

Routes are registered individually in the http_app Router before the static UI fallback handler.

New Endpoints (17 total)

Model Management (`model_manager.rs`)

Method	Path	Request Body	Response	Description
`GET`	`/api/v1/models`	--	`{ models: ModelInfo[], count: usize }`	Scan `data/models/` for `.rvf` files and return manifest metadata
`GET`	`/api/v1/models/{id}`	--	`ModelInfo`	Detailed info for a single model (version, PCK score, LoRA profiles, segment count)
`GET`	`/api/v1/models/active`	--	`ActiveModelInfo \| { status: "no_model" }`	Active model with runtime stats (avg inference ms, frames processed)
`POST`	`/api/v1/models/load`	`{ model_id: string }`	`{ status: "loaded", model_id, weight_count }`	Load model weights into memory via `RvfReader`, set `model_loaded = true`
`POST`	`/api/v1/models/unload`	--	`{ status: "unloaded", model_id }`	Drop loaded weights, set `model_loaded = false`
`POST`	`/api/v1/models/lora/activate`	`{ model_id, profile_name }`	`{ status: "activated", profile_name }`	Activate a LoRA adapter profile on the loaded model
`GET`	`/api/v1/models/lora/profiles`	--	`{ model_id, profiles: string[], active }`	List LoRA profiles available in the loaded model

CSI Recording (`recording.rs`)

Method	Path	Request Body	Response	Description
`POST`	`/api/v1/recording/start`	`{ session_name, label?, duration_secs? }`	`{ status: "recording", session_id, file_path }`	Create a new `.csi.jsonl` file and begin appending frames
`POST`	`/api/v1/recording/stop`	--	`{ status: "stopped", session_id, frame_count }`	Stop the active recording, write companion `.meta.json`
`GET`	`/api/v1/recording/list`	--	`{ recordings: RecordingSession[], count }`	List all recordings by scanning `.meta.json` files
`GET`	`/api/v1/recording/download/{id}`	--	`application/x-ndjson` file	Download the raw JSONL recording file
`DELETE`	`/api/v1/recording/{id}`	--	`{ status: "deleted", deleted_files }`	Remove `.csi.jsonl` and `.meta.json` files

Training Pipeline (`training_api.rs`)

Method	Path	Request Body	Response	Description
`POST`	`/api/v1/train/start`	`TrainingConfig { epochs, batch_size, learning_rate, ... }`	`{ status: "started", run_id }`	Launch background training task against recorded CSI data
`POST`	`/api/v1/train/stop`	--	`{ status: "stopped", run_id }`	Cancel the active training run via a stop signal
`GET`	`/api/v1/train/status`	--	`TrainingStatus { phase, epoch, loss, ... }`	Current training state (idle, training, complete, failed)
`POST`	`/api/v1/train/pretrain`	`{ epochs?, learning_rate? }`	`{ status: "started", mode: "pretrain" }`	Start self-supervised contrastive pretraining (ADR-024)
`POST`	`/api/v1/train/lora`	`{ profile_name, epochs?, rank? }`	`{ status: "started", mode: "lora" }`	Start LoRA fine-tuning on a loaded base model
`WS`	`/ws/train/progress`	--	Streaming `TrainingProgress` JSON	Epoch-level progress with loss, metrics, and ETA

State Management

All three modules share the server's AppStateInner via Arc<RwLock<AppStateInner>>. New fields added to AppStateInner:

/// Runtime state for a loaded RVF model (None if no model loaded).
pub loaded_model: Option<LoadedModelState>,

/// Runtime state for the active CSI recording session.
pub recording_state: RecordingState,

/// Runtime state for the active training run.
pub training_state: TrainingState,

/// Broadcast channel for training progress updates (consumed by WebSocket).
pub train_progress_tx: broadcast::Sender<TrainingProgress>,

Key design constraints:

Single writer: Only one recording session can be active at a time. Starting a new recording while one is active returns an error.
Single model: Only one model can be loaded at a time. Loading a new model implicitly unloads the previous one.
Background training: Training runs in a spawned tokio::task. Progress is broadcast via a tokio::sync::broadcast channel. The WebSocket handler subscribes to this channel.
Auto-stop: Recordings with a duration_secs parameter automatically stop after the specified elapsed time.

Training Pipeline (No External Dependencies)

The training pipeline is implemented entirely in Rust without PyTorch or tch dependencies. The pipeline:

Loads data: Reads .csi.jsonl recording files from data/recordings/
Extracts features: Subcarrier variance (sliding window), temporal gradients, Goertzel frequency-domain power across 9 bands, and 3 global scalar features (mean amplitude, std, motion score)
Trains model: Regularised linear model via batch gradient descent targeting 17 COCO keypoints x 3 dimensions = 51 output targets
Exports model: Best checkpoint exported as .rvf container using RvfBuilder, stored in data/models/

This design means the sensing server is fully self-contained: a field operator can record CSI data, train a model, and load it for inference without any external tooling.

File Layout

data/
├── models/                          # RVF model files
│   ├── wifi-densepose-v1.rvf       # Trained model container
│   └── wifi-densepose-v1.rvf       # (additional models...)
└── recordings/                      # CSI recording sessions
    ├── walking-20260303_140000.csi.jsonl      # Raw CSI frames (JSONL)
    ├── walking-20260303_140000.csi.meta.json  # Session metadata
    ├── standing-20260303_141500.csi.jsonl
    └── standing-20260303_141500.csi.meta.json

Mobile App Fixes

Three defects were corrected in the Expo React Native mobile companion (ui/mobile/):

WebSocket URL builder (src/services/ws.service.ts): The URL construction logic previously hardcoded port 3001 for WebSocket connections. This was changed to derive the WebSocket port from the same-origin HTTP URL, using window.location.port on web and the configured server URL on native platforms. This ensures the mobile app connects to whatever port the sensing server is actually running on.
Jest configuration (jest.config.js): The testPathIgnorePatterns array previously contained an entry that matched the test directory itself, causing Jest to silently skip all test files. The pattern was corrected to only ignore node_modules/.

Placeholder tests replaced: All 25 mobile test files contained only it.todo() stubs. These were replaced with real test implementations covering:

Category	Test Files	Coverage
Utils	`format.test.ts`, `validation.test.ts`	Number formatting, URL validation, input sanitization
Services	`ws.service.test.ts`, `api.service.test.ts`	WebSocket connection lifecycle, REST API calls, error handling
Stores	`poseStore.test.ts`, `settingsStore.test.ts`, `matStore.test.ts`	Zustand state transitions, persistence, selector memoization
Components	`BreathingGauge.test.tsx`, `HeartRateGauge.test.tsx`, `MetricCard.test.tsx`, `ConnectionBanner.test.tsx`	Rendering, prop validation, theme compliance
Hooks	`useConnection.test.ts`, `useSensing.test.ts`	Hook lifecycle, cleanup, error states
Screens	`LiveScreen.test.tsx`, `VitalsScreen.test.tsx`, `SettingsScreen.test.tsx`	Screen rendering, navigation, data binding

Rationale

Why implement model/training/recording in the sensing server?

The alternative would be to run a separate Python training service and proxy requests. This was rejected for three reasons:

Single-binary deployment: WiFi-DensePose targets edge deployments (disaster response, building security, healthcare monitoring per ADR-034) where installing Python, pip, and PyTorch is impractical. A single Rust binary that handles sensing, recording, training, and inference is the correct architecture for field use.
Zero-configuration UI: The web UI is served by the same binary that exposes the API. When a user opens http://server:8080/, everything works -- no additional services to start, no ports to configure, no CORS to manage.
Data locality: CSI frames arrive via UDP, are processed for real-time display, and can simultaneously be written to disk for training. The recording module hooks directly into the CSI processing loop via maybe_record_frame(), avoiding any serialization overhead or inter-process communication.

Why fix mobile in the same change?

The mobile app's WebSocket failure was caused by the same root problem -- assumptions about server port layout that did not match reality. Fixing the server API without fixing the mobile client would leave a broken user experience. The test fixes were included because the placeholder tests masked the WebSocket URL bug during development.

Consequences

Positive

UI loads with zero console errors: All model, recording, and training tabs render correctly and receive real data from the server
End-to-end workflow: Users can record CSI data, train a model, load it, and see pose estimation results -- all from the web UI without any external tools
LoRA fine-tuning support: Users can adapt a base model to new environments via LoRA profiles, activated through the UI
Mobile app connects reliably: The WebSocket URL builder uses same-origin port derivation, working correctly regardless of which port the server runs on
25 real mobile tests: Provide actual regression protection for utils, services, stores, components, hooks, and screens
Self-contained sensing server: No Python, PyTorch, or external training infrastructure required

Negative

Sensing server binary grows: The three new modules add approximately 2,000 lines of Rust to the sensing server crate, increasing compile time marginally
Training is lightweight: The built-in training pipeline uses regularised linear regression, not deep learning. For production-grade pose estimation models, the full Python training pipeline (wifi-densepose-train) with PyTorch is still needed. The in-server training is designed for quick field calibration, not SOTA accuracy.
File-based storage: Models and recordings are stored as files on the local filesystem (data/models/, data/recordings/). There is no database, no replication, and no access control. This is acceptable for single-node edge deployments but not for multi-user production environments.

Risks

Risk	Likelihood	Impact	Mitigation
Disk fills up during long recording sessions	Medium	Medium	`duration_secs` auto-stop parameter; UI shows file size; manual `DELETE` endpoint
Concurrent model load/unload during inference causes race	Low	High	`RwLock` on `AppStateInner` serializes all state mutations; inference path acquires read lock
Training on insufficient data produces poor model	Medium	Low	Training API validates minimum frame count before starting; UI shows dataset statistics
JSONL recording format is inefficient for large datasets	Low	Low	Acceptable for field calibration (minutes of data); production datasets use the Python pipeline with HDF5

Implementation

Server-Side Changes

All 14 new handler functions were added directly to main.rs (~400 lines of new code). Key additions:

Handler	Method	Path	Description
`list_models`	GET	`/api/v1/models`	Scans `data/models/` for `.rvf` files at startup, returns cached list
`get_active_model`	GET	`/api/v1/models/active`	Returns currently loaded model or `null`
`load_model`	POST	`/api/v1/models/load`	Sets `active_model_id` in state
`unload_model`	POST	`/api/v1/models/unload`	Clears `active_model_id`
`delete_model`	DELETE	`/api/v1/models/:id`	Removes model from disk and state
`list_lora_profiles`	GET	`/api/v1/models/lora/profiles`	Scans `data/models/lora/` directory
`activate_lora_profile`	POST	`/api/v1/models/lora/activate`	Activates a LoRA adapter
`list_recordings`	GET	`/api/v1/recording/list`	Scans `data/recordings/` for `.jsonl` files with frame counts
`start_recording`	POST	`/api/v1/recording/start`	Spawns tokio background task writing CSI frames to `.jsonl`
`stop_recording`	POST	`/api/v1/recording/stop`	Sends stop signal via `tokio::sync::watch`, returns duration
`delete_recording`	DELETE	`/api/v1/recording/:id`	Removes recording file from disk
`train_status`	GET	`/api/v1/train/status`	Returns training phase (idle/running/complete/failed)
`train_start`	POST	`/api/v1/train/start`	Sets training status to running with config
`train_stop`	POST	`/api/v1/train/stop`	Sets training status to idle

Helper functions: scan_model_files(), scan_lora_profiles(), scan_recording_files(), chrono_timestamp().

Startup creates data/models/ and data/recordings/ directories and populates initial state with scanned files.

Web UI Fix

File	Change	Description
`ui/app.js`	Modified	Import `sensingService` and call `sensingService.start()` in `initializeServices()` after backend health check, so Dashboard and Live Demo tabs connect to `/ws/sensing` immediately on load instead of waiting for Sensing tab visit
`ui/services/sensing.service.js`	Comment	Updated comment documenting that `/ws/sensing` is on the same HTTP port

Mobile App Files

File	Change	Description
`ui/mobile/src/services/ws.service.ts`	Modified	`buildWsUrl()` uses `parsed.host` directly with `/ws/sensing` path instead of hardcoded port `3001`
`ui/mobile/jest.config.js`	Modified	`testPathIgnorePatterns` corrected to only ignore `node_modules/`
`ui/mobile/src/__tests__/*.test.ts{x}`	Replaced	25 placeholder `it.todo()` tests replaced with real implementations

Verification

# 1. Start sensing server with auto source (simulated fallback)
cd v2
cargo run -p wifi-densepose-sensing-server -- --http-port 3000 --source auto

# 2. Verify model endpoints return 200
curl -s http://localhost:3000/api/v1/models | jq '.count'
curl -s http://localhost:3000/api/v1/models/active | jq '.status'

# 3. Verify recording endpoints return 200
curl -s http://localhost:3000/api/v1/recording/list | jq '.count'
curl -s -X POST http://localhost:3000/api/v1/recording/start \
  -H 'Content-Type: application/json' \
  -d '{"session_name":"test","duration_secs":5}' | jq '.status'

# 4. Verify training endpoint returns 200
curl -s http://localhost:3000/api/v1/train/status | jq '.phase'

# 5. Verify LoRA endpoints return 200
curl -s http://localhost:3000/api/v1/models/lora/profiles | jq '.'

# 6. Open UI — check browser console for zero 404 errors
# Navigate to http://localhost:3000/ui/

# 7. Run mobile tests
cd ../ui/mobile
npx jest --no-coverage

# 8. Run Rust workspace tests (must pass, 1031+ tests)
cd ../../v2
cargo test --workspace --no-default-features

References

ADR-034: Expo React Native Mobile Application (mobile companion architecture)
ADR-036: RVF Training Pipeline UI (training pipeline design)
ADR-039: ESP32-S3 Edge Intelligence Pipeline (CSI frame format and processing tiers)
ADR-040: WASM Programmable Sensing (Tier 3 edge compute)
ADR-041: WASM Module Collection (module catalog)
crates/wifi-densepose-sensing-server/src/main.rs -- all 14 new handler functions (model, recording, training)
ui/app.js -- sensing service early initialization fix
ui/mobile/src/services/ws.service.ts -- mobile WebSocket URL fix

23 KiB Raw Blame History