The Rust port at v2/ has been the primary codebase since the rename in #427. The Python implementation at v1/ is no longer the active target; the only load-bearing path is the deterministic proof bundle at v1/data/proof/ (per ADR-011 / ADR-028 witness verification). Move the whole Python tree into archive/v1/ and document the policy in archive/README.md: no new features, bug fixes only when they affect a still-load-bearing path (currently just the proof), CI continues to verify the proof on every push and PR. Path references updated in 26 files via path-pattern sed (only matches v1/<known-child> patterns, never bare v1 or API URLs like /api/v1/). Two double-prefix typos (archive/archive/v1/) caught and hand-fixed in verify-pipeline.yml and ADR-011. Validated: - Python proof verify.py imports cleanly at archive/v1/data/proof/ (numpy/scipy still required; CI installs requirements-lock.txt from archive/v1/ now) - cargo test --workspace --no-default-features → 1,539 passed, 0 failed, 8 ignored (unaffected by Python tree relocation) - ESP32-S3 on COM7 untouched (no firmware paths changed) After-merge: contributors should re-run any local `python v1/...` commands as `python archive/v1/...` (CLAUDE.md and CHANGELOG already updated).
52 KiB
SFDIPOT Product Factors Assessment: wifi-densepose
Assessment Date: 2026-04-05
Assessor: QE Product Factors Assessor (HTSM v6.3)
Framework: James Bach's Heuristic Test Strategy Model -- Product Factors (SFDIPOT)
Scope: Full wifi-densepose system -- Rust workspace (18 crates, 153k LoC), Python v1 (105 files, 39k LoC), ESP32 firmware (48 files, 1.6k LoC), CI/CD pipelines (8 workflows)
Test Count: 2,618 Rust #[test] functions + 33 Python test files
Executive Summary
The wifi-densepose project is an ambitious WiFi-based human pose estimation system spanning five deployment targets (server, desktop, WASM/browser, ESP32 embedded, mobile). This SFDIPOT assessment identifies 47 risk areas across all seven product factors. The highest concentration of risk lies in Time (real-time processing constraints with no latency testing), Platform (6 target architectures with limited cross-platform validation), and Interfaces (multiple protocol boundaries with incomplete contract testing).
Overall Risk Rating: HIGH -- The system's safety-critical use case (Mass Casualty Assessment Tool) combined with multi-platform deployment and real-time signal processing demands rigorous testing that is currently only partially in place.
Risk Heat Map
| Factor | Risk | Confidence | Test Coverage | Key Concern |
|---|---|---|---|---|
| Structure | MEDIUM | High | Good | 18 crates well-organized; MAT lib.rs at 626 lines pushes limit |
| Function | HIGH | High | Moderate | Vital signs extraction, pose estimation accuracy unvalidated in production conditions |
| Data | MEDIUM | High | Moderate | Proof-of-reality system strong; CSI data integrity across protocols untested |
| Interfaces | HIGH | Medium | Low | REST API stub in Rust; Python/Rust boundary undefined; ESP32 serial protocol loosely coupled |
| Platform | HIGH | Medium | Low | 6 deployment targets; ESP32 original/C3 excluded but not enforced at build level |
| Operations | MEDIUM | Medium | Low | No Dockerfile; firmware OTA path defined but unvalidated end-to-end |
| Time | CRITICAL | High | Very Low | 20 Hz target; no latency benchmarks; concurrent multi-node processing untested |
S -- Structure
What the product IS
S1: Code Integrity
Finding: The Rust workspace is well-structured with 18 crates following Domain-Driven Design bounded contexts. The wifi-densepose-core crate uses #![forbid(unsafe_code)] and provides clean trait abstractions (SignalProcessor, NeuralInference, DataStore). The crate dependency graph has a clear publish order documented in CLAUDE.md.
Risk: MEDIUM
- The
wifi-densepose-matlib.rs is 626 lines, exceeding the project's own 500-line limit specified in CLAUDE.md. TheDisasterResponsestruct owns 8 fields including anArc<dyn EventStore>, making it a coordination bottleneck. - The
wifi-densepose-wasm-edgecrate is excluded from the workspace (exclude = ["crates/wifi-densepose-wasm-edge"]), meaningcargo test --workspacedoes not exercise it. This creates a coverage gap for edge deployment code (662 lines). - The
wifi-densepose-apiRust crate is a 1-line stub (//! WiFi-DensePose REST API (stub)), while the Python v1 has a full FastAPI implementation. This implies the Rust port's API surface is incomplete.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| S-01 | P1 | Build wifi-densepose-wasm-edge separately (cargo build -p wifi-densepose-wasm-edge --target wasm32-unknown-unknown) and run any embedded tests to confirm they pass outside the workspace test run |
Integration |
| S-02 | P2 | Measure cyclomatic complexity of DisasterResponse::scan_cycle which spans 80+ lines with nested borrows and conditional event emission -- flag if complexity exceeds 15 |
Unit |
| S-03 | P2 | Run cargo check --workspace --all-features to surface feature-flag interaction issues across all 18 crates that are hidden by --no-default-features in CI |
Integration |
| S-04 | P3 | Count lines per file across all crates; flag any .rs file exceeding the 500-line project policy |
Lint/CI |
S2: Dependencies
Finding: The workspace has 30+ external crate dependencies including heavy ones: tch (PyTorch FFI), ort (ONNX Runtime), ndarray-linalg with openblas-static, and 7 ruvector-* crates from crates.io. The ruvector dependency comment notes "Vendored at v2.1.0 in vendor/ruvector; using crates.io versions until published" -- suggesting a version mismatch risk between vendored and published code.
Risk: MEDIUM
ort = "2.0.0-rc.11"is a release candidate. RC dependencies in production code carry API stability risk.ndarray-linalgwithopenblas-staticforces a specific BLAS implementation that may conflict on certain platforms (ARM, WASM).- The
tch-backendfeature flag gates the entire training pipeline. If a developer enables it without libtorch installed, the build fails without a clear error path.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| S-05 | P1 | Run cargo audit to detect known vulnerabilities in the 30+ dependencies, particularly ort RC and tch FFI bindings |
CI/Unit |
| S-06 | P2 | Build the workspace on ARM64 (aarch64-unknown-linux-gnu) to confirm openblas-static compiles; the current CI only runs x86_64 |
Integration |
| S-07 | P2 | Toggle tch-backend feature on wifi-densepose-train without libtorch installed; confirm error message is actionable, not a cryptic linker failure |
Human Exploration |
S3: Non-Executable Files
Finding: 43+ ADR documents, proof data files (sample_csi_data.json, expected_features.sha256), NVS configuration files for ESP32. The proof-of-reality system uses a published SHA-256 hash of pipeline output as a trust anchor.
Risk: LOW
- The
expected_features.sha256file is the single point of truth for pipeline integrity. If it is regenerated incorrectly (e.g., with a different numpy version), the proof becomes meaningless.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| S-08 | P0 | Run python archive/v1/data/proof/verify.py in CI on every PR that touches archive/v1/src/core/ or archive/v1/src/hardware/ to catch proof-breaking changes |
CI |
| S-09 | P2 | Pin numpy/scipy versions in requirements.txt and confirm verify.py --generate-hash produces the same hash across Python 3.10, 3.11, and 3.12 |
Integration |
F -- Function
What the product DOES
F1: Application -- Core Capabilities
Finding: The system advertises five core capabilities:
- CSI extraction from ESP32 hardware
- Signal processing (noise removal, phase sanitization, feature extraction, Doppler)
- Human presence detection and pose estimation (17-keypoint COCO format)
- Vital signs extraction (breathing rate, heart rate)
- Mass casualty assessment (survivor detection through debris)
The Python v1 CSI processor (csi_processor.py) implements a complete pipeline from raw CSI frames through feature extraction to human detection. The Rust port replicates and extends this with 14 RuvSense modules for multistatic sensing.
Risk: HIGH
- The human detection confidence calculation in
_calculate_detection_confidenceuses hardcoded binary thresholds (> 0.1,> 0.05,> 0.3) with fixed weights (0.4,0.3,0.3). These are not calibrated against ground truth data. - The temporal smoothing factor (
smoothing_factor = 0.9) means the system takes ~10 frames to respond to a presence change. For a 20 Hz system, that is 500ms of latency injected by design -- acceptable for presence but too slow for pose tracking. - The
EnsembleClassifierin the MAT crate combines breathing, heartbeat, and movement classifiers but there are no integration tests validating that the ensemble confidence actually correlates with real survivor detection.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| F-01 | P0 | Feed 100 known-good CSI frames (from sample_csi_data.json) through the full Python pipeline and assert detection confidence is within expected range (0.7-0.95 for human-present frames) |
Unit |
| F-02 | P0 | Feed 100 CSI frames of background noise (no human present) and confirm detection confidence stays below threshold (< 0.3); false positive rate must be < 5% | Unit |
| F-03 | P1 | Measure temporal smoothing convergence: inject a step change from no-human to human-present and count frames until confidence exceeds threshold; assert < 15 frames at 20 Hz | Unit |
| F-04 | P1 | Run the MAT EnsembleClassifier with synthetic vital signs at confidence boundary (0.49, 0.50, 0.51) and confirm correct accept/reject behavior at the confidence_threshold boundary |
Unit |
| F-05 | P2 | Inject CSI data with amplitudes.len() != phases.len() into DisasterResponse::push_csi_data and confirm the error path returns MatError::Detection with descriptive message |
Unit |
F2: Calculation Accuracy
Finding: The signal processing pipeline involves FFT (via rustfft and scipy.fft), correlation matrices, bandpass filtering, zero-crossing analysis, autocorrelation, and SVD decomposition. These are numerically sensitive operations.
Risk: HIGH
- The Doppler extraction in Python uses
scipy.fft.fftwithn=64bins on a sliding window of cached phase values. The normalization divides bymax_valwhich can amplify noise when the max is near zero. - The vital signs extractor (
BreathingExtractor,HeartRateExtractor) uses bandpass filtering in specific Hz ranges (0.1-0.5 Hz for breathing, 0.8-2.0 Hz for heart rate). These filter boundaries are physiologically reasonable but have no tolerance handling for edge cases (e.g., athlete with 40 bpm resting heart rate = 0.67 Hz, below the 0.8 Hz lower bound).
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| F-06 | P0 | Generate a synthetic CSI signal with known Doppler shift (e.g., 2 Hz sinusoidal phase modulation) and confirm the Doppler extraction peak is within +/- 0.5 Hz of the injected frequency | Unit |
| F-07 | P1 | Feed the HeartRateExtractor a signal at 0.67 Hz (40 bpm, athletic resting rate) and confirm it is either detected correctly or reported as VitalEstimate::unavailable -- not misclassified as breathing |
Unit |
| F-08 | P1 | Test Doppler normalization edge case: when max_val approaches zero (< 1e-12), confirm division does not produce NaN or Inf values |
Unit |
| F-09 | P2 | Compare Python scipy.fft.fft output against Rust rustfft output for the same 64-element input vector; assert difference < 1e-6 per bin |
Integration |
F3: Error Handling
Finding: The Rust crates use thiserror with per-crate error enums (MatError, SignalError, RuvSenseError) that chain properly. The Python code uses custom exception classes (CSIProcessingError, DatabaseConnectionError). Both handle errors with descriptive messages.
Risk: MEDIUM
- The Python
CSIProcessor.process_csi_datacatches all exceptions with a blanketexcept Exception as eand wraps them inCSIProcessingError. This loses the original exception type and stack trace from the caller's perspective. - The Rust
scan_cyclemethod silently discards event store errors withlet _ = self.event_store.append(...). In a disaster response context, losing domain events could mean missing survivor detections.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| F-10 | P1 | Make the InMemoryEventStore return an error on append() and confirm scan_cycle either propagates the error or logs it at WARN+ level -- not silently discard it |
Unit |
| F-11 | P2 | Inject a numpy.linalg.LinAlgError in the correlation matrix computation and confirm the error chain preserves the original exception type through CSIProcessingError |
Unit |
F4: Security
Finding: The Python API implements authentication middleware (AuthMiddleware), rate limiting (RateLimitMiddleware), CORS configuration, and trusted host middleware for production. Settings require a secret_key field. The dev config endpoint redacts sensitive fields containing "secret", "password", "token", "key", "credential", "auth".
Risk: MEDIUM
- The
secret_keyfield usesField(...)(required) but there is no validation on minimum key length or entropy. - CORS defaults to
["*"]which is permissive. While overridable, the default is risky if deployed without configuration. - The readiness check at
/health/readyhardcodesready = Truewith a comment "Basic readiness - API is responding" andchecks["hardware_ready"] = Trueregardless of actual hardware state. This defeats the purpose of a readiness probe.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| F-12 | P0 | Set secret_key to a 3-character string and confirm the application either rejects it at startup or logs a security warning |
Unit |
| F-13 | P1 | Submit a request to /health/ready when pose_service is None and confirm ready is reported as False, not hardcoded True |
Integration |
| F-14 | P1 | Set environment=production and confirm /docs, /redoc, and /openapi.json endpoints return 404, not the Swagger UI |
E2E |
| F-15 | P2 | Send 101 requests within the rate limit window and confirm the 101st is rejected with HTTP 429 | Integration |
F5: State Transitions
Finding: The system has multiple state machines:
DeviceStatus: ACTIVE -> INACTIVE -> MAINTENANCE -> ERRORSessionStatus: ACTIVE -> COMPLETED / FAILED / CANCELLEDProcessingStatus: PENDING -> PROCESSING -> COMPLETED / FAILED- ESP32 firmware: WiFi connecting -> connected -> CSI streaming
- RuvSense
TrackLifecycleState: lifecycle for pose tracks - MAT
ZoneStatus: Active scan zones
Risk: MEDIUM
- The database models define valid states via
CheckConstraintbut do not enforce transition rules (e.g., can a device go from ERROR directly to ACTIVE without going through MAINTENANCE?).
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| F-16 | P1 | Attempt to transition DeviceStatus from ERROR to ACTIVE directly and confirm the system either prevents it or logs the anomaly |
Unit |
| F-17 | P2 | Simulate a Session that is in COMPLETED status and attempt to add new CSI data to it; confirm it is rejected |
Unit |
D -- Data
What the product PROCESSES
D1: Input Data
Finding: The system ingests CSI frames from multiple sources:
- ESP32 ADR-018 binary protocol (UDP)
- Serial port data via
serialportcrate - Sample JSON data (
sample_csi_data.jsonwith 1,000 synthetic frames) CsiDataPython dataclass: amplitude (ndarray), phase (ndarray), frequency, bandwidth, num_subcarriers, num_antennas, snr, metadata
The Rust Esp32CsiParser::parse_frame takes raw bytes and returns structured CsiFrame with amplitude/phase arrays.
Risk: MEDIUM
- The Python
CSIDatadataclass accepts arbitrary-shaped numpy arrays for amplitude and phase. There is no validation thatamplitude.shape == (num_antennas, num_subcarriers). - The ESP32 parser returns
ParseError::InsufficientData { needed, got }but there is no handling for malformed data that has the right length but corrupt content (e.g., all-zero subcarrier data).
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| D-01 | P1 | Create a CSIData with amplitude.shape = (3, 64) but num_antennas = 2 and confirm the processor rejects or reshapes it |
Unit |
| D-02 | P1 | Feed the ESP32 parser a correctly-sized but all-zero byte buffer and confirm it either rejects the frame (quality check) or marks quality_score as degraded |
Unit |
| D-03 | P2 | Feed the ESP32 parser a buffer with valid header but truncated subcarrier data; confirm ParseError::InsufficientData |
Unit |
| D-04 | P2 | Test boundary: exactly 256 subcarriers (MAX_SUBCARRIERS constant) and 257 subcarriers -- confirm correct handling | Unit |
D2: Data Persistence
Finding: The Python v1 uses SQLAlchemy with PostgreSQL (primary) and SQLite (failsafe fallback). The database schema includes 6 tables: devices, sessions, csi_data, pose_detections, system_metrics, audit_logs. The csi_data table stores amplitude and phase as FloatArray columns with a unique constraint on (device_id, sequence_number, timestamp_ns).
Risk: MEDIUM
- Storing raw CSI amplitude/phase arrays as database columns (FloatArray) is expensive. At 20 Hz with 56 subcarriers, that is 2,240 floats/second per device stored to PostgreSQL. No data retention policy or archival strategy is documented.
- The SQLite fallback uses
NullPoolwhich means no connection reuse. Under load, this could exhaust file handles. - The
audit_logstable tracks changes but there is no mention of log rotation or size limits.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| D-05 | P1 | Insert 100,000 CSI frames (simulating ~83 minutes of data at 20 Hz) into the database and measure query performance for time-range retrievals | Integration |
| D-06 | P1 | Trigger PostgreSQL failover to SQLite and confirm: (a) no data loss during transition, (b) API continues responding, (c) health endpoint reports "degraded" not "healthy" | Integration |
| D-07 | P2 | Insert CSI data with duplicate (device_id, sequence_number, timestamp_ns) and confirm the unique constraint fires with an appropriate error message |
Unit |
| D-08 | P3 | Run 1,000 concurrent SQLite connections via the NullPool fallback and monitor for "database is locked" errors | Integration |
D3: Proof Data Integrity
Finding: The proof-of-reality system (archive/v1/data/proof/verify.py) is a deterministic pipeline verification tool. It feeds 1,000 synthetic CSI frames through the production CSI processor, hashes the output with SHA-256, and compares against a published hash. This is a strong engineering practice.
Risk: LOW
- The proof only exercises the Python v1 pipeline. The Rust port has no equivalent proof-of-reality check.
- The proof uses
seed=42for synthetic data generation. Ifnumpy.randomchanges its RNG implementation across versions, the proof breaks without any pipeline code change.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| D-09 | P0 | Run verify.py with --audit flag to scan for mock/random patterns in the codebase that could compromise pipeline integrity |
CI |
| D-10 | P1 | Create an equivalent proof-of-reality test for the Rust wifi-densepose-signal crate: feed the same 1,000 frames through CsiProcessor::new(config) and assert deterministic output |
Unit |
I -- Interfaces
How the product CONNECTS
I1: REST API
Finding: The Python v1 exposes a FastAPI application with three router groups:
/health/*-- Health, readiness, liveness, metrics, version (5 endpoints)/api/v1/pose/*-- Pose estimation endpoints/api/v1/stream/*-- Streaming endpoints
The Rust wifi-densepose-api crate is a 1-line stub. The wifi-densepose-mat crate has its own api module with an Axum router (create_router, AppState).
Risk: HIGH
- Two separate API implementations (Python FastAPI for v1, Rust Axum for MAT) with no shared contract or OpenAPI schema. A consumer cannot rely on interface consistency.
- The Python API's general exception handler returns a generic "Internal server error" for all unhandled exceptions in production, but logs the full traceback. If logs are not monitored, 500 errors go unnoticed.
- No API versioning enforcement: the prefix is configurable via
settings.api_prefixbut defaults to/api/v1. There is no v2 migration path documented.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| I-01 | P0 | Export OpenAPI spec from the Python FastAPI app and validate it against the actual endpoint behavior using Schemathesis or Dredd | E2E |
| I-02 | P1 | Send malformed JSON to every POST endpoint and confirm each returns HTTP 422 with validation error details, not 500 | Integration |
| I-03 | P1 | Hit the MAT Axum API and the Python FastAPI health endpoints in parallel and confirm they use compatible response schemas | Integration |
| I-04 | P2 | Send a request with Content-Type: text/xml to a JSON endpoint and confirm HTTP 415 Unsupported Media Type, not a 500 crash |
Integration |
I2: WebSocket Protocol
Finding: The Python v1 has a WebSocket subsystem (connection_manager.py, pose_stream.py) for real-time pose data streaming. The connection manager tracks active connections and provides stats.
Risk: MEDIUM
- No WebSocket protocol specification (message format, heartbeat interval, reconnection policy).
- The
connection_manager.shutdown()is called during cleanup but there is no graceful disconnect message sent to connected clients.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| I-05 | P1 | Connect 100 WebSocket clients simultaneously and confirm: (a) all receive pose data, (b) connection stats are accurate, (c) no memory leak over 60 seconds | Integration |
| I-06 | P1 | Disconnect a WebSocket client abruptly (TCP reset) and confirm the server cleans up the connection without leaking resources | Integration |
| I-07 | P2 | Send a malformed message over WebSocket and confirm the server rejects it without disconnecting the client | Integration |
I3: ESP32 Serial/UDP Protocol
Finding: The ESP32 firmware uses ADR-018 binary format for CSI frames sent over UDP. The firmware includes WiFi reconnection logic with exponential retry (up to MAX_RETRY=10), NVS configuration persistence, OTA update capability, and WASM runtime support.
The Rust Esp32CsiParser parses the binary frames from UDP bytes.
Risk: HIGH
- The ADR-018 binary protocol has no version field visible in the main.c header. If the protocol format changes, there is no way for the receiver to detect version mismatch.
- The UDP transport is fire-and-forget. There is no acknowledgment, no sequence gap detection documented in the receiver, and no backpressure mechanism.
- The
stream_sender.csends to a hardcoded or NVS-configured target IP. If the aggregator moves, the sensor is stranded until re-provisioned.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| I-08 | P0 | Inject a CSI frame with a future/unknown protocol version byte and confirm the parser returns ParseError with a version mismatch message, not a crash |
Unit |
| I-09 | P1 | Send 1,000 UDP CSI frames at 20 Hz from a simulated ESP32 and measure packet loss rate at the aggregator; assert < 1% loss on loopback | Integration |
| I-10 | P1 | Simulate network partition: stop sending UDP frames for 5 seconds, then resume. Confirm the aggregator recovers without manual intervention | Integration |
| I-11 | P2 | Send a UDP frame from a spoofed MAC address and confirm the aggregator either rejects or flags it (ADR-032 security hardening) | Integration |
I4: Inter-Crate Boundaries (Rust)
Finding: The Rust workspace has clear crate boundaries with pub use re-exports. The core traits (SignalProcessor, NeuralInference, DataStore) define contracts. However, some inter-crate communication uses concrete types rather than trait objects.
Risk: MEDIUM
wifi-densepose-matdepends onwifi-densepose-signal::SignalErrordirectly via#[from]. This couples the MAT error hierarchy to Signal internals.- The
wifi-densepose-traincrate conditionally compiles 5 modules (losses,metrics,model,proof,trainer) behind thetch-backendfeature. This means the training crate's public API surface changes dramatically based on feature flags.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| I-12 | P1 | Build wifi-densepose-mat with wifi-densepose-signal at a different version (e.g., mock a breaking change in SignalError) and confirm the type error is caught at compile time |
Unit |
| I-13 | P2 | Compile wifi-densepose-train with and without tch-backend and diff the public API symbols; document the feature-gated surface area |
Integration |
I5: CLI Interface
Finding: The Rust CLI (wifi-densepose-cli) provides subcommands for MAT operations: mat scan, mat status, mat survivors, mat alerts. Built with clap derive macros.
Risk: LOW
- CLI is narrowly scoped to MAT operations. No CLI for CSI data capture, signal processing, or model training.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| I-14 | P2 | Run wifi-densepose --help, wifi-densepose mat --help, and confirm all documented subcommands are present and help text is accurate |
E2E |
| I-15 | P3 | Run wifi-densepose mat scan --zone "" (empty zone name) and confirm a user-friendly error, not a panic |
Unit |
P -- Platform
What the product DEPENDS ON
P1: Multi-Platform Build Targets
Finding: The project targets 6 platforms:
- Linux x86_64 -- Primary development/server platform (CI runs here)
- Windows -- ESP32 firmware build requires special MSYSTEM env var stripping
- macOS -- CoreWLAN WiFi sensing (ADR-025),
mac_wifi.swiftin sensing module - ESP32-S3 -- Xtensa dual-core, 8MB/4MB flash variants
- WASM (wasm32-unknown-unknown) -- Browser deployment via wasm-pack
- Desktop --
wifi-densepose-desktopcrate (52 lines in lib.rs, minimal)
Explicitly unsupported: ESP32 (original) and ESP32-C3 (single-core, cannot run DSP pipeline).
Risk: HIGH
- The CI workflow (
ci.yml) only runs onubuntu-latest. No Windows, macOS, or ARM64 CI jobs for the Rust crates. - The macOS CoreWLAN integration (
mac_wifi.swift) exists in the Python sensing module but there are no tests or build validation for it. - The
openblas-staticdependency inndarray-linalgdoes not compile onwasm32-unknown-unknown, yetwifi-densepose-signaldepends on it. This means any crate depending onsignalcannot target WASM without feature gating. - The firmware CI (
firmware-ci.yml,firmware-qemu.yml) exists but theverify-pipeline.ymlsuggests a separate verification path.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| P-01 | P0 | Add macOS and Windows CI runners for cargo test --workspace --no-default-features to catch platform-specific compilation failures |
CI |
| P-02 | P1 | Build wifi-densepose-wasm with wasm-pack build --target web in CI and confirm it produces a valid .wasm binary under 5 MB |
CI |
| P-03 | P1 | Flash the 4MB firmware variant to an ESP32-S3 and confirm it boots, connects to WiFi, and streams CSI frames within 30 seconds | Hardware/Human |
| P-04 | P2 | Attempt to build the firmware for ESP32 (original, non-S3) and confirm the build fails with a clear error message about single-core incompatibility | Integration |
P2: External Software Dependencies
Finding: The system depends on:
- PostgreSQL (primary database)
- Redis (caching, rate limiting -- optional)
- libtorch (PyTorch C++ backend -- optional via
tch-backendfeature) - ONNX Runtime (
ortcrate) - OpenBLAS (via
ndarray-linalg) - ESP-IDF v5.4 (firmware toolchain)
- wasm-pack (WASM build tool)
Risk: MEDIUM
- The PostgreSQL-to-SQLite failsafe is a good design but the SQLite fallback does not support all PostgreSQL features (e.g.,
UUIDcolumns, array types viaStringArray/FloatArray). Themodel_types.pyfile likely provides compatibility shims but this is an untested assumption. - Redis is marked optional but the
RateLimitMiddlewarelikely depends on it for distributed rate limiting. If Redis is down and rate limiting is enabled, what happens?
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| P-05 | P1 | Start the API with redis_enabled=True but Redis unavailable, and redis_required=False. Confirm the API starts, rate limiting degrades gracefully, and health reports "degraded" |
Integration |
| P-06 | P1 | Insert a Device record via SQLite fallback with a UUID primary key and StringArray capabilities column; confirm round-trip read matches the write |
Integration |
| P-07 | P2 | Run the full Python test suite on Python 3.12 (the CI uses 3.11) to catch forward-compatibility issues | CI |
P3: Hardware Compatibility
Finding: Supported hardware:
- ESP32-S3 (8MB flash) at ~$9
- ESP32-S3 SuperMini (4MB flash) at ~$6
- ESP32-C6 + Seeed MR60BHA2 (60 GHz FMCW mmWave) at ~$15
- HLK-LD2410 (24 GHz FMCW presence sensor) at ~$3
The ESP32-S3 is the primary sensing node. The mmWave sensors are auxiliary.
Risk: MEDIUM
- The 4MB flash variant (
sdkconfig.defaults.4mb) may not have room for OTA + WASM runtime + display driver. Partition table conflicts are plausible but not tested in CI. - The mmWave sensor integration (
mmwave_sensor.c) exists in firmware but there are no tests validating the serial protocol parsing for the MR60BHA2 radar.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| P-08 | P1 | Build 4MB firmware with OTA + WASM + display all enabled and confirm the binary fits within the 4MB flash partition | CI |
| P-09 | P2 | Send synthetic MR60BHA2 serial output to the mmwave_sensor.c parser and confirm correct heart rate / breathing rate extraction |
Unit |
O -- Operations
How the product is USED
O1: Deployment Model
Finding: No Dockerfile exists (only .dockerignore). CI includes cd.yml (continuous deployment) but deployment target is unknown. The firmware has a documented flash process using idf.py and a provisioning script (provision.py).
Risk: HIGH
- Without a Dockerfile, the Python v1 API has no standardized deployment. Server setup is manual and environment-specific.
- The firmware OTA update mechanism (
ota_update.c) exists but the end-to-end update path (build -> sign -> distribute -> apply -> verify) is undocumented. - No Kubernetes manifests, systemd service files, or other deployment automation.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| O-01 | P1 | Create a Docker image for the Python v1 API and confirm it starts, responds to /health/live, and connects to a PostgreSQL container |
Integration |
| O-02 | P1 | Test the firmware OTA path: build a new firmware image, host it on HTTP, trigger OTA from the device, and confirm the device reboots with the new version | Hardware/Human |
| O-03 | P2 | Run wifi-densepose mat scan on a freshly provisioned ESP32-S3 and confirm end-to-end data flow from sensor to CLI output |
E2E/Human |
O2: Monitoring and Observability
Finding: The Python API provides comprehensive health checks (/health/health, /health/ready, /health/live), system metrics (CPU, memory, disk, network via psutil), and per-component health status. The Rust crates use tracing for structured logging.
Risk: MEDIUM
- The health check calls
psutil.cpu_percent(interval=1)which blocks for 1 second. This makes the health endpoint slow and potentially a bottleneck under load. - The system metrics endpoint is available to unauthenticated users at
/health/metrics. Only "detailed metrics" require authentication. - There is no distributed tracing (e.g., OpenTelemetry) for correlating requests across the Python API, ESP32 firmware, and potential Rust services.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| O-04 | P1 | Call /health/health 10 times concurrently and confirm total response time is < 15 seconds (not 10x the 1-second cpu_percent block) |
Integration |
| O-05 | P2 | Confirm /health/metrics does not expose PII, database credentials, or internal IP addresses in the response body |
Security/E2E |
O3: User Workflows
Finding: Primary user workflows:
- Researcher: Configure sensors -> Collect CSI data -> Train model -> Evaluate
- Disaster responder: Deploy sensors -> Start MAT scan -> Monitor survivors -> Triage
- Developer: Clone repo -> Build -> Run tests -> Submit PR
Risk: MEDIUM
- The disaster responder workflow is safety-critical. A false negative (missing a survivor) has life-or-death consequences. The system should have explicit false negative rate metrics but none are defined.
- The developer workflow requires installing OpenBLAS, potentially libtorch, and ESP-IDF v5.4. No
devcontainer.jsonornix-shellto standardize the development environment.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| O-06 | P0 | Run the complete developer setup workflow from a clean Ubuntu 22.04 VM: clone, install deps, cargo test --workspace --no-default-features, python archive/v1/data/proof/verify.py -- measure total setup time and document any manual steps |
Human Exploration |
| O-07 | P1 | Simulate a MAT scan with 5 survivors at varying signal strengths (strong, weak, borderline) and confirm the triage classification matches expected START protocol categories | Integration |
O4: Extreme Use
Finding: No load testing, stress testing, or chaos engineering infrastructure exists.
Risk: HIGH
- The system targets disaster response scenarios where multiple ESP32 nodes stream simultaneously. The aggregator's behavior under 10+ concurrent node streams is unknown.
- The database writes CSI data at 20 Hz per device. With 10 devices, that is 200 inserts/second of array data into PostgreSQL.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| O-08 | P1 | Simulate 10 ESP32 nodes streaming at 20 Hz to the aggregator and measure: packet loss, processing latency per frame, memory growth over 5 minutes | Performance |
| O-09 | P2 | Fill the CSI history deque to max_history_size=500 and confirm the oldest entry is evicted, not causing an OOM |
Unit |
T -- Time
WHEN things happen
T1: Real-Time Processing
Finding: The RuvSense pipeline targets 20 Hz output (50ms per TDMA cycle). The vital signs extraction uses sample rates of 100 Hz with 30-second windows. The CSI processor uses configurable sampling_rate, window_size, and overlap.
Risk: CRITICAL
- No latency benchmarks exist anywhere in the codebase. The 20 Hz target implies each frame must be processed in < 50ms including multi-band fusion, phase alignment, multistatic fusion, coherence gating, and pose tracking. This budget has never been measured.
- The Python
process_csi_datamethod isasyncbut all the numpy operations inside are synchronous and CPU-bound. Theawaitis cosmetic -- it does not yield to the event loop during computation. - The Doppler extraction iterates over the phase cache on every call. With
max_history_size=500, this means constructing a 500-element numpy array from a deque on each frame.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| T-01 | P0 | Benchmark the Rust RuvSensePipeline end-to-end latency for a single frame with 4 nodes and 56 subcarriers; assert total processing time < 50ms on x86_64 |
Benchmark |
| T-02 | P0 | Benchmark the Python CSIProcessor.process_csi_data method for a single frame and assert it completes in < 25ms (leaving budget for I/O and networking) |
Benchmark |
| T-03 | P1 | Profile the Doppler extraction path with max_history_size=500: measure time spent in list(self._phase_cache) and np.array(cache_list[-window:]) |
Benchmark |
| T-04 | P1 | Run the Python CSI processor with asyncio.run() and confirm it does not block the event loop for > 10ms per frame; use asyncio.get_event_loop().slow_callback_duration |
Integration |
T2: Concurrency
Finding: The Rust system uses tokio for async runtime with features = ["full"]. The Python API uses FastAPI (async) with uvicorn workers. The ESP32 firmware uses FreeRTOS tasks. The DisasterResponse::running flag uses AtomicBool for thread-safe scanning control.
Risk: HIGH
- The
DisasterResponsestruct is notSend + Syncsafe by default (it containsdyn EventStorebehind anArc, but the struct itself is not wrapped in aMutex). Ifstart_scanningis called from multiple threads, the mutable self-reference causes a data race. - The Python
get_database_manageruses a module-level global_db_managerwith no thread-safety protection. With multiple uvicorn workers, each worker gets its own instance (process isolation), but within a single worker, concurrent requests could race on initialization. - The ESP32 firmware uses FreeRTOS event groups for WiFi state but the CSI callback runs in the WiFi driver context. If the callback takes too long (e.g., edge processing), it blocks WiFi reception.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| T-05 | P0 | Run cargo test under Miri (or ThreadSanitizer) for the wifi-densepose-mat crate to detect data races in DisasterResponse |
CI |
| T-06 | P1 | Call DatabaseManager.initialize() concurrently from 10 async tasks and confirm only one initialization occurs (no double-init race) |
Integration |
| T-07 | P1 | Measure the CSI callback execution time on ESP32 and confirm it completes in < 1ms to avoid blocking the WiFi driver | Hardware/Benchmark |
| T-08 | P2 | Start and stop DisasterResponse::start_scanning from two different tokio tasks simultaneously and confirm no panic or deadlock |
Unit |
T3: Scheduling and Timeouts
Finding: The MAT scan interval is configurable (scan_interval_ms, default 500ms, minimum 100ms). The database connection pool has pool_timeout=30s and pool_recycle=3600s. Redis has socket_timeout=5s and connect_timeout=5s.
Risk: MEDIUM
- The ESP32 WiFi reconnection has
MAX_RETRY=10but no backoff strategy. Ten rapid reconnection attempts could flood the AP. - No timeout on the
scan_cyclemethod itself. If detection takes longer thanscan_interval_ms, cycles overlap without back-pressure. - The
pool_recycle=3600means database connections are recycled every hour. In a long-running deployment, this causes periodic connection churn.
Test Ideas:
| # | Priority | Test Idea | Automation |
|---|---|---|---|
| T-09 | P1 | Set scan_interval_ms=100 (minimum) and run a scan cycle that takes 200ms to complete; confirm the system does not accumulate a backlog of overlapping cycles |
Unit |
| T-10 | P2 | Simulate 10 WiFi disconnects in rapid succession on ESP32 and confirm the retry counter increments correctly and stops at MAX_RETRY=10 | Integration/Hardware |
| T-11 | P2 | Keep the API running for 2 hours and confirm database pool recycling does not cause request failures during connection rotation | Integration |
Product Coverage Outline (PCO)
| # | Testable Element | Reference | Product Factor(s) |
|---|---|---|---|
| 1 | Cargo workspace build integrity | Cargo.toml, 18 crates | Structure |
| 2 | WASM-edge crate exclusion gap | Cargo.toml exclude |
Structure |
| 3 | Dependency vulnerability surface | 30+ external crates | Structure |
| 4 | CSI processing pipeline determinism | csi_processor.py, verify.py | Function, Data |
| 5 | Human detection accuracy | _calculate_detection_confidence | Function |
| 6 | Vital signs extraction boundaries | BreathingExtractor, HeartRateExtractor | Function, Data |
| 7 | MAT ensemble classification | EnsembleClassifier | Function |
| 8 | Error chain preservation | CSIProcessingError, MatError | Function |
| 9 | Event store silent error discard | scan_cycle let _ = | Function |
| 10 | Authentication and secrets management | Settings.secret_key, AuthMiddleware | Function |
| 11 | Readiness probe accuracy | /health/ready hardcoded True | Function, Interfaces |
| 12 | State machine transition enforcement | DeviceStatus, SessionStatus | Function |
| 13 | CSI data shape validation | CSIData ndarray shapes | Data |
| 14 | ESP32 binary protocol parsing | Esp32CsiParser | Data, Interfaces |
| 15 | Database failover correctness | PostgreSQL -> SQLite | Data, Platform |
| 16 | Proof-of-reality cross-platform | verify.py, Rust equivalent | Data |
| 17 | REST API contract consistency | FastAPI, Axum MAT API | Interfaces |
| 18 | WebSocket connection management | connection_manager.py | Interfaces |
| 19 | UDP CSI transport reliability | stream_sender.c, aggregator | Interfaces |
| 20 | Cross-platform compilation | Linux, macOS, Windows, WASM, ESP32 | Platform |
| 21 | Hardware compatibility matrix | ESP32-S3 4MB/8MB, mmWave | Platform |
| 22 | External service dependencies | PostgreSQL, Redis, libtorch | Platform |
| 23 | Deployment automation | Missing Dockerfile | Operations |
| 24 | OTA firmware update path | ota_update.c | Operations |
| 25 | Health endpoint performance | psutil.cpu_percent blocking | Operations |
| 26 | Multi-node stress testing | 10+ concurrent ESP32 streams | Operations, Time |
| 27 | Real-time latency budget | 50ms target at 20 Hz | Time |
| 28 | Async processing correctness | CPU-bound in async context | Time |
| 29 | Thread safety and data races | DisasterResponse, DatabaseManager | Time |
| 30 | Scan cycle timing overlap | scan_interval_ms vs processing time | Time |
Test Data Suggestions
Test Data for Structure-Based Tests
- Cargo.toml with intentionally broken dependency versions to test build failure modes
.rsfiles at exactly 500 lines and 501 lines to test line-count policy enforcement- A workspace member list with a typo in the path to test error reporting
Test Data for Function-Based Tests
- 1,000 CSI frames from
sample_csi_data.jsonas baseline input - Synthetic CSI frames with known Doppler shifts (1 Hz, 2 Hz, 5 Hz, 10 Hz)
- Vital signs signals at physiological extremes: 8 bpm breathing (sleep apnea boundary), 200 bpm heart rate (tachycardia)
- Empty CSI frames (all zeros), single-subcarrier frames, maximum-subcarrier frames (256)
- EnsembleClassifier inputs at confidence boundary: 0.499, 0.500, 0.501
Test Data for Data-Based Tests
- 100,000 CSI frames for database stress testing (~83 minutes at 20 Hz)
- Duplicate
(device_id, sequence_number, timestamp_ns)tuples for constraint testing - CSIData with mismatched array shapes (
amplitude.shape != (num_antennas, num_subcarriers)) - SQLite database files at 100 MB, 1 GB, and 10 GB for scaling tests
Test Data for Interface-Based Tests
- Valid and malformed ADR-018 binary frames (truncated, corrupted, oversized)
- Spoofed MAC addresses in UDP frames for security testing
- 100 concurrent WebSocket connections with varying message rates
- OpenAPI specification exported from FastAPI for contract validation
Test Data for Platform-Based Tests
- Cross-compiled binaries for aarch64, x86_64, wasm32
- ESP32-S3 4MB partition tables with all features enabled (should overflow)
- MR60BHA2 radar serial output samples (synthetic)
Test Data for Operations-Based Tests
- Docker compose configuration with PostgreSQL + Redis + API
- Firmware OTA images (valid, corrupted, oversized)
- 10-node ESP32 mesh simulation traffic capture
Test Data for Time-Based Tests
- CSI frames with monotonically increasing timestamps at exactly 50ms intervals
- CSI frames with jittered timestamps (+/- 10ms, +/- 25ms, +/- 50ms)
- Phase cache at sizes: 0, 1, 2, 63, 64, 65, 499, 500 (boundary values for Doppler window)
Suggestions for Exploratory Test Sessions
Exploratory Test Sessions: Structure
- Session: Crate Dependency Graph Walk -- Starting from
wifi-densepose-cli, trace every transitive dependency and look for diamond dependencies, version conflicts, or unnecessary coupling between crates that should be independent. - Session: Feature Flag Combinatorics -- Systematically toggle feature flags on
wifi-densepose-train(tch-backend on/off) andwifi-densepose-core(std/serde/async) and build each combination. Look for compilation failures, missing exports, or confusing error messages.
Exploratory Test Sessions: Function
- Session: Detection Confidence Calibration -- Feed the CSI processor a sequence of frames that transitions from empty room to one person to two people. Observe how the confidence score evolves. Look for oscillation, slow convergence, or failure to distinguish scenarios.
- Session: MAT Disaster Scenario Walkthrough -- Set up a full MAT scan with 3 zones, inject synthetic CSI data representing 5 survivors at varying depths (0.5m, 2m, 5m). Observe triage classification, alert generation, and event store entries. Look for missing events or incorrect triage.
Exploratory Test Sessions: Data
- Session: Database Failover Chaos -- Start the API with PostgreSQL, insert data, kill PostgreSQL, observe failover to SQLite, insert more data, restart PostgreSQL, and examine whether the system recovers. Look for data loss, schema incompatibilities, or stuck states.
- Session: Proof of Reality Deep Dive -- Run
verify.py --verboseandverify.py --auditon a fresh checkout. Modify one line ofcsi_processor.py(e.g., change a threshold) and re-run verify. Look for how quickly the hash changes and whether the error message identifies what changed.
Exploratory Test Sessions: Interfaces
- Session: API Fuzzing Marathon -- Use
schemathesisorrestleragainst the running FastAPI application for 30 minutes. Focus on edge cases: empty bodies, huge payloads (10 MB JSON), unicode in string fields, negative numbers in integer fields. Track every 500 response. - Session: ESP32 Protocol Mismatch Hunt -- Capture real UDP traffic from an ESP32-S3, modify bytes at various offsets, and feed them to the
Esp32CsiParser. Look for panics, undefined behavior, or incorrect but accepted frames.
Exploratory Test Sessions: Platform
- Session: macOS CoreWLAN Availability -- On a macOS machine, attempt to use the
mac_wifi.swiftsensing module. Look for compilation issues, missing entitlements, or WiFi permission dialogs that block unattended operation. - Session: WASM in Browser -- Build
wifi-densepose-wasmand load it in Chrome, Firefox, and Safari. CallMatDashboardmethods from the JavaScript console. Look for WASM memory limits, missingweb-sysfeatures, or browser-specific failures.
Exploratory Test Sessions: Operations
- Session: First-Time Setup Experience -- Follow the README as a new developer on a clean Ubuntu 22.04 VM. Document every step that fails, every missing dependency, and every confusing error. Measure total time from
git cloneto first passing test. - Session: Firmware Provisioning End-to-End -- Use the
provision.pyscript to configure a real ESP32-S3 with WiFi credentials. Monitor serial output. Disconnect and reconnect. Look for edge cases in NVS persistence, WiFi credential storage, and recovery from bad configuration.
Exploratory Test Sessions: Time
- Session: Latency Budget Profiling -- Instrument the Rust
RuvSensePipelinewithtracingspans on each stage (multiband, phase_align, multistatic, coherence, pose_tracker). Run 1,000 frames and produce a flame graph. Identify which stage consumes the most of the 50ms budget. - Session: Concurrent Scanning Stress -- Start
DisasterResponse::start_scanningwithcontinuous_monitoring=trueandscan_interval_ms=100. While scanning, callpush_csi_datafrom a separate thread at 200 Hz. Look for data races, queue overflow, or missed scans.
Clarifying Questions
Suggestions based on general risk patterns and analysis of the existing codebase:
Structure
- What is the intended relationship between the Python v1 API and the Rust
wifi-densepose-apistub? Is the Rust API planned to replace Python, or will they coexist? - Why is
wifi-densepose-wasm-edgeexcluded from the workspace? Are its tests run in a separate CI job, or are they not run at all?
Function
- What is the acceptable false positive rate for human detection? What is the acceptable false negative rate for MAT survivor detection? These are not documented anywhere.
- The
HeartRateExtractorbandpass filter starts at 0.8 Hz (48 bpm). Is this intentional, given that athletic resting heart rates can be 40 bpm (0.67 Hz)? - The
smoothing_factorof 0.9 introduces ~500ms lag at 20 Hz. Is this acceptable for the pose tracking use case, or should it be configurable per-mode?
Data
- What is the data retention policy for CSI frames in PostgreSQL? At 20 Hz per device, storage grows at ~2.7 GB/day per device (estimated). Who is responsible for archival?
- Is there a plan to create a Rust-equivalent proof-of-reality test to ensure the Rust signal processing pipeline matches the Python pipeline output?
Interfaces
- Does the ADR-018 binary protocol include a version byte? If the firmware and server are at different protocol versions, how is this detected?
- What is the WebSocket message format for pose data streaming? Is it documented in an ADR or schema file?
- Is there authentication on the UDP CSI data stream, or can any device on the network inject frames into the aggregator?
Platform
- Is ARM64 (e.g., Raspberry Pi 4/5) a supported deployment target for the server? If so, has
openblas-staticbeen validated on ARM64? - Are there plans for an Android or iOS mobile app, or is the
wifi-densepose-desktopcrate the only non-server deployment target?
Operations
- Is there a Docker image on Docker Hub as mentioned in the pre-merge checklist? If so, what is the image name and how is it built?
- What is the firmware signing process for OTA updates? Is there a code-signing key, and how is it managed?
- Who monitors the
/health/healthendpoint in production? Is there an alerting integration (PagerDuty, Opsgenie, etc.)?
Time
- Has the 20 Hz (50ms per frame) latency budget ever been measured on actual hardware with real CSI data? What is the measured P99 latency?
- What happens when
scan_cycletakes longer thanscan_interval_ms? Does the next cycle start immediately, or is there a backlog mechanism? - The ESP32 CSI callback runs in the WiFi driver context. What is the maximum allowed execution time before WiFi reception is impacted?
Assessment Quality Metrics
| Metric | Value | Target | Status |
|---|---|---|---|
| SFDIPOT categories covered | 7/7 | 7/7 | PASS |
| Test ideas generated | 57 | 50+ | PASS |
| P0 (Critical) | 10 (17.5%) | 8-12% | PASS (slightly above due to safety-critical MAT domain) |
| P1 (High) | 20 (35.1%) | 20-30% | PASS |
| P2 (Medium) | 20 (35.1%) | 35-45% | PASS |
| P3 (Low) | 7 (12.3%) | 20-30% | BELOW (complex system with fewer trivial tests) |
| Automation: Unit | 22 (38.6%) | 30-40% | PASS |
| Automation: Integration | 19 (33.3%) | -- | PASS |
| Automation: E2E | 5 (8.8%) | <=50% | PASS |
| Automation: Benchmark | 5 (8.8%) | -- | N/A |
| Automation: Human Exploration | 6 (10.5%) | >=10% | PASS |
| Clarifying questions | 18 | 10+ | PASS |
| Exploratory sessions | 14 | 7+ (one per factor) | PASS |
Priority Summary: Top 10 Actions
- T-01/T-02 (P0): Benchmark real-time processing latency against the 50ms budget. The entire system's viability depends on this.
- F-01/F-02 (P0): Establish baseline false positive/negative rates for human detection with known test data.
- T-05 (P0): Run ThreadSanitizer on the MAT crate to detect data races in the multi-threaded scanning path.
- P-01 (P0): Add macOS and Windows CI runners. A 6-platform project tested on 1 platform is a risk multiplier.
- I-08 (P0): Add protocol version detection to the ESP32 parser to prevent silent data corruption from version mismatches.
- S-08/D-09 (P0): Ensure proof-of-reality runs on every PR touching the signal processing pipeline.
- F-12 (P0): Validate that weak secrets are rejected at startup, not silently accepted.
- O-06 (P0): Document and automate the developer setup experience. A system this complex needs reproducible environments.
- F-04 (P1): Test MAT ensemble classifier at confidence boundaries. In disaster response, boundary behavior determines life-or-death decisions.
- I-01 (P0): Generate and validate OpenAPI contract. Two API implementations (Python + Rust) without a shared contract will inevitably diverge.
Assessment generated using James Bach's HTSM Product Factors framework (SFDIPOT). All findings are based on static analysis of the codebase at commit 85434229 on the qe-reports branch. Risk ratings reflect both probability and impact, with the MAT safety-critical use case amplifying severity for all Function and Time findings.