diff --git a/docs/research/architecture/decision-tree.md b/docs/research/architecture/decision-tree.md new file mode 100644 index 00000000..ed3b1ddd --- /dev/null +++ b/docs/research/architecture/decision-tree.md @@ -0,0 +1,205 @@ +# Three-Tier Node — Decision Tree + +| Field | Value | +|--------------|------------------------------------------------------------------------| +| **Status** | Reference — informs whether/how to adopt the three-tier proposal | +| **Date** | 2026-04-25 | +| **Companion**| `architecture/three-tier-rust-node.md`, `sota/2026-Q2-rf-sensing-and-edge-rust.md` | + +This document maps each load-bearing decision in the three-tier proposal +to (a) what it depends on, (b) what evidence would justify yes/no, and +(c) which ADR slot would house the decision once made. It is intentionally +short — the prose lives in the SOTA survey and the seed exploration. + +--- + +## 1. Load-bearing vs independent decisions + +Six decisions are **load-bearing** — they unblock or block other +decisions: + +| # | Decision | Blocks | +|----|----------------------------------|------------------------------------------| +| L1 | Per-node BOM ceiling | Hardware split, Pi shape, all ADRs below | +| L2 | Single-MCU vs dual-MCU node | Sensor-MCU runtime, ISR strategy | +| L3 | One-Pi-per-node vs one-per-cluster | OTA shape, secure-boot story, BOM | +| L4 | CSI no_std maturity gate | Sensor-MCU language choice | +| L5 | Mesh control-plane technology | Comms MCU choice (S3 vs C6) | +| L6 | Heavy-compute SoC choice | Secure-boot path, ML model class | + +Five decisions are **independent** of the three-tier shape and can be +made in parallel: + +| # | Decision | +|----|----------------------------------| +| I1 | LoRa fallback chip (SX1262 vs LR1121) | +| I2 | Charger / PMIC (BQ24074 vs BQ25798) | +| I3 | QUIC vs MQTT-over-TLS for backhaul | +| I4 | OTA mechanism per die | +| I5 | Provisioning protocol (BLE vs USB) | + +--- + +## 2. Decision tree (Mermaid) + +```mermaid +flowchart TD + L1{"L1: BOM ceiling per node?"} + L1 -->|"<= $15"| KEEP_TODAY["Keep ADR-028 single-S3 node.
Three-tier proposal is out of budget."] + L1 -->|"$15-$30"| L3 + L1 -->|"> $30"| L3 + + L3{"L3: Heavy compute per node
or per cluster?"} + L3 -->|"per cluster (1 Pi / 3-6 nodes)"| HYBRID["Hybrid path:
single-S3 sensor + cluster Pi.
Cheapest viable upgrade."] + L3 -->|"per node"| L2 + + L2{"L2: Single-MCU or dual-MCU
per node?"} + L2 -->|"single MCU"| L4_SINGLE["ADR-081 already covers this.
Investigate WHY a dual-MCU is needed."] + L2 -->|"dual MCU (sensor + comms)"| L4 + + L4{"L4: Is no_std CSI capture
production-quality?"} + L4 -->|"no / unknown"| L4_NO["Hold dual-MCU shape until
esp-csi-rs / esp-radio matches
esp_wifi_set_csi_rx_cb in jitter & quality."] + L4 -->|"yes (benchmarked)"| L5 + + L5{"L5: Mesh control plane:
WiFi or 802.15.4?"} + L5 -->|"WiFi (ESP-WIFI-MESH)"| L5_WIFI["Comms MCU = ESP32-S3.
Stays on existing ADR-029 shape."] + L5 -->|"802.15.4 (Thread)"| L5_THREAD["Comms MCU = ESP32-C6.
Hybrid: WiFi data + Thread control."] + + L6{"L6: Heavy compute SoC?"} + L6 -->|"Pi Zero 2W"| L6_ZERO["dm-verity + signed FIT.
NOT immutable-ROM secure boot."] + L6 -->|"CM4 / Pi 5"| L6_CM4["RPi-foundation secure boot path.
+~$30-50 BOM."] + + HYBRID --> L6 + L5_WIFI --> L6 + L5_THREAD --> L6 + + L4_NO -.->|"if gated long-term"| HYBRID + + style KEEP_TODAY fill:#cfe + style HYBRID fill:#cfe + style L4_NO fill:#fec + style L4_SINGLE fill:#cfe +``` + +The tree's recommended cheapest-first path is: +**L1 → L3 (per-cluster) → HYBRID**, which keeps today's ESP32-S3 sensor +nodes and adds one Pi per 3–6 nodes. This captures most of the QUIC / +ML / secure-boot value without re-spinning the per-node PCB. + +--- + +## 3. Decision detail — what evidence justifies each branch + +### L1 — Per-node BOM ceiling + +| Branch | Evidence required | ADR slot | +|-----------------------|--------------------------------------------------------------------|--------------------------------------| +| ≤ $15 | Today's $9 BOM, ADR-028 witness; deployment-cost analysis | No new ADR — keep ADR-028 baseline | +| $15–$30 | Cost analysis showing single-MCU + cluster-Pi path < $30 | New ADR (e.g., ADR-083) | +| > $30 | Deployment-cost analysis showing per-node Pi pays for itself | Two ADRs (per-node Pi, BOM revision) | + +### L2 — Single vs dual MCU per node + +| Branch | Evidence required | ADR slot | +|--------------|--------------------------------------------------------------------------------------------|--------------------------------| +| Single MCU | ADR-081 5-layer kernel measurements (already 60 byte feature packets, 0.003% CPU at 5 Hz) | No new ADR — keep ADR-081 | +| Dual MCU | Measured ISR-jitter problem on single-MCU node; or no_std-CSI maturity demonstrated | New ADR (firmware split) | + +### L3 — Per-node vs per-cluster heavy compute + +| Branch | Evidence required | ADR slot | +|---------------|-----------------------------------------------------------------------------------------------|--------------------------------| +| Per cluster | Throughput math: 6 nodes × 5 Hz × 60 B = 1.8 KB/s per cluster; well within USB/Ethernet to Pi | New ADR (cluster-Pi shape) | +| Per node | Need: per-node ML, per-node QUIC, per-node secure boot, deployment without LAN gateway | New ADR (per-node Pi shape) | + +### L4 — CSI no_std maturity gate + +| Branch | Evidence required | ADR slot | +|------------|--------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------| +| Mature | esp-csi-rs (or replacement) on real S3 board: matches esp_wifi_set_csi_rx_cb capture rate, frame-loss, ISR-jitter | Phase-4 of ADR-081 + a `no_std` migration ADR | +| Not mature | Side-by-side benchmark shows ≥10% drop in capture quality, or ISR-jitter > 100 µs | Defer — remain on ESP-IDF C path | + +### L5 — Mesh control-plane technology + +| Branch | Evidence required | ADR slot | +|-----------------|--------------------------------------------------------------------------------------------------------------|---------------------------------------------| +| ESP-WIFI-MESH | ≤ 25-node target; existing ADR-029 + ADR-073 hold | No new ADR — keep ADR-029 | +| Thread | ≥ 50-node target; field test showing ESP-WIFI-MESH degradation; comms-MCU change to ESP32-C6 acceptable | New ADR (Thread control plane) | +| `esp-mesh-lite` | Wanting IP-layer routing for QUIC + WiFi homogeneity, but staying on S3 | New ADR (mesh-lite migration) | + +### L6 — Heavy-compute SoC choice + +| Branch | Evidence required | ADR slot | +|------------|--------------------------------------------------------------------------------------------------------------|-----------------------------------------| +| Pi Zero 2W | Buildroot + dm-verity + signed FIT meets the threat model; cost / power matters more than ROM-rooted boot | New ADR (Pi Zero 2W image / OTA) | +| CM4 / Pi 5 | True ROM-rooted secure boot is deployment-required (e.g., regulated environment) | New ADR (CM4 image / OTA) | + +--- + +## 4. Independent decisions — make in parallel + +Each of these can be evaluated in isolation; none depend on the L-decisions. + +| # | Decision | Default recommendation | ADR slot | +|----|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------| +| I1 | LoRa fallback chip | **SX1262.** LR1121 only if global / 2.4 GHz / satellite roaming is a deployment requirement. (SOTA §6) | ADR (LoRa fallback) | +| I2 | PMIC choice | **BQ24074 if panel ≤ 2 W**, **BQ25798 if panel ≥ 5 W or solar-only**. SPV1050 only for sub-watt energy harvesting. (SOTA §7) | ADR (power path) | +| I3 | Backhaul protocol | **QUIC (`quinn` + `rustls`)** if bidirectional / large payload / mobile-network handoff matters. **MQTT-over-TLS** for low-rate publish-only. (SOTA §5) | ADR (backhaul) | +| I4 | OTA per die | **`embassy-boot` two-slot** on no_std MCUs. **ESP-IDF native OTA** on ESP-IDF MCUs. **A/B + signed FIT** on Pi. (SOTA §3, §9) | ADR (OTA) | +| I5 | Provisioning protocol | **BLE provisioning via `esp-idf-svc`** for any in-field reprovisioning; **USB / serial** for factory provisioning only. (No SOTA section — well-trodden ground.) | ADR (provisioning) | + +--- + +## 5. Recommended ADR sequence + +If the three-tier proposal is partially adopted, the recommended ADR +sequence is **outside-in** — address the cheapest, most independent +decisions first, gate the load-bearing ones on real evidence: + +1. **Independent ADRs first** (any order): + - I1 LoRa fallback chip choice. + - I2 Power-path / PMIC choice (probably BQ24074 if panel stays ≤ 2 W, + BQ25798 otherwise). + - I3 QUIC vs MQTT-over-TLS (likely MQTT for the heartbeat-only case, + QUIC if model updates and fleet sync are real). +2. **Per-cluster-Pi ADR** (L3, hybrid branch) — the high-value, low-cost + first step. One Pi per 3–6 nodes. Captures most of the ML/QUIC/ + secure-boot value at minimal per-sensor BOM impact. +3. **Mesh control-plane ADR** (L5) — only if deployments target > 25 + nodes. Otherwise stays on ESP-WIFI-MESH per ADR-029. +4. **CSI no_std maturity benchmark ADR** (L4 evidence) — investigate, + but do not commit to dual-MCU until benchmarked. +5. **Dual-MCU node ADR** (L2) — only after L4 evidence + a clear ML or + ISR-jitter problem on the single-MCU node. +6. **Three-tier-PCB ADR** (full proposal) — last, only if BOM / threat- + model / scale all justify it. + +This ordering deliberately keeps the bulk of the deployable surface on +today's ADR-028 / ADR-081 baseline while letting each separable +upgrade be evaluated on its own evidence. + +--- + +## 6. Out-of-scope for this document + +- **Re-evaluating ADR-029 mesh choices** beyond mentioning Thread as + alternative — that belongs in a Mesh-control-plane ADR. +- **Specific PCB layout** of any of the candidate boards. +- **Cloud-side architecture** (gateway, fleet-sync target, time-series + storage). Out of scope of the node architecture proposal. +- **Cross-environment domain generalization (ADR-027)** — orthogonal to + the hardware shape. +- **Multistatic fusion algorithms** (`wifi-densepose-ruvector::viewpoint`) + — orthogonal to the hardware shape. + +--- + +## 7. References to other documents in this set + +- `architecture/three-tier-rust-node.md` — the seed proposal. +- `sota/2026-Q2-rf-sensing-and-edge-rust.md` — SOTA evidence per topic. +- `architecture/implementation-plan.md` — earlier (2026-04-02) GOAP plan + for ESP32-S3 + Pi Zero 2 W; the three-tier proposal is most usefully + read as an extension of this plan. +- `architecture/ruvsense-multistatic-fidelity-architecture.md` — + multistatic fusion architecture, orthogonal to node hardware shape. diff --git a/docs/research/architecture/three-tier-rust-node.md b/docs/research/architecture/three-tier-rust-node.md new file mode 100644 index 00000000..3906c7a7 --- /dev/null +++ b/docs/research/architecture/three-tier-rust-node.md @@ -0,0 +1,434 @@ +# Three-Tier Rust Node — Exploratory Architecture + +| Field | Value | +|--------------|------------------------------------------------------------------------| +| **Status** | Exploratory / not yet decided | +| **Date** | 2026-04-25 | +| **Authors** | ruv (proposal), filed by goal-planner research agent | +| **Classifies as** | Speculative architectural alternative to ADR-028 / ADR-081 baseline | +| **Companion**| `docs/research/sota/2026-Q2-rf-sensing-and-edge-rust.md` (SOTA), `docs/research/architecture/decision-tree.md` (decisions) | + +> **Reading note.** This document files a long architectural exploration the +> author wrote before any commitment. It is intentionally optimistic in places +> and will be tempered by the SOTA survey filed alongside it. The decision +> tree document maps each load-bearing claim to the evidence that would +> justify acting on it. Nothing in this document supersedes ADR-028 (the +> capability audit) or ADR-081 (the 5-layer adaptive kernel). Both already +> describe a working, single-MCU node; this document describes a +> hypothetical *three-tier* node that would replace it on PCBs that ship +> Pi-class compute next to two ESP32-class radios on a solar-powered HAT. + +--- + +## 1. ADRs this proposal would touch + +If pursued, this proposal evolves the following decisions. None are +overturned outright; all need re-read in this light. + +- **ADR-028 — ESP32 Capability Audit.** Today's witnessed node is a single + ESP32-S3 streaming raw ADR-018 frames over UDP. A three-tier node changes + the audit subject from "one MCU" to "two MCUs + a Pi", with implications + for the witness bundle, firmware-manifest hashes, and per-node BOM. +- **ADR-081 — Adaptive CSI Mesh Firmware Kernel.** The 5-layer kernel + already separates radio abstraction (L1), adaptive control (L2), mesh + plane (L3), feature extraction (L4), and Rust handoff (L5). A three-tier + node would split L1–L2 onto a no_std sensor MCU, L3 onto an ESP-IDF + comms MCU, and Layer-5+ Rust workload onto the Pi. The split is + compatible with the kernel; it is a deployment shape rather than a + redesign. +- **ADR-018 — ESP32 Dev Implementation.** ADR-018 binary CSI frames remain + the wire format between the sensor MCU and whoever consumes them. The + three-tier proposal tightens the contract: ADR-018 frames flow from + sensor MCU into the comms MCU only, never directly off the node. +- **ADR-029 / ADR-031 — Multistatic and sensing-first RF mode.** A + hardware-gated Pi Zero 2W enables the sensing-first mode to actually + hibernate the heavy compute, which ADR-031's power model assumes but the + current node cannot deliver because heavy compute lives off-node. +- **ADR-032 — Multistatic mesh security hardening.** HMAC-SHA256 beacon + auth + SipHash-2-4 frame integrity in ADR-032 already cover the + inter-node bus. The proposal adds Secure Boot V2 + flash encryption + at-rest on each MCU, and a signed Pi A/B image, which are *complements* + to ADR-032, not substitutes. + +--- + +## 2. Motivating thesis + +A WiFi/RF sensing node has three jobs that prefer three different +runtimes: + +1. **Strict-real-time radio capture and DSP** — sub-millisecond ISR + discipline, no allocator surprises, predictable interrupt latency. +2. **Networking, OTA, mesh, time sync** — TCP/IP, TLS, BLE provisioning, + ESP-WIFI-MESH, OTA bootloaders, NVS. The full battery of WiFi-stack + features that come with ESP-IDF and FreeRTOS. +3. **Heavy compute, ML inference, storage, fleet sync** — gigabytes of + model weights, vision inference, persistent storage, QUIC-based fleet + sync, optional cloud APIs. + +Today's RuView node tries to fit jobs 1 and 2 onto one ESP32-S3, and job 3 +either runs on a separate machine (the "sensing-server" host) or is +absent. The thesis of this proposal is that **collapsing all three onto +a single PCB but onto three separate dies** captures most of the +"single node" simplicity without sacrificing the runtime properties of +each layer. Concretely: + +- **Sensor MCU** — ESP32-S3, no_std, `esp-hal` + Embassy + `heapless` + + `postcard`. ISR-driven CSI capture, channel hopping, short-window DSP. + No WiFi stack of its own (the radio is in the comms MCU); a private + UART or SPI link to the comms MCU carries serialized frames. *(See SOTA + survey, §3, for the ISR-safety caveat that tempers this.)* +- **Comms MCU** — second ESP32-S3, ESP-IDF, `esp-idf-svc` + `esp-idf-sys`, + TLS/HTTPS/OTA/ESP-WIFI-MESH, NVS provisioning, BLE provisioning, LoRa + fallback. Owns the "outside world." +- **Pi Zero 2W** — *normally power-gated*. Wakes on event from the comms + MCU, runs heavy ML or fleet-sync work, optionally streams QUIC to a + gateway, then power-gates again. `tokio` + `quinn` + `rustls` + `axum`. + +A single PCB, a single 1S Li-ion + 2 W solar + linear charger, a single +enclosure. Three separate cores each running the runtime they are +actually good at. + +--- + +## 3. Hardware shape (proposed) + +### 3.1 Bill of materials (per node, target) + +| Slot | Part | Notes | +|---------------------|--------------------------------------------------|---------------------------------------------------| +| Sensor MCU | ESP32-S3-WROOM-1 (8 MB flash, 8 MB PSRAM) | no_std, Embassy, esp-radio. Always-on. | +| Comms MCU | ESP32-S3-MINI-1 or -WROOM-1 (4 MB flash) | ESP-IDF, ESP-WIFI-MESH, OTA, TLS. Mostly-on. | +| Heavy compute | Pi Zero 2W (1 GB RAM) | Power-gated by default. Wake on event. | +| LoRa fallback | Semtech SX1262 module | Heartbeat + recovery only. Sub-GHz. | +| Charger / PMIC | TI BQ24074 (linear) or BQ25798 (buck-boost MPPT) | See SOTA §7 for trade-off. | +| Battery | 1S Li-ion 18650 (3.0 Ah class) | Standard cell, easy to source. | +| Solar panel | ~2 W, 6 V, IP-rated | Roof-mount or window-mount. | +| Pi power gate | Logic-level P-FET high-side switch + ESP GPIO | Hard-cut when idle (350 mA → ~0 mA). | +| Inter-MCU bus | UART or SPI between sensor MCU and comms MCU | Postcard-framed binary on a 4-wire link. | +| Comms-to-Pi bus | UART (115200–921600 bps) or SPI | Pi-side `tokio-serial`/`spidev`. | +| Enclosure | IP54 or IP65 with antenna pass-through | - | +| Estimated BOM | $40–55 | At small build qty; falls with volume. | + +This is roughly 4–6× the ~$9 single-S3 node, which is the largest +single mark against the proposal. See §7.4 for whether the cost makes +sense. + +### 3.2 Power-state hierarchy (proposed) + +| State | Sensor MCU | Comms MCU | Pi Zero 2W | Approx draw | +|----------------|------------------|-----------------|------------------|-----------------| +| Deep idle | light sleep | DTIM-modulated | hard-off | < 5 mA | +| Sample window | active CSI | passive listen | hard-off | ~80 mA | +| Event publish | active CSI | TX burst | hard-off | ~150 mA peak | +| Escalation | active CSI | TX + bring-up | booting | ~350 mA peak | +| ML in progress | active CSI | passive | inferencing | ~450 mA | +| Recovery | sleep | LoRa heartbeat | hard-off | ~30 mA | + +The Pi is treated as the heavyweight worker that **must** be hard-power- +gated — not soft-suspended — when not in use. ARM SoCs leak in +suspend; a 350 mA "off" leakage destroys solar viability. + +### 3.3 Energy budget sketch + +- **Daily load** (sketch, *not measured*): ~1.4 Wh/day assuming Pi wakes + ≤ 2 minutes/day on average, sensor MCU light-sleeps when idle, comms + MCU DTIM-3 most of the time. +- **Daily harvest**: 2 W panel × 4 PSH × 0.7 system efficiency ≈ 5.6 + Wh/day in the seasonal worst case for mid-latitudes. + +Headroom is roughly 4×. If a deployment skews colder/cloudier, or the +inter-MCU bus runs hotter, headroom is 2–3×. SOTA §7 covers whether +the linear-charger + supercap-buffered topology actually delivers this +math, or whether MPPT is needed on a panel this small. + +--- + +## 4. Software shape (proposed) + +### 4.1 Sensor MCU — no_std embedded Rust + +| Concern | Crate(s) | +|----------------------|--------------------------------------------------------------| +| HAL / async runtime | `esp-hal` 1.x + Embassy executor | +| Time / timers | `embassy-time` | +| Static allocations | `heapless` (`Vec`, `String`, `Deque`, MPMC channels) | +| Wire format | `postcard` over `serde` for compact, schema-stable bytes | +| CRC | `crc` crate (already used host-side for the L4 packet check) | +| RF capture | `esp-radio` (the rename of `esp-wifi`) — CSI hooks via PR | +| Inter-MCU bus | `embassy-uart` or `embedded-hal-async` SPI | +| Power management | `esp-hal::system::sleep::*` + light-sleep wake on GPIO/timer | + +Boundary: the sensor MCU does **not** initialize a WiFi stack. It owns +the PHY for CSI capture only. All actual WiFi connectivity is on the +comms MCU. This is the load-bearing simplification of the proposal: it +sidesteps the embassy-on-ESP-IDF ISR-safety question by not running +ESP-IDF on this die at all. + +### 4.2 Comms MCU — std + ESP-IDF Rust + +| Concern | Crate(s) | +|----------------------|--------------------------------------------------------------------------| +| FreeRTOS bindings | `esp-idf-sys` | +| Service abstractions | `esp-idf-svc` (HTTPS, OTA, NVS, mDNS, BLE, MQTT, ESP-NOW) | +| Async runtime | `esp-idf-svc::timer::EspTaskTimerService` (NOT Embassy directly — see §6)| +| TLS | mbedTLS via `esp-idf-svc` | +| Mesh | ESP-WIFI-MESH (or ESP-MESH-LITE — see SOTA §8) | +| OTA | ESP-IDF native OTA (signed images, A/B partitions) | +| LoRa fallback | `lora-phy` or vendor C driver via `esp-idf-sys` | +| Inter-MCU bus | UART driver (`esp-idf-svc::uart`) framed with postcard | +| BLE provisioning | NimBLE via `esp-idf-svc` | + +The comms MCU is the *only* die that needs the full WiFi-stack security +surface. That makes it the obvious place to enforce Secure Boot V2 + +flash encryption + signed OTA. + +### 4.3 Pi Zero 2W — std Rust on Linux + +| Concern | Crate(s) | +|----------------------|-----------------------------------------------------------------------| +| Async runtime | `tokio` | +| QUIC | `quinn` + `rustls` | +| HTTP server (local) | `axum` | +| RPC to comms MCU | `tokio-serial` (UART) or `spidev` (SPI), framed with postcard | +| ML inference | `tract` (ONNX), `candle` (Pytorch-flavored), or `ort` (ONNX Runtime) | +| Persistent storage | `sled` or `redb` | +| OS | Buildroot-based custom image, A/B partitions, dm-verity, signed | + +Crucial constraint: the Pi runs **buildroot**, not Raspberry Pi OS. The +Raspberry Pi Foundation does not officially support secure boot on the +Pi Zero 2W; the secure-boot path is Pi 4/5-only. The cleanest path on a +Pi Zero 2W is buildroot + signed FIT image + dm-verity on the rootfs + +A/B partitions for OTA. See SOTA §9 for the realistic version of this. + +### 4.4 OTA on three dies + +| Die | OTA mechanism | +|--------------|-----------------------------------------------------------------------| +| Sensor MCU | `embassy-boot`-style two-slot OTA, signed images, ed25519 verification| +| Comms MCU | ESP-IDF native OTA, signed by project key, dual app partitions | +| Pi Zero 2W | A/B rootfs, signed FIT, fwupd or homemade `update-agent` binary | + +OTA is the area where the three-tier shape is most defensible. Each die's +update is a separate, independently rollback-able artifact. The comms +MCU acts as the *broker* — it pulls signed images for all three dies, +verifies them, and pushes them onto the sensor MCU and Pi over their +respective buses. + +--- + +## 5. Networking shape (proposed) + +Three concentric rings: + +1. **Inner ring — node-local IPC.** Postcard over UART/SPI between the + three dies. Length-prefixed, CRC-checked, no encryption (it's on a + trace, not a wire). +2. **Middle ring — RuView mesh.** ESP-WIFI-MESH (or ESP-MESH-LITE) + between comms MCUs across nodes, carrying L3 mesh-plane messages + from ADR-081 (TIME_SYNC, ROLE_ASSIGN, CHANNEL_PLAN, FEATURE_DELTA, + HEALTH, ANOMALY_ALERT). Authenticated with HMAC-SHA256 per ADR-032. +3. **Outer ring — backhaul.** QUIC from the Pi to a gateway/cloud + target (`quinn` + `rustls`), with the gateway optionally being + another node's Pi acting as a fusion-relay. LoRa is the *fallback* + ring for heartbeats and recovery commands when the WiFi mesh is + degraded. + +LoRa duty-cycle math (EU868 1% in the relevant sub-band, US915 dwell- +time-only) is friendly to "20 bytes every minute" heartbeats; at SF7, +125 kHz, the airtime is ~40 ms per packet — far under the 36 s/hour +EU868 limit. See SOTA §6 for the citation. + +--- + +## 6. Security posture (proposed) + +The proposal layers four mechanisms on each MCU: + +- **Secure Boot V2** — RSA-3072 or ECDSA signed bootloader, immutable + primary key digest in eFuse. +- **Flash encryption** — AES-XTS-256 with per-device key burned in eFuse, + hardware-isolated. +- **Disabled ROM download** — `DIS_DOWNLOAD_MODE` fuse blown after + provisioning so the device cannot be coerced back into a UART-ROM + state. +- **Signed OTA images** — separate signing key from the secure-boot key, + per-image rollback counter, anti-rollback eFuse counter. + +On the Pi: dm-verity over a read-only rootfs, signed FIT image with the +RPi-foundation-blessed (where possible) bootcode, A/B partitions, and a +signed manifest of the three dies' image hashes shipped together. The +comms MCU validates the manifest before consuming any image. + +This is **complementary** to ADR-032's HMAC-SHA256 + SipHash-2-4 mesh +hardening — those protect frames in flight; Secure Boot + flash +encryption protect images at rest. + +--- + +## 7. Honest critique of this proposal + +This section is required by the project conventions. The companion SOTA +survey expands each of these. + +### 7.1 The cost story is bad before volume + +A single ESP32-S3 node is ~$9 today. A three-tier node is closer to +$40–55. RuView's design point of "many cheap nodes" rewards low BOM. The +three-tier shape is justified only if each node *also* replaces a +sensing-server host (i.e., a Pi or laptop running the sensing pipeline) +that would have cost more than the marginal Pi-on-each-node. In a +deployment with 3 nodes feeding one $80 host, the host already amortizes +across the nodes. In a 50-node deployment, the math changes. + +### 7.2 The embassy-on-ESP-IDF ISR-safety question is real + +The proposal *avoids* this question by giving the sensor MCU a no_std +runtime instead of putting embassy on top of esp-idf-svc. The reason +this matters: per esp-idf-svc maintainers, **embassy-executor is not +ISR-safe** in the esp-idf-svc setup (it relies on `critical-section`, +which on esp-idf-hal is implemented over FreeRTOS task suspension). On +no_std with `esp-hal`, embassy is fine; on top of ESP-IDF, it is not. +The two-MCU split is the cleanest engineering answer to the question; +the alternative is keeping ESP-IDF on the single MCU (today's design) +and not introducing embassy at all. SOTA §3 documents the citation. + +### 7.3 esp-radio replaces esp-wifi, and CSI no_std support is partial + +The crate that the sensor MCU would use to capture CSI (in the +`esp-rs/esp-hal` 1.x ecosystem) was renamed to `esp-radio`. Third-party +`esp-csi-rs` exists and targets no_std but is described as +"early development." The 5-layer kernel today runs on top of ESP-IDF +v5.4 in C — a bird in the hand. Migrating CSI capture to no_std is a +distinct project, not a side effect of the three-tier shape. SOTA §2 +covers the maturity matrix. + +### 7.4 The Pi Zero 2W secure-boot story is weaker than the proposal implies + +The Raspberry Pi Foundation's official secure-boot path is **Pi 4 / Pi 5 +only**, with a USB-rooted RSA chain. There is no official secure-boot +bring-up document for the Pi Zero 2W. Buildroot + signed FIT + dm-verity +gets you most of the threat surface — but the proposal's "Pi 4 + buildroot +is the strongest path" line is not a Pi Zero 2W story. If true secure +boot matters for the deployment, the heavy-compute die should arguably +be a Pi 4 Compute Module (CM4) and not a Pi Zero 2W. SOTA §9 covers it. + +### 7.5 ESP-WIFI-MESH at 50–500 nodes is an open question + +Espressif documents up to 1,000 nodes and 25 layers as theoretical limits +for ESP-WIFI-MESH, with a recommended fan-out of 6 per node. There is +limited public evidence of stable 100+ node deployments in adversarial +RF environments. Comms-MCU mesh handling at scale is *not free*: the +mesh stack runs in the comms MCU's main loop, sharing CPU with TLS, OTA, +and BLE. SOTA §8 covers BLE Mesh / Thread / Zigbee comparison. None of +those replace WiFi-stack-sharing for CSI capture, but they could replace +ESP-WIFI-MESH for control-plane traffic if scale becomes a problem. + +### 7.6 MPPT vs linear charger at 2 W panel + +The proposal's BQ24074-based linear-charger topology is fine for a 2 W +panel; the efficiency loss vs MPPT is real but small at this scale. +At 2 W, the MPPT die (BQ25798) silicon, inductor, and code complexity +costs partly cancel its efficiency gain. SOTA §7 has the math. + +### 7.7 The QUIC outer ring is overkill for the heartbeat case + +QUIC is a strong choice when the Pi has lots of bursty data and is +behind a NAT or on flaky cellular. For a node that wakes 2 minutes/day +and emits a few KB of summarized features, MQTT-over-TLS or even +plain HTTPS is simpler and adequate. QUIC's value goes up if the Pi +also runs bidirectional model updates or large-batch fleet sync. +SOTA §5. + +--- + +## 8. What evidence would justify acting on this proposal + +This section maps to the decision tree in +`docs/research/architecture/decision-tree.md`. The short version: + +1. **Per-node cost ceiling.** Decide the BOM ceiling per node. The + three-tier shape only makes sense above ~$30/node and at deployments + where the host computer is *not* a separate cost. +2. **CSI no_std maturity gate.** `esp-csi-rs` (or the replacement under + `esp-radio`) must demonstrate equivalent capture quality to today's + `esp-wifi-set-csi-rx-cb`-based path on a real ESP32-S3 board, with + ISR-jitter measured. Until this is verified, the sensor-MCU Rust + story is risk. +3. **Inter-MCU bus saturation.** Postcard-framed UART/SPI between the + sensor MCU and comms MCU must carry ADR-018 frames at the target + capture rate without backpressure-induced drops at the sensor MCU. +4. **Pi power-gate budget.** Measured leakage of the gated Pi Zero 2W, + with proven cold-boot wake-up under 5 s, is required before the + energy budget closes. +5. **Mesh scale evidence.** A 12+ node ESP-WIFI-MESH (or alternative) + field test at sustained 1–10 Hz `rv_feature_state_t` upload is + required to validate the middle ring at >>3 nodes. +6. **Secure-boot path on Pi Zero 2W.** Either accept that the Pi cannot + be fully secure-booted, or upgrade the heavy-compute die to a CM4 / + CM5 / Pi 5 if true secure boot is a deployment requirement. + +--- + +## 9. Open questions + +The proposal as written elides answers to these: + +- **Why two ESP32-S3 dies and not one ESP32-S3 plus one ESP32-C6?** The + C6 is RISC-V, has 802.15.4 + WiFi 6, and would let the comms MCU + handle BLE Mesh / Thread / Zigbee natively. The two-S3 split chose + homogeneity and Xtensa toolchain; the C6 split chooses richer + protocol coverage on the comms die. +- **Is the sensor MCU strictly necessary?** Today, the single-MCU node + (ADR-028 / ADR-081) handles CSI capture and ESP-IDF networking on one + S3, in C, and works. The two-MCU-on-board case is justified mainly by + *ISR purity* and *Rust no_std*, not by a missing capability today. +- **Why a Pi Zero 2W rather than the Pi being the gateway?** The + proposal puts a Pi *on every node*. A more conservative shape is one + Pi per *site* (or per cluster of 3–6 nodes), with the nodes staying + single-MCU. That keeps the BOM near today's $9/node for sensors, + isolates heavy compute, and concentrates secure boot on a smaller + number of more capable dies. This is the deployment shape implicit in + ADR-031's sensing-first mode and is worth comparing head-to-head. +- **What does a single 50-node deployment cost** under each of: today's + shape (one S3 + one host), one-Pi-per-site (one S3 + one Pi per ~6 + nodes), and the proposal (3-die-per-node)? The cost crossover point + determines which architecture is correct. + +--- + +## 10. Recommendation + +This document records the proposal accurately. It does not recommend +adopting it. The recommendation, if a decision is forced, is: + +1. **Do not build a three-tier-per-node PCB now.** The current shape + (single ESP32-S3 + ADR-081 5-layer kernel) is the witnessed system. +2. **Investigate one-Pi-per-site as the cheaper variant** (proposal §9 + bullet 3). It captures most of the heavy-compute and QUIC-backhaul + benefits at a fraction of the BOM. +3. **Spend the first chunk of effort on the three "evidence" gates from + §8** — CSI no_std maturity, ESP-WIFI-MESH at scale, and Pi + secure-boot reality — *before* committing to a hardware re-spin. +4. **Reserve the three-tier shape** for a future "RuView Pro" SKU + targeting deployments where per-node BOM is not the dominant cost + and full secure-boot + dm-verity at the edge is mandatory. + +The decision tree document codifies these gates as branch points so +they can be checked off independently rather than as one large +all-or-nothing ADR. + +--- + +## 11. Companion documents + +- **SOTA survey.** `docs/research/sota/2026-Q2-rf-sensing-and-edge-rust.md` + — citations, primary sources, what's true in 2026 for each load-bearing + claim above. +- **Decision tree.** `docs/research/architecture/decision-tree.md` — the + Mermaid map from each load-bearing decision to its dependencies and + ADR slot. +- **Existing implementation plan.** `docs/research/architecture/implementation-plan.md` + — the ESP32-S3 + Pi Zero 2W goal-state plan from 2026-04-02. The + three-tier proposal is most usefully read as an evolution of *that* + plan rather than a replacement of ADR-028. diff --git a/docs/research/sota/2026-Q2-rf-sensing-and-edge-rust.md b/docs/research/sota/2026-Q2-rf-sensing-and-edge-rust.md new file mode 100644 index 00000000..af82a5c4 --- /dev/null +++ b/docs/research/sota/2026-Q2-rf-sensing-and-edge-rust.md @@ -0,0 +1,601 @@ +# SOTA Survey — RF Sensing and Edge Rust (2026 Q2) + +| Field | Value | +|--------------|------------------------------------------------------------------------| +| **Status** | Reference / informs `architecture/three-tier-rust-node.md` | +| **Date** | 2026-04-25 | +| **Author** | goal-planner research agent | +| **Scope** | What's true in 2026, what holds up in the three-tier proposal, what to reconsider | +| **Word target** | ~3,500 words | + +> **Conventions.** Each section answers (a) what's true in 2026, (b) what +> claims in the three-tier proposal hold up, (c) what to reconsider, and +> (d) primary references. Where no primary source could be located, the +> claim is explicitly marked **"no primary source found, mark as +> conjecture."** + +--- + +## 1. WiFi CSI through-wall pose / occupancy estimation + +### 1.1 What's true in 2026 + +The CSI-to-pose literature has matured along three orthogonal axes since +DensePose-from-WiFi (2022) lit the fuse: + +- **Lightweight architectures.** WiFlow (Feb 2026) demonstrated a + spatio-temporal-decoupled network with 4.82 M parameters, 0.47 GFLOPs, + PCK@20 = 97.0% and MPJPE ≈ 8 mm on the random-split MM-Fi benchmark, + 3–4× smaller than WPformer and ~25× smaller than WiSPPN. +- **Domain generalization.** PerceptAlign (DT-Pose) and the + cross-environment evaluation in MM-Fi made the cross-subject and + cross-layout numbers honest. PerceptAlign reports MPJPE 222 mm on Scene + 4 and 317 mm on Scene 5 in cross-layout test, beating prior SOTA by + >50% — but those are still order-of-magnitude worse than in-domain. +- **Topological priors.** GraphPose-Fi (2025) and topology-constrained + decoders (DT-Pose) explicitly use the human skeleton as a graph, + improving plausibility under occlusion. +- **Multistatic geometry.** RuView's own ADR-029/ADR-031 line is the + practical multistatic story; ISAC-Fi (Aug 2024) and the multistatic + ISAC-MIMO papers (2024–2025) describe similar geometry as a 6G research + topic. IEEE 802.11bf-2025 (published 26 September 2025) is the + standardization vector. + +### 1.2 What holds up + +The proposal's claim that "3–6 ESP32-S3 nodes can do meaningful pose +work" is consistent with WiFlow's network sizes (4.82 M params, INT8 +~5 MB) and with the MM-Fi multi-link benchmark. The CSI pipeline does +not need a Pi *per node* to run inference; one Pi per cluster is +sufficient. RuView's existing ESP32-mesh + sensing-server already +demonstrates the shape. + +### 1.3 What to reconsider + +- **Through-wall claims are still aggressive.** Published WiFi sensing + papers focus on line-of-sight or single-wall cases; published + through-multiple-walls numbers in 2025–2026 are scarce. The + three-tier proposal's "through-wall" framing should be tempered to + "through-thin-wall" without primary evidence. *No primary source + found for through-multiple-walls, mark as conjecture.* +- **Nexmon-on-Pi is not obviously a win.** Nexmon CSI on a Pi 4 captures + up to 80 MHz BW on Broadcom chips and gives more subcarriers per frame + than ESP32, but the Pi platform has no equivalent of ESP32 Secure Boot + V2, and the Broadcom firmware-patch path is fragile across kernel + releases. RuView's existing ESP32-S3 mesh already beats Nexmon-on-Pi + on cost, security posture, and provisioning. +- **USRP/SDR is overkill for occupancy and pose**, and is far over the + proposal's BOM ceiling. It would only become attractive for + research-grade beamforming or sub-cm ranging. + +### 1.4 Primary references + +- WiFlow: [arXiv:2602.08661](https://arxiv.org/html/2602.08661) — Feb 2026. +- DT-Pose: [arXiv:2501.09411](https://arxiv.org/abs/2501.09411) — Jan 2025. +- GraphPose-Fi: [arXiv:2511.19105](https://arxiv.org/abs/2511.19105) — Nov 2025. +- Geometry-aware cross-layout HPE: [arXiv:2601.12252](https://arxiv.org/html/2601.12252). +- Nexmon CSI: [seemoo-lab/nexmon_csi](https://github.com/seemoo-lab/nexmon_csi). + +--- + +## 2. IEEE 802.11bf and multistatic ISAC + +### 2.1 What's true in 2026 + +**IEEE Std 802.11bf-2025 was published 26 September 2025** and is the +ratified amendment for WLAN sensing in license-exempt bands 1–7.125 GHz +and >45 GHz. The 3rd SA Ballot Recirculation closed 16 January 2025 +with 98% approval. P802.11bf/D8.0 (March 2025) was the last public +draft. The standard defines sensing operation on top of HE/EHT PHYs and +on the DMG/EDMG (60 GHz) PHYs. + +3GPP RAN #108 (June 2025) admitted ISAC into the 6G study scope as a +"Day 1" 6G feature. ISAC-Fi (Aug 2024) demonstrated *monostatic* sensing +over commodity WiFi by repurposing the communication waveform. +Multistatic ISAC over cell-free MIMO (2024–2025) is the analytical +direction. + +### 2.2 What holds up + +The three-tier proposal's framing of "WiFi mesh + multistatic sensing" +is well-aligned with where the standard is moving. ADR-029's existing +multistatic mode and ADR-073's multifrequency mesh scan are the kind of +pre-standard implementations that 802.11bf is now codifying. + +### 2.3 What to reconsider + +- **802.11bf does not turn an ESP32 into an 802.11bf sensor.** It + defines a *protocol* for sensing-aware exchanges between APs and + STAs. Off-the-shelf ESP32-S3 silicon was designed before the standard; + CSI extraction on ESP32 will keep being a side channel, not a + standards-blessed feature, until Espressif ships a chip with the + 802.11bf MAC primitives. *No primary source found for an Espressif + 802.11bf-aware product, mark as conjecture.* +- **ISAC-Fi's monostatic-on-commodity-WiFi result** is interesting but + requires PHY changes; not a path to ESP32 today. +- **The proposal should claim "802.11bf-compatible feature set" rather + than "802.11bf-compliant"** until silicon exists. + +### 2.4 Primary references + +- IEEE 802.11bf-2025: [standards.ieee.org](https://standards.ieee.org/ieee/802.11bf/11574/). +- ISAC-Fi: [arXiv:2408.09851](https://arxiv.org/abs/2408.09851). +- IEEE 802.11bf overview paper: [arXiv:2207.04859](https://arxiv.org/pdf/2207.04859). +- NIST overview: [nist.gov/publications/ieee-80211bf](https://www.nist.gov/publications/ieee-80211bf-enabling-widespread-adoption-wi-fi-sensing). + +--- + +## 3. Embedded Rust ecosystem for ESP32-S3 (2026) + +### 3.1 What's true in 2026 + +The esp-rs ecosystem has matured but rebranded: + +- **`esp-hal` is at 1.x.** `esp-hal 1.0.0` shipped October 2023; `1.1.0` + was released April 2024. Stabilized HAL APIs, async drivers, but with + the constraint that "async drivers can no longer be sent between + cores and executors." +- **`esp-wifi` was renamed to `esp-radio`** in the 1.x line. The + scheduler functionality moved to a new crate `esp-rtos`. Existing + `esp-wifi` references in tutorials are pre-1.x. +- **Embassy on ESP** is split: on no_std ESP-HAL it's a first-class + citizen, but the Embassy team and Espressif explicitly steer Embassy + use *toward* `esp-rtos` over time. +- **Embassy on top of `esp-idf-svc` (std)** has a documented gotcha: + **embassy-executor is not ISR-safe** because it depends on + `critical-section`, which `esp-idf-hal` implements over FreeRTOS task + suspension. The recommended std executor is `edge-executor` or the + built-in `esp-idf-hal` executor. +- **CSI capture on no_std** via `esp-csi-rs` (third-party crate) exists + but is documented as "still in early development." The + production-blessed CSI path remains `esp_wifi_set_csi_rx_cb()` in + ESP-IDF C — exactly what `firmware/esp32-csi-node/main/csi_collector.c` + uses today. + +### 3.2 What holds up + +The three-tier proposal's choice to put the **sensor MCU on no_std** +(`esp-hal` + Embassy) avoids the ESP-IDF ISR-safety question entirely, +which is the right architectural answer to a real problem. The proposal +is correct that `heapless` + `postcard` + `embassy-time` is the modern +no_std default. + +### 3.3 What to reconsider + +- **Update the toolchain names.** The proposal lists `esp-wifi`; in 1.x + this is `esp-radio`. It lists `embassy-executor` on the comms MCU + by implication; on the comms MCU the executor must be + `edge-executor` or `esp-idf-hal`'s built-in executor, not Embassy. +- **CSI maturity is the gating risk.** `esp-csi-rs` is early + development and the production CSI path is still C. Migrating CSI to + no_std Rust is a project unto itself, not a free side effect of + splitting the dies. +- **`esp-idf-svc` parity with C ESP-IDF is good but not 100%.** OTA, + HTTPS, NVS, BLE provisioning, ESP-WIFI-MESH all have wrappers. Some + niche ESP-IDF C APIs still need `esp-idf-sys` raw FFI. This is fine + but means the comms MCU is not "all-Rust" — there's a layer of unsafe + wrapping at the bottom. + +### 3.4 Primary references + +- esp-hal releases: [github.com/esp-rs/esp-hal/releases](https://github.com/esp-rs/esp-hal/releases). +- esp-idf-svc CHANGELOG: [github.com/esp-rs/esp-idf-svc/blob/master/CHANGELOG.md](https://github.com/esp-rs/esp-idf-svc/blob/master/CHANGELOG.md). +- Embassy ISR-safety gotcha: [esp-idf-svc#342](https://github.com/esp-rs/esp-idf-svc/issues/342) and esp-idf-svc CHANGELOG. +- esp-csi-rs crate: [crates.io/crates/esp-csi-rs](https://crates.io/crates/esp-csi-rs). +- Embassy Book: [embassy.dev/book](https://embassy.dev/book/). + +--- + +## 4. Edge ML for CSI on ESP32-class hardware + +### 4.1 What's true in 2026 + +- **TFLite Micro on ESP32-S3** is the most-cited path. Reported + numbers: wake-word inference at 50–60 ms latency, model size ~240 KB + flash, ~350 KB RAM. INT8 quantization reportedly delivers >6× speedup + over float on S3. Espressif's `esp-tflite-micro` is the reference + port. +- **`tract`** (Sonos's pure-Rust ONNX/NNEF runtime) targets std Linux + primarily; there is no widely-adopted no_std no-alloc port. +- **`candle`** (Hugging Face's Pytorch-flavored Rust ML library) is std + Linux/macOS/Windows; not designed for MCU class. +- **ONNX Runtime (`ort` Rust binding)** is a wrapper over the C++ + runtime; on ARMv8 (Pi Zero 2W) it works, on Xtensa it does not. +- **ESP-DL** is Espressif's own DL framework for ESP32-S2/S3, optimized + for the AI extensions of the Xtensa LX7 (which ESP32-S3 has). It is C, + not Rust. + +For a 4.82 M-param INT8 WiFlow at 0.47 GFLOPs: + +- On a Pi Zero 2W (Cortex-A53 quad, NEON), inference is plausibly in + the 50–100 ms range. *No primary measurement found for WiFlow on Pi + Zero 2W; mark as conjecture.* +- On an ESP32-S3 (Xtensa LX7, 240 MHz, AI extensions), even INT8 4.82M + is outside the 8 MB flash + 8 MB PSRAM envelope when intermediate + tensors are counted. WiFlow on S3 would require additional pruning or + a smaller model class. + +### 4.2 What holds up + +The proposal's split between "sensor MCU does ISR-clean DSP" and "Pi +runs the model" is the right shape. ML inference at the WiFlow scale is +*not* an ESP32 workload in 2026. + +### 4.3 What to reconsider + +- **The sensor MCU's ML role should be tiny-feature inference, not + pose.** Motion classification, presence binary, anomaly thresholding — + the ADR-039 Tier-0/Tier-1 outputs — fit on ESP32-S3 with TFLite Micro + or hand-written DSP. They do not fit `tract` or `candle` no_std. +- **For Rust-on-MCU-ML**, the realistic path is hand-rolled INT8 + inference (RuView's `wifi-densepose-nn` already has FFI hooks) or a + Rust port of a tiny TFLM-style runtime. **No mainstream Rust + no_std-no_alloc ONNX runtime exists in production at 2026 Q2.** +- **The Pi Zero 2W's 1 GB RAM is fine for WiFlow but tight for larger + pose models.** A CM4/CM5 with 4 GB unlocks Hugging-Face-class models; + whether the deployment needs that is a use-case question. + +### 4.4 Primary references + +- esp-tflite-micro: [github.com/espressif/esp-tflite-micro](https://github.com/espressif/esp-tflite-micro). +- ESP32-S3 TFLite Micro practical guide: [zediot.com](https://zediot.com/blog/esp32-s3-tensorflow-lite-micro/). +- WiFlow architecture (parameters/FLOPs): [arXiv:2602.08661](https://arxiv.org/html/2602.08661). +- ESP32-S3 TinyML INT8 speedup: [zediot.com TinyML optimization](https://zediot.com/blog/esp32-s3-tinyml-optimization/). + +--- + +## 5. QUIC for IoT backhaul + +### 5.1 What's true in 2026 + +- **`quinn` + `rustls` is the production Rust QUIC stack.** Both target + std Linux, both work fine on ARMv8 (Pi Zero 2W). `rustls` is + FIPS-validatable via the AWS-LC backend. +- **MQTT-over-QUIC is the emerging IoT pattern.** EMQX 5.x and NanoMQ + both ship MQTT-over-QUIC; published benchmarks show comparable or + better tail-latency than MQTT-over-TLS-over-TCP, especially under + packet loss and mobile-network handoff conditions. +- **For low-rate telemetry** (a few KB at minute granularity), the + difference between QUIC and TLS-over-TCP is small in steady-state. The + win is in connection-establishment cost (~1 RTT vs ~3 RTT) and in + graceful behavior across IP changes. + +### 5.2 What holds up + +The proposal's choice of `quinn` for the Pi-to-cloud ring is sound and +matches what EMQX, NanoMQ, and Microsoft (MsQuic) are converging on. +`rustls` is a strong default. + +### 5.3 What to reconsider + +- **Heartbeat-only deployments don't need QUIC.** If the Pi wakes 2 + minutes/day to push aggregated features, an MQTT-over-TLS publish on + port 8883 is one library, well-supported, and cheaper to operate. +- **QUIC pays off when bidirectional or large-payload traffic is real.** + Model updates, fleet sync, on-demand video — these are the cases + where the 1-RTT handshake and connection-migration matter. +- **Don't terminate QUIC inside the comms MCU.** ESP-IDF has no + production QUIC stack; QUIC belongs on the Pi or gateway, not on the + MCU. + +### 5.4 Primary references + +- quinn: [docs.rs/quinn](https://docs.rs/quinn). +- MQTT-over-QUIC IIoT evaluation: [MDPI Sensors 21:5737](https://www.mdpi.com/1424-8220/21/17/5737). +- EMQX MQTT trends: [emqx.com 2025 trends](https://www.emqx.com/en/blog/mqtt-trends-for-2025-and-beyond). + +--- + +## 6. LoRa for sensor mesh fallback + +### 6.1 What's true in 2026 + +- **SX1262** — Semtech's mainstream Gen-2 sub-GHz LoRa transceiver, + +22 dBm TX, 4.2 mA RX. The default for low-rate, long-range battery + applications. Mature ecosystem, low BOM cost, supported by `lora-phy` + and most Meshtastic boards. +- **LR1110** — adds GNSS scan + WiFi scan. Designed for asset-tracking + workflows where the device opportunistically reports GNSS+WiFi + fingerprints to a cloud-side resolver. +- **LR1121** — Gen-3, sub-GHz + 2.4 GHz + S/L-band satellite. ~4.5 dB + better Sub-GHz sensitivity vs SX1262. Cost premium and more system + complexity. +- **Duty cycles**: EU868 imposes 1% in most sub-bands and 0.1% in the + 863–865 MHz sub-band. US915 uses dwell-time (400 ms) instead of + duty-cycle limits. Raw-LoRa peer-to-peer must still respect the + regional regulatory constraint, even though LoRaWAN is not on the + wire. + +For a 20-byte heartbeat at SF7, BW 125 kHz, the airtime is ~40 ms. At +the EU868 1% duty cycle, that's 36 s/hour available — more than 900 +heartbeats per hour theoretical max. + +### 6.2 What holds up + +SX1262 for fallback heartbeats is the correct, well-priced choice. The +proposal's "bytes per minute" framing is well within EU868 1% and US915 +dwell-time budgets. + +### 6.3 What to reconsider + +- **LR1121 is not justified for fallback heartbeats.** The + satellite/2.4 GHz capabilities are deployment-shape choices, not + fallback-radio choices. +- **Raw LoRa P2P, not LoRaWAN.** The proposal already implies P2P; this + should be explicit. LoRaWAN gateways add infrastructure cost without + improving fallback reliability, and they don't help direct + node-to-node fallback recovery. +- **LoRa cannot carry CSI features at any meaningful rate.** SF7 BW125 + raw rate is ~5.5 kbps; ADR-081 `rv_feature_state_t` at 5 Hz is 2.4 + kbps gross, 480 B/s, well within budget if compressed and gated. + Raw ADR-018 frames at 100 KB/s/node are not LoRa-shaped. + +### 6.4 Primary references + +- Semtech SX1262 datasheet via DigiKey: [forum.digikey.com LoRa breakdown](https://forum.digikey.com/t/lora-hardware-breakdown-key-chips-and-modules-for-iot-applications/52243). +- LR1121 / SX1262 / LR2021 comparison: [nicerf.com](https://www.nicerf.com/news/lr2021-vs-sx1262-vs-lr1121.html). +- TTN duty cycle reference: [thethingsnetwork.org](https://www.thethingsnetwork.org/docs/lorawan/duty-cycle/). +- TTN regional EU863-870: [thethingsnetwork.org regional](https://www.thethingsnetwork.org/docs/lorawan/regional-parameters/eu868/). + +--- + +## 7. Solar + Li-ion power-path for 350 mA bursty IoT loads + +### 7.1 What's true in 2026 + +- **TI BQ24074** — small, simple, linear charger; dual input + (DC + USB); has the input-voltage-limit feature that crudely + approximates MPPT for small panels. Adafruit's "Universal" charger + product is built on it. Low silicon cost, no inductors. +- **TI BQ25798** — newer (2025-class) buck-boost charger with **true + Voc-sampling MPPT**, dual-input, supports 1–4S Li-ion, 5 A capability, + 3.6–24 V input range. Adafruit launched a development module in May + 2025. +- **Analog Devices LTC4015** — multi-chemistry, two-phase MPPT (15-min + global sweep + 1-second local dither). High-cost, high-capability; + overkill for sub-5 W panels. +- **Silergy SPV1050** — purpose-built for sub-watt IoT solar (e.g. + energy-harvesting sensors). Constant-voltage-ratio MPPT, 70 mA solar + / 100 mA USB charge limit. Best for *very small* (<1 W) panels and + micro-energy budgets. + +### 7.2 What holds up + +For a 2 W panel and a node-average load that bursts to 350 mA, the +BQ24074 (linear) is sufficient. The proposal's choice is fine. + +### 7.3 What to reconsider + +- **MPPT becomes attractive when panel power × variability is high.** + At 2 W, the efficiency delta between linear-with-input-voltage-limit + and true MPPT is on the order of 10–20% in cloudy conditions. For a + 4× harvest-to-load headroom, this is not the binding constraint. +- **If the deployment ever scales to a 5–10 W panel** (e.g., to support + a Pi that wakes more often than 2 minutes/day), BQ25798's MPPT pays + off. +- **A super-cap on the input rail** is cheap insurance against the Pi's + ~350 mA boot inrush; the proposal should consider one. + +### 7.4 Primary references + +- BQ25798 launch coverage (Adafruit, May 2025): [blog.adafruit.com](https://blog.adafruit.com/2025/05/15/eye-on-npi-ti-bq25798-i2c-controlled-1-to-4-cell-5-a-buck-boost-battery-charger-mppt-for-solar-panels-eyeonnpi-digikey-digikey-adafruit/). +- BQ25798 datasheet: [ti.com](https://www.ti.com/lit/ds/symlink/bq25798.pdf). +- BQ24074 product (Adafruit): [adafruit.com/product/4755](https://www.adafruit.com/product/4755). +- SPV1050 application reference: [DFRobot wiki](https://wiki.dfrobot.com/dfr0579/). + +--- + +## 8. Mesh routing alternatives to ESP-WIFI-MESH + +### 8.1 What's true in 2026 + +- **ESP-WIFI-MESH** documents support up to ~1,000 nodes in 25 layers, + with a recommended fan-out of 6/node (hardware AP-mode limit is 10). + Espressif's own newer `esp-mesh-lite` is the lighter, IP-layer-routable + alternative. +- **Thread / OpenThread** — IPv6-native 802.15.4 mesh, self-healing, + designed for 250+ node networks per partition. Strong scalability and + security story. Hardware: ESP32-C6, ESP32-H2, Nordic nRF52840, Silicon + Labs EFR32. +- **Zigbee** — 802.15.4 like Thread, but with a much older application + layer. Scales reasonably to ~100 nodes in practice, with congestion + challenges in dense deployments. +- **BLE Mesh** — managed flooding, optimized for sporadic traffic. Good + for ~50 nodes; not the right shape for always-on infrastructure. + +### 8.2 What holds up + +For < 25-node deployments, ESP-WIFI-MESH (or `esp-mesh-lite`) is the +direct continuation of today's RuView mesh and the proposal's choice is +defensible. + +### 8.3 What to reconsider + +- **For 50–500 node deployments, Thread is the better fit.** It was + designed for that scale; ESP-WIFI-MESH was not. Using Thread *for the + control plane* (TIME_SYNC, ROLE_ASSIGN, CHANNEL_PLAN, HEALTH) while + keeping ADR-018 CSI frames on WiFi is a viable hybrid. +- **The comms MCU choice changes.** ESP-WIFI-MESH stays on ESP32-S3. + Thread/Zigbee/BLE Mesh prefer ESP32-C6 (which has 802.15.4 + WiFi 6) + or a separate radio. The proposal's two-S3 die choice forecloses on + this hybrid; a one-S3 + one-C6 split is worth evaluating. +- **Thread's IPv6-native routing pairs nicely with QUIC.** Both speak + IP; ESP-WIFI-MESH does not (it uses its own L2-style routing and + bridges IP). + +### 8.4 Primary references + +- ESP-WIFI-MESH overview: [docs.espressif.com](https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/esp-wifi-mesh.html). +- esp-mesh-lite: [github.com/espressif/esp-mesh-lite](https://github.com/espressif/esp-mesh-lite). +- Silicon Labs benchmarking: [silabs.com mesh-performance](https://www.silabs.com/wireless/multiprotocol/mesh-performance). +- Bluetooth/Thread/Zigbee comparison: [eetimes.com](https://www.eetimes.com/bluetooth-thread-zigbee-mesh-compared/). +- Zigbee vs Matter-over-Thread (2026): [arXiv:2603.04221](https://arxiv.org/html/2603.04221v1). + +--- + +## 9. Pi Zero 2W secure-boot reality + +### 9.1 What's true in 2026 + +- **Raspberry Pi Foundation's official secure-boot path is Pi 4 / Pi 5 + / CM4.** It uses the RPi-bootloader ROM, USB-rooted RSA chain, and + the `usbboot` tooling. There is no equivalent on the Pi Zero 2W + (BCM2710A1). +- **Buildroot does support Pi Zero 2W** (April 2025 defconfig update + uses the same ARM64 `bcm2711_defconfig` as the Pi 4). +- **dm-verity + signed FIT image** is the realistic Pi-Zero-2W path: + buildroot produces a read-only rootfs, dm-verity covers it with a + signed Merkle tree, the boot partition has signed kernel/initramfs. + This delivers integrity but not "secure boot" in the immutable-ROM + sense. +- **A/B partitions for OTA** is straightforward in buildroot. + `swupdate` and `RAUC` are the well-known frameworks; both work on Pi + Zero 2W. + +### 9.2 What holds up + +The proposal's "buildroot, not Raspberry Pi OS" instinct is correct. +RPi OS does not support secure boot on any Pi. + +### 9.3 What to reconsider + +- **The "Pi 4 + buildroot is the strongest path" line is true but not a + Pi Zero 2W story.** If true secure boot with an immutable ROM-rooted + chain is required, the heavy-compute die should be a CM4 or Pi 5, not + a Pi Zero 2W. +- **For the proposal's deployment shape** (mostly-off Pi, infrequent + wake-ups), dm-verity + signed FIT + A/B is probably enough threat + cover and avoids the cost of a CM4. Document this as an explicit + tradeoff, not as "the strongest path." +- **`fwupd` is the package-manager-style update agent**; or a + self-rolled "update-agent" binary signed by the project key. Either + works; project-style fits with the homogeneous Rust toolchain better. + +### 9.4 Primary references + +- Raspberry Pi USB-boot secure-boot example: [github.com/raspberrypi/usbboot](https://github.com/raspberrypi/usbboot/blob/master/secure-boot-example/README.md). +- Raspberry Pi forum on secure boot: [forums.raspberrypi.com 352061](https://forums.raspberrypi.com/viewtopic.php?t=352061). +- Buildroot Pi Zero 2W defconfig (April 2025): [lists.buildroot.org](https://lists.buildroot.org/pipermail/buildroot/2025-April/776753.html). + +--- + +## 10. Cross-cutting takeaways + +A short list of items that affect more than one section: + +1. **The biggest single risk in the proposal is the no_std CSI maturity + gate.** If `esp-csi-rs` (or whatever replaces it under `esp-radio`) + does not match `esp_wifi_set_csi_rx_cb` in capture quality and + ISR-jitter, the sensor-MCU shape collapses back to "C ESP-IDF on the + sensor MCU too" and the value of the split shrinks. +2. **The cost story improves dramatically if the heavy-compute die is + shared across nodes.** "One Pi per cluster of 6" is closer to today's + $9-per-sensor BOM at the per-sensor edge while still adding the + QUIC/ML/secure-boot story at the cluster level. +3. **IEEE 802.11bf-2025's ratification** changes the regulatory and + ecosystem landscape but does not change what off-the-shelf ESP32 + silicon can do today. RuView's pre-standard work (ADR-029, ADR-073, + ADR-081) is well-aligned with the standard's direction; nothing in + the proposal makes it more or less compatible. +4. **The right "comms MCU" might be ESP32-C6 instead of a second S3.** + C6 has 802.15.4 (Thread/Zigbee), WiFi 6, and BLE 5.4. For a + deployment that scales beyond ~25 nodes, the Thread control plane is + a meaningful upgrade. +5. **Power gating the Pi is the load-bearing power decision.** Soft + suspend leaks; hard FET cut does not. The proposal's instinct is + right, but the supercap/transient story has to be designed in. + +--- + +## 11. Items where no primary source was found + +This section is required by the project conventions and lists each +non-trivial claim where a primary source could not be located in this +research pass: + +- **Through-multiple-walls CSI pose accuracy at room scale.** Published + papers focus on line-of-sight or single-wall environments. *Mark as + conjecture for now.* +- **WiFlow inference latency on Pi Zero 2W (Cortex-A53).** Estimated at + 50–100 ms; no measurement found. *Mark as conjecture; benchmark + before claiming.* +- **Espressif silicon roadmap for 802.11bf-aware MAC primitives.** No + public announcement from Espressif as of 2026 Q2. *Mark as + conjecture.* +- **Pi Zero 2W gated cold-boot wake-up time under 5 s with the proposed + buildroot image.** Mentioned in the proposal as a constraint, no + measurement found. *Mark as benchmark target.* +- **ESP-WIFI-MESH stable-state tested deployment beyond ~25 nodes.** + Espressif documents 1,000-node theoretical ceilings but published + third-party deployment data at scale is sparse. *Mark as conjecture + pending field test.* + +--- + +## 12. Source list + +(Primary references are inlined per-section. This is the unique +domains list for quick reuse.) + +- IEEE Standards Association — `standards.ieee.org` +- arXiv — `arxiv.org` +- IEEE Xplore — `ieeexplore.ieee.org` +- Espressif documentation — `docs.espressif.com` +- Espressif GitHub — `github.com/espressif` +- esp-rs project — `github.com/esp-rs`, `crates.io/crates/esp-csi-rs`, + `docs.rs/esp-idf-hal` +- Embassy project — `embassy.dev` +- The Things Network — `thethingsnetwork.org` +- Texas Instruments — `ti.com` +- Adafruit — `adafruit.com`, `blog.adafruit.com` +- Buildroot — `lists.buildroot.org` +- Silicon Labs — `silabs.com` +- DigiKey forum — `forum.digikey.com` +- NIST — `nist.gov` +- MDPI Sensors — `mdpi.com` +- EMQ technical blog — `emqx.com` +- Raspberry Pi forum / GitHub — `forums.raspberrypi.com`, + `github.com/raspberrypi/usbboot` +- nicerf comparison guide — `nicerf.com` +- DFRobot wiki — `wiki.dfrobot.com` + +--- + +## Sources + +- [WiFlow: A Lightweight WiFi-based Continuous Human Pose Estimation Network](https://arxiv.org/html/2602.08661) +- [Towards Robust and Realistic Human Pose Estimation via WiFi Signals (DT-Pose)](https://arxiv.org/abs/2501.09411) +- [Graph-based 3D Human Pose Estimation using WiFi Signals (GraphPose-Fi)](https://arxiv.org/abs/2511.19105) +- [IEEE 802.11bf-2025](https://standards.ieee.org/ieee/802.11bf/11574/) +- [An Overview on IEEE 802.11bf: WLAN Sensing](https://arxiv.org/pdf/2207.04859) +- [IEEE 802.11bf NIST page](https://www.nist.gov/publications/ieee-80211bf-enabling-widespread-adoption-wi-fi-sensing) +- [ISAC-Fi: Enabling Full-Fledged Monostatic Sensing Over Wi-Fi](https://arxiv.org/abs/2408.09851) +- [Multistatic ISAC Macro–Micro Cooperation](https://www.mdpi.com/1424-8220/24/8/2498) +- [esp-rs/esp-hal releases](https://github.com/esp-rs/esp-hal/releases) +- [esp-idf-svc CHANGELOG](https://github.com/esp-rs/esp-idf-svc/blob/master/CHANGELOG.md) +- [esp-idf-svc Embassy ISR-safety issue #342](https://github.com/esp-rs/esp-idf-svc/issues/342) +- [esp-csi-rs crate](https://crates.io/crates/esp-csi-rs) +- [Embassy Book](https://embassy.dev/book/) +- [esp-tflite-micro](https://github.com/espressif/esp-tflite-micro) +- [ESP32-S3 TFLite Micro practical guide](https://zediot.com/blog/esp32-s3-tensorflow-lite-micro/) +- [ESP32-S3 TinyML Optimization](https://zediot.com/blog/esp32-s3-tinyml-optimization/) +- [quinn QUIC](https://docs.rs/quinn) +- [MQTT-over-QUIC IIoT evaluation (MDPI)](https://www.mdpi.com/1424-8220/21/17/5737) +- [MQTT trends for 2025 (EMQ)](https://www.emqx.com/en/blog/mqtt-trends-for-2025-and-beyond) +- [LoRa SX1262 / LR1121 / LR2021 comparison](https://www.nicerf.com/news/lr2021-vs-sx1262-vs-lr1121.html) +- [LoRa hardware breakdown (DigiKey)](https://forum.digikey.com/t/lora-hardware-breakdown-key-chips-and-modules-for-iot-applications/52243) +- [LoRaWAN duty cycle (TTN)](https://www.thethingsnetwork.org/docs/lorawan/duty-cycle/) +- [LoRaWAN regional EU868 (TTN)](https://www.thethingsnetwork.org/docs/lorawan/regional-parameters/eu868/) +- [BQ25798 launch coverage (Adafruit/DigiKey)](https://blog.adafruit.com/2025/05/15/eye-on-npi-ti-bq25798-i2c-controlled-1-to-4-cell-5-a-buck-boost-battery-charger-mppt-for-solar-panels-eyeonnpi-digikey-digikey-adafruit/) +- [BQ25798 datasheet](https://www.ti.com/lit/ds/symlink/bq25798.pdf) +- [BQ24074 product page](https://www.adafruit.com/product/4755) +- [SPV1050 reference](https://wiki.dfrobot.com/dfr0579/) +- [ESP-WIFI-MESH guide](https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/esp-wifi-mesh.html) +- [esp-mesh-lite](https://github.com/espressif/esp-mesh-lite) +- [Silicon Labs mesh benchmarking](https://www.silabs.com/wireless/multiprotocol/mesh-performance) +- [Bluetooth/Thread/Zigbee comparison (EE Times)](https://www.eetimes.com/bluetooth-thread-zigbee-mesh-compared/) +- [Zigbee vs Matter-over-Thread (arXiv 2603.04221)](https://arxiv.org/html/2603.04221v1) +- [Raspberry Pi USB-boot secure-boot example](https://github.com/raspberrypi/usbboot/blob/master/secure-boot-example/README.md) +- [Raspberry Pi forum: secure boot](https://forums.raspberrypi.com/viewtopic.php?t=352061) +- [Buildroot Pi Zero 2 W defconfig (April 2025)](https://lists.buildroot.org/pipermail/buildroot/2025-April/776753.html) +- [Nexmon CSI](https://github.com/seemoo-lab/nexmon_csi)