fix(docker): UDP relay for multi-source ESP32 on Docker Desktop Windows (#502)

Docker Desktop on Windows demultiplexes inbound UDP from multiple source
IPs onto a single virtual socket, silently dropping packets from all but
one ESP32 node. This makes multi-node sensing setups appear to work
(WebSocket connects, packets flow on the host) while only one node's CSI
ever reaches the container.

Adds scripts/udp-relay.py (stdlib only) which collapses multi-source UDP
to a single loopback source so Docker's forwarding accepts every packet.
Verified locally: 6 packets from 3 distinct source ports all arrive at
the receiver from a single relay socket.

Updates docker/docker-compose.yml with an inline comment pointing
Windows users at the relay + 5006:5005 mapping. Linux/macOS hosts are
unaffected and need no changes.

Also documents the workaround alongside fixes for #188 (UI 404 from
relative --ui-path) and #438 (boot loop on --edge-tier 1/2 against
pre-v0.4.3.1 firmware) as new sections 9-11 of docs/TROUBLESHOOTING.md.
Supersedes the docs-only PR #413.

Closes #374, #386
Refs #188, #438, #301
This commit is contained in:
rUv 2026-05-17 18:01:44 -04:00 committed by GitHub
parent e22a24714a
commit d33962eff2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 187 additions and 1 deletions

View file

@ -9,7 +9,18 @@ services:
ports:
- "3000:3000" # REST API
- "3001:3001" # WebSocket
- "5005:5005/udp" # ESP32 UDP
# ESP32 UDP. On Linux/macOS this works with multiple ESP32 nodes out of
# the box. On Docker Desktop for Windows, multi-source UDP is collapsed
# to one source IP at the WSL/Hyper-V boundary, so all-but-one node's
# frames are silently dropped (issue #374, #386).
#
# Windows workaround: change this to "5006:5005/udp" and run the host
# relay so every datagram arrives from the same loopback source:
#
# python scripts/udp-relay.py --listen-port 5005 --forward-port 5006
#
# See docs/TROUBLESHOOTING.md §9 for details.
- "5005:5005/udp"
environment:
- RUST_LOG=info
# CSI_SOURCE controls the data source for the sensing server.

View file

@ -109,3 +109,75 @@ ssh thyhack@100.90.238.87
**Symptom:** Plugging into the right USB-C port (when facing the board with USB-C toward you) shows no serial device on the host.
**Fix:** Use the left USB-C port. On most ESP32-S3-DevKitC boards, the left port is the USB-to-UART bridge (CP2102/CH340) used for flashing and serial monitor. The right port is the native USB (USB-JTAG) which requires different drivers and isn't used by the RuView firmware.
---
## 9. Docker Desktop on Windows drops UDP from multiple ESP32 nodes
**Symptom:** Two or more ESP32 nodes are flashed, provisioned, and visibly transmit on the network — `tcpdump`/Wireshark on the Windows host shows datagrams from every node — but inside the Docker container only one source IP arrives. `/api/v1/sensing/latest` shows a single node and the live UI freezes or only tracks one body. Reported in #374 (4-node bench) and reproduced in #386 (6-node demo, RuView v0.7.0).
**Root cause:** Docker Desktop on Windows runs the engine inside a WSL2 / Hyper-V VM. Inbound UDP from the host LAN is forwarded through `vpnkit` / `vEthernet` and the multi-source-IP datagrams are demultiplexed onto a single virtual socket. The first source-IP "wins"; subsequent unique sources are silently dropped at the VM boundary. This is a Docker Desktop limitation, not a sensing-server bug — `host.docker.internal` and `--network host` do not help (host networking is not implemented for the Linux engine on Windows).
**Fix:** Run the bundled UDP relay on the host so every forwarded datagram arrives from the same loopback source IP, which Docker passes through unchanged.
```powershell
# 1. Start the relay (PowerShell or any terminal)
python scripts/udp-relay.py --listen-port 5005 --forward-port 5006
# 2. Edit docker/docker-compose.yml — change the ESP32 UDP mapping from
# - "5005:5005/udp"
# to
# - "5006:5005/udp"
# 3. Bring the stack up
docker compose -f docker/docker-compose.yml up
```
ESP32 nodes still target the host on `--target-ip <host>:5005` — no firmware re-provisioning is needed. The relay is `scripts/udp-relay.py` (stdlib only, no extra deps). Verify with `--verbose` that each node's source IP appears at least once before forwarding stabilises on a single ephemeral relay port.
**Prevention:** Linux and macOS hosts are unaffected; the relay only needs to run on Docker Desktop for Windows. If Docker Desktop ships per-source UDP forwarding (tracked at [docker/for-win#1144](https://github.com/docker/for-win/issues/1144) and related), this workaround can be retired.
**Prior art:** PR #413 (`txhno`) proposed a docs-only writeup of the same workaround; this entry supersedes it.
---
## 10. `404` on the visualization page when running sensing-server
**Symptom:** `sensing-server` starts cleanly, logs `HTTP server listening on http://localhost:3000`, but loading `http://localhost:3000/` (or `/ui/index.html`) returns `404 Not Found`. Reported in #188.
**Root cause:** The default `--ui-path ../../ui` is resolved relative to the binary's *current working directory*, not the binary location. When the binary is launched from anywhere other than `crates/wifi-densepose-sensing-server/`, the relative path doesn't reach the UI assets and Axum's static file handler returns 404.
**Fix:** Pass an absolute UI path, run the binary from the crate directory, or use the Docker image (which bundles the UI under `/app/ui`).
```bash
# Option A — absolute path (recommended for production)
sensing-server --source esp32 --udp-port 5005 --http-port 3000 \
--ws-port 3001 --ui-path /absolute/path/to/ui
# Option B — run from the crate dir (works for local dev / cargo run)
cd v2/crates/wifi-densepose-sensing-server
cargo run -- --source esp32
# Option C — Docker (no path config needed)
docker compose -f docker/docker-compose.yml up sensing-server
```
**Prevention:** Track future work in #188 to fall back to a path resolved relative to the executable when the cwd-relative path doesn't exist, so the binary works regardless of where it's launched.
---
## 11. Boot loop on `--edge-tier 1` or `--edge-tier 2`
**Symptom:** ESP32-S3 boots normally with `--edge-tier 0`, but flashing the same firmware with `--edge-tier 1` or `2` produces a boot loop. Serial output reaches `cpu_start` and `heap_init`, then resets repeatedly. Reported in #438 against firmware `v0.4.3.1-esp32-3-g66e2fa083-dir`.
**Root cause:** Edge tiers 1 and 2 enable the on-device DSP pipeline on Core 1. In the affected build, the `edge_dsp` task ran a tight per-frame loop without yielding, so the FreeRTOS task watchdog tripped on Core 1 and panicked. Tier 0 is passthrough only and doesn't activate the pipeline, so the watchdog never fires there.
**Fix:** Flash the [v0.4.3.1-esp32](https://github.com/ruvnet/RuView/releases/tag/v0.4.3.1-esp32) release or later — the DSP task yield fixes have shipped on `main` since the build in the report.
```bash
# Verify what version you're on (look for "App version" in serial output on boot)
python -m serial.tools.miniterm COM7 115200
# Expect: "App version: v0.4.3.1-esp32" or higher
```
If the boot loop persists on a release build, capture a full serial trace including the watchdog backtrace and reopen #438 with the new build hash.

103
scripts/udp-relay.py Normal file
View file

@ -0,0 +1,103 @@
#!/usr/bin/env python3
"""
UDP relay for Docker Desktop on Windows (issue #374, #386).
Docker Desktop on Windows multiplexes inbound UDP from multiple source IPs to
a single source IP inside the container, which causes packets from all but one
ESP32 node to be silently dropped at the WSL/Hyper-V boundary.
This relay listens on the host, then re-emits each datagram from its own
single socket back to a localhost port that Docker forwards into the
container. Because every forwarded datagram now has the same source IP/port
(the relay's loopback socket), Docker passes them all through.
Usage:
# Default: listen on host:5005, forward to 127.0.0.1:5006
# Container should be started with -p 5006:5005/udp.
python scripts/udp-relay.py
# Custom ports
python scripts/udp-relay.py --listen-port 5005 --forward-port 5006
# Verbose (one line per packet)
python scripts/udp-relay.py --verbose
"""
import argparse
import socket
import sys
import time
def run_relay(listen_host: str, listen_port: int, forward_host: str,
forward_port: int, stats_interval: float, verbose: bool) -> int:
rx = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
rx.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
try:
rx.bind((listen_host, listen_port))
except OSError as e:
print(f"udp-relay: failed to bind {listen_host}:{listen_port}: {e}",
file=sys.stderr)
return 1
tx = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
forward_addr = (forward_host, forward_port)
print(f"udp-relay: listening on {listen_host}:{listen_port} "
f"-> forwarding to {forward_host}:{forward_port}")
print("udp-relay: collapses multi-source UDP to a single loopback source "
"so Docker Desktop on Windows forwards every packet (issue #374).")
sources: dict[tuple[str, int], int] = {}
total = 0
last_stats = time.monotonic()
try:
while True:
data, src = rx.recvfrom(65535)
tx.sendto(data, forward_addr)
total += 1
sources[src] = sources.get(src, 0) + 1
if verbose:
print(f"udp-relay: {src[0]}:{src[1]} -> "
f"{forward_host}:{forward_port} ({len(data)}B)")
now = time.monotonic()
if now - last_stats >= stats_interval:
print(f"udp-relay: forwarded {total} pkts from "
f"{len(sources)} sources in last {stats_interval:.0f}s")
sources.clear()
total = 0
last_stats = now
except KeyboardInterrupt:
print("udp-relay: stopping")
return 0
finally:
rx.close()
tx.close()
def main() -> int:
p = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--listen-host", default="0.0.0.0",
help="Host interface to bind (default: 0.0.0.0)")
p.add_argument("--listen-port", type=int, default=5005,
help="Port the ESP32 nodes send to (default: 5005)")
p.add_argument("--forward-host", default="127.0.0.1",
help="Where to forward packets (default: 127.0.0.1)")
p.add_argument("--forward-port", type=int, default=5006,
help="Port Docker maps into the container (default: 5006)")
p.add_argument("--stats-interval", type=float, default=10.0,
help="Seconds between stats lines (default: 10)")
p.add_argument("--verbose", action="store_true",
help="Log every forwarded packet")
args = p.parse_args()
return run_relay(args.listen_host, args.listen_port, args.forward_host,
args.forward_port, args.stats_interval, args.verbose)
if __name__ == "__main__":
sys.exit(main())