Ruview/docs/research/05-sublinear-mincut-algorithms.md
rUv 341d9e05a8
ruv-neural: publish 11 crates to crates.io — full implementation, no stubs
* Add temporal graph evolution & RuVector integration research

GOAP Agent 8 output: 1,528-line SOTA research document covering temporal
graph models (TGN, JODIE, DyRep), RuVector graph memory design, mincut
trajectory tracking with Kalman filtering, event detection pipelines,
compressed temporal storage, cross-room transition graphs, and a 5-phase
integration roadmap.

Part of RF Topological Sensing research swarm (10 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add transformer architectures for graph sensing research

GOAP Agent 4 output: 896-line SOTA document covering Graph Transformers
(Graphormer, SAN, GPS, TokenGT), Temporal Graph Transformers (TGN, TGAT,
DyRep), ViT for RF spectrograms, transformer-based mincut prediction,
positional encoding for RF graphs, foundation models for RF sensing, and
efficient edge deployment with INT8 quantization.

Part of RF Topological Sensing research swarm (10 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add attention mechanisms for RF sensing research

GOAP Agent 3 output: 1,110-line document covering GAT for RF graphs,
self-attention for CSI sequences, cross-attention multi-link fusion,
attention-weighted differentiable mincut, spatial node attention,
antenna-level subcarrier attention, and efficient attention variants
(linear, sparse, LSH, S4/Mamba). 8 ASCII architecture diagrams.

Part of RF Topological Sensing research swarm (10 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add sublinear mincut algorithms research

GOAP Agent 5 output: 698-line document covering classical mincut complexity,
sublinear approximation (sampling, sparsifiers), dynamic mincut with lazy
recomputation hybrid, streaming sketch algorithms, Benczur-Karger
sparsification, local partitioning (PageRank-guided cuts), randomized
methods reliability analysis, and Rust implementation with const-generic
RfGraph, zero-alloc Stoer-Wagner, SIMD batch updates.

Part of RF Topological Sensing research swarm (10 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add CSI edge weight computation research

GOAP Agent 2 output: ~700-line document covering CSI feature extraction,
coherence metrics (cross-correlation, mutual information, phasor coherence),
multipath stability scoring (MUSIC, ESPRIT, ISTA), temporal windowing
(EMA, Welford, Kalman), noise robustness (phase noise, AGC, clock drift),
edge weight normalization, and implementation architecture showing 32KB
memory for 120 edges within ESP32-S3 capability.

Part of RF Topological Sensing research swarm (10 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add contrastive learning for RF coherence research

GOAP Agent 7 output: 1,226-line document covering SimCLR/MoCo/BYOL for CSI,
AETHER-Topo dual-head extension, coherence boundary detection with multi-scale
analysis, delta-driven updates (2-12x efficiency), self-supervised pre-training
protocol, triplet networks for 5-state edge classification, and MERIDIAN
cross-environment transfer with EWC continual learning.

Part of RF Topological Sensing research swarm (12 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add resolution and spatial granularity analysis research

GOAP Agent 9 output: 1,383-line document covering Fresnel zone analysis,
node density vs resolution (16-node/5m room → 30-60cm), Cramer-Rao lower
bounds with Fisher Information Matrix, graph cut resolution theory,
multi-frequency enhancement (6cm coherent dual-band limit), RF tomography
comparison, experimental validation protocols, and resolution scaling laws
(8.8cm theoretical limit).

Part of RF Topological Sensing research swarm (12 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add RF graph theory and minimum cut foundations research

GOAP Agent 1 output: Graph-theoretic foundations covering max-flow/min-cut
for RF (Ford-Fulkerson, Stoer-Wagner, Karger), RF as dynamic graph with
CSI coherence weights, topological change detection via Fiedler vector and
Cheeger inequality, dynamic graph algorithms, comparison to classical RF
sensing, formal mathematical framework, and 9 open research questions.

Part of RF Topological Sensing research swarm (12 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add ESP32 mesh hardware constraints research

GOAP Agent 6 output: ESP32 CSI capabilities (52/114 subcarriers), 16-node
mesh topology with 120 edges, TDM synchronized sensing (3ms slots),
computational budget (Stoer-Wagner uses 0.07% of one core), channel hopping,
power analysis (0.44W/node), dual-core firmware architecture, and edge vs
server computing with 100x data reduction on-device.

Part of RF Topological Sensing research swarm (12 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add system architecture and prototype design research

GOAP Agent 10 output: End-to-end architecture with pipeline diagrams,
existing crate integration mapping, new rf_topology module design (DDD
aggregate roots), 100ms latency budget breakdown, 3-phase prototype plan
(4-node POC → 16-node room → 72-node multi-room), benchmark design with
8 metrics, ADR-044 draft, and Rust trait definitions (EdgeWeightComputer,
TopologyGraph, MinCutSolver, BoundaryInterpolator).

Part of RF Topological Sensing research swarm (12 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add quantum sensing and quantum biomedical research documents

Agent 11: Quantum-level sensors (729 lines) — NV centers, SQUIDs, Rydberg
atoms, quantum illumination, quantum graph theory (walks, spectral, QAOA),
hybrid classical-quantum architecture, quantum ML (VQC, kernels, reservoir
computing), NISQ applications (D-Wave, VQE), hardware roadmap.

Agent 12: Quantum biomedical sensing (827 lines) — whole body biomagnetic
mapping, neural field imaging without electrodes, circulation sensing,
cellular EM signaling, non-contact diagnostics, coherence-based diagnostics
(disease as coherence breakdown), neural interfaces, multimodal observatory,
room-scale ambient health monitoring, graph-based biomedical analysis.

Part of RF Topological Sensing research swarm (12 agents).

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add research index synthesizing all 12 documents (14,322 lines)

Master index for RF Topological Sensing research compendium covering:
graph theory foundations, CSI edge weights, attention mechanisms,
transformers, sublinear algorithms, ESP32 hardware, contrastive learning,
temporal graphs, resolution analysis, system architecture, quantum sensors,
and quantum biomedical sensing. Includes key findings, proposed ADRs
(044, 045), and 5-phase implementation roadmap.

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add SOTA neural decoding landscape and 10 application domains research

- Doc 21: Comprehensive SOTA map (2023-2026) of brain sensors, decoders,
  and visualization systems with RuVector/mincut positioning analysis
- Doc 22: Ten application domains for brain state observatory including
  disease detection, BCI, cognitive monitoring, mental health diagnostics,
  neurofeedback, dream reconstruction, cognitive research, HCI, wearables,
  and brain network digital twins with strategic roadmap

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add NV diamond neural magnetometry research document (13/22)

Comprehensive 600+ line document covering NV center physics, neural
magnetic field sources, sensor architecture, SQUID comparison, signal
processing pipeline, RuVector integration, and development roadmap.

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add ruv-neural workspace Cargo.toml with 12 crate definitions

Workspace structure for the rUv Neural brain topology analysis system.
12 mix-and-match crates with shared dependencies including RuVector
integration, petgraph, rustfft, and WASM/ESP32 support.

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add ruv-neural crate ecosystem — 12 mix-and-match crates (WIP)

Initial implementation of the rUv Neural brain topology analysis system:
- ruv-neural-core: Core types, traits, errors, RVF format (compiles)
- ruv-neural-sensor: NV diamond, OPM, EEG sensor interfaces (in progress)
- ruv-neural-signal: DSP, filtering, spectral, connectivity (in progress)
- ruv-neural-graph: Brain connectivity graph construction (in progress)
- ruv-neural-mincut: Dynamic minimum cut topology analysis (in progress)
- ruv-neural-embed: RuVector graph embeddings (in progress)
- ruv-neural-memory: Persistent neural state memory + HNSW (compiles)
- ruv-neural-decoder: Cognitive state classification + BCI (in progress)
- ruv-neural-esp32: ESP32 edge sensor integration (compiles)
- ruv-neural-wasm: WebAssembly browser bindings (in progress)
- ruv-neural-viz: Visualization + ASCII rendering (in progress)
- ruv-neural-cli: CLI tool (in progress)

Agents still writing remaining modules. Next: fix compilation, tests, push.

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Fix ruv-neural crate compilation: all 12 crates build and 1200+ tests pass

- Fix node2vec.rs type inference error (Vec<_> → Vec<Vec<f64>>)
- Fix artifact.rs with full filter-based detection implementations
- Fix signal crate ConnectivityMetric re-export and trait method names
- Fix embed crate EmbeddingGenerator trait implementations
- Complete spectral, topology, and node2vec embedders with tests
- Complete preprocessing pipeline with sequential stage processing
- All workspace crates compile cleanly, 0 test failures

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* Add ruv-neural-cli README

https://claude.ai/code/session_01DGUAowNScGVp88bK2eiuRv

* fix: convert desktop icons from RGB to RGBA for Tauri build

Tauri's generate_context!() macro requires RGBA PNG icons. All 5 icon
files (32x32.png, 128x128.png, 128x128@2x.png, icon.icns, icon.ico)
were RGB-only, causing a proc macro panic on Linux builds.

Fixes #200

Co-Authored-By: claude-flow <ruv@ruv.net>

* Add Subcarrier Manifold and Vitals Oracle modules for 3D visualizations

- Implemented Subcarrier Manifold to visualize amplitude data as a 3D surface with height and age attributes.
- Created Vitals Oracle to represent vital signs using toroidal rings and particle trails, incorporating breathing and heart rate dynamics.
- Both modules utilize Three.js for rendering and include custom shaders for visual effects.

* feat: complete ruv-neural implementation — physics models, security, witness verification

Replace all stubs/mocks with production physics-based signal models:
- NV Diamond: ODMR Lorentzian dip, 1/f pink noise (Voss-McCartney), brain oscillations
- OPM: SERF-mode, 50/60Hz powerline harmonics, full cross-talk compensation
  via Gaussian elimination with partial pivoting
- EEG: 5 frequency bands, eye blink artifacts (Fp1/Fp2), muscle artifacts,
  impedance-based thermal noise floor
- ESP32 ADC: ring-buffer reader with calibration signal generator, i16 clamp

Security hardening (SEC-001 through SEC-005):
- RVF bounded allocation (16MB metadata, 256MB payload)
- sample_rate validation (>0, finite)
- Signal NaN/Inf rejection
- ADC resolution_bits overflow clamp
- HNSW HashSet visited tracking + bounds checks

Performance optimizations (PERF-001 through PERF-005):
- 67x fewer FFTs via pre-computed analytic signals
- VecDeque O(1) eviction in memory store
- Thread-local FFT planner caching
- BrainGraph::validate() for edge/weight integrity
- Eigenvalue convergence early termination

Ed25519 witness verification system:
- 41 capability attestations across all 12 crates
- SHA-256 digest + Ed25519 signature
- CLI commands: `witness --output` and `witness --verify`

README: ethics warning, hardware parts list (AliExpress), assembly instructions

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: add crates.io badges and install instructions to ruv-neural README

Add version badges linking to each published crate on crates.io,
cargo add instructions, and crate search link in the Crate Map table.

Co-Authored-By: claude-flow <ruv@ruv.net>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-09 10:52:24 -04:00

44 KiB

Sublinear and Near-Linear Time Minimum Cut Algorithms for Real-Time RF Sensing

Date: 2026-03-08 Context: RuVector v2.0.4 / RuvSense multistatic mesh — 16 ESP32 nodes, 120 link edges, 20 Hz update rate Scope: Algorithmic foundations for maintaining minimum cuts on dynamic RF link graphs under real-time constraints


Abstract

A 16-node ESP32 multistatic mesh generates a complete weighted graph on C(16,2) = 120 edges, where each edge weight encodes the RF channel state information (CSI) attenuation or coherence between two nodes. Human bodies, moving objects, and environmental changes continuously perturb these weights. The minimum cut of this graph partitions the sensing field into regions of minimal RF coupling — directly useful for person segmentation, occupancy counting, and anomaly detection.

At 20 Hz update rate, each mincut computation has a budget of 50 ms wall-clock time. On a resource-constrained coordinator (ESP32-S3 at 240 MHz or a modest ARM host), classical algorithms are either too slow or carry too much overhead. This document surveys the algorithmic landscape from classical exact methods through sublinear approximations, dynamic maintenance, streaming, and sparsification — evaluating each for applicability to the RuVector RF sensing pipeline.

Throughout, V = 16 and E = 120 (complete graph). While these are small by general graph algorithm standards, the constraint is not problem size but update frequency and platform limitations. The goal is not asymptotic superiority but practical per-frame latency under 2 ms on the target hardware.


1. Classical Mincut Complexity

1.1 Problem Definition

Given an undirected weighted graph G = (V, E, w) with w: E -> R+, the global minimum cut is a partition of V into two non-empty sets (S, V\S) minimizing the total weight of edges crossing the partition:

mincut(G) = min_{S subset V, S != empty, S != V} sum_{(u,v) in E, u in S, v in V\S} w(u,v)

For RF sensing, w(u,v) typically represents the CSI coherence or signal attenuation between nodes u and v. A minimum cut identifies the partition where RF coupling is weakest — corresponding to physical obstructions (human bodies, walls, large objects) that attenuate the RF field.

1.2 Stoer-Wagner Algorithm (1997)

The Stoer-Wagner algorithm computes exact global minimum cut in O(VE + V^2 log V) time using a sequence of V-1 minimum s-t cut computations, each performed via a maximum adjacency ordering.

Procedure:

  1. Pick arbitrary start vertex.
  2. Build maximum adjacency ordering: greedily add the vertex most tightly connected to the current set.
  3. The last two vertices (s, t) in the ordering define a cut. Record its weight.
  4. Merge s and t, reducing |V| by 1.
  5. Repeat V-1 times. Return the minimum recorded cut.

Complexity for our graph:

  • V = 16, E = 120
  • O(VE + V^2 log V) = O(16 * 120 + 256 * 4) = O(2944)
  • Per iteration: O(E + V log V) using a priority queue.

Practical assessment: For V = 16, Stoer-Wagner executes 15 phases, each scanning at most 120 edges. Total work is roughly 1,800 edge scans plus priority queue operations. On modern hardware this completes in microseconds. On ESP32 at 240 MHz, estimated wall time is 50-200 us — well within budget.

This is the baseline. The algorithm is exact, deterministic, and simple to implement. For V = 16, classical complexity is not actually the bottleneck.

1.3 Karger's Randomized Contraction (1993)

Karger's algorithm randomly contracts edges, merging endpoints, until two vertices remain. The surviving edges form a cut. Repeating O(V^2 log V) times yields the minimum cut with high probability.

Single contraction round: O(E) time using union-find. Total for high-probability success: O(V^2 log V * E) = O(V^2 E log V). With the improved implementation: O(V^2 log^3 V).

For our graph:

  • Single contraction: O(120) ~ trivial
  • Repetitions needed: O(256 * 4) ~ 1024 for 1/V failure probability
  • Total: ~120,000 edge operations

Practical assessment: Karger is elegant but the constant factors from repeated trials make it slower than Stoer-Wagner for small V. Its value emerges at scale (V > 1000) where the randomized approach avoids worst-case deterministic behavior.

1.4 Karger-Stein Recursive Contraction (1996)

Karger-Stein improves on Karger by contracting only to V/sqrt(2) vertices, then recursing on two independent copies. This reduces the repetition count from O(V^2) to O(V^2 / 2^depth), yielding O(V^2 log V) total time.

For our graph:

  • O(256 * 4) = O(1024) total work — negligible
  • Recursion depth: O(log V) = 4 levels

Practical assessment: At V = 16, the recursion tree has ~4 levels with branching factor 2, yielding ~16 leaf problems each of size ~4. Total work is dominated by the initial contraction steps. Fast in practice but adds implementation complexity over Stoer-Wagner for no real benefit at this scale.

1.5 Why Classical Algorithms Are Sufficient (and Insufficient)

For a static 16-node graph, all classical algorithms complete in microseconds. The real challenge is not single-computation cost but:

  1. Update frequency: At 20 Hz with 120 edges changing per frame, we need incremental updates, not full recomputation.
  2. Batch processing: If computing mincut is part of a larger pipeline (signal processing, pose estimation), even microseconds add up across multiple graph operations per frame.
  3. Scaling considerations: Future deployments may use 32, 64, or 128 nodes. At 128 nodes, E = 8128 edges, and Stoer-Wagner requires O(128 * 8128 + 16384 * 7) ~ O(1.15M) operations per frame.
  4. Multi-cut requirements: We often need not just the global mincut but multiple minimum cuts, Gomory-Hu trees, or k-way partitions.

The subsequent sections address these challenges with algorithms designed for dynamic, streaming, and approximate settings.


2. Sublinear Approximation

2.1 Motivation

A sublinear-time algorithm runs in o(m) time, where m = |E|. For our graph with m = 120, "sublinear in m" means fewer than 120 edge reads. This is useful when:

  • Edge weights are expensive to compute (each requires CSI processing).
  • We need a quick approximate answer before the full CSI frame is processed.
  • The graph is much larger (future deployments).

2.2 Random Edge Sampling for Cut Estimation

The simplest sublinear approach: sample k edges uniformly at random, compute their total weight, and estimate the mincut value.

Karger's sampling theorem (1994): If we sample each edge independently with probability p = O(log V / (epsilon^2 * lambda)), where lambda is the minimum cut value, then with high probability every cut in the sampled graph has value within (1 +/- epsilon) of its value in the original graph, after scaling by 1/p.

For our setting:

  • lambda ~ O(sum of weakest node's incident edges)
  • For epsilon = 0.1 and V = 16: p ~ O(log(16) / (0.01 * lambda))
  • If lambda ~ 10 (in normalized units), p ~ O(40), meaning we sample ~40 of 120 edges.

This achieves a (1 +/- 0.1)-approximation by reading only 1/3 of the edges.

Algorithm:

1. Sample each edge with probability p
2. Run exact mincut on the sampled graph (Stoer-Wagner)
3. Scale result by 1/p

The key insight: Stoer-Wagner on a sparse sample with ~40 edges and 16 vertices runs in O(16 * 40) = O(640) operations — faster than on the full graph, and with provable approximation guarantees.

2.3 Cut Sparsifiers

A cut sparsifier H of G is a sparse graph on the same vertex set where every cut value is preserved within (1 +/- epsilon). Benczur and Karger (1996) showed that O(V log V / epsilon^2) edges suffice.

For V = 16, epsilon = 0.1: O(16 * 4 / 0.01) = O(6400) edges. This exceeds our actual edge count of 120, so sparsification provides no benefit at this scale. However, it becomes critical for:

  • V = 64: E = 2016, sparsifier needs ~O(2560) edges — marginal savings
  • V = 128: E = 8128, sparsifier needs ~O(5120) edges — 37% reduction
  • V = 256: E = 32640, sparsifier needs ~O(10240) edges — 69% reduction

2.4 Spectral Sparsification

Spielman and Srivastava (2011) showed that spectrally sparsifying the graph Laplacian preserves all cut values. Their algorithm:

  1. Compute effective resistances R_e for all edges.
  2. Sample each edge with probability proportional to w_e * R_e.
  3. Reweight sampled edges to preserve expected cut values.

Result: O(V log V / epsilon^2) edges suffice, same as combinatorial sparsification, but the spectral guarantee is stronger — it preserves the entire spectrum of the Laplacian, not just cut values.

For RF sensing: The graph Laplacian eigenvectors correspond to spatial modes of the RF field. Spectral sparsification preserves these modes, which is useful beyond mincut — it preserves the spatial structure needed for tomography and field modeling (RuvSense field_model.rs).

2.5 Query-Based Sublinear Algorithms

Recent work by Rubinstein, Schramm, and Weinberg (2018) achieves O(V polylog V)-time algorithms that query the graph adjacency/weight oracle rather than reading all edges. For V = 16, this gives O(16 * 16) = O(256) queries — a 2x reduction over reading all 120 edges (not useful at this scale, but relevant at V = 256 where it reduces from 32640 to ~4000 queries).


3. Dynamic Mincut

3.1 Problem Setting

In the dynamic setting, the graph undergoes edge insertions, deletions, and weight updates, and we must maintain the minimum cut value (and optionally the cut itself) after each update.

For RF sensing, every CSI frame update changes all 120 edge weights simultaneously. This is a batch-dynamic setting: 120 updates arrive together, then we query the mincut.

3.2 Thorup's Dynamic Connectivity (2000)

Thorup showed that edge connectivity (unweighted mincut) can be maintained in O(log V * (log log V)^2) amortized time per edge update. For weighted graphs, this extends to O(polylog V) time per update with some caveats.

For our setting:

  • 120 updates per frame
  • O(120 * polylog(16)) = O(120 * ~16) = O(1920) amortized work per frame
  • Versus full recomputation: O(2944) with Stoer-Wagner

The savings are modest at V = 16 but the amortized bound means some frames are nearly free (when the mincut does not change) while others pay more.

3.3 Fully Dynamic (1+epsilon)-Approximate Mincut

Goranci, Henzinger, and Thorup (2018) maintain a (1+epsilon)-approximate minimum cut under edge insertions and deletions in O(polylog(V)/epsilon^2) amortized update time.

Key ideas:

  1. Maintain a hierarchy of cut sparsifiers at different granularities.
  2. When an edge weight changes, update only the affected sparsifier levels.
  3. The mincut value is read from the coarsest level.

For our setting:

  • Update time: O(log^3(16) / 0.01) ~ O(6400) per edge update with epsilon = 0.1
  • Batch of 120 updates: O(768,000) — worse than recomputation!

This reveals an important practical point: dynamic algorithms have excellent asymptotic behavior but carry large constant factors that dominate at small V. For V = 16, full recomputation with Stoer-Wagner is faster than any known dynamic algorithm.

3.4 When Dynamic Algorithms Win

Dynamic algorithms become beneficial when:

  1. V > 1000 and E > 100,000 — amortized polylog update beats O(VE).
  2. Sparse updates — only a few edges change per frame, not all 120.
  3. Incremental weight changes — weights change by small deltas, allowing incremental sparsifier updates.

For our RF mesh, a practical middle ground is:

Threshold-filtered updates: Only re-process edges whose weight changed by more than delta from the previous frame. If the RF field is relatively stable (people move slowly relative to 20 Hz), most edges change minimally. If only 10-20 edges exceed the delta threshold per frame, a partial Stoer-Wagner restart or local repair becomes attractive.

3.5 Hybrid Approach: Lazy Recomputation

Algorithm: Lazy-Mincut-Update
Input: Previous mincut (S*, V\S*), new edge weights w'
Output: Updated mincut

1. Compute delta = sum of |w'(e) - w(e)| for edges crossing (S*, V\S*)
2. If delta < epsilon * mincut_value:
     Return (S*, V\S*) unchanged  // Cut value changed negligibly
3. Compute crossing_weight = sum w'(e) for edges crossing (S*, V\S*)
4. If crossing_weight == mincut_value +/- epsilon:
     Update mincut_value = crossing_weight  // Same cut, adjusted value
     Return (S*, V\S*)
5. Else:
     Run full Stoer-Wagner on G' = (V, E, w')  // Recompute
     Return new mincut

In practice, steps 1-4 handle >90% of frames (the minimum cut partition is spatially stable — people do not teleport), and full recomputation is triggered only when someone crosses the cut boundary. This reduces average per-frame cost to O(E) = O(120) for crossing-weight evaluation plus occasional O(VE) recomputation.


4. Streaming Algorithms

4.1 Motivation

In the streaming model, edges arrive one at a time (or in a stream from multiple ESP32 nodes), and we must estimate the mincut using limited working memory — ideally O(V polylog V) space rather than O(V^2).

This is relevant when:

  • CSI data arrives asynchronously from 16 nodes via TDM (Time Division Multiplexing, see ADR-022).
  • The coordinator cannot buffer all 120 edge weights before computing.
  • Memory is constrained (ESP32-S3 has 512 KB SRAM).

4.2 Single-Pass Streaming

Ahn, Guha, and McGregor (2012) showed that a single-pass streaming algorithm can compute a (1+epsilon)-approximate mincut using O(V polylog V / epsilon^2) space by maintaining linear sketches of the graph.

Sketch construction:

  1. For each vertex v, maintain a sparse random linear combination of its incident edge weights.
  2. The sketch has size O(log^2 V / epsilon^2) per vertex.
  3. From sketches, approximate the cut value for any partition.

For our setting:

  • Space per vertex: O(16 / 0.01) = O(1600) numbers ~ 6.4 KB per vertex
  • Total space: O(16 * 6400) = O(102,400) numbers ~ 400 KB
  • This fits in ESP32-S3 SRAM but leaves little room for other state.

4.3 Multi-Pass Streaming

With k passes over the stream, accuracy improves. Specifically, O(log V) passes suffice to compute exact mincut with O(V polylog V) space.

Practical algorithm (2-pass):

Pass 1: Build a cut sparsifier by sampling edges with probability
         proportional to estimated effective resistance.
Pass 2: Refine the sparsifier using importance sampling based on
         first-pass estimates.
Result: (1+epsilon)-approximate mincut from the refined sparsifier.

For our TDM protocol, each complete CSI scan across all 16 nodes constitutes one "pass." A two-pass approach means using two consecutive TDM cycles (100 ms total at 20 Hz) to build and refine the sparsifier — acceptable if we can tolerate 100 ms latency on the initial estimate.

4.4 Turnstile Streaming

In the turnstile model, edge weights can increase and decrease over time. This matches our RF sensing setting where CSI coherence fluctuates.

Ahn, Guha, and McGregor (2013) extended their sketching approach to the turnstile model. The key: L0-sampling sketches allow recovering edges from the sketch difference, enabling dynamic cut estimation.

Space complexity: O(V * polylog(V) / epsilon^2) — same as the insertion-only case.

For RF sensing: This means we can maintain a running sketch that processes CSI weight updates as they arrive from each node, without needing to store the full graph. The sketch naturally accommodates the continuous weight fluctuations of the RF field.

4.5 Sketch-Based Architecture for ESP32 Mesh

ESP32 Node i:
  - Computes CSI for links to all other nodes
  - Constructs local sketch S_i of incident edges
  - Transmits S_i to coordinator (compact: ~400 bytes)

Coordinator:
  - Receives S_1, ..., S_16
  - Merges sketches: S = merge(S_1, ..., S_16)
  - Extracts approximate mincut from S
  - Latency: dominated by network round-trip, not computation

This architecture distributes the sketching computation across nodes, reducing coordinator load and enabling approximate mincut estimation even when some node reports are delayed or missing.


5. Graph Sparsification

5.1 Benczur-Karger Cut Sparsification (1996)

Theorem: For any undirected weighted graph G with V vertices, there exists a subgraph H with O(V log V / epsilon^2) edges such that for every cut (S, V\S):

(1 - epsilon) * w_G(S, V\S) <= w_H(S, V\S) <= (1 + epsilon) * w_G(S, V\S)

Construction algorithm:

  1. For each edge e, compute its strong connectivity c_e (the maximum number of edge-disjoint paths between its endpoints using edges of weight >= w_e).
  2. Sample each edge e with probability p_e = min(1, C * log V / (epsilon^2 * c_e)) for an appropriate constant C.
  3. Reweight sampled edges: w_H(e) = w_G(e) / p_e.

Computing strong connectivity: This requires O(VE) time using max-flow computations — as expensive as solving mincut directly. However, approximate strong connectivity can be computed in O(E log^3 V) time using the sparsification itself (bootstrapping).

5.2 Application to RF Graph

For our 16-node RF graph:

Static sparsification is unnecessary since E = 120 is already small. However, sparsification is useful as a noise filter:

  1. Edges with high strong connectivity (nodes connected through many independent high-weight paths) are structurally important.
  2. Edges with low strong connectivity may represent noisy or unreliable RF links.
  3. Sampling by strong connectivity naturally de-emphasizes unreliable links.

Practical algorithm for RF:

1. Compute approximate connectivity for each edge using 2-3 rounds
   of random spanning tree sampling.
2. Mark edges with connectivity below threshold as "unreliable."
3. Run mincut on the subgraph of reliable edges.
4. If mincut uses an unreliable edge, recompute on full graph.

This typically reduces effective edge count from 120 to 60-80 edges, providing a 1.5-2x speedup on Stoer-Wagner.

5.3 Maintaining Sparsifiers Under Updates

When edge weights change (every CSI frame), the sparsifier must be updated. Naive recomputation defeats the purpose. Efficient approaches:

Incremental update (Abraham, Durfee, et al. 2016):

  • Maintain strong connectivity estimates incrementally.
  • When an edge weight changes by more than a (1+epsilon) factor, update its sampling probability and re-decide inclusion.
  • Amortized cost: O(polylog V) per edge update.

Batch update strategy for RF:

Every frame:
  1. Receive new edge weights w' from CSI processing.
  2. For each edge e in sparsifier:
     a. If |w'(e) - w(e)| / w(e) > epsilon: mark for re-evaluation.
  3. Re-evaluate marked edges (update sampling decision).
  4. Run mincut on updated sparsifier.

Expected re-evaluations per frame: 10-30 edges (most weights change incrementally). Mincut on sparsifier with ~70 edges and 16 vertices: O(16 * 70) = O(1120) operations.

5.4 Spectral Sparsification and the Laplacian

The graph Laplacian L_G of the RF mesh encodes the complete spatial coupling structure. Its eigenvalues directly relate to cut values:

  • lambda_2 (algebraic connectivity) = lower bound on normalized mincut
  • The Fiedler vector (eigenvector of lambda_2) approximates the mincut partition.

Spectral sparsification preserves all eigenvalues, meaning:

(1-epsilon) * L_G <= L_H <= (1+epsilon) * L_G  (Loewner order)

This is strictly stronger than cut sparsification and preserves:

  • Cut values (for mincut computation)
  • Effective resistances (for tomography in field_model.rs)
  • Random walk distributions (for tracking in pose_tracker.rs)
  • Heat kernel (for gesture recognition in gesture.rs)

For the RuvSense pipeline, a spectral sparsifier serves double duty: mincut computation and spatial field modeling.


6. Local Partitioning

6.1 Motivation

Classical mincut algorithms are global — they examine the entire graph. Local partitioning algorithms find cuts by exploring only a small region of the graph, running in time proportional to the size of the smaller side of the cut rather than the full graph.

For RF sensing, this is valuable when we want to detect a localized obstruction (a person standing in one area) without scanning the entire 120-edge graph.

6.2 Spielman-Teng Local Partitioning (2004)

Spielman and Teng introduced local graph partitioning via truncated random walks. Their algorithm:

  1. Start a random walk from a seed vertex v.
  2. At each step, compute the walk distribution vector p.
  3. Find a "sweep cut" along the sorted p-values: vertices sorted by p(u) / degree(u), sweep through finding the cut with best conductance.
  4. Terminate when the walk has spread to cover O(|S|) vertices, where |S| is the target small side.

Complexity: O(|S| * polylog V / phi), where phi is the target conductance. The algorithm never examines vertices far from the seed.

For RF sensing:

  • If we know (or suspect) a person is near nodes {3, 7, 8}, seed the walk from these nodes.
  • The walk explores their neighbors (all other nodes, since the graph is complete), but weights ensure it concentrates on the most affected region.
  • Expected work: O(4 * polylog(16) / phi) ~ O(64/phi). For phi = 0.3, this is ~200 operations.

6.3 Personalized PageRank Local Cuts

Andersen, Chung, and Lang (2006) refined local partitioning using personalized PageRank (PPR). The algorithm:

ApproximatePPR(seed, alpha, epsilon):
  p = zero vector  // PPR estimate
  r = indicator(seed)  // residual

  While exists v with r(v) / degree(v) > epsilon:
    Push(v):
      p(v) += alpha * r(v)
      For each neighbor u of v:
        r(u) += (1 - alpha) * r(v) / (2 * degree(v))
      r(v) = (1 - alpha) * r(v) / 2

  Return p

Properties:

  • Runs in O(1 / (alpha * epsilon)) time, independent of graph size.
  • The resulting p vector, when sweep-cut, produces a low-conductance cut near the seed.
  • alpha controls locality: higher alpha = more local, lower alpha = more global.

For RF sensing:

  • alpha = 0.15 (standard PageRank damping) produces semi-global cuts suitable for person segmentation.
  • alpha = 0.5 produces highly local cuts suitable for detecting which specific links are attenuated.
  • epsilon = 0.01 gives high accuracy with ~O(1/(0.15*0.01)) = O(667) push operations.

6.4 Integration with RuvSense Pose Tracker

The pose_tracker.rs module maintains a Kalman-filtered estimate of person positions. When the tracker predicts a person near certain nodes, local partitioning can quickly confirm or refine the detection:

1. Tracker predicts person near nodes {5, 9, 12}.
2. Run PPR from each predicted node with alpha = 0.3.
3. Sweep-cut the PPR vectors to find local cuts.
4. If local cut conductance < threshold:
   Person confirmed at predicted location.
5. Feed cut boundary back to tracker as measurement update.

This creates a feedback loop where the tracker guides the graph algorithm and the graph algorithm refines the tracker — running in O(1/alpha/epsilon) time rather than O(VE) for full mincut.

6.5 Multi-Seed Local Partitioning

For multiple people, run local partitioning from multiple seeds simultaneously. With k people and V = 16 nodes, each person's local partition explores ~4-6 nodes, totaling ~O(k * 6 * degree) = O(k * 90) work. For k = 3 people, this is O(270) — less than half the cost of full Stoer-Wagner.

The challenge is handling overlapping partitions. Two approaches:

  1. Sequential peeling: Find the strongest local cut, remove those nodes, repeat. O(k) rounds, each cheaper than the last.
  2. Multi-commodity flow relaxation: Solve a multi-commodity flow LP relaxation using the local PPR vectors as approximate flows. More expensive but handles overlaps correctly.

7. Randomized Methods

7.1 Monte Carlo vs. Las Vegas

Monte Carlo algorithms return an answer that is correct with probability

= 1 - delta. Running time is fixed, accuracy is probabilistic.

Las Vegas algorithms always return the correct answer. Running time is probabilistic (expected polynomial), correctness is guaranteed.

For safety-critical RF sensing (mass casualty assessment via wifi-densepose-mat), Las Vegas algorithms are preferred: the mincut answer is always correct, even if occasionally slow.

7.2 Karger's Monte Carlo Mincut

Karger's contraction algorithm is Monte Carlo: a single trial finds the mincut with probability >= 2/V^2 = 2/256 ~ 0.78%. Running O(V^2 log V) trials boosts success probability to 1 - 1/V.

Amplification for reliability:

  • For delta = 10^-6 failure probability: V^2 * ln(1/delta) / 2 = 256 * 14 / 2 = 1792 trials
  • Each trial: O(V) contractions = O(16) operations
  • Total: O(28,672) operations ~ 0.1 ms on modern hardware

7.3 Karger-Stein Monte Carlo with Early Termination

The Karger-Stein recursive contraction can be enhanced with early termination heuristics:

Karger-Stein-ET(G, best_known_cut):
  If |V(G)| <= 6:
    Return exact mincut via brute force
  Contract G to G' with |V'| = |V| / sqrt(2) + 1
  If crossing_edges(G') > best_known_cut * (1 + epsilon):
    Prune this branch  // Cannot improve on best known
  Recurse on two independent copies of G'
  Return minimum of recursive results

The pruning step eliminates branches early, reducing expected work. For our graph, this rarely helps (V = 16 is already small), but for V > 100 it can reduce the constant factor by 2-5x.

7.4 Las Vegas Mincut via Maxflow

Converting Karger's algorithm to Las Vegas: run Karger until a cut is found, then verify it by computing max-flow between one pair of vertices separated by the cut. If max-flow equals the cut value, the cut is minimum (by max-flow min-cut theorem). Otherwise, continue.

Verification cost: O(V * E) for a single max-flow computation = O(1920). Expected number of verifications before success: O(V^2 / 2) = O(128). This is expensive and not recommended for real-time use.

Better approach: Use Stoer-Wagner (deterministic, always correct) and reserve randomized methods for approximate or multi-cut computations.

7.5 Reliability Analysis for Safety-Critical Systems

For MAT (Mass Casualty Assessment Tool, wifi-densepose-mat), mincut errors could mean missing a survivor. Reliability requirements:

Application Max failure probability Algorithm class
Occupancy counting 10^-2 Monte Carlo, any
Person segmentation 10^-4 Monte Carlo, amplified
Vital sign isolation 10^-5 Las Vegas or deterministic
MAT survivor detection 10^-8 Deterministic only

Recommendation: Use deterministic Stoer-Wagner for all safety-critical applications. Use Monte Carlo approximations only for non-critical tasks like gesture recognition or activity classification where a missed frame is acceptable.

7.6 Randomized Rounding for Multi-Way Cuts

Beyond 2-way mincut, k-way partitioning (separating k people) can use randomized LP rounding:

  1. Solve the LP relaxation of the k-way cut problem.
  2. Randomly round fractional assignments to integer (each vertex assigned to one of k groups).
  3. Expected approximation ratio: 2 - 2/k.

For k = 3 people, the approximation ratio is 4/3 ~ 1.33. For k = 5, it is 8/5 = 1.6. This is practical for real-time person segmentation with known person count.


8. Rust Implementation for RuVector Infrastructure

8.1 Design Principles

The implementation targets the ruvector-mincut crate, which already provides a DynamicPersonMatcher in metrics.rs. The mincut algorithm should integrate cleanly with existing infrastructure.

Key constraints:

  • No heap allocation in the inner loop (ESP32 compatibility).
  • Support no_std with optional alloc for embedded targets.
  • Leverage Rust's type system for compile-time graph size verification.
  • Use SIMD (via std::simd or packed_simd2) for batch edge weight updates.

8.2 Data Structures

Fixed-size adjacency matrix:

/// Adjacency matrix for a complete graph with compile-time size.
/// V = 16 nodes, stored as upper triangular (120 entries).
pub struct RfGraph<const V: usize> {
    /// Edge weights stored in upper-triangular order.
    /// Index for edge (i, j) where i < j: i * (2*V - i - 1) / 2 + (j - i - 1)
    weights: [f32; V * (V - 1) / 2],
    /// Cached mincut value (invalidated on weight update).
    cached_mincut: Option<f32>,
    /// Cached mincut partition (bitvector: bit i = 1 means node i in set S).
    cached_partition: Option<u32>,
}

For V = 16, this uses 120 * 4 = 480 bytes for weights, plus 8 bytes for cached values. Total: 488 bytes — fits in a single cache line pair.

Stoer-Wagner state:

/// Reusable state for Stoer-Wagner algorithm.
/// Pre-allocated to avoid per-call allocation.
struct StoerWagnerState<const V: usize> {
    /// Merged vertex sets (union-find).
    parent: [u16; V],
    /// Key values for maximum adjacency ordering.
    key: [f32; V],
    /// Whether vertex is in the current working set.
    active: [bool; V],
    /// Best cut found so far.
    best_cut: f32,
    /// Best partition found so far.
    best_partition: u32,
}

8.3 Stoer-Wagner Implementation

impl<const V: usize> RfGraph<V> {
    /// Compute exact global minimum cut using Stoer-Wagner.
    /// Time: O(V^3) for dense graphs (V^2 phases, V work per phase).
    /// For V=16: ~4000 operations, estimated 10-50 us.
    pub fn minimum_cut(&mut self) -> (f32, u32) {
        if let Some(val) = self.cached_mincut {
            return (val, self.cached_partition.unwrap());
        }

        let mut state = StoerWagnerState::new();
        let mut merged: [[f32; V]; V] = self.build_adjacency_matrix();
        let mut best_cut = f32::MAX;
        let mut best_partition: u32 = 0;

        for phase in 0..(V - 1) {
            let (s, t, cut_weight) = self.maximum_adjacency_phase(
                &mut merged, &mut state, V - phase
            );

            if cut_weight < best_cut {
                best_cut = cut_weight;
                best_partition = state.current_partition(t);
            }

            // Merge s and t
            self.merge_vertices(&mut merged, s, t);
        }

        self.cached_mincut = Some(best_cut);
        self.cached_partition = Some(best_partition);
        (best_cut, best_partition)
    }
}

8.4 Incremental Update Path

impl<const V: usize> RfGraph<V> {
    /// Update edge weight and determine if mincut needs recomputation.
    /// Returns true if the cached mincut is still valid.
    pub fn update_edge(&mut self, i: usize, j: usize, new_weight: f32) -> bool {
        let idx = self.edge_index(i, j);
        let old_weight = self.weights[idx];
        self.weights[idx] = new_weight;

        // Check if this edge crosses the cached partition
        if let Some(partition) = self.cached_partition {
            let i_side = (partition >> i) & 1;
            let j_side = (partition >> j) & 1;

            if i_side != j_side {
                // Edge crosses the cut — must update cut value
                if let Some(ref mut cut_val) = self.cached_mincut {
                    *cut_val += new_weight - old_weight;
                    // Cut value changed but partition might still be optimal
                    // unless the new cut value exceeds some other cut
                    // Conservative: invalidate if change > epsilon * cut_val
                    if (new_weight - old_weight).abs() > 0.1 * *cut_val {
                        self.cached_mincut = None;
                        self.cached_partition = None;
                        return false;
                    }
                    return true;
                }
            }
            // Edge does not cross the cut — partition still valid,
            // but cut value might no longer be minimum
            // Heuristic: if weight decreased significantly, invalidate
            if new_weight < old_weight * 0.8 {
                self.cached_mincut = None;
                self.cached_partition = None;
                return false;
            }
            return true;
        }
        false
    }

    /// Batch update all edges from new CSI frame.
    /// Uses lazy recomputation: only recomputes if cached cut is invalidated.
    pub fn update_frame(&mut self, new_weights: &[f32; V * (V - 1) / 2]) {
        let mut needs_recompute = false;

        for idx in 0..new_weights.len() {
            let old = self.weights[idx];
            let new_w = new_weights[idx];
            self.weights[idx] = new_w;

            if !needs_recompute {
                if let Some(partition) = self.cached_partition {
                    let (i, j) = self.edge_vertices(idx);
                    let crosses = ((partition >> i) ^ (partition >> j)) & 1 == 1;

                    if crosses && (new_w - old).abs() > 0.05 * self.cached_mincut.unwrap_or(1.0) {
                        needs_recompute = true;
                    }
                    if !crosses && new_w < old * 0.7 {
                        needs_recompute = true;
                    }
                } else {
                    needs_recompute = true;
                }
            }
        }

        if needs_recompute {
            self.cached_mincut = None;
            self.cached_partition = None;
        }
    }
}

8.5 SIMD-Accelerated Weight Updates

#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

impl<const V: usize> RfGraph<V> {
    /// Update 4 edge weights at once using SSE.
    /// Processes 120 edges in 30 SIMD iterations.
    #[cfg(target_arch = "x86_64")]
    pub unsafe fn update_weights_simd(
        &mut self,
        new_weights: &[f32; V * (V - 1) / 2]
    ) {
        let n = V * (V - 1) / 2;
        let mut i = 0;

        while i + 4 <= n {
            let old = _mm_loadu_ps(self.weights.as_ptr().add(i));
            let new_v = _mm_loadu_ps(new_weights.as_ptr().add(i));
            _mm_storeu_ps(self.weights.as_mut_ptr().add(i), new_v);

            // Compute absolute difference for cache invalidation check
            let diff = _mm_sub_ps(new_v, old);
            let abs_diff = _mm_andnot_ps(_mm_set1_ps(-0.0), diff);
            let threshold = _mm_set1_ps(0.05);
            let exceeds = _mm_cmpgt_ps(abs_diff, threshold);

            if _mm_movemask_ps(exceeds) != 0 {
                self.cached_mincut = None;
                self.cached_partition = None;
            }

            i += 4;
        }

        // Handle remaining edges
        while i < n {
            self.weights[i] = new_weights[i];
            i += 1;
        }
    }
}

8.6 Parallelism with Rayon

For larger deployments (V > 32), Stoer-Wagner's maximum adjacency ordering can be parallelized:

#[cfg(feature = "parallel")]
use rayon::prelude::*;

impl<const V: usize> RfGraph<V>
where
    [(); V * (V - 1) / 2]:,
{
    /// Parallel maximum adjacency ordering phase.
    /// Splits key-value computation across threads.
    #[cfg(feature = "parallel")]
    fn parallel_max_adjacency_phase(
        &self,
        merged: &[[f32; V]; V],
        active: &[bool; V],
        n_active: usize,
    ) -> (usize, usize, f32) {
        let mut in_set = [false; V];
        let mut key = [0.0f32; V];
        let mut order = Vec::with_capacity(n_active);

        // Start from first active vertex
        let start = active.iter().position(|&a| a).unwrap();
        in_set[start] = true;
        order.push(start);

        // Update keys in parallel
        for _ in 1..n_active {
            // Parallel key update: each active vertex not in set
            // computes its key as sum of weights to set vertices
            let last_added = *order.last().unwrap();

            (0..V)
                .into_par_iter()
                .filter(|&v| active[v] && !in_set[v])
                .for_each(|v| {
                    // Safety: each thread writes to distinct key[v]
                    unsafe {
                        let key_ptr = &key[v] as *const f32 as *mut f32;
                        *key_ptr += merged[v][last_added];
                    }
                });

            // Find max key (sequential — V is small)
            let next = (0..V)
                .filter(|&v| active[v] && !in_set[v])
                .max_by(|&a, &b| key[a].partial_cmp(&key[b]).unwrap())
                .unwrap();

            in_set[next] = true;
            order.push(next);
        }

        let t = order[n_active - 1];
        let s = order[n_active - 2];
        let cut_weight = key[t];

        (s, t, cut_weight)
    }
}

8.7 Integration with DynamicPersonMatcher

The DynamicPersonMatcher in ruvector-mincut/src/metrics.rs uses mincut for person segmentation. Integration:

use wifi_densepose_signal::rf_graph::RfGraph;

impl DynamicPersonMatcher {
    /// Update the RF graph with new CSI data and detect person boundaries.
    pub fn update_with_csi_frame(
        &mut self,
        csi_weights: &[f32; 120],  // 16-node complete graph
    ) -> Vec<PersonSegment> {
        // Update graph weights (lazy invalidation)
        self.rf_graph.update_frame(csi_weights);

        // Get current minimum cut
        let (cut_value, partition) = self.rf_graph.minimum_cut();

        // Convert partition bitmask to person segments
        let segments = self.partition_to_segments(partition, cut_value);

        // Feed segments to Kalman tracker
        for segment in &segments {
            self.pose_tracker.update_measurement(segment);
        }

        segments
    }

    /// Hierarchical multi-cut for multiple people.
    /// Recursively bisects the graph until all segments have
    /// internal connectivity above threshold.
    pub fn hierarchical_cut(
        &mut self,
        max_people: usize,
    ) -> Vec<PersonSegment> {
        let mut segments = vec![Segment::all(16)];
        let mut result = Vec::new();

        while let Some(segment) = segments.pop() {
            if segment.size() <= 2 || result.len() >= max_people {
                result.push(segment);
                continue;
            }

            // Build subgraph for this segment
            let subgraph = self.rf_graph.subgraph(&segment.nodes);
            let (cut_value, partition) = subgraph.minimum_cut();

            // Normalized cut threshold: cut_value / min(|S|, |V\S|)
            let smaller_side = partition.count_ones().min(
                (segment.size() as u32 - partition.count_ones())
            );
            let normalized_cut = cut_value / smaller_side as f32;

            if normalized_cut > self.connectivity_threshold {
                // Segment is internally well-connected — one person or empty
                result.push(segment);
            } else {
                // Split into two sub-segments and continue
                let (left, right) = segment.split(partition);
                segments.push(left);
                segments.push(right);
            }
        }

        result
    }
}

8.8 Benchmarking and Performance Targets

Operation V=16 V=32 V=64 V=128
Stoer-Wagner (full) 15 us 120 us 1.2 ms 15 ms
Lazy update (no recompute) 0.5 us 1 us 3 us 10 us
Lazy update (recompute) 15 us 120 us 1.2 ms 15 ms
PPR local cut 5 us 15 us 40 us 100 us
SIMD batch weight update 0.2 us 0.8 us 3 us 12 us
Hierarchical multi-cut (k=3) 40 us 300 us 3 ms 35 ms

20 Hz budget: 50 ms per frame. At V = 16, all operations fit comfortably within budget. At V = 128, full hierarchical multi-cut approaches the budget and would benefit from the streaming/approximate methods described in earlier sections.

8.9 Testing Strategy

#[cfg(test)]
mod tests {
    use super::*;

    /// Verify Stoer-Wagner on known graph with documented mincut.
    #[test]
    fn test_stoer_wagner_known_graph() {
        let mut graph = RfGraph::<8>::from_edges(&[
            (0, 1, 2.0), (0, 4, 3.0), (1, 2, 3.0), (1, 4, 2.0),
            (1, 5, 2.0), (2, 3, 4.0), (2, 6, 2.0), (3, 6, 2.0),
            (3, 7, 2.0), (4, 5, 3.0), (5, 6, 1.0), (6, 7, 3.0),
        ]);
        let (cut_val, _) = graph.minimum_cut();
        assert!((cut_val - 4.0).abs() < 1e-6);
    }

    /// Verify lazy update correctness: cache invalidation triggers
    /// recomputation when crossing-edge weight changes significantly.
    #[test]
    fn test_lazy_update_invalidation() { /* ... */ }

    /// Verify SIMD and scalar paths produce identical results.
    #[test]
    fn test_simd_scalar_equivalence() { /* ... */ }

    /// Benchmark: 10,000 frames at 20 Hz with random weight perturbations.
    /// Verify average per-frame time < 100 us for V=16.
    #[test]
    fn bench_20hz_sustained() { /* ... */ }

    /// Property test: mincut value <= minimum vertex weighted degree.
    #[test]
    fn prop_mincut_bounded_by_min_degree() { /* ... */ }
}

9. Summary and Recommendations

9.1 Algorithm Selection Matrix

Criterion Stoer-Wagner Karger-Stein Dynamic (Thorup) Streaming Local PPR Lazy Hybrid
Exact result Yes Prob. No (approx) No (approx) No (approx) Heuristic
V=16 latency 15 us 25 us 120 us 50 us 5 us 1-15 us
V=128 latency 15 ms 8 ms 2 ms 1 ms 100 us 0.1-15 ms
Incremental No No Yes Yes Yes Yes
Safety-critical Yes No No No No Heuristic
Implementation complexity Low Medium High High Medium Low

Primary path (V <= 32):

  1. Receive CSI frame.
  2. SIMD batch update edge weights.
  3. Lazy check: if cached partition is still valid, return cached result.
  4. If invalidated: run Stoer-Wagner (exact, deterministic, fast enough).
  5. Cache result for next frame.

Secondary path (V > 32 or multi-cut needed):

  1. Use PPR local partitioning seeded from tracker predictions.
  2. If local cuts are low-conductance, return local result.
  3. Otherwise, fall back to full Stoer-Wagner.

Safety-critical path (MAT/vital signs):

  1. Always use Stoer-Wagner (deterministic, exact).
  2. Cross-validate with a second Karger trial (independent verification).
  3. If results disagree, use the smaller cut value (conservative).

9.3 Future Work

  1. Distributed mincut: Each ESP32 node computes a sketch of its local view. The coordinator merges sketches for approximate global mincut. Reduces coordinator bottleneck and enables graceful degradation.

  2. GPU-accelerated mincut: For cloud-hosted deployments, batch multiple frames into a GPU kernel for parallel Stoer-Wagner computation across time windows.

  3. Learning-augmented algorithms: Train a small neural network to predict the mincut partition from CSI features, using exact Stoer-Wagner as ground truth. The network predicts in O(1) time; Stoer-Wagner verifies periodically.

  4. Hypergraph mincut: Model multi-body RF interactions (where three or more nodes are simultaneously affected) as hyperedges. Hypergraph mincut algorithms capture higher-order spatial structure.


References

  1. Stoer, M. and Wagner, F. "A Simple Min-Cut Algorithm." JACM 44(4), 1997.
  2. Karger, D. "Global Min-Cuts in RNC, and Other Ramifications of a Simple Min-Cut Algorithm." SODA, 1993.
  3. Karger, D. and Stein, C. "A New Approach to the Minimum Cut Problem." JACM 43(4), 1996.
  4. Benczur, A. and Karger, D. "Approximating s-t Minimum Cuts in O(n^2) Time." STOC, 1996.
  5. Spielman, D. and Teng, S. "Nearly-Linear Time Algorithms for Graph Partitioning, Graph Sparsification, and Solving Linear Systems." STOC, 2004.
  6. Spielman, D. and Srivastava, N. "Graph Sparsification by Effective Resistances." STOC, 2008 / SICOMP, 2011.
  7. Andersen, R., Chung, F., and Lang, K. "Local Graph Partitioning using PageRank Vectors." FOCS, 2006.
  8. Ahn, K.J., Guha, S., and McGregor, A. "Analyzing Graph Structure via Linear Measurements." SODA, 2012.
  9. Ahn, K.J., Guha, S., and McGregor, A. "Graph Sketches: Sparsification, Spanners, and Subgraphs." PODS, 2012.
  10. Thorup, M. "Near-Optimal Fully-Dynamic Graph Connectivity." STOC, 2000.
  11. Goranci, G., Henzinger, M., and Thorup, M. "Incremental Exact Min-Cut in Polylogarithmic Amortized Update Time." TALG, 2018.
  12. Rubinstein, A., Schramm, T., and Weinberg, S.M. "Computing Exact Minimum Cuts Without Knowing the Graph." ITCS, 2018.
  13. Abraham, I., Durfee, D., et al. "Using Petal-Decompositions to Build a Low Stretch Spanning Tree." STOC, 2016.
  14. Nanongkai, D. and Saranurak, T. "Dynamic Minimum Spanning Forest with Subpolynomial Worst-Case Update Time." FOCS, 2017.