From c19a33ee1cb5be2429a4e10bd9ddb983993a72e2 Mon Sep 17 00:00:00 2001 From: rUv Date: Sat, 25 Apr 2026 23:08:05 -0400 Subject: [PATCH] =?UTF-8?q?docs(adr):=20ADR-084=20=E2=80=94=20RaBitQ=20sim?= =?UTF-8?q?ilarity=20sensor=20for=20CSI/pose/memory=20(proposed)=20(#429)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adopt RaBitQ-style binary sketches as a first-class cheap similarity sensor at four points in the RuView pipeline: AETHER re-ID hot-cache filter, per-room novelty / drift detection, mesh-exchange compression, and privacy-preserving event logs. Implementation home is ruvector-core::quantization::BinaryQuantized (already vendored, already SIMD-accelerated NEON+POPCNT, 32x compression, 1-bit sign quantization + hamming distance), re-exported through a thin RuView-flavored API in wifi-densepose-ruvector::sketch. Pattern at every site: dense embedding -> RaBitQ sketch -> hamming pre-filter to top-K -> full-precision refinement only on miss. Decision boundary unchanged; sketch is a sensor that gates *which* comparisons run, not *what* they decide. Acceptance test (per source proposal): - sketch compare cost reduction: 8x-30x vs full float - top-K candidate coverage: >= 90% agreement with full-float pass - end-to-end accuracy regression: < 1 percentage point Site-by-site rollback if any criterion fails at a given site; remaining sites continue. Five implementation passes, each independently testable: ruvector module wrap, AETHER re-ID pre-filter, cluster-Pi novelty sensor, mesh-exchange compression, privacy log. Sensor MCU unchanged; sketches happen at the cluster Pi (ADR-083). Validation requires acceptance numbers on >= 3 of 5 passes. Open question (out-of-scope until pass-1 benchmark): whether RuView embeddings need a Johnson-Lindenstrauss / RaBitQ-paper randomized rotation before sign-quantization, or whether pure 1-bit sign quantization (today's BinaryQuantized) is sufficient. --- docs/adr/ADR-084-rabitq-similarity-sensor.md | 276 +++++++++++++++++++ 1 file changed, 276 insertions(+) create mode 100644 docs/adr/ADR-084-rabitq-similarity-sensor.md diff --git a/docs/adr/ADR-084-rabitq-similarity-sensor.md b/docs/adr/ADR-084-rabitq-similarity-sensor.md new file mode 100644 index 00000000..4cd7c56d --- /dev/null +++ b/docs/adr/ADR-084-rabitq-similarity-sensor.md @@ -0,0 +1,276 @@ +# ADR-084: RaBitQ Similarity Sensor for CSI / Pose / Memory Routing + +| Field | Value | +|----------------|-----------------------------------------------------------------------------------------| +| **Status** | Proposed | +| **Date** | 2026-04-26 | +| **Authors** | ruv | +| **Refines** | ADR-024 (AETHER re-ID embeddings), ADR-027 (cross-environment domain generalization), ADR-076 (CSI spectrogram embeddings), ADR-081 (5-layer firmware kernel) | +| **Companion** | ADR-083 (per-cluster Pi compute hop) | +| **Implements** | `vendor/ruvector/crates/ruvector-core/src/quantization.rs::BinaryQuantized` | + +## Context + +RuView's signal pipeline already produces several **dense float +embeddings** at different layers: + +- AETHER 128-d re-ID embeddings on each `PoseTrack` (ADR-024) +- 64–256-d CSI spectrogram embeddings (ADR-076) +- per-room field-model eigenmode vectors (ADR-030) +- per-frame multistatic fused vectors (ADR-029) + +Every one of these eventually answers the same shape of question: +**"have I seen something like this before?"** Today the answer is +computed by full float dot-product / Mahalanobis comparisons against a +candidate set. That cost grows linearly with stored vectors and +quadratically when used inside dynamic-mincut graph maintenance, +re-identification re-scoring, and cross-environment domain detection. + +The vendored `ruvector-core` crate already ships a 1-bit quantization +(`BinaryQuantized`, 32× compression, SIMD popcnt + hamming distance) +that is functionally equivalent to the **RaBitQ** family of binary +sketches: a vector is reduced to one bit per dimension, compared via +hamming distance, and used as a coarse pre-filter before full +precision refinement. The same module also exposes `ScalarQuantized` +(int8, 4×) and `ProductQuantized` (PQ, 8–16×), so the tiered +quantization story is already implemented; the *deployment pattern* is +not. + +The user observation that motivates this ADR: **RaBitQ-style sketches +are not just a vector compression trick — they are a cheap similarity +sensor.** Used as a sensor, they unlock: + +- always-on novelty / anomaly gating that wakes heavy CNNs only on + meaningful change +- cluster-Pi memory routing (which shard / room / model to query first) +- cross-node mesh exchange of compressed sketches instead of raw vectors +- privacy-preserving event logs (sketches, not reconstructable signals) + +This ADR formalizes the deployment pattern across the RuView stack and +commits to `ruvector::quantization::BinaryQuantized` as the canonical +implementation. + +## Decision + +Adopt **RaBitQ-style binary sketches as a first-class, cheap +similarity sensor** at four points in the RuView pipeline: + +1. **CSI / pose embedding hot-cache filter** at the cluster Pi. +2. **Drift / novelty sensor** between live observation and a + per-room normal-state bank. +3. **Mesh-exchange compression** between sensor nodes when reporting + cross-cluster events. +4. **Privacy-preserving event log** at the cluster Pi and gateway. + +The canonical pattern at every point is: + +```text +dense embedding ──► RaBitQ sketch ──► hamming/popcnt compare + ├──► candidate set (top-K) + └──► novelty score (0..1) + │ + ▼ + ┌── below threshold ──► emit summary, no escalation + │ + └── above threshold ──► full-precision refinement + ├──► ruvector mincut / HNSW + ├──► AETHER re-ID rescoring + └──► pose model / CNN wake +``` + +### Implementation home + +- **Sketch type and SIMD primitives**: + `vendor/ruvector/crates/ruvector-core/src/quantization.rs::BinaryQuantized` + — already implemented, already SIMD-accelerated (NEON on aarch64, + POPCNT on x86_64). Re-export through a new + `crates/wifi-densepose-ruvector/src/sketch.rs` module so consumers in + `signal`, `train`, `mat`, and `sensing-server` see a stable + RuView-flavored API and don't bind directly to the vendor crate. + +- **Per-room normal-state bank**: lives at the cluster Pi (ADR-083), + not on the sensor MCU. Sensor MCUs continue to emit dense embeddings + in the existing `rv_feature_state_t` packet shape; sketching happens + on the Pi where the candidate bank is. + +- **Sketch versioning**: each sketch carries a 16-bit `sketch_version` + field so the Pi can tell incompatible sketches apart when an + embedding model upgrades. Bumped on every embedding-model change. + +### Where the sensor sits in the pipeline + +| Pipeline stage | Today (full float) | With RaBitQ similarity sensor | +|---|---|---| +| AETHER re-ID match | full 128-d cosine on every active track × candidate | hamming pre-filter to top-K, then full cosine on K | +| Mincut subcarrier selection | full graph re-evaluation | sketch-flagged "likely-changed" boundary edges, full mincut on those | +| CSI room fingerprint | trained classifier on full embedding | sketch hamming to per-room sketch, classifier on miss | +| Field-model novelty (ADR-030) | residual-energy threshold | sketch novelty as second gate before SVD redo | +| Mesh / inter-cluster sync | dense embedding broadcast | sketch broadcast; full vector only on miss | +| Event log retention | full embedding stored | sketch + witness hash stored; raw embedding ephemeral | + +In every row, the **decision boundary is unchanged** — full precision +still owns the final answer. The sketch is a sensor that only gates +which comparisons run, not what they decide. + +### Acceptance criterion (per the source proposal) + +The system-level acceptance test is: + +> RaBitQ should reduce compare cost by **8× to 30×** while preserving +> top-k decisions well enough that full refinement changes **fewer +> than 10%** of final results. + +Concretely, this means: + +- Sketch compare must be measurably **8× cheaper** than the float + comparison it replaces (criterion-bench in `signal/`). +- Top-K candidate set chosen by sketch must contain ≥ 90% of the + candidates the full-float pass would have picked (offline replay + against recorded CSI). +- End-to-end pose / re-ID accuracy must regress by **less than 1 + percentage point** vs the full-float baseline on the existing + evaluation set. + +If any of these three fail, the sensor is rolled back at that point in +the pipeline and the failing site reverts to full float; the rest of +the pipeline keeps using sketches. This is point-by-point, not +all-or-nothing. + +## Consequences + +### Positive + +- **Cheaper hot path everywhere a "have I seen this" question lives.** + AETHER re-ID, mincut maintenance, room fingerprinting, novelty + detection, mesh sync, and event-log retention all run a 32×-smaller, + popcnt-friendly comparison first. +- **Always-on anomaly gating becomes affordable.** The CNN / pose + model only wakes when sketch novelty crosses a threshold. Energy + budget per node drops materially in steady-state quiet rooms. +- **Privacy story improves.** Event logs and inter-cluster mesh + traffic carry sketches and witness hashes, not reconstructable + embeddings. The 1-bit quantization is *not* invertible to the + original CSI. +- **Composes cleanly with ADR-083.** The cluster Pi is the natural + home for the sketch bank; sensor MCUs remain unchanged. +- **No new dependency.** `BinaryQuantized` is already in the vendored + `ruvector-core` and already SIMD-accelerated. + +### Negative / risks + +- **Sketch quality depends on embedding distribution.** Pure 1-bit + sign quantization (which `BinaryQuantized` implements) works best + when the embedding space is roughly zero-centered and isotropic. + AETHER and CSI spectrogram embeddings need to be benchmarked for + this assumption; if either fails, a randomized rotation + (Johnson-Lindenstrauss / RaBitQ-paper-style) must be added before + sketching. Out-of-scope for this ADR; tracked as a follow-up if + the acceptance test fails. +- **Top-K coverage degrades for small candidate sets.** With < 16 + candidates, the sketch compare can pick the wrong K. Site-by-site + fallback to full float is part of the rollout plan. +- **Sketch-version skew during model upgrades.** A model change + invalidates all stored sketches; the cluster Pi must re-sketch the + candidate bank when `sketch_version` bumps. Cost is bounded but + non-zero. + +### Neutral + +- ADR-024, ADR-027, ADR-029, ADR-030, ADR-076 are unchanged in + *what* they compute. They gain a sketch pre-filter at the comparison + step. +- ADR-082's confirmed-track output filter is upstream of the sketch + layer; it stays correct. + +## Implementation + +The implementation lands in five passes, each independently testable. +Every pass is gated by the acceptance criterion above; if any fail, +that site rolls back and the rest continue. + +1. **`wifi-densepose-ruvector::sketch` module.** Re-export + `BinaryQuantized` plus a thin RuView-flavored API + (`Sketch::from_embedding`, `Sketch::distance`, `SketchBank::topk`). + Add `sketch_version: u16` and `embedding_dim: u16` fields to the + public type. Criterion benches: sketch ↔ float compare-cost ratio. + +2. **AETHER re-ID pre-filter.** In + `wifi-densepose-signal/src/ruvsense/pose_tracker.rs`, before + computing the full 128-d cosine across active tracks × candidates, + sketch both sides and reduce to top-K via hamming. Bench: re-ID + pass time per frame, ID-stability under cross-room transitions. + +3. **Cluster-Pi novelty sensor.** In + `wifi-densepose-sensing-server`, maintain a per-room + `SketchBank` of "normal-state" sketches; on each incoming + `rv_feature_state_t`, compute embedding sketch, score novelty + against the bank, and emit `novelty_score` as a new field on the + WebSocket update envelope. Heavy CNN wake gate uses this score. + +4. **Mesh-exchange compression.** Inter-cluster broadcasts (the + ADR-066 swarm-bridge channel) carry sketch + witness instead of + the full embedding when novelty is low. Full embedding only + exchanged when novelty crosses threshold. + +5. **Privacy-preserving event log.** Event log table on the cluster + Pi stores `(sketch_bytes, sketch_version, novelty_score, + witness_sha256)` instead of raw embeddings. Existing log readers + are unchanged in API; only the storage layer rewrites. + +Each pass adds tests: a property test (sketch ↔ float top-K agreement +≥ 90%), a criterion bench (≥ 8× compare cost reduction), and an +end-to-end accuracy regression test (< 1 pp drop). + +## Validation + +This ADR is **proposed**, not accepted. Acceptance requires the three +acceptance numbers above to hold on **at least three of the five +implementation passes** (the sites where the bulk of the load sits: +AETHER re-ID, cluster-Pi novelty, and event log). The mesh-exchange +and mincut prefilter passes are nice-to-haves; they can ship +afterward if their per-site numbers hold. + +Validation runs against: + +- the existing 1,539-test workspace suite (must stay green) +- a new `tests/integration/rabitq_sketch_pipeline.rs` integration test + driving recorded CSI through the full pipeline with and without + sketches, comparing top-K decisions and end-to-end pose accuracy +- ESP32-S3 on COM7 — sensor MCU unchanged; sketch happens at the + cluster Pi, so this validation is a smoke test that the + sensor → Pi UDP path still works after the cluster Pi gains the + sketch bank + +## Related + +- **ADR-024** (Accepted) — AETHER re-ID embeddings. Primary consumer + of the sketch pre-filter. +- **ADR-027** (Accepted) — Cross-environment domain generalization + (MERIDIAN). Per-room sketch bank is the natural data structure for + domain detection. +- **ADR-030** (Proposed) — RuvSense persistent field model. Sketch + novelty is the cheap second gate before SVD recompute. +- **ADR-066** — Swarm bridge to coordinator. Inter-cluster sketch + exchange. +- **ADR-076** (Accepted) — CSI spectrogram embeddings. Sketch + consumer; embedding source. +- **ADR-081** (Accepted) — 5-layer adaptive CSI mesh firmware kernel. + Sensor MCU unchanged by this ADR; sketches happen at the cluster Pi. +- **ADR-083** (Proposed) — Per-cluster Pi compute hop. Defines the + device class that hosts the sketch bank. + +## Open questions + +- **Does `BinaryQuantized` need a randomized rotation pre-pass for + RuView's embedding distributions?** Pure sign quantization assumes + zero-centered, isotropic embeddings. If AETHER / spectrogram + distributions are skewed (likely for spectrogram), add a + `randomized_rotation` pre-pass following the original RaBitQ paper + (Gao & Long, SIGMOD 2024). Decided after pass-1 benchmark. +- **Sketch dimension target.** Default to the embedding's native + dimension (128 for AETHER, 256 for spectrogram). Higher-dimensional + sketches (Johnson-Lindenstrauss-projected to 512) trade compute for + recall; benchmark before committing. +- **Per-room vs per-deployment sketch banks.** Defaulting to per-room + for novelty detection. Cross-room re-ID may want a shared bank; + decide once cross-room AETHER traces are available.