docs(adr): ADR-084 — RaBitQ similarity sensor for CSI/pose/memory (proposed) (#429)

Adopt RaBitQ-style binary sketches as a first-class cheap similarity sensor at four points in the RuView pipeline: AETHER re-ID hot-cache filter, per-room novelty / drift detection, mesh-exchange compression, and privacy-preserving event logs. Implementation home is ruvector-core::quantization::BinaryQuantized (already vendored, already SIMD-accelerated NEON+POPCNT, 32x compression, 1-bit sign quantization + hamming distance), re-exported through a thin RuView-flavored API in wifi-densepose-ruvector::sketch. Pattern at every site: dense embedding -> RaBitQ sketch -> hamming pre-filter to top-K -> full-precision refinement only on miss. Decision boundary unchanged; sketch is a sensor that gates *which* comparisons run, not *what* they decide. Acceptance test (per source proposal): - sketch compare cost reduction: 8x-30x vs full float - top-K candidate coverage: >= 90% agreement with full-float pass - end-to-end accuracy regression: < 1 percentage point Site-by-site rollback if any criterion fails at a given site; remaining sites continue. Five implementation passes, each independently testable: ruvector module wrap, AETHER re-ID pre-filter, cluster-Pi novelty sensor, mesh-exchange compression, privacy log. Sensor MCU unchanged; sketches happen at the cluster Pi (ADR-083). Validation requires acceptance numbers on >= 3 of 5 passes. Open question (out-of-scope until pass-1 benchmark): whether RuView embeddings need a Johnson-Lindenstrauss / RaBitQ-paper randomized rotation before sign-quantization, or whether pure 1-bit sign quantization (today's BinaryQuantized) is sufficient.
2026-05-18 23:59:27 +00:00 · 2026-04-25 23:08:05 -04:00 · 2026-04-25 23:08:05 -04:00 · c19a33ee1c
commit c19a33ee1c
parent 259939b7ec
1 changed files with 276 additions and 0 deletions
--- a/docs/adr/ADR-084-rabitq-similarity-sensor.md
+++ b/docs/adr/ADR-084-rabitq-similarity-sensor.md
@ -0,0 +1,276 @@
+# ADR-084: RaBitQ Similarity Sensor for CSI / Pose / Memory Routing
+
+| Field          | Value                                                                                   |
+|----------------|-----------------------------------------------------------------------------------------|
+| **Status**     | Proposed                                                                                |
+| **Date**       | 2026-04-26                                                                              |
+| **Authors**    | ruv                                                                                     |
+| **Refines**    | ADR-024 (AETHER re-ID embeddings), ADR-027 (cross-environment domain generalization), ADR-076 (CSI spectrogram embeddings), ADR-081 (5-layer firmware kernel) |
+| **Companion**  | ADR-083 (per-cluster Pi compute hop)                                                    |
+| **Implements** | `vendor/ruvector/crates/ruvector-core/src/quantization.rs::BinaryQuantized`             |
+
+## Context
+
+RuView's signal pipeline already produces several **dense float
+embeddings** at different layers:
+
+- AETHER 128-d re-ID embeddings on each `PoseTrack` (ADR-024)
+- 64–256-d CSI spectrogram embeddings (ADR-076)
+- per-room field-model eigenmode vectors (ADR-030)
+- per-frame multistatic fused vectors (ADR-029)
+
+Every one of these eventually answers the same shape of question:
+**"have I seen something like this before?"** Today the answer is
+computed by full float dot-product / Mahalanobis comparisons against a
+candidate set. That cost grows linearly with stored vectors and
+quadratically when used inside dynamic-mincut graph maintenance,
+re-identification re-scoring, and cross-environment domain detection.
+
+The vendored `ruvector-core` crate already ships a 1-bit quantization
+(`BinaryQuantized`, 32× compression, SIMD popcnt + hamming distance)
+that is functionally equivalent to the **RaBitQ** family of binary
+sketches: a vector is reduced to one bit per dimension, compared via
+hamming distance, and used as a coarse pre-filter before full
+precision refinement. The same module also exposes `ScalarQuantized`
+(int8, 4×) and `ProductQuantized` (PQ, 8–16×), so the tiered
+quantization story is already implemented; the *deployment pattern* is
+not.
+
+The user observation that motivates this ADR: **RaBitQ-style sketches
+are not just a vector compression trick — they are a cheap similarity
+sensor.** Used as a sensor, they unlock:
+
+- always-on novelty / anomaly gating that wakes heavy CNNs only on
+  meaningful change
+- cluster-Pi memory routing (which shard / room / model to query first)
+- cross-node mesh exchange of compressed sketches instead of raw vectors
+- privacy-preserving event logs (sketches, not reconstructable signals)
+
+This ADR formalizes the deployment pattern across the RuView stack and
+commits to `ruvector::quantization::BinaryQuantized` as the canonical
+implementation.
+
+## Decision
+
+Adopt **RaBitQ-style binary sketches as a first-class, cheap
+similarity sensor** at four points in the RuView pipeline:
+
+1. **CSI / pose embedding hot-cache filter** at the cluster Pi.
+2. **Drift / novelty sensor** between live observation and a
+   per-room normal-state bank.
+3. **Mesh-exchange compression** between sensor nodes when reporting
+   cross-cluster events.
+4. **Privacy-preserving event log** at the cluster Pi and gateway.
+
+The canonical pattern at every point is:
+
+```text
+dense embedding  ──►  RaBitQ sketch  ──►  hamming/popcnt compare
+                                       ├──►  candidate set (top-K)
+                                       └──►  novelty score (0..1)
+                                              │
+                                              ▼
+                          ┌── below threshold ──►  emit summary, no escalation
+                          │
+                          └── above threshold ──►  full-precision refinement
+                                                     ├──►  ruvector mincut / HNSW
+                                                     ├──►  AETHER re-ID rescoring
+                                                     └──►  pose model / CNN wake
+```
+
+### Implementation home
+
+- **Sketch type and SIMD primitives**:
+  `vendor/ruvector/crates/ruvector-core/src/quantization.rs::BinaryQuantized`
+  — already implemented, already SIMD-accelerated (NEON on aarch64,
+  POPCNT on x86_64). Re-export through a new
+  `crates/wifi-densepose-ruvector/src/sketch.rs` module so consumers in
+  `signal`, `train`, `mat`, and `sensing-server` see a stable
+  RuView-flavored API and don't bind directly to the vendor crate.
+
+- **Per-room normal-state bank**: lives at the cluster Pi (ADR-083),
+  not on the sensor MCU. Sensor MCUs continue to emit dense embeddings
+  in the existing `rv_feature_state_t` packet shape; sketching happens
+  on the Pi where the candidate bank is.
+
+- **Sketch versioning**: each sketch carries a 16-bit `sketch_version`
+  field so the Pi can tell incompatible sketches apart when an
+  embedding model upgrades. Bumped on every embedding-model change.
+
+### Where the sensor sits in the pipeline
+
+| Pipeline stage | Today (full float) | With RaBitQ similarity sensor |
+|---|---|---|
+| AETHER re-ID match | full 128-d cosine on every active track × candidate | hamming pre-filter to top-K, then full cosine on K |
+| Mincut subcarrier selection | full graph re-evaluation | sketch-flagged "likely-changed" boundary edges, full mincut on those |
+| CSI room fingerprint | trained classifier on full embedding | sketch hamming to per-room sketch, classifier on miss |
+| Field-model novelty (ADR-030) | residual-energy threshold | sketch novelty as second gate before SVD redo |
+| Mesh / inter-cluster sync | dense embedding broadcast | sketch broadcast; full vector only on miss |
+| Event log retention | full embedding stored | sketch + witness hash stored; raw embedding ephemeral |
+
+In every row, the **decision boundary is unchanged** — full precision
+still owns the final answer. The sketch is a sensor that only gates
+which comparisons run, not what they decide.
+
+### Acceptance criterion (per the source proposal)
+
+The system-level acceptance test is:
+
+> RaBitQ should reduce compare cost by **8× to 30×** while preserving
+> top-k decisions well enough that full refinement changes **fewer
+> than 10%** of final results.
+
+Concretely, this means:
+
+- Sketch compare must be measurably **8× cheaper** than the float
+  comparison it replaces (criterion-bench in `signal/`).
+- Top-K candidate set chosen by sketch must contain ≥ 90% of the
+  candidates the full-float pass would have picked (offline replay
+  against recorded CSI).
+- End-to-end pose / re-ID accuracy must regress by **less than 1
+  percentage point** vs the full-float baseline on the existing
+  evaluation set.
+
+If any of these three fail, the sensor is rolled back at that point in
+the pipeline and the failing site reverts to full float; the rest of
+the pipeline keeps using sketches. This is point-by-point, not
+all-or-nothing.
+
+## Consequences
+
+### Positive
+
+- **Cheaper hot path everywhere a "have I seen this" question lives.**
+  AETHER re-ID, mincut maintenance, room fingerprinting, novelty
+  detection, mesh sync, and event-log retention all run a 32×-smaller,
+  popcnt-friendly comparison first.
+- **Always-on anomaly gating becomes affordable.** The CNN / pose
+  model only wakes when sketch novelty crosses a threshold. Energy
+  budget per node drops materially in steady-state quiet rooms.
+- **Privacy story improves.** Event logs and inter-cluster mesh
+  traffic carry sketches and witness hashes, not reconstructable
+  embeddings. The 1-bit quantization is *not* invertible to the
+  original CSI.
+- **Composes cleanly with ADR-083.** The cluster Pi is the natural
+  home for the sketch bank; sensor MCUs remain unchanged.
+- **No new dependency.** `BinaryQuantized` is already in the vendored
+  `ruvector-core` and already SIMD-accelerated.
+
+### Negative / risks
+
+- **Sketch quality depends on embedding distribution.** Pure 1-bit
+  sign quantization (which `BinaryQuantized` implements) works best
+  when the embedding space is roughly zero-centered and isotropic.
+  AETHER and CSI spectrogram embeddings need to be benchmarked for
+  this assumption; if either fails, a randomized rotation
+  (Johnson-Lindenstrauss / RaBitQ-paper-style) must be added before
+  sketching. Out-of-scope for this ADR; tracked as a follow-up if
+  the acceptance test fails.
+- **Top-K coverage degrades for small candidate sets.** With < 16
+  candidates, the sketch compare can pick the wrong K. Site-by-site
+  fallback to full float is part of the rollout plan.
+- **Sketch-version skew during model upgrades.** A model change
+  invalidates all stored sketches; the cluster Pi must re-sketch the
+  candidate bank when `sketch_version` bumps. Cost is bounded but
+  non-zero.
+
+### Neutral
+
+- ADR-024, ADR-027, ADR-029, ADR-030, ADR-076 are unchanged in
+  *what* they compute. They gain a sketch pre-filter at the comparison
+  step.
+- ADR-082's confirmed-track output filter is upstream of the sketch
+  layer; it stays correct.
+
+## Implementation
+
+The implementation lands in five passes, each independently testable.
+Every pass is gated by the acceptance criterion above; if any fail,
+that site rolls back and the rest continue.
+
+1. **`wifi-densepose-ruvector::sketch` module.** Re-export
+   `BinaryQuantized` plus a thin RuView-flavored API
+   (`Sketch::from_embedding`, `Sketch::distance`, `SketchBank::topk`).
+   Add `sketch_version: u16` and `embedding_dim: u16` fields to the
+   public type. Criterion benches: sketch ↔ float compare-cost ratio.
+
+2. **AETHER re-ID pre-filter.** In
+   `wifi-densepose-signal/src/ruvsense/pose_tracker.rs`, before
+   computing the full 128-d cosine across active tracks × candidates,
+   sketch both sides and reduce to top-K via hamming. Bench: re-ID
+   pass time per frame, ID-stability under cross-room transitions.
+
+3. **Cluster-Pi novelty sensor.** In
+   `wifi-densepose-sensing-server`, maintain a per-room
+   `SketchBank` of "normal-state" sketches; on each incoming
+   `rv_feature_state_t`, compute embedding sketch, score novelty
+   against the bank, and emit `novelty_score` as a new field on the
+   WebSocket update envelope. Heavy CNN wake gate uses this score.
+
+4. **Mesh-exchange compression.** Inter-cluster broadcasts (the
+   ADR-066 swarm-bridge channel) carry sketch + witness instead of
+   the full embedding when novelty is low. Full embedding only
+   exchanged when novelty crosses threshold.
+
+5. **Privacy-preserving event log.** Event log table on the cluster
+   Pi stores `(sketch_bytes, sketch_version, novelty_score,
+   witness_sha256)` instead of raw embeddings. Existing log readers
+   are unchanged in API; only the storage layer rewrites.
+
+Each pass adds tests: a property test (sketch ↔ float top-K agreement
+≥ 90%), a criterion bench (≥ 8× compare cost reduction), and an
+end-to-end accuracy regression test (< 1 pp drop).
+
+## Validation
+
+This ADR is **proposed**, not accepted. Acceptance requires the three
+acceptance numbers above to hold on **at least three of the five
+implementation passes** (the sites where the bulk of the load sits:
+AETHER re-ID, cluster-Pi novelty, and event log). The mesh-exchange
+and mincut prefilter passes are nice-to-haves; they can ship
+afterward if their per-site numbers hold.
+
+Validation runs against:
+
+- the existing 1,539-test workspace suite (must stay green)
+- a new `tests/integration/rabitq_sketch_pipeline.rs` integration test
+  driving recorded CSI through the full pipeline with and without
+  sketches, comparing top-K decisions and end-to-end pose accuracy
+- ESP32-S3 on COM7 — sensor MCU unchanged; sketch happens at the
+  cluster Pi, so this validation is a smoke test that the
+  sensor → Pi UDP path still works after the cluster Pi gains the
+  sketch bank
+
+## Related
+
+- **ADR-024** (Accepted) — AETHER re-ID embeddings. Primary consumer
+  of the sketch pre-filter.
+- **ADR-027** (Accepted) — Cross-environment domain generalization
+  (MERIDIAN). Per-room sketch bank is the natural data structure for
+  domain detection.
+- **ADR-030** (Proposed) — RuvSense persistent field model. Sketch
+  novelty is the cheap second gate before SVD recompute.
+- **ADR-066** — Swarm bridge to coordinator. Inter-cluster sketch
+  exchange.
+- **ADR-076** (Accepted) — CSI spectrogram embeddings. Sketch
+  consumer; embedding source.
+- **ADR-081** (Accepted) — 5-layer adaptive CSI mesh firmware kernel.
+  Sensor MCU unchanged by this ADR; sketches happen at the cluster Pi.
+- **ADR-083** (Proposed) — Per-cluster Pi compute hop. Defines the
+  device class that hosts the sketch bank.
+
+## Open questions
+
+- **Does `BinaryQuantized` need a randomized rotation pre-pass for
+  RuView's embedding distributions?** Pure sign quantization assumes
+  zero-centered, isotropic embeddings. If AETHER / spectrogram
+  distributions are skewed (likely for spectrogram), add a
+  `randomized_rotation` pre-pass following the original RaBitQ paper
+  (Gao & Long, SIGMOD 2024). Decided after pass-1 benchmark.
+- **Sketch dimension target.** Default to the embedding's native
+  dimension (128 for AETHER, 256 for spectrogram). Higher-dimensional
+  sketches (Johnson-Lindenstrauss-projected to 512) trade compute for
+  recall; benchmark before committing.
+- **Per-room vs per-deployment sketch banks.** Defaulting to per-room
+  for novelty detection. Cross-room re-ID may want a shared bank;
+  decide once cross-room AETHER traces are available.