mirror of
https://github.com/ruvnet/RuView.git
synced 2026-04-26 13:10:40 +00:00
feat(ruvector,signal,sensing-server): ADR-084 Passes 1/1.5/2/3 — RaBitQ similarity sensor implementation (#435)
* feat(ruvector): ADR-084 Pass 1 — sketch module foundation
Implements Pass 1 of ADR-084 (RaBitQ similarity sensor): a thin
RuView-flavored API over `ruvector_core::quantization::BinaryQuantized`,
exposed at `wifi_densepose_ruvector::{Sketch, SketchBank, SketchError}`.
API surface:
- `Sketch::from_embedding(&[f32], sketch_version: u16)` — sign-quantize
a dense embedding into a 1-bit-per-dim packed sketch.
- `Sketch::distance` — hamming distance with schema-mismatch error.
- `Sketch::distance_unchecked` — hot-path variant for sketches already
validated as same-schema.
- `SketchBank::insert/topk/novelty` — bank with caller-assigned u32 IDs,
schema locked at first insert, novelty = min_distance / embedding_dim.
Schema versioning (`sketch_version: u16` + `embedding_dim: u16`) prevents
silent comparisons across embedding-model generations. Bumping the model
forces re-sketch of the candidate bank.
Pass 1 establishes the API and unit-test foundation. Acceptance criteria
(8x-30x compare-cost reduction, 90% top-K coverage, <1pp accuracy regression)
are measured per-site in Passes 2-5.
Validated:
- 12 new tests pass (sketch construction, hamming, top-K ordering,
schema lock, schema rejection, novelty)
- cargo test --workspace --no-default-features → 1,551 passed, 0 failed,
8 ignored (was 1,539 before; +12 new tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #117300)
Co-Authored-By: claude-flow <ruv@ruv.net>
* bench(ruvector): ADR-084 acceptance — sketch-vs-float compare cost
Adds sketch_bench measuring the first ADR-084 acceptance criterion
(8x-30x compare cost reduction) at three dimensions and a realistic
top-K@k=8 over 1024 sketches.
Measured (Windows host, criterion --warm-up 1s --measurement 3s):
compare_d512:
float_l2: 197.03 ns/op
float_cosine: 231.17 ns/op
sketch_hamming: 4.56 ns/op → 43-51x speedup
topk_d128_n1024_k8:
float_l2_topk: 47.59 us
sketch_hamming: 6.34 us → 7.5x speedup
Pair-wise compare exceeds the 8-30x acceptance criterion by an order
of magnitude. Top-K is at 7.5x — close to the threshold; the sort
dominates at this bank size, which is a Pass 1.5 optimization
opportunity (partial-sort heap for small K).
Co-Authored-By: claude-flow <ruv@ruv.net>
* perf(ruvector): ADR-084 Pass 1.5 — partial-sort heap in SketchBank::topk
Replace `sort_by_key + truncate` (O(n log n)) with a fixed-size max-heap
(O(n log k)) for top-K queries when n > k. Fast path when n ≤ k stays
on the simple sort.
Bench at d=128, n=1024, k=8 (Windows host, criterion 3s measurement):
Before (sort + truncate): 6.34 µs/op
After (heap): 3.83 µs/op -39.4% / +1.65× faster
Combined with the 32× memory shrink and 47.6 µs → 3.83 µs total path
saving:
topk_d128_n1024_k8 vs float_l2_topk:
Pass 1 sort_by_key: 47.59 µs / 6.34 µs = 7.5× speedup
Pass 1.5 heap: 47.59 µs / 3.83 µs = 12.4× speedup
Now over the ADR-084 acceptance criterion of 8× minimum. Heap pays off
strictly more at larger n; benchmark at n=4096 is a Pass-2 follow-up.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(signal): ADR-084 Pass 2 — sketch-prefilter for EmbeddingHistory::search
Adds `EmbeddingHistory::with_sketch(...)` and `search_prefilter(query, k,
prefilter_factor)`. The prefilter sketches the query, hamming-ranks the
parallel sketch array to take the top `k * prefilter_factor` candidates,
then refines those with exact cosine and returns the top-K.
`EmbeddingHistory::new(...)` is unchanged — sketches are opt-in via the
new constructor. `search_prefilter` falls back to brute-force `search`
when sketches are disabled, so callers never see incorrect results.
ADR-084 acceptance criterion empirically validated:
Synthetic 128-d AETHER-shape, n=256, 16 queries:
k=8, prefilter_factor=4 → 78.9% top-K coverage (FAIL <90%)
k=8, prefilter_factor=8 → ≥90% top-K coverage (PASS)
k=16, prefilter_factor=8 → ≥90% top-K coverage (PASS)
The factor=4 default that I'd planned in Pass 1 falls below the 90% bar
on uniform-random synthetic data. Production callers should use **8**
unless their embeddings carry enough structure (real AETHER traces
likely will) to clear the bar at lower factors. Documented in the
search_prefilter docstring and asserted in
test_search_prefilter_topk_coverage_meets_adr_084.
FIFO eviction now drains the parallel sketches array in lockstep —
test_search_prefilter_evicts_sketches_on_fifo guards against the two
arrays drifting (which would silently corrupt top-K via index
mismatch).
Validated:
- cargo test --workspace --no-default-features → 1,554 passed,
0 failed, 8 ignored (was 1,551; +3 new prefilter tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #3200)
Co-Authored-By: claude-flow <ruv@ruv.net>
* bench(signal): ADR-084 Pass 2 — end-to-end search_prefilter speedup
Measures EmbeddingHistory::search_prefilter (sketch + cosine refine)
vs the brute-force EmbeddingHistory::search baseline at three realistic
AETHER bank sizes, with the empirically validated prefilter_factor=8.
Measured (Windows host, criterion --warm-up 1s --measurement 3s):
d=128, k=8:
n=256 brute_force_cosine = 31.98 us, prefilter = 13.78 us → 2.3x
n=1024 brute_force_cosine = 110.4 us, prefilter = 16.64 us → 6.6x
n=4096 brute_force_cosine = 507.4 us, prefilter = 66.37 us → 7.6x
Speedup grows with bank size (sketch overhead is fixed; brute-force
scales linearly with n). At n=4k the prefilter approaches the 8x
ADR-084 acceptance criterion; at n=10k+ (realistic multi-day
deployment banks) it crosses cleanly. Below n=512 the brute-force
path is already cheap (sub-50 us) so the prefilter's narrower wins
don't materially affect the hot path.
Coverage acceptance (≥90% top-K agreement) is exercised in the
unit-test suite, not the bench. The bench measures cost only.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(signal): ADR-084 Pass 3 — EmbeddingHistory::novelty primitive
Adds the cluster-Pi novelty-sensor primitive: `EmbeddingHistory::novelty(query)`
returns `Option<f32>` in [0.0, 1.0] where 0.0 = exact-match-in-bank
and 1.0 = no-overlap. Returns None when sketches are disabled so
callers can fall back gracefully (existing `EmbeddingHistory::new`
constructor stays sketch-disabled).
This is the building block of the cluster-Pi novelty gate
described in ADR-084 §"cluster-Pi novelty sensor": each sensor node
maintains a bank of recent feature vectors, the gate scores the
incoming frame's novelty against the bank, and the heavy CNN /
pose-model wake gate consumes the score.
Wiring novelty into sensing-server's NodeState happens in a
follow-up — that's a ~50-line surgical change touching main.rs that
deserves its own commit. This patch lands the primitive + tests so
the wiring is straightforward.
Three regression tests added:
- test_novelty_returns_none_without_sketches
(graceful fallback when bank is sketch-less)
- test_novelty_zero_for_exact_match_one_for_empty_bank
(semantic boundaries)
- test_novelty_decreases_as_bank_grows_around_query
(gradient direction — guards against reversed comparator)
Validated:
- cargo test --workspace --no-default-features → 1,557 passed,
0 failed, 8 ignored (was 1,554; +3 new novelty tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #7600)
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(sensing-server): ADR-084 Pass 3 — wire novelty into NodeState
Wires the EmbeddingHistory::novelty primitive (Pass 3 prior commit)
into the per-node frame ingestion path on the cluster Pi. Each
incoming CSI frame now updates a per-node sketch bank of the last
6.4 s of feature vectors and produces a novelty score in [0.0, 1.0]
that downstream model-wake gates can consume.
Two NodeState structs were touched (one in types.rs and a
refactoring-leftover duplicate in main.rs that the call site uses);
both gain feature_history + last_novelty_score fields and an
update_novelty helper that:
- truncates / zero-pads incoming amplitudes to NOVELTY_VECTOR_DIM (56)
- scores novelty *before* inserting (so a frame doesn't see itself)
- FIFO-evicts when the bank reaches NOVELTY_HISTORY_CAPACITY (64)
Wired at the per-node ESP32 frame path in main.rs:3772 (immediately
before frame_history.push_back). Existing call sites that operate on
the singleton SensingState (not per-node) intentionally untouched —
they will be wired in a follow-up alongside the WebSocket update
envelope's novelty_score field.
Two new unit tests in novelty_tests:
- first_frame_yields_max_novelty_then_zero_on_repeat
(semantic boundaries: empty bank = 1.0, exact repeat = 0.0)
- handles_short_and_long_amplitude_vectors
(truncate / zero-pad robustness across hardware variants)
Validated:
- cargo test --workspace --no-default-features → 1,559 passed,
0 failed, 8 ignored (was 1,557; +2 new novelty tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #3900)
Co-Authored-By: claude-flow <ruv@ruv.net>
* hardening(ruvector): L2 from PR #435 review — overflow on >u16::MAX dims
Pass 1.6 hardening, addressing L2 finding from the security review on
PR #435 (https://github.com/ruvnet/RuView/pull/435#issuecomment-4321285519):
The original `Sketch::from_embedding` used `debug_assert!` for the
`embedding.len() <= u16::MAX` invariant, which compiled out in release
builds. A caller passing a 65,536+ -dim embedding would silently
truncate the dimension count via `as u16` cast — two over-long inputs
would then compare as same-dimensional rather than as 64k vs 70k, and
the dimension confusion would not surface anywhere.
Two-part fix:
- `from_embedding` (infallible) now SATURATES `embedding_dim` to
`u16::MAX` rather than truncating. Two over-long inputs still get
packed bit-correctly by `BinaryQuantized` and the saturated dim is
consistent across both, so they compare predictably (just with an
upper-bounded distance).
- `try_from_embedding` (new, fallible) returns
`Err(SketchError::EmbeddingDimOverflow{got, max})` when the input
exceeds `u16::MAX`. Use this when an over-long input should fail
loudly rather than be silently saturated.
- New error variant `SketchError::EmbeddingDimOverflow` with the
observed `got` and the `max` (`u16::MAX as usize`).
- New regression test `try_from_embedding_rejects_over_long_input`
asserts both paths: try_ → Err, infallible → saturate.
Validated:
- 13 sketch unit tests pass (was 12; +1 for L2 boundary).
- cargo test --workspace --no-default-features → 1,560 passed,
0 failed, 8 ignored (was 1,559; +1).
- ESP32-S3 on COM7 streaming live CSI (cb #100, fresh boot RSSI -48 dBm).
Co-Authored-By: claude-flow <ruv@ruv.net>
* hardening(ruvector,signal): L1+L3 from PR #435 review
Two follow-ups to the security review on PR #435:
L1 — Defensive `if let Some(...)` for SketchBank::topk heap peek.
The original `.expect("heap len == k > 0")` was mathematically
unreachable (k > 0 enforced at function entry, heap.len() >= k branch
guards), but a structural pattern makes the impossibility a type
property rather than a runtime invariant. Same hot-path cost; zero
panic risk in the production binary.
L3 — Guard `embedding_dim == 0` in `EmbeddingHistory::novelty`.
A 0-dim history is constructible via `with_sketch(0, ...)`; without
the guard the function returned `NaN` (min_d as f32 / 0.0), silently
poisoning every downstream gate (model-wake, anomaly-emit, etc).
Now returns Some(1.0) — fail-loud at "no comparison possible →
maximally novel," never NaN. New regression test
`test_novelty_zero_dim_history_returns_one_not_nan` pins it down.
Validated:
- cargo test --workspace --no-default-features → 1,561 passed,
0 failed, 8 ignored (was 1,560; +1 for the L3 NaN guard test).
- ESP32-S3 on COM7 streaming live CSI (cb #12400, RSSI fresh).
L4 (f64→f32 cast) is documentation-only and lands in a follow-up
patch; L8 (always-on novelty sensor) is an observation, not a fix.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(sensing-server): ADR-084 Pass 3.5 — novelty_score on PerNodeFeatureInfo
Adds an optional `novelty_score: Option<f32>` field to
PerNodeFeatureInfo, the per-node WebSocket envelope shape. Mirrored
on both struct definitions (types.rs canonical + main.rs's
refactoring-leftover duplicate) so the schema is consistent.
`#[serde(skip_serializing_if = "Option::is_none")]` keeps existing
WebSocket consumers unaffected — old clients see no extra field
unless the server populates it. No PerNodeFeatureInfo literal
construction sites exist today (all `node_features: None`), so this
is a schema-only addition; live population from
`NodeState::last_novelty_score` lands in a Pass 3.6 follow-up that
also wires `node_features: Some(...)` at the per-node ESP32 frame
emit path.
Validated:
- cargo test --workspace --no-default-features → 1,561 passed,
0 failed, 8 ignored (no change; schema-only).
- ESP32-S3 on COM7 streaming live CSI (cb #2100, fresh boot).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(sensing-server): ADR-084 Pass 3.6 — populate node_features with novelty_score
Wires `node_features: Some(...)` at the two per-node ESP32 frame
emit sites (formerly `node_features: None`). Adds a `build_node_features`
helper that constructs `Vec<PerNodeFeatureInfo>` from `s.node_states`,
including the per-node `last_novelty_score`.
This completes the Pass 3.x track — novelty score now flows from
NodeState → PerNodeFeatureInfo → SensingUpdate envelope → WebSocket
clients. Cluster-Pi UI / model-wake / anomaly-emit gates can read
it without round-tripping back to the server.
Three other call sites (singleton paths at 1772, 1911, 4170) keep
`node_features: None` for now — those are for the offline /
simulated paths that don't have per-node ESP32 state. They'll get
populated when their parent flows wire up real multi-node fanout.
Stale flag uses `ESP32_OFFLINE_TIMEOUT` (5s) — same threshold the
rest of the system uses to decide a node has dropped.
Validated:
- cargo test --workspace --no-default-features → 1,561 passed,
0 failed, 8 ignored (no change; integration test would be wire-
format diff in a follow-up).
- ESP32-S3 on COM7 streaming live CSI (cb #100, fresh boot,
RSSI -49 dBm).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(ruvector): ADR-084 Pass 4 — WireSketch wire-format primitive
Adds `WireSketch::serialize` / `deserialize` for transmitting a
sketch + novelty score over any byte-stream channel — cluster↔cluster
mesh (ADR-066 swarm bridge when it exists), sensor→cluster-Pi UDP
(ADR-086 edge gate complement), gateway→cloud QUIC. Channel-agnostic
by design.
Wire layout (12-byte header + ceil(dim/8) bytes payload, little-endian):
[0..4] magic = 0xC5110084
[4..6] format_version = 1
[6..8] sketch_version (embedding-model schema)
[8..10] embedding_dim
[10..12] novelty_q15 (novelty * 32_767, saturated)
[12..] packed sketch bits
A 128-d AETHER sketch fits in exactly 28 bytes (12 header + 16 bits).
Deserializer is paranoid by design — every untrusted byte buffer
gets validated against:
- length floor (>= header bytes)
- length ceiling (WIRE_SKETCH_MAX_BYTES = 9 KiB; defends against
memory-exhaustion attacks via claimed-but-impossible large dims)
- magic match
- format_version supported
- embedding_dim → payload bytes consistency
A malformed UDP packet from a non-RuView sender produces a typed
`WireSketchError` (variant per failure class), never a panic.
Re-exported from lib.rs alongside `Sketch` / `SketchBank`.
Seven new tests:
- wire_serialize_round_trip (correctness)
- wire_rejects_short_buffer (length floor)
- wire_rejects_oversized_buffer (length ceiling, DoS guard)
- wire_rejects_bad_magic (cross-protocol confusion guard)
- wire_rejects_unsupported_format_version (forward-compat)
- wire_rejects_payload_size_mismatch (header/body consistency)
- wire_envelope_size_for_aether_128d (sizing contract: 28 bytes)
Validated:
- cargo test --workspace --no-default-features → 1,568 passed,
0 failed, 8 ignored (was 1,561; +7 wire-format tests).
- ESP32-S3 on COM7 streaming live CSI (cb #15100, RSSI -48 dBm).
Pass 4's wire-format primitive ships first; the channel that
carries it (ADR-066 swarm-bridge or ADR-086 sensor→Pi gate) is
out-of-scope for this commit and tracked separately.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(ruvector): ADR-084 Pass 5 — privacy-preserving event log + L4 docstring
Pass 5 — `PrivacyEventLog` and `NoveltyEvent` types in a new
`wifi_densepose_ruvector::event_log` module. Each event stores
`(timestamp, sketch_bytes, sketch_version, embedding_dim, novelty,
witness_sha256)` — explicitly NOT the raw float embedding. The
witness is SHA-256 of the WireSketch serialization (12-byte header +
packed bits + q15 novelty), making events content-addressable: two
pushes of the same `(sketch, novelty)` produce byte-identical
witnesses, enabling dedup at the receiver and verifier.
Privacy properties (ADR-084 §"Privacy-preserving event log"):
1. Non-invertibility — 1-bit sign quantization is lossy; an attacker
with read access cannot reconstruct the source CSI / embedding.
2. Content addressing — `(sketch_version, witness)` is fully qualified.
3. Bounded memory — fixed capacity ring; misbehaving senders cannot
exhaust receiver memory.
Seven new tests:
- push_grows_until_capacity_then_fifo_evicts
- zero_capacity_log_silently_drops_pushes (no-op stub case)
- witness_is_deterministic_for_same_sketch_and_novelty
(witness must NOT depend on timestamp)
- witness_differs_for_different_novelty_scores
- find_by_witness_returns_most_recent_match
- find_by_witness_returns_none_on_miss
- event_does_not_carry_raw_embedding (structural privacy guarantee)
L4 hardening (PR #435 security review) — the `f64 → f32` cast in
NodeState::update_novelty now has a docstring noting the boundary
behaviour: `f64::INFINITY` survives as `f32::INFINITY`, `f64::NAN`
propagates as `f32::NAN`. Neither panics. CSI amplitudes from healthy
firmware are well within f32 finite range.
Validated:
- cargo test --workspace --no-default-features → 1,575 passed,
0 failed, 8 ignored (was 1,568; +7 event-log tests).
- ESP32-S3 on COM7 streaming live CSI (cb #2800, RSSI -52 dBm).
Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
parent
d3020fec6b
commit
17509a2a41
12 changed files with 2486 additions and 52 deletions
536
v2/Cargo.lock
generated
536
v2/Cargo.lock
generated
|
|
@ -64,6 +64,23 @@ version = "0.1.6"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "4b46cbb362ab8752921c97e041f5e366ee6297bd428a31275b9fcf1e380f7299"
|
||||
|
||||
[[package]]
|
||||
name = "anndists"
|
||||
version = "0.1.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9a8396b473aa0bceed68fb32462505387ea39fa47c7029417e0a49f10592b036"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"cfg-if",
|
||||
"cpu-time",
|
||||
"env_logger",
|
||||
"lazy_static",
|
||||
"log",
|
||||
"num-traits",
|
||||
"num_cpus",
|
||||
"rayon",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "ansi-str"
|
||||
version = "0.8.0"
|
||||
|
|
@ -90,7 +107,22 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "43d5b281e737544384e969a5ccad3f1cdd24b48086a0fc1b2a5262a26b8f4f4a"
|
||||
dependencies = [
|
||||
"anstyle",
|
||||
"anstyle-parse",
|
||||
"anstyle-parse 0.2.7",
|
||||
"anstyle-query",
|
||||
"anstyle-wincon",
|
||||
"colorchoice",
|
||||
"is_terminal_polyfill",
|
||||
"utf8parse",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "anstream"
|
||||
version = "1.0.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "824a212faf96e9acacdbd09febd34438f8f711fb84e09a8916013cd7815ca28d"
|
||||
dependencies = [
|
||||
"anstyle",
|
||||
"anstyle-parse 1.0.0",
|
||||
"anstyle-query",
|
||||
"anstyle-wincon",
|
||||
"colorchoice",
|
||||
|
|
@ -113,6 +145,15 @@ dependencies = [
|
|||
"utf8parse",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "anstyle-parse"
|
||||
version = "1.0.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "52ce7f38b242319f7cabaa6813055467063ecdc9d355bbb4ce0c68908cd8130e"
|
||||
dependencies = [
|
||||
"utf8parse",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "anstyle-query"
|
||||
version = "1.1.5"
|
||||
|
|
@ -257,10 +298,10 @@ dependencies = [
|
|||
"base64 0.22.1",
|
||||
"bytes",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"http-body-util",
|
||||
"hyper",
|
||||
"hyper 1.8.1",
|
||||
"hyper-util",
|
||||
"itoa",
|
||||
"matchit",
|
||||
|
|
@ -274,7 +315,7 @@ dependencies = [
|
|||
"serde_path_to_error",
|
||||
"serde_urlencoded",
|
||||
"sha1",
|
||||
"sync_wrapper",
|
||||
"sync_wrapper 1.0.2",
|
||||
"tokio",
|
||||
"tokio-tungstenite",
|
||||
"tower",
|
||||
|
|
@ -292,13 +333,13 @@ dependencies = [
|
|||
"async-trait",
|
||||
"bytes",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"http-body-util",
|
||||
"mime",
|
||||
"pin-project-lite",
|
||||
"rustversion",
|
||||
"sync_wrapper",
|
||||
"sync_wrapper 1.0.2",
|
||||
"tower-layer",
|
||||
"tower-service",
|
||||
"tracing",
|
||||
|
|
@ -333,6 +374,15 @@ version = "1.8.3"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "2af50177e190e07a26ab74f8b1efbfe2ef87da2116221318cb1c2e82baf7de06"
|
||||
|
||||
[[package]]
|
||||
name = "bincode"
|
||||
version = "1.3.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b1f45e9417d87227c7a56d22e471c6206462cba514c7590c09aff4cf6d1ddcad"
|
||||
dependencies = [
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "bincode"
|
||||
version = "2.0.1"
|
||||
|
|
@ -771,7 +821,7 @@ version = "4.5.60"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "24a241312cea5059b13574bb9b3861cabf758b879c15190b37b6d6fd63ab6876"
|
||||
dependencies = [
|
||||
"anstream",
|
||||
"anstream 0.6.21",
|
||||
"anstyle",
|
||||
"clap_lex",
|
||||
"strsim",
|
||||
|
|
@ -875,6 +925,16 @@ dependencies = [
|
|||
"version_check",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "core-foundation"
|
||||
version = "0.9.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "91e195e091a93c46f7102ec7818a2aa394e1e1771c3ab4825963fa03e45afb8f"
|
||||
dependencies = [
|
||||
"core-foundation-sys",
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "core-foundation"
|
||||
version = "0.10.1"
|
||||
|
|
@ -898,7 +958,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "064badf302c3194842cf2c5d61f56cc88e54a759313879cdf03abdd27d0c3b97"
|
||||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"core-foundation",
|
||||
"core-foundation 0.10.1",
|
||||
"core-graphics-types",
|
||||
"foreign-types 0.5.0",
|
||||
"libc",
|
||||
|
|
@ -911,10 +971,20 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "3d44a101f213f6c4cdc1853d4b78aef6db6bdfa3468798cc1d9912f4735013eb"
|
||||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"core-foundation",
|
||||
"core-foundation 0.10.1",
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "cpu-time"
|
||||
version = "1.0.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e9e393a7668fe1fad3075085b86c781883000b4ede868f43627b34a87c8b7ded"
|
||||
dependencies = [
|
||||
"libc",
|
||||
"winapi",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "cpufeatures"
|
||||
version = "0.2.17"
|
||||
|
|
@ -1371,7 +1441,7 @@ dependencies = [
|
|||
"rustc_version",
|
||||
"toml 0.9.12+spec-1.1.0",
|
||||
"vswhom",
|
||||
"winreg",
|
||||
"winreg 0.55.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
|
@ -1407,6 +1477,29 @@ dependencies = [
|
|||
"syn 2.0.117",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "env_filter"
|
||||
version = "1.0.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "32e90c2accc4b07a8456ea0debdc2e7587bdd890680d71173a15d4ae604f6eef"
|
||||
dependencies = [
|
||||
"log",
|
||||
"regex",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "env_logger"
|
||||
version = "0.11.10"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0621c04f2196ac3f488dd583365b9c09be011a4ab8b9f37248ffcc8f6198b56a"
|
||||
dependencies = [
|
||||
"anstream 1.0.0",
|
||||
"anstyle",
|
||||
"env_filter",
|
||||
"jiff",
|
||||
"log",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "equivalent"
|
||||
version = "1.0.2"
|
||||
|
|
@ -1867,7 +1960,7 @@ dependencies = [
|
|||
"raw-cpuid",
|
||||
"rayon",
|
||||
"seq-macro",
|
||||
"sysctl",
|
||||
"sysctl 0.5.5",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
|
@ -2188,6 +2281,25 @@ dependencies = [
|
|||
"syn 2.0.117",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "h2"
|
||||
version = "0.3.27"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0beca50380b1fc32983fc1cb4587bfa4bb9e78fc259aad4a0032d2080309222d"
|
||||
dependencies = [
|
||||
"bytes",
|
||||
"fnv",
|
||||
"futures-core",
|
||||
"futures-sink",
|
||||
"futures-util",
|
||||
"http 0.2.12",
|
||||
"indexmap 2.13.0",
|
||||
"slab",
|
||||
"tokio",
|
||||
"tokio-util",
|
||||
"tracing",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "half"
|
||||
version = "2.7.1"
|
||||
|
|
@ -2333,6 +2445,31 @@ version = "1.1.14"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ec9d92d097f4749b64e8cc33d924d9f40a2d4eb91402b458014b781f5733d60f"
|
||||
|
||||
[[package]]
|
||||
name = "hnsw_rs"
|
||||
version = "0.3.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "43a5258f079b97bf2e8311ff9579e903c899dcbac0d9a138d62e9a066778bd07"
|
||||
dependencies = [
|
||||
"anndists",
|
||||
"anyhow",
|
||||
"bincode 1.3.3",
|
||||
"cfg-if",
|
||||
"cpu-time",
|
||||
"env_logger",
|
||||
"hashbrown 0.15.5",
|
||||
"indexmap 2.13.0",
|
||||
"lazy_static",
|
||||
"log",
|
||||
"mmap-rs",
|
||||
"num-traits",
|
||||
"num_cpus",
|
||||
"parking_lot",
|
||||
"rand 0.9.2",
|
||||
"rayon",
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "html5ever"
|
||||
version = "0.29.1"
|
||||
|
|
@ -2345,6 +2482,17 @@ dependencies = [
|
|||
"match_token",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "http"
|
||||
version = "0.2.12"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "601cbb57e577e2f5ef5be8e7b83f0f63994f25aa94d673e54a92d5c516d101f1"
|
||||
dependencies = [
|
||||
"bytes",
|
||||
"fnv",
|
||||
"itoa",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "http"
|
||||
version = "1.4.0"
|
||||
|
|
@ -2355,6 +2503,17 @@ dependencies = [
|
|||
"itoa",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "http-body"
|
||||
version = "0.4.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7ceab25649e9960c0311ea418d17bee82c0dcec1bd053b5f9a66e265a693bed2"
|
||||
dependencies = [
|
||||
"bytes",
|
||||
"http 0.2.12",
|
||||
"pin-project-lite",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "http-body"
|
||||
version = "1.0.1"
|
||||
|
|
@ -2362,7 +2521,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "1efedce1fb8e6913f23e0c92de8e62cd5b772a67e7b3946df930a62566c93184"
|
||||
dependencies = [
|
||||
"bytes",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
|
@ -2373,8 +2532,8 @@ checksum = "b021d93e26becf5dc7e1b75b1bed1fd93124b374ceb73f43d4d4eafec896a64a"
|
|||
dependencies = [
|
||||
"bytes",
|
||||
"futures-core",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"pin-project-lite",
|
||||
]
|
||||
|
||||
|
|
@ -2396,6 +2555,30 @@ version = "1.0.3"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "df3b46402a9d5adb4c86a0cf463f42e19994e3ee891101b1841f30a545cb49a9"
|
||||
|
||||
[[package]]
|
||||
name = "hyper"
|
||||
version = "0.14.32"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "41dfc780fdec9373c01bae43289ea34c972e40ee3c9f6b3c8801a35f35586ce7"
|
||||
dependencies = [
|
||||
"bytes",
|
||||
"futures-channel",
|
||||
"futures-core",
|
||||
"futures-util",
|
||||
"h2",
|
||||
"http 0.2.12",
|
||||
"http-body 0.4.6",
|
||||
"httparse",
|
||||
"httpdate",
|
||||
"itoa",
|
||||
"pin-project-lite",
|
||||
"socket2 0.5.10",
|
||||
"tokio",
|
||||
"tower-service",
|
||||
"tracing",
|
||||
"want",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hyper"
|
||||
version = "1.8.1"
|
||||
|
|
@ -2406,8 +2589,8 @@ dependencies = [
|
|||
"bytes",
|
||||
"futures-channel",
|
||||
"futures-core",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"httparse",
|
||||
"httpdate",
|
||||
"itoa",
|
||||
|
|
@ -2418,6 +2601,20 @@ dependencies = [
|
|||
"want",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hyper-rustls"
|
||||
version = "0.24.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ec3efd23720e2049821a693cbc7e65ea87c72f1c58ff2f9522ff332b1491e590"
|
||||
dependencies = [
|
||||
"futures-util",
|
||||
"http 0.2.12",
|
||||
"hyper 0.14.32",
|
||||
"rustls 0.21.12",
|
||||
"tokio",
|
||||
"tokio-rustls",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hyper-tls"
|
||||
version = "0.6.0"
|
||||
|
|
@ -2426,7 +2623,7 @@ checksum = "70206fc6890eaca9fde8a0bf71caa2ddfc9fe045ac9e5c70df101a7dbde866e0"
|
|||
dependencies = [
|
||||
"bytes",
|
||||
"http-body-util",
|
||||
"hyper",
|
||||
"hyper 1.8.1",
|
||||
"hyper-util",
|
||||
"native-tls",
|
||||
"tokio",
|
||||
|
|
@ -2444,9 +2641,9 @@ dependencies = [
|
|||
"bytes",
|
||||
"futures-channel",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"hyper",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"hyper 1.8.1",
|
||||
"ipnet",
|
||||
"libc",
|
||||
"percent-encoding",
|
||||
|
|
@ -2778,6 +2975,30 @@ dependencies = [
|
|||
"system-deps",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jiff"
|
||||
version = "0.2.24"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f00b5dbd620d61dfdcb6007c9c1f6054ebd75319f163d886a9055cec1155073d"
|
||||
dependencies = [
|
||||
"jiff-static",
|
||||
"log",
|
||||
"portable-atomic",
|
||||
"portable-atomic-util",
|
||||
"serde_core",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jiff-static"
|
||||
version = "0.2.24"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e000de030ff8022ea1da3f466fbb0f3a809f5e51ed31f6dd931c35181ad8e6d7"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn 2.0.117",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jni"
|
||||
version = "0.21.1"
|
||||
|
|
@ -3270,6 +3491,23 @@ dependencies = [
|
|||
"winapi",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "mmap-rs"
|
||||
version = "0.7.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "4ecce9d566cb9234ae3db9e249c8b55665feaaf32b0859ff1e27e310d2beb3d8"
|
||||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"combine",
|
||||
"libc",
|
||||
"mach2",
|
||||
"nix 0.30.1",
|
||||
"sysctl 0.6.0",
|
||||
"thiserror 2.0.18",
|
||||
"widestring",
|
||||
"windows 0.48.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "muda"
|
||||
version = "0.17.1"
|
||||
|
|
@ -3501,6 +3739,18 @@ dependencies = [
|
|||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "nix"
|
||||
version = "0.30.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "74523f3a35e05aba87a1d978330aef40f67b0304ac79c1c00b294c9830543db6"
|
||||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"cfg-if",
|
||||
"cfg_aliases",
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "nodrop"
|
||||
version = "0.1.14"
|
||||
|
|
@ -4837,6 +5087,15 @@ version = "0.5.5"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "03251193000f4bd3b042892be858ee50e8b3719f2b08e5833ac4353724632430"
|
||||
|
||||
[[package]]
|
||||
name = "redb"
|
||||
version = "2.6.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8eca1e9d98d5a7e9002d0013e18d5a9b000aee942eb134883a82f06ebffb6c01"
|
||||
dependencies = [
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "redox_syscall"
|
||||
version = "0.5.18"
|
||||
|
|
@ -4935,6 +5194,47 @@ dependencies = [
|
|||
"bytecheck",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "reqwest"
|
||||
version = "0.11.27"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "dd67538700a17451e7cba03ac727fb961abb7607553461627b97de0b89cf4a62"
|
||||
dependencies = [
|
||||
"base64 0.21.7",
|
||||
"bytes",
|
||||
"encoding_rs",
|
||||
"futures-core",
|
||||
"futures-util",
|
||||
"h2",
|
||||
"http 0.2.12",
|
||||
"http-body 0.4.6",
|
||||
"hyper 0.14.32",
|
||||
"hyper-rustls",
|
||||
"ipnet",
|
||||
"js-sys",
|
||||
"log",
|
||||
"mime",
|
||||
"once_cell",
|
||||
"percent-encoding",
|
||||
"pin-project-lite",
|
||||
"rustls 0.21.12",
|
||||
"rustls-pemfile",
|
||||
"serde",
|
||||
"serde_json",
|
||||
"serde_urlencoded",
|
||||
"sync_wrapper 0.1.2",
|
||||
"system-configuration",
|
||||
"tokio",
|
||||
"tokio-rustls",
|
||||
"tower-service",
|
||||
"url",
|
||||
"wasm-bindgen",
|
||||
"wasm-bindgen-futures",
|
||||
"web-sys",
|
||||
"webpki-roots",
|
||||
"winreg 0.50.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "reqwest"
|
||||
version = "0.12.28"
|
||||
|
|
@ -4945,10 +5245,10 @@ dependencies = [
|
|||
"bytes",
|
||||
"futures-core",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"http-body-util",
|
||||
"hyper",
|
||||
"hyper 1.8.1",
|
||||
"hyper-tls",
|
||||
"hyper-util",
|
||||
"js-sys",
|
||||
|
|
@ -4961,7 +5261,7 @@ dependencies = [
|
|||
"serde",
|
||||
"serde_json",
|
||||
"serde_urlencoded",
|
||||
"sync_wrapper",
|
||||
"sync_wrapper 1.0.2",
|
||||
"tokio",
|
||||
"tokio-native-tls",
|
||||
"tower",
|
||||
|
|
@ -4983,10 +5283,10 @@ dependencies = [
|
|||
"bytes",
|
||||
"futures-core",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"http-body-util",
|
||||
"hyper",
|
||||
"hyper 1.8.1",
|
||||
"hyper-util",
|
||||
"js-sys",
|
||||
"log",
|
||||
|
|
@ -4994,7 +5294,7 @@ dependencies = [
|
|||
"pin-project-lite",
|
||||
"serde",
|
||||
"serde_json",
|
||||
"sync_wrapper",
|
||||
"sync_wrapper 1.0.2",
|
||||
"tokio",
|
||||
"tokio-util",
|
||||
"tower",
|
||||
|
|
@ -5194,6 +5494,18 @@ dependencies = [
|
|||
"windows-sys 0.61.2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls"
|
||||
version = "0.21.12"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3f56a14d1f48b391359b22f731fd4bd7e43c97f3c50eee276f3aa09c94784d3e"
|
||||
dependencies = [
|
||||
"log",
|
||||
"ring",
|
||||
"rustls-webpki 0.101.7",
|
||||
"sct",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls"
|
||||
version = "0.22.4"
|
||||
|
|
@ -5234,6 +5546,15 @@ dependencies = [
|
|||
"security-framework",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls-pemfile"
|
||||
version = "1.0.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1c74cae0a4cf6ccbbf5f359f08efdf8ee7e1dc532573bf0db71968cb56b1448c"
|
||||
dependencies = [
|
||||
"base64 0.21.7",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls-pki-types"
|
||||
version = "1.14.0"
|
||||
|
|
@ -5250,7 +5571,7 @@ version = "0.6.2"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1d99feebc72bae7ab76ba994bb5e121b8d83d910ca40b36e0921f53becc41784"
|
||||
dependencies = [
|
||||
"core-foundation",
|
||||
"core-foundation 0.10.1",
|
||||
"core-foundation-sys",
|
||||
"jni",
|
||||
"log",
|
||||
|
|
@ -5271,6 +5592,16 @@ version = "0.1.1"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f87165f0995f63a9fbeea62b64d10b4d9d8e78ec6d7d51fb2125fda7bb36788f"
|
||||
|
||||
[[package]]
|
||||
name = "rustls-webpki"
|
||||
version = "0.101.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8b6275d1ee7a1cd780b64aca7726599a1dbc893b1e64144529e55c3c2f745765"
|
||||
dependencies = [
|
||||
"ring",
|
||||
"untrusted",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls-webpki"
|
||||
version = "0.102.8"
|
||||
|
|
@ -5353,17 +5684,24 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "dc7bc95e3682430c27228d7bc694ba9640cd322dde1bd5e7c9cf96a16afb4ca1"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"bincode",
|
||||
"bincode 2.0.1",
|
||||
"chrono",
|
||||
"crossbeam",
|
||||
"dashmap",
|
||||
"hnsw_rs",
|
||||
"memmap2",
|
||||
"ndarray 0.16.1",
|
||||
"once_cell",
|
||||
"parking_lot",
|
||||
"rand 0.8.5",
|
||||
"rand_distr 0.4.3",
|
||||
"rayon",
|
||||
"redb",
|
||||
"reqwest 0.11.27",
|
||||
"rkyv",
|
||||
"serde",
|
||||
"serde_json",
|
||||
"simsimd",
|
||||
"thiserror 2.0.18",
|
||||
"tracing",
|
||||
"uuid",
|
||||
|
|
@ -5556,6 +5894,16 @@ version = "1.2.0"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "94143f37725109f92c262ed2cf5e59bce7498c01bcc1502d7b9afe439a4e9f49"
|
||||
|
||||
[[package]]
|
||||
name = "sct"
|
||||
version = "0.7.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "da046153aa2352493d6cb7da4b6e5c0c057d8a1d0a9aa8560baffdd945acd414"
|
||||
dependencies = [
|
||||
"ring",
|
||||
"untrusted",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "security-framework"
|
||||
version = "3.7.0"
|
||||
|
|
@ -5563,7 +5911,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "b7f4bc775c73d9a02cde8bf7b2ec4c9d12743edf609006c7facc23998404cd1d"
|
||||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"core-foundation",
|
||||
"core-foundation 0.10.1",
|
||||
"core-foundation-sys",
|
||||
"libc",
|
||||
"security-framework-sys",
|
||||
|
|
@ -5803,7 +6151,7 @@ checksum = "2acaf3f973e8616d7ceac415f53fc60e190b2a686fbcf8d27d0256c741c5007b"
|
|||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"cfg-if",
|
||||
"core-foundation",
|
||||
"core-foundation 0.10.1",
|
||||
"core-foundation-sys",
|
||||
"io-kit-sys",
|
||||
"libudev",
|
||||
|
|
@ -5928,6 +6276,15 @@ version = "0.1.5"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e3a9fe34e3e7a50316060351f37187a3f546bce95496156754b601a5fa71b76e"
|
||||
|
||||
[[package]]
|
||||
name = "simsimd"
|
||||
version = "5.9.11"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9638f2829f4887c62a01958903b58fa1b740a64d5dc2bbc4a75a33827ee1bd53"
|
||||
dependencies = [
|
||||
"cc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "siphasher"
|
||||
version = "0.3.11"
|
||||
|
|
@ -6134,6 +6491,12 @@ dependencies = [
|
|||
"unicode-ident",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "sync_wrapper"
|
||||
version = "0.1.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "2047c6ded9c721764247e62cd3b03c09ffc529b2ba5b10ec482ae507a4a70160"
|
||||
|
||||
[[package]]
|
||||
name = "sync_wrapper"
|
||||
version = "1.0.2"
|
||||
|
|
@ -6168,6 +6531,20 @@ dependencies = [
|
|||
"walkdir",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "sysctl"
|
||||
version = "0.6.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "01198a2debb237c62b6826ec7081082d951f46dbb64b0e8c7649a452230d1dfc"
|
||||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"byteorder",
|
||||
"enum-as-inner",
|
||||
"libc",
|
||||
"thiserror 1.0.69",
|
||||
"walkdir",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "sysinfo"
|
||||
version = "0.32.1"
|
||||
|
|
@ -6182,6 +6559,27 @@ dependencies = [
|
|||
"windows 0.57.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "system-configuration"
|
||||
version = "0.5.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ba3a3adc5c275d719af8cb4272ea1c4a6d668a777f37e115f6d11ddbc1c8e0e7"
|
||||
dependencies = [
|
||||
"bitflags 1.3.2",
|
||||
"core-foundation 0.9.4",
|
||||
"system-configuration-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "system-configuration-sys"
|
||||
version = "0.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "a75fb188eb626b924683e3b95e3a48e63551fcfb51949de2f06a9d91dbee93c9"
|
||||
dependencies = [
|
||||
"core-foundation-sys",
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "system-deps"
|
||||
version = "6.2.2"
|
||||
|
|
@ -6229,7 +6627,7 @@ checksum = "6e06d52c379e63da659a483a958110bbde891695a0ecb53e48cc7786d5eda7bb"
|
|||
dependencies = [
|
||||
"bitflags 2.11.0",
|
||||
"block2",
|
||||
"core-foundation",
|
||||
"core-foundation 0.10.1",
|
||||
"core-graphics",
|
||||
"crossbeam-channel",
|
||||
"dispatch2",
|
||||
|
|
@ -6303,7 +6701,7 @@ dependencies = [
|
|||
"glob",
|
||||
"gtk",
|
||||
"heck 0.5.0",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"jni",
|
||||
"libc",
|
||||
"log",
|
||||
|
|
@ -6488,7 +6886,7 @@ dependencies = [
|
|||
"cookie",
|
||||
"dpi",
|
||||
"gtk",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"jni",
|
||||
"objc2",
|
||||
"objc2-ui-kit",
|
||||
|
|
@ -6511,7 +6909,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "e11ea2e6f801d275fdd890d6c9603736012742a1c33b96d0db788c9cdebf7f9e"
|
||||
dependencies = [
|
||||
"gtk",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"jni",
|
||||
"log",
|
||||
"objc2",
|
||||
|
|
@ -6543,7 +6941,7 @@ dependencies = [
|
|||
"dunce",
|
||||
"glob",
|
||||
"html5ever",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"infer",
|
||||
"json-patch",
|
||||
"kuchikiki",
|
||||
|
|
@ -6779,6 +7177,16 @@ dependencies = [
|
|||
"tokio",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tokio-rustls"
|
||||
version = "0.24.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c28327cf380ac148141087fbfb9de9d7bd4e84ab5d2c28fbc911d753de8a7081"
|
||||
dependencies = [
|
||||
"rustls 0.21.12",
|
||||
"tokio",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tokio-serial"
|
||||
version = "5.4.5"
|
||||
|
|
@ -6966,7 +7374,7 @@ dependencies = [
|
|||
"futures-core",
|
||||
"futures-util",
|
||||
"pin-project-lite",
|
||||
"sync_wrapper",
|
||||
"sync_wrapper 1.0.2",
|
||||
"tokio",
|
||||
"tower-layer",
|
||||
"tower-service",
|
||||
|
|
@ -6982,8 +7390,8 @@ dependencies = [
|
|||
"bitflags 2.11.0",
|
||||
"bytes",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"http-body-util",
|
||||
"http-range-header",
|
||||
"httpdate",
|
||||
|
|
@ -7007,8 +7415,8 @@ dependencies = [
|
|||
"bitflags 2.11.0",
|
||||
"bytes",
|
||||
"futures-util",
|
||||
"http",
|
||||
"http-body",
|
||||
"http 1.4.0",
|
||||
"http-body 1.0.1",
|
||||
"iri-string",
|
||||
"pin-project-lite",
|
||||
"tower",
|
||||
|
|
@ -7150,7 +7558,7 @@ dependencies = [
|
|||
"byteorder",
|
||||
"bytes",
|
||||
"data-encoding",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"httparse",
|
||||
"log",
|
||||
"rand 0.8.5",
|
||||
|
|
@ -7306,7 +7714,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
|
|||
checksum = "d81f9efa9df032be5934a46a068815a10a042b494b6a58cb0a1a97bb5467ed6f"
|
||||
dependencies = [
|
||||
"base64 0.22.1",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"httparse",
|
||||
"log",
|
||||
]
|
||||
|
|
@ -7724,6 +8132,12 @@ dependencies = [
|
|||
"rustls-pki-types",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "webpki-roots"
|
||||
version = "0.25.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5f20c57d8d7db6d3b86154206ae5d8fba62dd39573114de97c2cb0578251f8e1"
|
||||
|
||||
[[package]]
|
||||
name = "webview2-com"
|
||||
version = "0.38.2"
|
||||
|
|
@ -7770,6 +8184,12 @@ dependencies = [
|
|||
"safe_arch",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "widestring"
|
||||
version = "1.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "72069c3113ab32ab29e5584db3c6ec55d416895e60715417b5b883a357c3e471"
|
||||
|
||||
[[package]]
|
||||
name = "wifi-densepose-api"
|
||||
version = "0.3.0"
|
||||
|
|
@ -7961,6 +8381,7 @@ dependencies = [
|
|||
"criterion",
|
||||
"ruvector-attention 2.0.4",
|
||||
"ruvector-attn-mincut",
|
||||
"ruvector-core",
|
||||
"ruvector-crv",
|
||||
"ruvector-gnn",
|
||||
"ruvector-mincut",
|
||||
|
|
@ -7968,6 +8389,7 @@ dependencies = [
|
|||
"ruvector-temporal-tensor",
|
||||
"serde",
|
||||
"serde_json",
|
||||
"sha2",
|
||||
"thiserror 1.0.69",
|
||||
]
|
||||
|
||||
|
|
@ -8013,6 +8435,7 @@ dependencies = [
|
|||
"serde_json",
|
||||
"thiserror 1.0.69",
|
||||
"wifi-densepose-core",
|
||||
"wifi-densepose-ruvector",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
|
@ -8139,6 +8562,15 @@ dependencies = [
|
|||
"windows-version",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "windows"
|
||||
version = "0.48.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e686886bc078bc1b0b600cac0147aadb815089b6e4da64016cbd754b6342700f"
|
||||
dependencies = [
|
||||
"windows-targets 0.48.5",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "windows"
|
||||
version = "0.57.0"
|
||||
|
|
@ -8664,6 +9096,16 @@ dependencies = [
|
|||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "winreg"
|
||||
version = "0.50.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "524e57b2c537c0f9b1e69f1965311ec12182b4122e45035b1508cd24d2adadb1"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"windows-sys 0.48.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "winreg"
|
||||
version = "0.55.0"
|
||||
|
|
@ -8784,7 +9226,7 @@ dependencies = [
|
|||
"gdkx11",
|
||||
"gtk",
|
||||
"html5ever",
|
||||
"http",
|
||||
"http 1.4.0",
|
||||
"javascriptcore-rs",
|
||||
"jni",
|
||||
"kuchikiki",
|
||||
|
|
|
|||
|
|
@ -120,6 +120,7 @@ midstreamer-attractor = "0.1.0"
|
|||
|
||||
# ruvector integration (published on crates.io)
|
||||
# Vendored at v2.1.0 in vendor/ruvector; using crates.io versions until published.
|
||||
ruvector-core = "2.0.4"
|
||||
ruvector-mincut = "2.0.4"
|
||||
ruvector-attn-mincut = "2.0.4"
|
||||
ruvector-temporal-tensor = "2.0.4"
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@ default = []
|
|||
crv = ["dep:ruvector-crv", "dep:ruvector-gnn", "dep:serde", "dep:serde_json"]
|
||||
|
||||
[dependencies]
|
||||
ruvector-core = { workspace = true }
|
||||
ruvector-mincut = { workspace = true }
|
||||
ruvector-attn-mincut = { workspace = true }
|
||||
ruvector-temporal-tensor = { workspace = true }
|
||||
|
|
@ -26,6 +27,10 @@ thiserror = { workspace = true }
|
|||
serde = { workspace = true, optional = true }
|
||||
serde_json = { workspace = true, optional = true }
|
||||
|
||||
# ADR-084 Pass 5 — privacy-preserving event log uses SHA-256 to
|
||||
# anchor each stored sketch as a content-addressable witness hash.
|
||||
sha2 = { workspace = true }
|
||||
|
||||
[dev-dependencies]
|
||||
approx = "0.5"
|
||||
criterion = { workspace = true }
|
||||
|
|
@ -33,3 +38,7 @@ criterion = { workspace = true }
|
|||
[[bench]]
|
||||
name = "crv_bench"
|
||||
harness = false
|
||||
|
||||
[[bench]]
|
||||
name = "sketch_bench"
|
||||
harness = false
|
||||
|
|
|
|||
170
v2/crates/wifi-densepose-ruvector/benches/sketch_bench.rs
Normal file
170
v2/crates/wifi-densepose-ruvector/benches/sketch_bench.rs
Normal file
|
|
@ -0,0 +1,170 @@
|
|||
//! ADR-084 acceptance criterion benchmark: sketch-vs-float compare cost.
|
||||
//!
|
||||
//! Acceptance threshold from `docs/adr/ADR-084-rabitq-similarity-sensor.md`:
|
||||
//! > Sketch compare cost reduction: **8×–30×** vs full-float compare.
|
||||
//!
|
||||
//! This bench measures the per-pair compare cost at the embedding sizes
|
||||
//! actually used in RuView:
|
||||
//!
|
||||
//! - 128-d (AETHER re-ID embeddings, ADR-024)
|
||||
//! - 256-d (CSI spectrogram embeddings, ADR-076)
|
||||
//! - 512-d (forward-looking, in case of post-rotation projection)
|
||||
//!
|
||||
//! For each dimension, three benches compare:
|
||||
//!
|
||||
//! 1. **`float_l2`** — squared-euclidean over `&[f32]` (the baseline; what
|
||||
//! AETHER actually computes today via the centroid path in
|
||||
//! `tracker_bridge.rs`).
|
||||
//! 2. **`float_cosine`** — cosine distance over `&[f32]` (alternative
|
||||
//! baseline; what some pipeline sites prefer).
|
||||
//! 3. **`sketch_hamming`** — hamming distance over the 1-bit sketch.
|
||||
//!
|
||||
//! Run with:
|
||||
//! ```bash
|
||||
//! cargo bench -p wifi-densepose-ruvector --bench sketch_bench
|
||||
//! ```
|
||||
//!
|
||||
//! Pass criterion: `sketch_hamming` is at least **8×** faster than the
|
||||
//! cheaper of `float_l2` / `float_cosine` at every measured dimension.
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use std::hint;
|
||||
use wifi_densepose_ruvector::Sketch;
|
||||
|
||||
const SKETCH_VERSION: u16 = 1;
|
||||
|
||||
/// Squared-euclidean over `&[f32]` — baseline AETHER path.
|
||||
#[inline]
|
||||
fn float_l2_squared(a: &[f32], b: &[f32]) -> f32 {
|
||||
a.iter()
|
||||
.zip(b.iter())
|
||||
.map(|(x, y)| {
|
||||
let d = x - y;
|
||||
d * d
|
||||
})
|
||||
.sum()
|
||||
}
|
||||
|
||||
/// Cosine distance (1.0 - cosine similarity) over `&[f32]`.
|
||||
/// Alternative baseline — used by some pipeline sites that need
|
||||
/// magnitude-invariant similarity.
|
||||
#[inline]
|
||||
fn float_cosine(a: &[f32], b: &[f32]) -> f32 {
|
||||
let mut dot = 0.0f32;
|
||||
let mut na = 0.0f32;
|
||||
let mut nb = 0.0f32;
|
||||
for (&x, &y) in a.iter().zip(b.iter()) {
|
||||
dot += x * y;
|
||||
na += x * x;
|
||||
nb += y * y;
|
||||
}
|
||||
let denom = (na * nb).sqrt();
|
||||
if denom < f32::EPSILON {
|
||||
1.0
|
||||
} else {
|
||||
1.0 - dot / denom
|
||||
}
|
||||
}
|
||||
|
||||
/// Generate a deterministic pseudo-random embedding of the given dimension.
|
||||
/// Uses a simple LCG so benches are repeatable across runs and machines
|
||||
/// without pulling in a `rand` dev-dep just for fixture generation.
|
||||
fn make_embedding(dim: usize, seed: u32) -> Vec<f32> {
|
||||
let mut state = seed.wrapping_mul(2654435761).wrapping_add(1);
|
||||
(0..dim)
|
||||
.map(|_| {
|
||||
// Iterate LCG (Numerical Recipes constants — for fixture only,
|
||||
// not for cryptographic use).
|
||||
state = state.wrapping_mul(1664525).wrapping_add(1013904223);
|
||||
// Map to [-1.0, 1.0] approximately.
|
||||
let u = (state >> 8) as f32 / (1u32 << 24) as f32;
|
||||
u * 2.0 - 1.0
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn bench_compare_cost(c: &mut Criterion) {
|
||||
for &dim in &[128usize, 256, 512] {
|
||||
let a_vec = make_embedding(dim, 0xAAAA_AAAA);
|
||||
let b_vec = make_embedding(dim, 0xBBBB_BBBB);
|
||||
let a_sketch = Sketch::from_embedding(&a_vec, SKETCH_VERSION);
|
||||
let b_sketch = Sketch::from_embedding(&b_vec, SKETCH_VERSION);
|
||||
|
||||
let mut group = c.benchmark_group(format!("compare_d{dim}"));
|
||||
group.throughput(Throughput::Elements(1));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("float_l2", dim), &dim, |bencher, _| {
|
||||
bencher.iter(|| {
|
||||
let d = float_l2_squared(black_box(&a_vec), black_box(&b_vec));
|
||||
hint::black_box(d)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("float_cosine", dim), &dim, |bencher, _| {
|
||||
bencher.iter(|| {
|
||||
let d = float_cosine(black_box(&a_vec), black_box(&b_vec));
|
||||
hint::black_box(d)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("sketch_hamming", dim), &dim, |bencher, _| {
|
||||
bencher.iter(|| {
|
||||
let d = black_box(&a_sketch).distance_unchecked(black_box(&b_sketch));
|
||||
hint::black_box(d)
|
||||
});
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
}
|
||||
|
||||
/// Top-K @ K=8 over a 1024-sketch bank — the realistic AETHER use case
|
||||
/// (a few thousand re-ID candidates, K small).
|
||||
fn bench_topk(c: &mut Criterion) {
|
||||
use wifi_densepose_ruvector::SketchBank;
|
||||
|
||||
let dim = 128usize;
|
||||
let bank_size = 1024usize;
|
||||
let k = 8usize;
|
||||
|
||||
let mut bank = SketchBank::new();
|
||||
for i in 0..bank_size {
|
||||
let v = make_embedding(dim, i as u32);
|
||||
bank.insert(i as u32, Sketch::from_embedding(&v, SKETCH_VERSION))
|
||||
.expect("schema-locked insert");
|
||||
}
|
||||
|
||||
let query_vec = make_embedding(dim, 0xCAFE_BABE);
|
||||
let query_sketch = Sketch::from_embedding(&query_vec, SKETCH_VERSION);
|
||||
|
||||
// Build a parallel float bank for the baseline.
|
||||
let float_bank: Vec<Vec<f32>> = (0..bank_size).map(|i| make_embedding(dim, i as u32)).collect();
|
||||
|
||||
let mut group = c.benchmark_group(format!("topk_d{dim}_n{bank_size}_k{k}"));
|
||||
group.throughput(Throughput::Elements(bank_size as u64));
|
||||
|
||||
group.bench_function("float_l2_topk", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let mut scored: Vec<(u32, f32)> = float_bank
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| (i as u32, float_l2_squared(black_box(&query_vec), v)))
|
||||
.collect();
|
||||
scored.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
scored.truncate(k);
|
||||
hint::black_box(scored)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_function("sketch_hamming_topk", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let result = black_box(&bank).topk(black_box(&query_sketch), k).expect("schema match");
|
||||
hint::black_box(result)
|
||||
});
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_compare_cost, bench_topk);
|
||||
criterion_main!(benches);
|
||||
266
v2/crates/wifi-densepose-ruvector/src/event_log.rs
Normal file
266
v2/crates/wifi-densepose-ruvector/src/event_log.rs
Normal file
|
|
@ -0,0 +1,266 @@
|
|||
//! ADR-084 Pass 5 — privacy-preserving event log.
|
||||
//!
|
||||
//! Stores `(timestamp, sketch, novelty, witness_sha256)` tuples instead
|
||||
//! of raw float embeddings. Two privacy properties matter:
|
||||
//!
|
||||
//! 1. **Non-invertibility.** The 1-bit sketch is lossy — there is no
|
||||
//! general mathematical inverse from a stored event back to a
|
||||
//! `[f32]` source embedding. Even an attacker with side-channel
|
||||
//! information about the embedding model's output distribution
|
||||
//! cannot reconstruct the underlying CSI.
|
||||
//!
|
||||
//! 2. **Content addressing.** Each event carries a SHA-256 of the
|
||||
//! serialized [`crate::WireSketch`] payload (header + packed bits).
|
||||
//! Two events with the same `witness` are byte-equal — the cluster-Pi
|
||||
//! can deduplicate, the gateway can checkpoint without re-storing,
|
||||
//! and downstream verifiers can prove "this event came from that
|
||||
//! sketch" without ever holding the original embedding.
|
||||
//!
|
||||
//! See ADR-084 §"Privacy-preserving event log" and the post-merge
|
||||
//! security review on PR #435 (finding L7) for context.
|
||||
//!
|
||||
//! # Bounded by design
|
||||
//!
|
||||
//! [`PrivacyEventLog`] is a fixed-capacity ring buffer; once full,
|
||||
//! oldest events are FIFO-evicted. A misbehaving sender cannot exhaust
|
||||
//! receiver memory by flooding the bank — peak footprint is
|
||||
//! `capacity × (sketch_bytes + 50)` bytes.
|
||||
|
||||
use sha2::{Digest, Sha256};
|
||||
use std::collections::VecDeque;
|
||||
|
||||
use crate::sketch::{Sketch, WireSketch};
|
||||
|
||||
/// One entry in the privacy-preserving event log.
|
||||
///
|
||||
/// All fields are public so callers can serialize / inspect / forward
|
||||
/// events through their own pipelines without going through getters.
|
||||
/// The struct is intentionally self-contained — no references to
|
||||
/// external state, so an event can be moved across thread / process /
|
||||
/// host boundaries without dangling.
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub struct NoveltyEvent {
|
||||
/// Microseconds since UNIX epoch when the underlying frame was
|
||||
/// observed. Caller-supplied; the event log doesn't fetch the
|
||||
/// clock so test fixtures are deterministic.
|
||||
pub timestamp_us: u64,
|
||||
/// 1-bit packed sketch bytes (`(embedding_dim + 7) / 8` bytes long).
|
||||
pub sketch_bytes: Vec<u8>,
|
||||
/// Embedding-model schema version so `(version, witness)` is a
|
||||
/// fully qualified content address.
|
||||
pub sketch_version: u16,
|
||||
/// Source-embedding dimension, fixing the bit count of `sketch_bytes`.
|
||||
pub embedding_dim: u16,
|
||||
/// Novelty score in `[0.0, 1.0]` at the time the event was logged.
|
||||
/// Saturated and stored as f32 for direct downstream use; the q15
|
||||
/// quantization happens on the wire format
|
||||
/// ([`crate::WireSketch`]) — the in-memory log keeps full f32
|
||||
/// precision.
|
||||
pub novelty: f32,
|
||||
/// SHA-256 of the serialized [`crate::WireSketch`] payload
|
||||
/// (header + packed bits + the q15 novelty quantum). Two events
|
||||
/// with the same witness are byte-identical on the wire.
|
||||
pub witness_sha256: [u8; 32],
|
||||
}
|
||||
|
||||
/// Fixed-capacity, FIFO-evicting log of [`NoveltyEvent`]s.
|
||||
///
|
||||
/// Used as the cluster-Pi's per-node anomaly trail. The log is **not**
|
||||
/// the source of truth for novelty (that's [`crate::SketchBank`] and
|
||||
/// `EmbeddingHistory::novelty`); it's the *audit* of what happened.
|
||||
///
|
||||
/// # Memory bound
|
||||
///
|
||||
/// `capacity * (sketch_bytes_per_event + ~50 fixed bytes)` is the worst
|
||||
/// case. For 64 events × 16-byte sketches that's ~4 KiB — fits in any
|
||||
/// per-node state struct without concern.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct PrivacyEventLog {
|
||||
capacity: usize,
|
||||
events: VecDeque<NoveltyEvent>,
|
||||
}
|
||||
|
||||
impl PrivacyEventLog {
|
||||
/// Create a new log with the given fixed capacity.
|
||||
///
|
||||
/// `capacity == 0` is allowed; the log accepts pushes but
|
||||
/// immediately discards them, which is occasionally useful as a
|
||||
/// no-op stub in test fixtures or when the privacy log is meant
|
||||
/// to be disabled at deployment time.
|
||||
pub fn new(capacity: usize) -> Self {
|
||||
Self {
|
||||
capacity,
|
||||
events: VecDeque::with_capacity(capacity.min(1024)),
|
||||
}
|
||||
}
|
||||
|
||||
/// Append an event built from a `Sketch` + novelty score.
|
||||
///
|
||||
/// The event's `witness_sha256` is computed over the [`WireSketch`]
|
||||
/// serialization of `(sketch, novelty)` — so two pushes of the same
|
||||
/// `(sketch, novelty)` produce byte-identical witnesses, enabling
|
||||
/// dedup at the receiver.
|
||||
///
|
||||
/// FIFO-evicts the oldest event if the log is at capacity. Returns
|
||||
/// the number of events present after the push (0 when capacity is
|
||||
/// 0, otherwise `<= capacity`).
|
||||
pub fn push(&mut self, sketch: &Sketch, novelty: f32, timestamp_us: u64) -> usize {
|
||||
if self.capacity == 0 {
|
||||
return 0;
|
||||
}
|
||||
let wire = WireSketch::serialize(sketch, novelty);
|
||||
let mut hasher = Sha256::new();
|
||||
hasher.update(&wire);
|
||||
let witness: [u8; 32] = hasher.finalize().into();
|
||||
|
||||
if self.events.len() >= self.capacity {
|
||||
self.events.pop_front();
|
||||
}
|
||||
self.events.push_back(NoveltyEvent {
|
||||
timestamp_us,
|
||||
sketch_bytes: sketch.packed_bytes().to_vec(),
|
||||
sketch_version: sketch.sketch_version(),
|
||||
embedding_dim: sketch.embedding_dim(),
|
||||
novelty,
|
||||
witness_sha256: witness,
|
||||
});
|
||||
self.events.len()
|
||||
}
|
||||
|
||||
/// Number of events currently stored.
|
||||
#[inline]
|
||||
pub fn len(&self) -> usize {
|
||||
self.events.len()
|
||||
}
|
||||
|
||||
/// True iff the log has no events.
|
||||
#[inline]
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.events.is_empty()
|
||||
}
|
||||
|
||||
/// Bank capacity (the max number of events ever held simultaneously).
|
||||
#[inline]
|
||||
pub fn capacity(&self) -> usize {
|
||||
self.capacity
|
||||
}
|
||||
|
||||
/// Iterate over events oldest-first.
|
||||
pub fn iter(&self) -> impl Iterator<Item = &NoveltyEvent> {
|
||||
self.events.iter()
|
||||
}
|
||||
|
||||
/// Find the most recent event whose `witness_sha256` matches.
|
||||
/// Returns `None` if no event matches.
|
||||
///
|
||||
/// Used by content-addressable lookups — a downstream receiver
|
||||
/// can ask "have you logged this exact `(sketch, novelty)` before?"
|
||||
/// without re-transmitting the sketch.
|
||||
pub fn find_by_witness(&self, witness: &[u8; 32]) -> Option<&NoveltyEvent> {
|
||||
self.events
|
||||
.iter()
|
||||
.rev()
|
||||
.find(|e| &e.witness_sha256 == witness)
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::sketch::Sketch;
|
||||
|
||||
fn make_sketch(seed: u32) -> Sketch {
|
||||
let v: Vec<f32> = (0..32)
|
||||
.map(|i| ((i as u32).wrapping_mul(seed) as f32).sin())
|
||||
.collect();
|
||||
Sketch::from_embedding(&v, 1)
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn push_grows_until_capacity_then_fifo_evicts() {
|
||||
let mut log = PrivacyEventLog::new(3);
|
||||
for i in 0..5u64 {
|
||||
log.push(&make_sketch(i as u32 + 1), 0.5, i * 1000);
|
||||
}
|
||||
assert_eq!(log.len(), 3, "must cap at capacity");
|
||||
// Oldest two evicted; first remaining timestamp is 2_000.
|
||||
let first = log.iter().next().unwrap();
|
||||
assert_eq!(first.timestamp_us, 2000);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn zero_capacity_log_silently_drops_pushes() {
|
||||
let mut log = PrivacyEventLog::new(0);
|
||||
let n = log.push(&make_sketch(1), 0.5, 0);
|
||||
assert_eq!(n, 0);
|
||||
assert_eq!(log.len(), 0);
|
||||
assert!(log.is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn witness_is_deterministic_for_same_sketch_and_novelty() {
|
||||
let mut log_a = PrivacyEventLog::new(2);
|
||||
let mut log_b = PrivacyEventLog::new(2);
|
||||
let s = make_sketch(7);
|
||||
// Same sketch + same novelty + (intentionally different)
|
||||
// timestamps — witness must NOT depend on timestamp; the
|
||||
// wire format does not include it.
|
||||
log_a.push(&s, 0.25, 100);
|
||||
log_b.push(&s, 0.25, 999_999);
|
||||
let wa = log_a.iter().next().unwrap().witness_sha256;
|
||||
let wb = log_b.iter().next().unwrap().witness_sha256;
|
||||
assert_eq!(wa, wb, "witness must be content-addressable, not time-addressable");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn witness_differs_for_different_novelty_scores() {
|
||||
let mut log = PrivacyEventLog::new(2);
|
||||
let s = make_sketch(11);
|
||||
log.push(&s, 0.10, 0);
|
||||
log.push(&s, 0.90, 0);
|
||||
let mut iter = log.iter();
|
||||
let w0 = iter.next().unwrap().witness_sha256;
|
||||
let w1 = iter.next().unwrap().witness_sha256;
|
||||
assert_ne!(w0, w1, "different novelty → different witness");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn find_by_witness_returns_most_recent_match() {
|
||||
let mut log = PrivacyEventLog::new(5);
|
||||
let s = make_sketch(42);
|
||||
log.push(&s, 0.5, 100);
|
||||
log.push(&make_sketch(99), 0.3, 200);
|
||||
log.push(&s, 0.5, 300); // duplicate by witness, newer timestamp
|
||||
|
||||
let target_witness = log.iter().nth(2).unwrap().witness_sha256;
|
||||
let hit = log.find_by_witness(&target_witness).unwrap();
|
||||
assert_eq!(hit.timestamp_us, 300, "find_by_witness returns most recent");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn find_by_witness_returns_none_on_miss() {
|
||||
let mut log = PrivacyEventLog::new(2);
|
||||
log.push(&make_sketch(1), 0.5, 0);
|
||||
let bogus = [0xAA_u8; 32];
|
||||
assert!(log.find_by_witness(&bogus).is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn event_does_not_carry_raw_embedding() {
|
||||
// The whole point of the event log: an attacker with read
|
||||
// access to the log cannot recover the source CSI / embedding.
|
||||
// Verify structurally that no `Vec<f32>` field exists on
|
||||
// NoveltyEvent — only the bit-packed sketch.
|
||||
let mut log = PrivacyEventLog::new(1);
|
||||
let s = make_sketch(5);
|
||||
log.push(&s, 0.5, 0);
|
||||
let event = log.iter().next().unwrap();
|
||||
// The packed sketch is bytes (1-bit-per-source-dim, ceil-divided).
|
||||
// Length proves the source dim (32 bits = 4 bytes).
|
||||
assert_eq!(event.sketch_bytes.len(), 4);
|
||||
assert_eq!(event.embedding_dim, 32);
|
||||
// No way to reconstruct the original `[f32; 32]` from these 4 bytes
|
||||
// alone; that's the privacy guarantee. (Compile-time witnessed:
|
||||
// there's no Vec<f32> field on NoveltyEvent.)
|
||||
}
|
||||
}
|
||||
|
|
@ -28,6 +28,14 @@
|
|||
|
||||
#[cfg(feature = "crv")]
|
||||
pub mod crv;
|
||||
pub mod event_log;
|
||||
pub mod mat;
|
||||
pub mod signal;
|
||||
pub mod sketch;
|
||||
pub mod viewpoint;
|
||||
|
||||
pub use event_log::{NoveltyEvent, PrivacyEventLog};
|
||||
pub use sketch::{
|
||||
Sketch, SketchBank, SketchError, WireSketch, WireSketchError,
|
||||
WIRE_SKETCH_FORMAT_VERSION, WIRE_SKETCH_MAGIC, WIRE_SKETCH_MAX_BYTES,
|
||||
};
|
||||
|
|
|
|||
844
v2/crates/wifi-densepose-ruvector/src/sketch.rs
Normal file
844
v2/crates/wifi-densepose-ruvector/src/sketch.rs
Normal file
|
|
@ -0,0 +1,844 @@
|
|||
//! RaBitQ-style binary sketch — cheap similarity sensor for CSI/pose embeddings.
|
||||
//!
|
||||
//! Implements **Pass 1** of [ADR-084](../../../../../docs/adr/ADR-084-rabitq-similarity-sensor.md):
|
||||
//! a thin RuView-flavored API over `ruvector_core::quantization::BinaryQuantized`.
|
||||
//!
|
||||
//! # Why a sketch
|
||||
//!
|
||||
//! Every "have I seen something like this before?" comparison in the RuView
|
||||
//! pipeline (AETHER re-ID, room fingerprinting, mincut prefilter, novelty
|
||||
//! detection, mesh-exchange compression, privacy event log) shares the same
|
||||
//! shape: dense float embedding → similarity score → top-K candidates.
|
||||
//! The full-precision compare is expensive — `O(d)` float operations per pair,
|
||||
//! cache-unfriendly because every dimension is a 4-byte load.
|
||||
//!
|
||||
//! A 1-bit sketch (one bit per embedding dimension, packed into bytes) collapses
|
||||
//! the compare to a hardware-accelerated POPCNT/NEON-vcnt over ~32× less
|
||||
//! memory. The published *RaBitQ* algorithm (Gao & Long, SIGMOD 2024) wraps
|
||||
//! this with a randomized rotation for theoretical error bounds; we ship the
|
||||
//! pure sign-quantization variant first and add the rotation later if
|
||||
//! benchmark-measured top-K coverage drops below the ADR-084 acceptance
|
||||
//! threshold of 90%.
|
||||
//!
|
||||
//! # Acceptance criteria (ADR-084 §"Acceptance test")
|
||||
//!
|
||||
//! - Sketch compare cost reduction: **8×–30×** vs full-float compare.
|
||||
//! - Top-K coverage: **≥ 90%** agreement with full-float top-K.
|
||||
//! - End-to-end accuracy regression: **< 1 percentage point**.
|
||||
//!
|
||||
//! Pass 1 establishes the API and the unit-test foundation. Pass 2+ wires it
|
||||
//! into specific pipeline sites and measures the criteria there.
|
||||
//!
|
||||
//! # Use sites (ADR-084)
|
||||
//!
|
||||
//! 1. AETHER re-ID hot-cache filter (`signal::ruvsense::pose_tracker`)
|
||||
//! 2. Cluster-Pi novelty sensor (`sensing-server` `SketchBank`)
|
||||
//! 3. Mesh-exchange compression (ADR-066 swarm bridge)
|
||||
//! 4. Privacy-preserving event log (cluster Pi)
|
||||
//! 5. Mincut prefilter (`ruvector::signal::subcarrier`)
|
||||
//!
|
||||
//! All sites take a `&Sketch` instead of an `&[f32]`; the bridge to dense
|
||||
//! embeddings is `Sketch::from_embedding`.
|
||||
|
||||
use ruvector_core::quantization::{BinaryQuantized, QuantizedVector};
|
||||
use std::cmp::Reverse;
|
||||
use std::collections::BinaryHeap;
|
||||
|
||||
/// Errors raised by the sketch API.
|
||||
#[derive(Debug, thiserror::Error)]
|
||||
pub enum SketchError {
|
||||
/// The sketch's `sketch_version` does not match the `SketchBank`'s.
|
||||
/// This guards against silently comparing sketches produced by different
|
||||
/// embedding-model generations.
|
||||
#[error("sketch_version mismatch: bank={bank}, query={query}")]
|
||||
SketchVersionMismatch {
|
||||
/// Version stored in the bank.
|
||||
bank: u16,
|
||||
/// Version on the incoming sketch.
|
||||
query: u16,
|
||||
},
|
||||
|
||||
/// The sketch's embedding dimension does not match the bank's.
|
||||
/// Two sketches of different dimensions cannot be compared.
|
||||
#[error("embedding_dim mismatch: bank={bank}, query={query}")]
|
||||
EmbeddingDimMismatch {
|
||||
/// Dimension stored in the bank.
|
||||
bank: u16,
|
||||
/// Dimension on the incoming sketch.
|
||||
query: u16,
|
||||
},
|
||||
|
||||
/// Embedding dimension exceeds `u16::MAX` (65,535).
|
||||
///
|
||||
/// Returned by [`Sketch::try_from_embedding`] to surface what
|
||||
/// `from_embedding`'s `debug_assert!` would have hidden in release
|
||||
/// builds — silently truncating the dimension count would otherwise
|
||||
/// let two different-length embeddings compare as if they were the
|
||||
/// same length. See ADR-084 §"Versioning" and the security-review
|
||||
/// finding L2 on PR #435 for context.
|
||||
#[error("embedding dimension {got} exceeds u16::MAX ({max})")]
|
||||
EmbeddingDimOverflow {
|
||||
/// Actual length of the input embedding.
|
||||
got: usize,
|
||||
/// Maximum supported dimension (`u16::MAX`).
|
||||
max: usize,
|
||||
},
|
||||
}
|
||||
|
||||
/// A 1-bit binary sketch of a dense embedding vector.
|
||||
///
|
||||
/// 32× smaller than the source `[f32]` and compared via SIMD-accelerated
|
||||
/// hamming distance (NEON `vcnt` on aarch64, POPCNT on x86_64). Use as a
|
||||
/// cheap pre-filter before full-precision comparison.
|
||||
///
|
||||
/// # Versioning
|
||||
///
|
||||
/// `sketch_version` distinguishes sketches produced by different embedding
|
||||
/// generations. Bumping the embedding model invalidates all stored sketches;
|
||||
/// the `SketchBank` rejects mismatched versions at compare time so callers
|
||||
/// never silently compare incompatible sketches.
|
||||
///
|
||||
/// `embedding_dim` is the source vector's length (not the byte-packed size);
|
||||
/// kept as a check that two sketches are actually comparable.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct Sketch {
|
||||
/// 1-bit-per-dimension packed bytes.
|
||||
inner: BinaryQuantized,
|
||||
/// Source-embedding dimension (e.g., 128 for AETHER).
|
||||
embedding_dim: u16,
|
||||
/// Schema version of the producing embedding model.
|
||||
sketch_version: u16,
|
||||
}
|
||||
|
||||
impl Sketch {
|
||||
/// Construct a sketch from a dense f32 embedding.
|
||||
///
|
||||
/// Each dimension contributes one bit: `1` if the value is `> 0.0`,
|
||||
/// `0` otherwise. This is the standard sign-quantization step.
|
||||
///
|
||||
/// `sketch_version` must be supplied by the caller and bumped whenever
|
||||
/// the embedding model that produced the input changes meaningfully
|
||||
/// (e.g., a re-trained AETHER head). Two sketches with different
|
||||
/// `sketch_version`s are not comparable.
|
||||
pub fn from_embedding(embedding: &[f32], sketch_version: u16) -> Self {
|
||||
// L2 hardening (PR #435 security review): in release builds the
|
||||
// previous `debug_assert!` was compiled out, allowing silent
|
||||
// u16-truncation when `embedding.len() > u16::MAX`. Saturate to
|
||||
// u16::MAX rather than truncate so two over-long embeddings
|
||||
// compare as same-dimensional rather than as accidentally-short.
|
||||
// Callers that need a hard error should use `try_from_embedding`.
|
||||
let embedding_dim = embedding.len().min(u16::MAX as usize) as u16;
|
||||
Self {
|
||||
inner: BinaryQuantized::quantize(embedding),
|
||||
embedding_dim,
|
||||
sketch_version,
|
||||
}
|
||||
}
|
||||
|
||||
/// Fallible constructor that rejects embeddings longer than
|
||||
/// `u16::MAX` (65,535) instead of saturating, raising
|
||||
/// [`SketchError::EmbeddingDimOverflow`]. Use this when an
|
||||
/// over-long input should fail loudly rather than silently
|
||||
/// produce a sketch that disagrees with its source on
|
||||
/// `embedding_dim`.
|
||||
pub fn try_from_embedding(
|
||||
embedding: &[f32],
|
||||
sketch_version: u16,
|
||||
) -> Result<Self, SketchError> {
|
||||
if embedding.len() > u16::MAX as usize {
|
||||
return Err(SketchError::EmbeddingDimOverflow {
|
||||
got: embedding.len(),
|
||||
max: u16::MAX as usize,
|
||||
});
|
||||
}
|
||||
Ok(Self::from_embedding(embedding, sketch_version))
|
||||
}
|
||||
|
||||
/// Hamming distance to another sketch in `[0, embedding_dim]`.
|
||||
///
|
||||
/// Returns `None` if the two sketches have different `embedding_dim` or
|
||||
/// `sketch_version` — comparing them would be semantically meaningless.
|
||||
/// Use [`Sketch::distance_unchecked`] when the caller has already
|
||||
/// validated the sketches come from the same producer.
|
||||
pub fn distance(&self, other: &Self) -> Result<u32, SketchError> {
|
||||
if self.embedding_dim != other.embedding_dim {
|
||||
return Err(SketchError::EmbeddingDimMismatch {
|
||||
bank: self.embedding_dim,
|
||||
query: other.embedding_dim,
|
||||
});
|
||||
}
|
||||
if self.sketch_version != other.sketch_version {
|
||||
return Err(SketchError::SketchVersionMismatch {
|
||||
bank: self.sketch_version,
|
||||
query: other.sketch_version,
|
||||
});
|
||||
}
|
||||
Ok(self.inner.distance(&other.inner) as u32)
|
||||
}
|
||||
|
||||
/// Hamming distance without compatibility checks.
|
||||
///
|
||||
/// Faster than [`Sketch::distance`] (no version/dim check) but the
|
||||
/// caller is responsible for guaranteeing both sketches come from the
|
||||
/// same embedding model and dimension. Use only on sketches retrieved
|
||||
/// from the same `SketchBank`.
|
||||
#[inline]
|
||||
pub fn distance_unchecked(&self, other: &Self) -> u32 {
|
||||
self.inner.distance(&other.inner) as u32
|
||||
}
|
||||
|
||||
/// Source-embedding dimension (number of dimensions in the original
|
||||
/// `[f32]`, not the packed byte length).
|
||||
#[inline]
|
||||
pub fn embedding_dim(&self) -> u16 {
|
||||
self.embedding_dim
|
||||
}
|
||||
|
||||
/// Schema version of the producing embedding model.
|
||||
#[inline]
|
||||
pub fn sketch_version(&self) -> u16 {
|
||||
self.sketch_version
|
||||
}
|
||||
|
||||
/// Borrow the inner ruvector-core `BinaryQuantized` for advanced use
|
||||
/// (e.g., serialisation through ruvector's existing infrastructure).
|
||||
/// Most callers should use [`Sketch::distance`] or [`SketchBank`].
|
||||
#[inline]
|
||||
pub fn as_inner(&self) -> &BinaryQuantized {
|
||||
&self.inner
|
||||
}
|
||||
|
||||
/// Borrow the packed sketch bytes (1 bit per source-embedding
|
||||
/// dimension, ceil-divided into bytes). Used by [`WireSketch`] to
|
||||
/// produce a wire-format payload without re-quantizing. Length is
|
||||
/// `(embedding_dim + 7) / 8` bytes.
|
||||
#[inline]
|
||||
pub fn packed_bytes(&self) -> &[u8] {
|
||||
&self.inner.bits
|
||||
}
|
||||
}
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// ADR-084 Pass 4 — wire-format primitive (cluster-channel-agnostic)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
/// Magic bytes for ADR-084 sketch wire frames. Receivers reject any
|
||||
/// payload that doesn't start with these four bytes — the same shape
|
||||
/// of magic-prefix check ADR-018's CSI binary frame uses (e.g.
|
||||
/// `0xC5110001`). Picked to be distinct from any existing RuView magic.
|
||||
pub const WIRE_SKETCH_MAGIC: u32 = 0xC511_0084;
|
||||
|
||||
/// On-the-wire schema version. Bump on any field reordering or addition.
|
||||
/// `Sketch::sketch_version` (the *embedding model* version) is a
|
||||
/// separate concept and travels in the payload.
|
||||
pub const WIRE_SKETCH_FORMAT_VERSION: u16 = 1;
|
||||
|
||||
/// Maximum wire-payload size the deserializer will accept. Guards
|
||||
/// against a malicious sender claiming `embedding_dim = u16::MAX`
|
||||
/// (would imply 8 KiB of packed bits) and exhausting receiver memory.
|
||||
/// 8 KiB matches the largest reasonable production embedding (post-
|
||||
/// rotation 65,535-d sign-quantized) plus a few bytes of header.
|
||||
pub const WIRE_SKETCH_MAX_BYTES: usize = 9 * 1024;
|
||||
|
||||
/// Errors raised by [`WireSketch::deserialize`].
|
||||
#[derive(Debug, thiserror::Error)]
|
||||
pub enum WireSketchError {
|
||||
/// Payload shorter than the fixed header (12 bytes).
|
||||
#[error("wire payload too short: got {got} bytes, header needs {needed}")]
|
||||
TooShort {
|
||||
/// Bytes received.
|
||||
got: usize,
|
||||
/// Minimum bytes required (12).
|
||||
needed: usize,
|
||||
},
|
||||
/// Payload larger than [`WIRE_SKETCH_MAX_BYTES`].
|
||||
#[error("wire payload exceeds max ({got} > {max})")]
|
||||
TooLarge {
|
||||
/// Bytes received.
|
||||
got: usize,
|
||||
/// Maximum bytes accepted.
|
||||
max: usize,
|
||||
},
|
||||
/// Magic bytes do not match [`WIRE_SKETCH_MAGIC`].
|
||||
#[error("wire magic mismatch: got 0x{got:08X}, expected 0x{expected:08X}")]
|
||||
MagicMismatch {
|
||||
/// Magic value received.
|
||||
got: u32,
|
||||
/// Magic value expected.
|
||||
expected: u32,
|
||||
},
|
||||
/// Format version is newer than the receiver knows how to parse.
|
||||
#[error("wire format_version {got} > supported {max}")]
|
||||
UnsupportedVersion {
|
||||
/// Version received.
|
||||
got: u16,
|
||||
/// Highest version this build understands.
|
||||
max: u16,
|
||||
},
|
||||
/// `embedding_dim` and the byte payload disagree on size.
|
||||
#[error("payload byte count mismatch: header dim={dim} → expected {expected_bytes}, got {got_bytes}")]
|
||||
PayloadSizeMismatch {
|
||||
/// Embedding dimension in the header.
|
||||
dim: u16,
|
||||
/// Bytes the header implies.
|
||||
expected_bytes: usize,
|
||||
/// Bytes actually present.
|
||||
got_bytes: usize,
|
||||
},
|
||||
}
|
||||
|
||||
/// Serialize / deserialize a `Sketch` plus its novelty score for
|
||||
/// transmission over any channel — cluster↔cluster mesh, sensor→Pi UDP,
|
||||
/// gateway→cloud QUIC, etc.
|
||||
///
|
||||
/// # Wire layout (little-endian, packed)
|
||||
///
|
||||
/// | Offset | Field | Width | Notes |
|
||||
/// |--------|--------------------|-------|--------------------------------------------|
|
||||
/// | 0 | `magic` | u32 | [`WIRE_SKETCH_MAGIC`] |
|
||||
/// | 4 | `format_version` | u16 | [`WIRE_SKETCH_FORMAT_VERSION`] |
|
||||
/// | 6 | `sketch_version` | u16 | embedding-model schema version |
|
||||
/// | 8 | `embedding_dim` | u16 | source-embedding dimensions |
|
||||
/// | 10 | `novelty_q15` | u16 | novelty in `[0,1]` × 32_767 (saturated) |
|
||||
/// | 12 | `bits[]` | var | `(embedding_dim + 7) / 8` bytes |
|
||||
///
|
||||
/// Header is exactly **12 bytes**; payload is `ceil(embedding_dim/8)`
|
||||
/// bytes. Total for a 128-d AETHER sketch is 12 + 16 = **28 bytes**.
|
||||
///
|
||||
/// # Why the receiver is paranoid
|
||||
///
|
||||
/// All deserialization paths validate magic, format_version,
|
||||
/// embedding_dim → payload-bytes consistency, and total size before
|
||||
/// touching `BinaryQuantized`. A malformed UDP packet from a
|
||||
/// non-RuView sender will produce a typed `WireSketchError`, never a
|
||||
/// panic. Caps via [`WIRE_SKETCH_MAX_BYTES`] guard against memory-
|
||||
/// exhaustion attacks.
|
||||
pub struct WireSketch;
|
||||
|
||||
impl WireSketch {
|
||||
/// Header size (magic + format_version + sketch_version + dim + novelty).
|
||||
pub const HEADER_BYTES: usize = 12;
|
||||
|
||||
/// Encode a sketch + novelty score for transmission. `novelty` is
|
||||
/// clamped to `[0.0, 1.0]` and quantized to a `u16` (q15 fixed-
|
||||
/// point) so the wire payload is fixed-size. Encoding never
|
||||
/// allocates more than `Self::HEADER_BYTES + sketch.packed_bytes().len()`.
|
||||
pub fn serialize(sketch: &Sketch, novelty: f32) -> Vec<u8> {
|
||||
let bits = sketch.packed_bytes();
|
||||
let total = Self::HEADER_BYTES + bits.len();
|
||||
let mut out = Vec::with_capacity(total);
|
||||
out.extend_from_slice(&WIRE_SKETCH_MAGIC.to_le_bytes());
|
||||
out.extend_from_slice(&WIRE_SKETCH_FORMAT_VERSION.to_le_bytes());
|
||||
out.extend_from_slice(&sketch.sketch_version.to_le_bytes());
|
||||
out.extend_from_slice(&sketch.embedding_dim.to_le_bytes());
|
||||
let nov_q15: u16 = (novelty.clamp(0.0, 1.0) * 32_767.0).round() as u16;
|
||||
out.extend_from_slice(&nov_q15.to_le_bytes());
|
||||
out.extend_from_slice(bits);
|
||||
out
|
||||
}
|
||||
|
||||
/// Decode a sketch + novelty score from an untrusted byte buffer.
|
||||
/// Returns the parsed `(Sketch, novelty)` tuple, or a typed error.
|
||||
pub fn deserialize(buf: &[u8]) -> Result<(Sketch, f32), WireSketchError> {
|
||||
// Length floor: must contain at least the header.
|
||||
if buf.len() < Self::HEADER_BYTES {
|
||||
return Err(WireSketchError::TooShort {
|
||||
got: buf.len(),
|
||||
needed: Self::HEADER_BYTES,
|
||||
});
|
||||
}
|
||||
// Length ceiling: defend against memory-exhaustion attacks via
|
||||
// claimed-but-impossible large dims.
|
||||
if buf.len() > WIRE_SKETCH_MAX_BYTES {
|
||||
return Err(WireSketchError::TooLarge {
|
||||
got: buf.len(),
|
||||
max: WIRE_SKETCH_MAX_BYTES,
|
||||
});
|
||||
}
|
||||
|
||||
let magic = u32::from_le_bytes(buf[0..4].try_into().expect("4-byte slice"));
|
||||
if magic != WIRE_SKETCH_MAGIC {
|
||||
return Err(WireSketchError::MagicMismatch {
|
||||
got: magic,
|
||||
expected: WIRE_SKETCH_MAGIC,
|
||||
});
|
||||
}
|
||||
|
||||
let format_version = u16::from_le_bytes(buf[4..6].try_into().expect("2-byte slice"));
|
||||
if format_version > WIRE_SKETCH_FORMAT_VERSION {
|
||||
return Err(WireSketchError::UnsupportedVersion {
|
||||
got: format_version,
|
||||
max: WIRE_SKETCH_FORMAT_VERSION,
|
||||
});
|
||||
}
|
||||
|
||||
let sketch_version = u16::from_le_bytes(buf[6..8].try_into().expect("2-byte slice"));
|
||||
let embedding_dim = u16::from_le_bytes(buf[8..10].try_into().expect("2-byte slice"));
|
||||
let nov_q15 = u16::from_le_bytes(buf[10..12].try_into().expect("2-byte slice"));
|
||||
|
||||
let expected_bits = ((embedding_dim as usize) + 7) / 8;
|
||||
let got_bits = buf.len() - Self::HEADER_BYTES;
|
||||
if expected_bits != got_bits {
|
||||
return Err(WireSketchError::PayloadSizeMismatch {
|
||||
dim: embedding_dim,
|
||||
expected_bytes: expected_bits,
|
||||
got_bytes: got_bits,
|
||||
});
|
||||
}
|
||||
|
||||
let bits = buf[Self::HEADER_BYTES..].to_vec();
|
||||
let sketch = Sketch {
|
||||
inner: BinaryQuantized {
|
||||
bits,
|
||||
dimensions: embedding_dim as usize,
|
||||
},
|
||||
embedding_dim,
|
||||
sketch_version,
|
||||
};
|
||||
let novelty = (nov_q15 as f32) / 32_767.0;
|
||||
Ok((sketch, novelty))
|
||||
}
|
||||
}
|
||||
|
||||
/// A bank of sketches with stable IDs, queried for top-K nearest neighbours
|
||||
/// by hamming distance.
|
||||
///
|
||||
/// Used at every "have I seen this before" site in the pipeline. The bank
|
||||
/// enforces `sketch_version` and `embedding_dim` consistency at insertion
|
||||
/// time, so `topk` queries never need to re-check.
|
||||
///
|
||||
/// # Invariants
|
||||
///
|
||||
/// - All sketches in a bank share the same `embedding_dim` and `sketch_version`.
|
||||
/// - Bank IDs (`u32`) are caller-assigned and stable across `topk` calls;
|
||||
/// the bank does not renumber on insertion or removal.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct SketchBank {
|
||||
/// (id, sketch) pairs in insertion order.
|
||||
entries: Vec<(u32, Sketch)>,
|
||||
/// Locked at first insertion; all subsequent inserts must match.
|
||||
embedding_dim: Option<u16>,
|
||||
/// Locked at first insertion; all subsequent inserts must match.
|
||||
sketch_version: Option<u16>,
|
||||
}
|
||||
|
||||
impl SketchBank {
|
||||
/// Create an empty bank. Dimension and version are locked at the first
|
||||
/// `insert` call.
|
||||
pub fn new() -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
embedding_dim: None,
|
||||
sketch_version: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create a bank with a pre-locked `embedding_dim` and `sketch_version`.
|
||||
/// Use when the bank's expected schema is known at construction.
|
||||
pub fn with_schema(embedding_dim: u16, sketch_version: u16) -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
embedding_dim: Some(embedding_dim),
|
||||
sketch_version: Some(sketch_version),
|
||||
}
|
||||
}
|
||||
|
||||
/// Number of sketches in the bank.
|
||||
#[inline]
|
||||
pub fn len(&self) -> usize {
|
||||
self.entries.len()
|
||||
}
|
||||
|
||||
/// True iff the bank has no sketches.
|
||||
#[inline]
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.entries.is_empty()
|
||||
}
|
||||
|
||||
/// Locked embedding dimension, or `None` if the bank is empty and
|
||||
/// no schema was pre-supplied.
|
||||
#[inline]
|
||||
pub fn embedding_dim(&self) -> Option<u16> {
|
||||
self.embedding_dim
|
||||
}
|
||||
|
||||
/// Locked sketch version, or `None` if the bank is empty and
|
||||
/// no schema was pre-supplied.
|
||||
#[inline]
|
||||
pub fn sketch_version(&self) -> Option<u16> {
|
||||
self.sketch_version
|
||||
}
|
||||
|
||||
/// Insert a sketch with caller-assigned ID. Locks the bank's schema on
|
||||
/// first insertion; rejects subsequent inserts that mismatch.
|
||||
pub fn insert(&mut self, id: u32, sketch: Sketch) -> Result<(), SketchError> {
|
||||
match self.embedding_dim {
|
||||
None => self.embedding_dim = Some(sketch.embedding_dim),
|
||||
Some(d) if d != sketch.embedding_dim => {
|
||||
return Err(SketchError::EmbeddingDimMismatch {
|
||||
bank: d,
|
||||
query: sketch.embedding_dim,
|
||||
});
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
match self.sketch_version {
|
||||
None => self.sketch_version = Some(sketch.sketch_version),
|
||||
Some(v) if v != sketch.sketch_version => {
|
||||
return Err(SketchError::SketchVersionMismatch {
|
||||
bank: v,
|
||||
query: sketch.sketch_version,
|
||||
});
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
self.entries.push((id, sketch));
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Top-K nearest neighbours by hamming distance, ascending.
|
||||
///
|
||||
/// Returns up to `k` `(id, distance)` pairs sorted by distance. If the
|
||||
/// bank has fewer than `k` entries, returns all of them. If `k == 0`,
|
||||
/// returns empty.
|
||||
///
|
||||
/// Returns `Err` if the query's `embedding_dim` or `sketch_version`
|
||||
/// disagrees with the bank's locked schema. (Cannot return `Err` if the
|
||||
/// bank is empty *and* no schema was pre-supplied — there's nothing to
|
||||
/// disagree with.)
|
||||
pub fn topk(&self, query: &Sketch, k: usize) -> Result<Vec<(u32, u32)>, SketchError> {
|
||||
if k == 0 || self.entries.is_empty() {
|
||||
return Ok(Vec::new());
|
||||
}
|
||||
if let Some(d) = self.embedding_dim {
|
||||
if d != query.embedding_dim {
|
||||
return Err(SketchError::EmbeddingDimMismatch {
|
||||
bank: d,
|
||||
query: query.embedding_dim,
|
||||
});
|
||||
}
|
||||
}
|
||||
if let Some(v) = self.sketch_version {
|
||||
if v != query.sketch_version {
|
||||
return Err(SketchError::SketchVersionMismatch {
|
||||
bank: v,
|
||||
query: query.sketch_version,
|
||||
});
|
||||
}
|
||||
}
|
||||
// Pass-1.5 optimisation: O(n log k) partial sort via a fixed-size
|
||||
// max-heap of `Reverse((distance, id))`. The heap's `peek()`
|
||||
// returns the *largest* of the current best-k. Each candidate is
|
||||
// compared against the heap top in O(1); only better candidates
|
||||
// trigger an O(log k) push/pop. Avoids touching the long tail of
|
||||
// large-distance entries that the truncate would have discarded.
|
||||
//
|
||||
// Fast path: when n ≤ k there is nothing to discard, so a plain
|
||||
// collect + sort is faster than building a heap.
|
||||
let n = self.entries.len();
|
||||
if n <= k {
|
||||
let mut scored: Vec<(u32, u32)> = self
|
||||
.entries
|
||||
.iter()
|
||||
.map(|(id, sk)| (*id, sk.distance_unchecked(query)))
|
||||
.collect();
|
||||
scored.sort_by_key(|&(_, d)| d);
|
||||
return Ok(scored);
|
||||
}
|
||||
|
||||
let mut heap: BinaryHeap<Reverse<(u32, u32)>> = BinaryHeap::with_capacity(k + 1);
|
||||
for (id, sk) in &self.entries {
|
||||
let d = sk.distance_unchecked(query);
|
||||
if heap.len() < k {
|
||||
heap.push(Reverse((d, *id)));
|
||||
} else if let Some(&Reverse((worst, _))) = heap.peek() {
|
||||
// L1 hardening (PR #435 review): structural `if let` rather
|
||||
// than `.expect("heap len == k > 0")`. The branch is
|
||||
// mathematically unreachable when `heap.len() >= k > 0`,
|
||||
// but a defensive pattern makes the impossibility a type
|
||||
// property rather than a runtime invariant. Same hot-path
|
||||
// cost (one bounds check); zero panic risk.
|
||||
if d < worst {
|
||||
heap.pop();
|
||||
heap.push(Reverse((d, *id)));
|
||||
}
|
||||
}
|
||||
}
|
||||
// Drain heap into a Vec — already in (Reverse) descending order;
|
||||
// sort to expose ascending-by-distance per the public contract.
|
||||
let mut scored: Vec<(u32, u32)> = heap
|
||||
.into_iter()
|
||||
.map(|Reverse((d, id))| (id, d))
|
||||
.collect();
|
||||
scored.sort_by_key(|&(_, d)| d);
|
||||
Ok(scored)
|
||||
}
|
||||
|
||||
/// Compute the novelty score of a query against the bank in `[0.0, 1.0]`.
|
||||
///
|
||||
/// Defined as `min_distance / embedding_dim`, so 0.0 means "exact bit
|
||||
/// match exists in the bank" and 1.0 means "every bit differs from the
|
||||
/// nearest stored sketch." Returns 1.0 (max novelty) on an empty bank.
|
||||
/// Returns `Err` on schema mismatch.
|
||||
pub fn novelty(&self, query: &Sketch) -> Result<f32, SketchError> {
|
||||
if self.entries.is_empty() {
|
||||
return Ok(1.0);
|
||||
}
|
||||
let topk = self.topk(query, 1)?;
|
||||
let min_distance = topk.first().map(|&(_, d)| d).unwrap_or(u32::MAX);
|
||||
Ok(min_distance as f32 / query.embedding_dim as f32)
|
||||
}
|
||||
}
|
||||
|
||||
impl Default for SketchBank {
|
||||
fn default() -> Self {
|
||||
Self::new()
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn from_embedding_packs_one_bit_per_dim() {
|
||||
let v = vec![0.5, -0.5, 0.5, -0.5, 0.5, -0.5, 0.5, -0.5];
|
||||
let s = Sketch::from_embedding(&v, 1);
|
||||
assert_eq!(s.embedding_dim(), 8);
|
||||
assert_eq!(s.sketch_version(), 1);
|
||||
// Distance to self is 0
|
||||
assert_eq!(s.distance_unchecked(&s), 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn distance_is_hamming_count() {
|
||||
let a = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1);
|
||||
let b = Sketch::from_embedding(&[-0.5, -0.5, -0.5, -0.5], 1);
|
||||
// All 4 dims flipped sign → 4 bit differences.
|
||||
assert_eq!(a.distance(&b).unwrap(), 4);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn distance_rejects_mismatched_dims() {
|
||||
let a = Sketch::from_embedding(&[0.5, 0.5], 1);
|
||||
let b = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1);
|
||||
let err = a.distance(&b).unwrap_err();
|
||||
assert!(matches!(err, SketchError::EmbeddingDimMismatch { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn distance_rejects_mismatched_versions() {
|
||||
let a = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1);
|
||||
let b = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 2);
|
||||
let err = a.distance(&b).unwrap_err();
|
||||
assert!(matches!(err, SketchError::SketchVersionMismatch { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_topk_returns_sorted_by_distance() {
|
||||
let mut bank = SketchBank::new();
|
||||
// id 10: identical
|
||||
bank.insert(10, Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1)).unwrap();
|
||||
// id 20: 1 bit different (last dim flipped)
|
||||
bank.insert(20, Sketch::from_embedding(&[0.5, 0.5, 0.5, -0.5], 1)).unwrap();
|
||||
// id 30: 2 bits different
|
||||
bank.insert(30, Sketch::from_embedding(&[-0.5, 0.5, -0.5, 0.5], 1)).unwrap();
|
||||
|
||||
let query = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1);
|
||||
let topk = bank.topk(&query, 3).unwrap();
|
||||
|
||||
assert_eq!(topk.len(), 3);
|
||||
assert_eq!(topk[0].0, 10); // 0 distance
|
||||
assert_eq!(topk[1].0, 20); // 1 distance
|
||||
assert_eq!(topk[2].0, 30); // 2 distance
|
||||
assert!(topk[0].1 <= topk[1].1);
|
||||
assert!(topk[1].1 <= topk[2].1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_topk_zero_returns_empty() {
|
||||
let mut bank = SketchBank::new();
|
||||
bank.insert(1, Sketch::from_embedding(&[0.5, 0.5], 1)).unwrap();
|
||||
let q = Sketch::from_embedding(&[0.5, 0.5], 1);
|
||||
assert_eq!(bank.topk(&q, 0).unwrap().len(), 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_topk_more_than_size_returns_all() {
|
||||
let mut bank = SketchBank::new();
|
||||
bank.insert(1, Sketch::from_embedding(&[0.5, 0.5], 1)).unwrap();
|
||||
bank.insert(2, Sketch::from_embedding(&[-0.5, 0.5], 1)).unwrap();
|
||||
let q = Sketch::from_embedding(&[0.5, 0.5], 1);
|
||||
assert_eq!(bank.topk(&q, 100).unwrap().len(), 2);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_locks_schema_on_first_insert() {
|
||||
let mut bank = SketchBank::new();
|
||||
bank.insert(1, Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1)).unwrap();
|
||||
// Different version → reject
|
||||
let err = bank
|
||||
.insert(2, Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 2))
|
||||
.unwrap_err();
|
||||
assert!(matches!(err, SketchError::SketchVersionMismatch { .. }));
|
||||
// Different dim → reject
|
||||
let err = bank
|
||||
.insert(3, Sketch::from_embedding(&[0.5, 0.5], 1))
|
||||
.unwrap_err();
|
||||
assert!(matches!(err, SketchError::EmbeddingDimMismatch { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_with_schema_rejects_first_mismatching_insert() {
|
||||
let mut bank = SketchBank::with_schema(4, 7);
|
||||
let err = bank
|
||||
.insert(1, Sketch::from_embedding(&[0.5, 0.5], 7))
|
||||
.unwrap_err();
|
||||
assert!(matches!(err, SketchError::EmbeddingDimMismatch { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn novelty_zero_for_exact_match_one_for_empty() {
|
||||
let bank_empty = SketchBank::new();
|
||||
let q = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1);
|
||||
assert_eq!(bank_empty.novelty(&q).unwrap(), 1.0);
|
||||
|
||||
let mut bank = SketchBank::new();
|
||||
bank.insert(1, q.clone()).unwrap();
|
||||
assert_eq!(bank.novelty(&q).unwrap(), 0.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn novelty_is_proportional_to_min_distance() {
|
||||
let mut bank = SketchBank::new();
|
||||
// Bank has one sketch with all 8 dims positive.
|
||||
bank.insert(1, Sketch::from_embedding(&[0.5; 8], 1)).unwrap();
|
||||
// Query flips half the dims → 4 bit difference / 8 dims = 0.5.
|
||||
let query = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5, -0.5, -0.5, -0.5, -0.5], 1);
|
||||
let novelty = bank.novelty(&query).unwrap();
|
||||
assert!((novelty - 0.5).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn try_from_embedding_rejects_over_long_input() {
|
||||
// L2 security-review finding (PR #435): the infallible
|
||||
// `from_embedding` saturates to u16::MAX; the fallible
|
||||
// `try_from_embedding` must surface the overflow so callers can
|
||||
// detect the misuse. We can't actually allocate a 65,536-f32
|
||||
// vector in unit tests cheaply (that's 256 KiB, fine), but we
|
||||
// can fabricate a `Vec` with `len() > u16::MAX` and check the
|
||||
// error path.
|
||||
let too_long: Vec<f32> = vec![0.5; (u16::MAX as usize) + 1];
|
||||
let err = Sketch::try_from_embedding(&too_long, 1).unwrap_err();
|
||||
match err {
|
||||
SketchError::EmbeddingDimOverflow { got, max } => {
|
||||
assert_eq!(got, (u16::MAX as usize) + 1);
|
||||
assert_eq!(max, u16::MAX as usize);
|
||||
}
|
||||
_ => panic!("expected EmbeddingDimOverflow, got {err:?}"),
|
||||
}
|
||||
|
||||
// The infallible path should *saturate* to u16::MAX rather
|
||||
// than panic in release. Verify the saturation is observable
|
||||
// on `embedding_dim()`.
|
||||
let s = Sketch::from_embedding(&too_long, 1);
|
||||
assert_eq!(s.embedding_dim(), u16::MAX);
|
||||
}
|
||||
|
||||
// ─── ADR-084 Pass 4 wire-format tests ────────────────────────────────────
|
||||
|
||||
#[test]
|
||||
fn wire_serialize_round_trip() {
|
||||
let v = vec![0.5_f32, -0.5, 0.5, -0.5, 0.5, -0.5, 0.5, -0.5];
|
||||
let sketch = Sketch::from_embedding(&v, 7);
|
||||
let bytes = WireSketch::serialize(&sketch, 0.42);
|
||||
|
||||
// Header (12) + 1 byte (8 dims / 8) = 13 bytes total.
|
||||
assert_eq!(bytes.len(), WireSketch::HEADER_BYTES + 1);
|
||||
|
||||
let (decoded, novelty) = WireSketch::deserialize(&bytes).expect("round-trip");
|
||||
assert_eq!(decoded.embedding_dim(), 8);
|
||||
assert_eq!(decoded.sketch_version(), 7);
|
||||
assert_eq!(decoded.distance_unchecked(&sketch), 0);
|
||||
// q15 quantization round-trips with bounded error.
|
||||
assert!((novelty - 0.42).abs() < 1.0 / 32_767.0 * 2.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_rejects_short_buffer() {
|
||||
let err = WireSketch::deserialize(&[0u8; 5]).unwrap_err();
|
||||
match err {
|
||||
WireSketchError::TooShort { got: 5, needed } => {
|
||||
assert_eq!(needed, WireSketch::HEADER_BYTES);
|
||||
}
|
||||
_ => panic!("expected TooShort, got {err:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_rejects_oversized_buffer() {
|
||||
let big = vec![0u8; WIRE_SKETCH_MAX_BYTES + 1];
|
||||
let err = WireSketch::deserialize(&big).unwrap_err();
|
||||
assert!(matches!(err, WireSketchError::TooLarge { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_rejects_bad_magic() {
|
||||
let mut bytes = WireSketch::serialize(&Sketch::from_embedding(&[0.5; 16], 1), 0.0);
|
||||
bytes[0..4].copy_from_slice(&0xDEAD_BEEF_u32.to_le_bytes());
|
||||
let err = WireSketch::deserialize(&bytes).unwrap_err();
|
||||
assert!(matches!(err, WireSketchError::MagicMismatch { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_rejects_unsupported_format_version() {
|
||||
let mut bytes = WireSketch::serialize(&Sketch::from_embedding(&[0.5; 16], 1), 0.0);
|
||||
// Bump format_version to 99 — beyond what this build supports.
|
||||
bytes[4..6].copy_from_slice(&99_u16.to_le_bytes());
|
||||
let err = WireSketch::deserialize(&bytes).unwrap_err();
|
||||
assert!(matches!(err, WireSketchError::UnsupportedVersion { got: 99, .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_rejects_payload_size_mismatch() {
|
||||
// Build a valid 16-d sketch (2 bytes), then claim dim=24 in the
|
||||
// header (would need 3 bytes). Payload-size check must fire.
|
||||
let mut bytes = WireSketch::serialize(&Sketch::from_embedding(&[0.5; 16], 1), 0.0);
|
||||
bytes[8..10].copy_from_slice(&24_u16.to_le_bytes());
|
||||
let err = WireSketch::deserialize(&bytes).unwrap_err();
|
||||
match err {
|
||||
WireSketchError::PayloadSizeMismatch {
|
||||
dim: 24,
|
||||
expected_bytes: 3,
|
||||
got_bytes: 2,
|
||||
} => {}
|
||||
_ => panic!("expected PayloadSizeMismatch, got {err:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_envelope_size_for_aether_128d() {
|
||||
// Documented size sanity: a 128-d AETHER sketch should fit in
|
||||
// 12-byte header + 16-byte payload = 28 bytes total.
|
||||
let v: Vec<f32> = (0..128).map(|i| (i as f32).sin()).collect();
|
||||
let sketch = Sketch::from_embedding(&v, 1);
|
||||
let bytes = WireSketch::serialize(&sketch, 0.5);
|
||||
assert_eq!(bytes.len(), 28, "AETHER 128-d must wire to exactly 28 bytes");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn topk_rejects_query_with_wrong_schema() {
|
||||
let mut bank = SketchBank::with_schema(4, 1);
|
||||
bank.insert(1, Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 1)).unwrap();
|
||||
let bad_dim = Sketch::from_embedding(&[0.5, 0.5], 1);
|
||||
assert!(matches!(
|
||||
bank.topk(&bad_dim, 1).unwrap_err(),
|
||||
SketchError::EmbeddingDimMismatch { .. }
|
||||
));
|
||||
let bad_ver = Sketch::from_embedding(&[0.5, 0.5, 0.5, 0.5], 99);
|
||||
assert!(matches!(
|
||||
bank.topk(&bad_ver, 1).unwrap_err(),
|
||||
SketchError::SketchVersionMismatch { .. }
|
||||
));
|
||||
}
|
||||
}
|
||||
|
|
@ -333,6 +333,13 @@ struct NodeState {
|
|||
motion_energy_history: VecDeque<f64>,
|
||||
/// Coherence score [0.0, 1.0]: low variance in motion_energy = high coherence.
|
||||
coherence_score: f64,
|
||||
/// ADR-084 Pass 3 cluster-Pi novelty sensor — per-node sketch bank of
|
||||
/// recent CSI feature vectors. Populated by `update_novelty` on each
|
||||
/// frame; left `None` to disable the sensor on a per-node basis.
|
||||
feature_history: Option<wifi_densepose_signal::ruvsense::longitudinal::EmbeddingHistory>,
|
||||
/// Most recent novelty score in [0.0, 1.0] (0 = exact-match in bank,
|
||||
/// 1 = no overlap). Consumed by the model-wake gate downstream.
|
||||
pub(crate) last_novelty_score: Option<f32>,
|
||||
}
|
||||
|
||||
/// Default EMA alpha for temporal keypoint smoothing (RuVector Phase 2).
|
||||
|
|
@ -347,6 +354,15 @@ const COHERENCE_LOW_THRESHOLD: f64 = 0.3;
|
|||
const MAX_BONE_CHANGE_RATIO: f64 = 0.20;
|
||||
/// Number of motion_energy frames to track for coherence scoring.
|
||||
const COHERENCE_WINDOW: usize = 20;
|
||||
/// ADR-084 Pass 3 — per-node novelty sketch dimension (56 subcarriers,
|
||||
/// the dominant ESP32-S3 capture configuration).
|
||||
const NOVELTY_VECTOR_DIM: usize = 56;
|
||||
/// ADR-084 Pass 3 — number of past sketches retained per-node for
|
||||
/// novelty comparison. 64 frames ≈ 6.4 s at 10 Hz.
|
||||
const NOVELTY_HISTORY_CAPACITY: usize = 64;
|
||||
/// ADR-084 Pass 3 — feature-vector schema version. Bump on changes to
|
||||
/// subcarrier ordering / normalisation so banks reject stale data.
|
||||
const NOVELTY_SKETCH_VERSION: u16 = 1;
|
||||
|
||||
impl NodeState {
|
||||
pub(crate) fn new() -> Self {
|
||||
|
|
@ -375,9 +391,46 @@ impl NodeState {
|
|||
prev_keypoints: None,
|
||||
motion_energy_history: VecDeque::with_capacity(COHERENCE_WINDOW),
|
||||
coherence_score: 1.0, // assume stable initially
|
||||
feature_history: Some(
|
||||
wifi_densepose_signal::ruvsense::longitudinal::EmbeddingHistory::with_sketch(
|
||||
NOVELTY_VECTOR_DIM,
|
||||
NOVELTY_HISTORY_CAPACITY,
|
||||
NOVELTY_SKETCH_VERSION,
|
||||
),
|
||||
),
|
||||
last_novelty_score: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// ADR-084 cluster-Pi novelty step. Truncates / zero-pads the
|
||||
/// incoming amplitude vector to `NOVELTY_VECTOR_DIM`, scores its
|
||||
/// novelty against the per-node bank, then inserts it. The novelty
|
||||
/// score is computed *before* the insert so a frame doesn't see
|
||||
/// itself in the bank.
|
||||
pub(crate) fn update_novelty(&mut self, amplitudes: &[f64]) {
|
||||
let history = match &mut self.feature_history {
|
||||
Some(h) => h,
|
||||
None => return,
|
||||
};
|
||||
let mut feature: Vec<f32> = amplitudes
|
||||
.iter()
|
||||
.take(NOVELTY_VECTOR_DIM)
|
||||
.map(|&v| v as f32)
|
||||
.collect();
|
||||
feature.resize(NOVELTY_VECTOR_DIM, 0.0);
|
||||
|
||||
// Score before insert so a query doesn't see itself.
|
||||
self.last_novelty_score = history.novelty(&feature);
|
||||
|
||||
let _ = history.push(
|
||||
wifi_densepose_signal::ruvsense::longitudinal::EmbeddingEntry {
|
||||
person_id: 0,
|
||||
day_us: 0,
|
||||
embedding: feature,
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
/// Update the coherence score from the latest motion_energy value.
|
||||
///
|
||||
/// Coherence is computed as 1.0 / (1.0 + running_variance) so that
|
||||
|
|
@ -423,6 +476,68 @@ struct PerNodeFeatureInfo {
|
|||
last_seen_ms: u64,
|
||||
frame_rate_hz: f64,
|
||||
stale: bool,
|
||||
/// ADR-084 Pass 3 cluster-Pi novelty score in `[0.0, 1.0]`.
|
||||
/// `0.0` = exact-match-in-bank, `1.0` = no overlap with recent
|
||||
/// per-node frame history. `None` until the first
|
||||
/// `update_novelty()` call. Consumers (model-wake gate, anomaly
|
||||
/// emit, UI heatmap) read this to decide whether to escalate.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
novelty_score: Option<f32>,
|
||||
}
|
||||
|
||||
/// Build a per-node feature snapshot for the WebSocket envelope.
|
||||
///
|
||||
/// ADR-084 Pass 3.6 — exposes `last_novelty_score` from each
|
||||
/// `NodeState` to the WebSocket consumer. Returns `None` when the
|
||||
/// node map is empty (no live ESP32 frames have been ingested yet),
|
||||
/// so the existing `node_features: None` semantics on cold-start are
|
||||
/// preserved.
|
||||
///
|
||||
/// Stale flag uses 5-second threshold matching `ESP32_OFFLINE_TIMEOUT`.
|
||||
fn build_node_features(
|
||||
node_states: &std::collections::HashMap<u8, NodeState>,
|
||||
now: std::time::Instant,
|
||||
) -> Option<Vec<PerNodeFeatureInfo>> {
|
||||
if node_states.is_empty() {
|
||||
return None;
|
||||
}
|
||||
let entries: Vec<PerNodeFeatureInfo> = node_states
|
||||
.iter()
|
||||
.map(|(&node_id, ns)| {
|
||||
let last_seen_ms = ns
|
||||
.last_frame_time
|
||||
.map(|t| now.saturating_duration_since(t).as_millis() as u64)
|
||||
.unwrap_or(u64::MAX);
|
||||
let stale = ns
|
||||
.last_frame_time
|
||||
.map(|t| now.saturating_duration_since(t) > ESP32_OFFLINE_TIMEOUT)
|
||||
.unwrap_or(true);
|
||||
let features = ns.latest_features.clone().unwrap_or(FeatureInfo {
|
||||
mean_rssi: 0.0,
|
||||
variance: 0.0,
|
||||
motion_band_power: 0.0,
|
||||
breathing_band_power: 0.0,
|
||||
dominant_freq_hz: 0.0,
|
||||
change_points: 0,
|
||||
spectral_power: 0.0,
|
||||
});
|
||||
PerNodeFeatureInfo {
|
||||
node_id,
|
||||
features,
|
||||
classification: ClassificationInfo {
|
||||
motion_level: ns.current_motion_level.clone(),
|
||||
presence: !matches!(ns.current_motion_level.as_str(), "absent"),
|
||||
confidence: ns.smoothed_person_score.clamp(0.0, 1.0),
|
||||
},
|
||||
rssi_dbm: ns.rssi_history.back().copied().unwrap_or(0.0),
|
||||
last_seen_ms,
|
||||
frame_rate_hz: 0.0, // Computed elsewhere; not yet plumbed here.
|
||||
stale,
|
||||
novelty_score: ns.last_novelty_score,
|
||||
}
|
||||
})
|
||||
.collect();
|
||||
Some(entries)
|
||||
}
|
||||
|
||||
/// Shared application state
|
||||
|
|
@ -3696,7 +3811,12 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
|
|||
model_status: None,
|
||||
persons: None,
|
||||
estimated_persons: if total_persons > 0 { Some(total_persons) } else { None },
|
||||
node_features: None,
|
||||
// ADR-084 Pass 3.6: surface per-node novelty_score
|
||||
// (and the rest of the per-node feature snapshot)
|
||||
// on the WebSocket envelope so cluster-Pi consumers
|
||||
// can implement model-wake gating without round-
|
||||
// tripping back to the server.
|
||||
node_features: build_node_features(&s.node_states, now),
|
||||
};
|
||||
|
||||
let raw_persons = derive_pose_from_sensing(&update);
|
||||
|
|
@ -3764,6 +3884,13 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
|
|||
let ns = s.node_states.entry(node_id).or_insert_with(NodeState::new);
|
||||
ns.last_frame_time = Some(std::time::Instant::now());
|
||||
|
||||
// ADR-084 Pass 3: cluster-Pi novelty sensor.
|
||||
// Score this frame's feature vector against the per-node
|
||||
// sketch bank *before* pushing it (so the score reflects
|
||||
// pre-insert state). Result lands in `ns.last_novelty_score`
|
||||
// for downstream model-wake gating.
|
||||
ns.update_novelty(&frame.amplitudes);
|
||||
|
||||
ns.frame_history.push_back(frame.amplitudes.clone());
|
||||
if ns.frame_history.len() > FRAME_HISTORY_CAPACITY {
|
||||
ns.frame_history.pop_front();
|
||||
|
|
@ -3908,7 +4035,12 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
|
|||
model_status: None,
|
||||
persons: None,
|
||||
estimated_persons: if total_persons > 0 { Some(total_persons) } else { None },
|
||||
node_features: None,
|
||||
// ADR-084 Pass 3.6: surface per-node novelty_score
|
||||
// (and the rest of the per-node feature snapshot)
|
||||
// on the WebSocket envelope so cluster-Pi consumers
|
||||
// can implement model-wake gating without round-
|
||||
// tripping back to the server.
|
||||
node_features: build_node_features(&s.node_states, now),
|
||||
};
|
||||
|
||||
let raw_persons = derive_pose_from_sensing(&update);
|
||||
|
|
@ -4870,3 +5002,51 @@ async fn main() {
|
|||
|
||||
info!("Server shut down cleanly");
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod novelty_tests {
|
||||
use super::*;
|
||||
|
||||
/// First call to `update_novelty` must produce *some* score
|
||||
/// (`Some(_)` not `None`) — proves the per-node sketch bank is
|
||||
/// initialised by `NodeState::new()` and the novelty path is
|
||||
/// actually being exercised. With an empty bank the score is 1.0
|
||||
/// (max novelty).
|
||||
#[test]
|
||||
fn first_frame_yields_max_novelty_then_zero_on_repeat() {
|
||||
let mut ns = NodeState::new();
|
||||
let amplitudes: Vec<f64> = (0..NOVELTY_VECTOR_DIM)
|
||||
.map(|i| (i as f64).sin())
|
||||
.collect();
|
||||
|
||||
ns.update_novelty(&litudes);
|
||||
let first = ns.last_novelty_score.expect("sketch bank initialised");
|
||||
assert!(
|
||||
(first - 1.0).abs() < 1e-6,
|
||||
"empty bank → max novelty 1.0, got {first}"
|
||||
);
|
||||
|
||||
// Repeat the exact same frame — bank now contains it, so the
|
||||
// novelty score must be 0.0 (the score is computed before the
|
||||
// second insert, against the post-first-insert bank).
|
||||
ns.update_novelty(&litudes);
|
||||
let second = ns.last_novelty_score.expect("score stays Some");
|
||||
assert_eq!(second, 0.0, "exact-repeat frame → novelty 0.0");
|
||||
}
|
||||
|
||||
/// `update_novelty` must tolerate amplitude vectors of unexpected
|
||||
/// length — short ones zero-padded, long ones truncated — without
|
||||
/// panicking. ESP32-S3 boards report 56 subcarriers but other
|
||||
/// hardware variants ship 52 or 64; the schema-locked sketch bank
|
||||
/// requires exactly NOVELTY_VECTOR_DIM.
|
||||
#[test]
|
||||
fn handles_short_and_long_amplitude_vectors() {
|
||||
let mut ns = NodeState::new();
|
||||
ns.update_novelty(&[1.0, 2.0]); // way short
|
||||
assert!(ns.last_novelty_score.is_some());
|
||||
|
||||
let too_long: Vec<f64> = (0..NOVELTY_VECTOR_DIM * 2).map(|i| i as f64).collect();
|
||||
ns.update_novelty(&too_long); // way long
|
||||
assert!(ns.last_novelty_score.is_some());
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -15,12 +15,32 @@ use crate::vital_signs::{VitalSignDetector, VitalSigns};
|
|||
use wifi_densepose_signal::ruvsense::pose_tracker::PoseTracker;
|
||||
use wifi_densepose_signal::ruvsense::multistatic::MultistaticFuser;
|
||||
use wifi_densepose_signal::ruvsense::field_model::FieldModel;
|
||||
use wifi_densepose_signal::ruvsense::longitudinal::{EmbeddingEntry, EmbeddingHistory};
|
||||
|
||||
// ── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
/// Number of frames retained in `frame_history` for temporal analysis.
|
||||
pub const FRAME_HISTORY_CAPACITY: usize = 100;
|
||||
|
||||
/// Per-node feature-vector dimension fed into the novelty sketch bank
|
||||
/// (ADR-084 §"cluster-Pi novelty sensor"). 56 subcarriers is the
|
||||
/// dominant ESP32-S3 capture configuration; vectors with more or fewer
|
||||
/// subcarriers are truncated or zero-padded to this length so the
|
||||
/// schema-locked SketchBank stays consistent across hardware variants.
|
||||
pub const NOVELTY_VECTOR_DIM: usize = 56;
|
||||
|
||||
/// Number of past sketches retained per-node for novelty comparison.
|
||||
/// 64 frames ≈ 6.4 s at 10 Hz CSI rate, enough to capture short-term
|
||||
/// "this is what this room normally looks like." Older sketches are
|
||||
/// FIFO-evicted by `EmbeddingHistory`.
|
||||
pub const NOVELTY_HISTORY_CAPACITY: usize = 64;
|
||||
|
||||
/// Schema version for the per-node novelty sketch. Bump when the
|
||||
/// feature-vector encoding changes meaningfully (e.g., different
|
||||
/// subcarrier ordering or normalisation) so existing per-node banks
|
||||
/// reject incoming sketches from incompatible model generations.
|
||||
pub const NOVELTY_SKETCH_VERSION: u16 = 1;
|
||||
|
||||
/// If no ESP32 frame arrives within this duration, source reverts to offline.
|
||||
pub const ESP32_OFFLINE_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(5);
|
||||
|
||||
|
|
@ -183,6 +203,11 @@ pub struct PerNodeFeatureInfo {
|
|||
pub rssi_dbm: f64,
|
||||
pub last_seen_ms: u64,
|
||||
pub frame_rate_hz: f64,
|
||||
/// ADR-084 Pass 3 cluster-Pi novelty score in `[0.0, 1.0]`.
|
||||
/// `0.0` = exact-match-in-bank, `1.0` = no overlap with recent
|
||||
/// per-node frame history. `None` until first `update_novelty()`.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub novelty_score: Option<f32>,
|
||||
pub stale: bool,
|
||||
}
|
||||
|
||||
|
|
@ -247,6 +272,15 @@ pub struct NodeState {
|
|||
pub prev_keypoints: Option<Vec<[f64; 3]>>,
|
||||
pub motion_energy_history: VecDeque<f64>,
|
||||
pub coherence_score: f64,
|
||||
/// ADR-084 cluster-Pi novelty sensor — per-node sketch bank of recent
|
||||
/// CSI feature vectors. Populated lazily by `update_novelty` on each
|
||||
/// frame; left `None` if the sensor is disabled (e.g., in unit-test
|
||||
/// fixtures that don't exercise the novelty path).
|
||||
pub feature_history: Option<EmbeddingHistory>,
|
||||
/// Most recent novelty score for this node in `[0.0, 1.0]`.
|
||||
/// `None` until the first `update_novelty` call. Consumed by the
|
||||
/// model-wake gate downstream (low novelty → skip CNN, save energy).
|
||||
pub last_novelty_score: Option<f32>,
|
||||
}
|
||||
|
||||
impl NodeState {
|
||||
|
|
@ -276,9 +310,56 @@ impl NodeState {
|
|||
prev_keypoints: None,
|
||||
motion_energy_history: VecDeque::with_capacity(COHERENCE_WINDOW),
|
||||
coherence_score: 1.0,
|
||||
feature_history: Some(EmbeddingHistory::with_sketch(
|
||||
NOVELTY_VECTOR_DIM,
|
||||
NOVELTY_HISTORY_CAPACITY,
|
||||
NOVELTY_SKETCH_VERSION,
|
||||
)),
|
||||
last_novelty_score: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// ADR-084 cluster-Pi novelty step. Truncates / zero-pads the
|
||||
/// incoming amplitude vector to `NOVELTY_VECTOR_DIM`, scores its
|
||||
/// novelty against the per-node bank, then inserts it. The novelty
|
||||
/// score is computed *before* the insert so a query frame doesn't
|
||||
/// score itself.
|
||||
///
|
||||
/// Idempotent in the absence of `feature_history` (returns early
|
||||
/// silently). Caller can read the result via `last_novelty_score`.
|
||||
pub fn update_novelty(&mut self, amplitudes: &[f64]) {
|
||||
let history = match &mut self.feature_history {
|
||||
Some(h) => h,
|
||||
None => return,
|
||||
};
|
||||
// Truncate or zero-pad to the canonical dim.
|
||||
//
|
||||
// L4 hardening (PR #435 security review): the `as f32` cast
|
||||
// accepts adversarial f64 inputs without panic. `f64::INFINITY`
|
||||
// becomes `f32::INFINITY` (sign-quantizes to bit=1; novelty
|
||||
// degrades but no crash). `f64::NAN` propagates as `f32::NAN`
|
||||
// (sign-quantizes to bit=0 since `NaN > 0.0` is false). CSI
|
||||
// amplitudes from healthy ESP32 firmware are well within f32
|
||||
// finite range — adversarial input degrades novelty quality
|
||||
// but never causes the gate to panic.
|
||||
let mut feature: Vec<f32> = amplitudes
|
||||
.iter()
|
||||
.take(NOVELTY_VECTOR_DIM)
|
||||
.map(|&v| v as f32)
|
||||
.collect();
|
||||
feature.resize(NOVELTY_VECTOR_DIM, 0.0);
|
||||
|
||||
// Score before insert so a query doesn't see itself.
|
||||
self.last_novelty_score = history.novelty(&feature);
|
||||
|
||||
// FIFO insert (EmbeddingHistory handles eviction internally).
|
||||
let _ = history.push(EmbeddingEntry {
|
||||
person_id: 0, // novelty bank doesn't track per-person identity
|
||||
day_us: 0,
|
||||
embedding: feature,
|
||||
});
|
||||
}
|
||||
|
||||
/// Update the coherence score from the latest motion_energy value.
|
||||
pub fn update_coherence(&mut self, motion_energy: f64) {
|
||||
if self.motion_energy_history.len() >= COHERENCE_WINDOW {
|
||||
|
|
|
|||
|
|
@ -45,6 +45,8 @@ midstreamer-attractor = { workspace = true }
|
|||
|
||||
# Internal
|
||||
wifi-densepose-core = { version = "0.3.0", path = "../wifi-densepose-core" }
|
||||
# ADR-084 Pass 2: sketch-prefilter for the EmbeddingHistory search loop.
|
||||
wifi-densepose-ruvector = { version = "0.3.0", path = "../wifi-densepose-ruvector", default-features = false }
|
||||
|
||||
[dev-dependencies]
|
||||
criterion = { version = "0.5", features = ["html_reports"] }
|
||||
|
|
@ -53,3 +55,7 @@ proptest.workspace = true
|
|||
[[bench]]
|
||||
name = "signal_bench"
|
||||
harness = false
|
||||
|
||||
[[bench]]
|
||||
name = "aether_prefilter_bench"
|
||||
harness = false
|
||||
|
|
|
|||
|
|
@ -0,0 +1,95 @@
|
|||
//! ADR-084 Pass 2 acceptance bench — EmbeddingHistory::search_prefilter
|
||||
//! vs the brute-force EmbeddingHistory::search baseline.
|
||||
//!
|
||||
//! Measures the second ADR-084 acceptance number — **end-to-end query
|
||||
//! cost reduction** at the AETHER re-ID site, with the empirically
|
||||
//! validated `prefilter_factor=8` from
|
||||
//! `test_search_prefilter_topk_coverage_meets_adr_084`.
|
||||
//!
|
||||
//! Run with:
|
||||
//! ```bash
|
||||
//! cargo bench -p wifi-densepose-signal --bench aether_prefilter_bench
|
||||
//! ```
|
||||
//!
|
||||
//! Pass criterion: prefilter ≥ 4× faster than brute-force at n=1024;
|
||||
//! ideally trends toward 8× as n grows. The 90%-coverage criterion is
|
||||
//! exercised in the unit-test suite, not the bench (the bench measures
|
||||
//! cost only).
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use std::hint;
|
||||
use wifi_densepose_signal::ruvsense::longitudinal::{EmbeddingEntry, EmbeddingHistory};
|
||||
|
||||
const SKETCH_VERSION: u16 = 1;
|
||||
const PREFILTER_FACTOR: usize = 8;
|
||||
|
||||
/// Deterministic LCG so bench fixtures are reproducible across runs.
|
||||
fn lcg_embedding(dim: usize, seed: u32) -> Vec<f32> {
|
||||
let mut s = seed.wrapping_mul(2_654_435_761).wrapping_add(1);
|
||||
(0..dim)
|
||||
.map(|_| {
|
||||
s = s.wrapping_mul(1_664_525).wrapping_add(1_013_904_223);
|
||||
let u = (s >> 8) as f32 / (1u32 << 24) as f32;
|
||||
u * 2.0 - 1.0
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn bench_search_vs_prefilter(c: &mut Criterion) {
|
||||
const DIM: usize = 128; // AETHER embedding dimension (ADR-024)
|
||||
const K: usize = 8;
|
||||
|
||||
for &n in &[256usize, 1024, 4096] {
|
||||
// Build two parallel histories — one with sketches (prefilter
|
||||
// path) and one without (brute-force path). They contain the
|
||||
// same embeddings.
|
||||
let mut bf = EmbeddingHistory::new(DIM, n);
|
||||
let mut pf = EmbeddingHistory::with_sketch(DIM, n, SKETCH_VERSION);
|
||||
for i in 0..n {
|
||||
let v = lcg_embedding(DIM, i as u32 + 1);
|
||||
let entry = EmbeddingEntry {
|
||||
person_id: i as u64,
|
||||
day_us: i as u64,
|
||||
embedding: v,
|
||||
};
|
||||
bf.push(entry.clone()).expect("bf push");
|
||||
pf.push(entry).expect("pf push");
|
||||
}
|
||||
|
||||
let query = lcg_embedding(DIM, 0xCAFE_BABE);
|
||||
|
||||
let mut group = c.benchmark_group(format!("aether_search_d{DIM}_n{n}_k{K}"));
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("brute_force_cosine", n),
|
||||
&n,
|
||||
|bencher, _| {
|
||||
bencher.iter(|| {
|
||||
let r = black_box(&bf).search(black_box(&query), K);
|
||||
hint::black_box(r)
|
||||
});
|
||||
},
|
||||
);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("sketch_prefilter_factor8", n),
|
||||
&n,
|
||||
|bencher, _| {
|
||||
bencher.iter(|| {
|
||||
let r = black_box(&pf).search_prefilter(
|
||||
black_box(&query),
|
||||
K,
|
||||
PREFILTER_FACTOR,
|
||||
);
|
||||
hint::black_box(r)
|
||||
});
|
||||
},
|
||||
);
|
||||
|
||||
group.finish();
|
||||
}
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_search_vs_prefilter);
|
||||
criterion_main!(benches);
|
||||
|
|
@ -338,25 +338,58 @@ pub struct EmbeddingEntry {
|
|||
///
|
||||
/// In production, this would be backed by an HNSW index for fast
|
||||
/// nearest-neighbor search. This implementation uses brute-force
|
||||
/// cosine similarity for correctness.
|
||||
/// cosine similarity for correctness, with an optional RaBitQ-style
|
||||
/// sketch prefilter (ADR-084) for hot-path queries.
|
||||
#[derive(Debug)]
|
||||
pub struct EmbeddingHistory {
|
||||
entries: Vec<EmbeddingEntry>,
|
||||
/// Per-entry sketch (parallel to `entries`); maintained on push/evict.
|
||||
/// Always populated when `sketch_version` is set.
|
||||
sketches: Vec<wifi_densepose_ruvector::Sketch>,
|
||||
max_entries: usize,
|
||||
embedding_dim: usize,
|
||||
/// Sketch schema version (ADR-084 §"Versioning"). When set, every push
|
||||
/// computes a sketch alongside the float embedding so `search_prefilter`
|
||||
/// can use it. `None` disables the prefilter path entirely (compatible
|
||||
/// with existing callers that never opted in).
|
||||
sketch_version: Option<u16>,
|
||||
}
|
||||
|
||||
impl EmbeddingHistory {
|
||||
/// Create a new embedding history store.
|
||||
/// Create a new embedding history store with the sketch prefilter
|
||||
/// **disabled**. Callers that want the ADR-084 prefilter path should
|
||||
/// use [`EmbeddingHistory::with_sketch`] instead.
|
||||
pub fn new(embedding_dim: usize, max_entries: usize) -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
sketches: Vec::new(),
|
||||
max_entries,
|
||||
embedding_dim,
|
||||
sketch_version: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Add an embedding entry.
|
||||
/// Create a history store with the ADR-084 sketch prefilter enabled.
|
||||
///
|
||||
/// `sketch_version` is the producing embedding-model version (bump it
|
||||
/// on any model change so callers can invalidate stored sketches
|
||||
/// instead of silently comparing across generations).
|
||||
pub fn with_sketch(
|
||||
embedding_dim: usize,
|
||||
max_entries: usize,
|
||||
sketch_version: u16,
|
||||
) -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
sketches: Vec::new(),
|
||||
max_entries,
|
||||
embedding_dim,
|
||||
sketch_version: Some(sketch_version),
|
||||
}
|
||||
}
|
||||
|
||||
/// Add an embedding entry. If sketches are enabled, also computes
|
||||
/// and stores the per-entry sketch.
|
||||
pub fn push(&mut self, entry: EmbeddingEntry) -> Result<(), LongitudinalError> {
|
||||
if entry.embedding.len() != self.embedding_dim {
|
||||
return Err(LongitudinalError::EmbeddingDimensionMismatch {
|
||||
|
|
@ -366,6 +399,13 @@ impl EmbeddingHistory {
|
|||
}
|
||||
if self.entries.len() >= self.max_entries {
|
||||
self.entries.drain(..1); // FIFO eviction — acceptable for daily-rate inserts
|
||||
if !self.sketches.is_empty() {
|
||||
self.sketches.drain(..1);
|
||||
}
|
||||
}
|
||||
if let Some(sv) = self.sketch_version {
|
||||
let sk = wifi_densepose_ruvector::Sketch::from_embedding(&entry.embedding, sv);
|
||||
self.sketches.push(sk);
|
||||
}
|
||||
self.entries.push(entry);
|
||||
Ok(())
|
||||
|
|
@ -385,6 +425,105 @@ impl EmbeddingHistory {
|
|||
similarities
|
||||
}
|
||||
|
||||
/// ADR-084 Pass 2: sketch-prefiltered K-nearest cosine search.
|
||||
///
|
||||
/// Two-stage pipeline:
|
||||
///
|
||||
/// 1. **Prefilter:** sketch the query, hamming-rank all stored
|
||||
/// sketches, take the top `k * prefilter_factor` candidates.
|
||||
/// 2. **Refine:** compute exact cosine similarity against just those
|
||||
/// candidates and return the top-K by cosine.
|
||||
///
|
||||
/// `prefilter_factor` controls the recall/cost trade-off — larger
|
||||
/// values widen the candidate set (more cosine work, higher top-K
|
||||
/// coverage) and smaller values narrow it (less work, risk of
|
||||
/// missing the true top-K). ADR-084 acceptance is **≥ 90% top-K
|
||||
/// agreement** with the brute-force `search`; on synthetic uniform-
|
||||
/// random 128-d embeddings (the AETHER shape), measured coverage is
|
||||
/// **78.9% at factor=4 (FAIL)** and **≥ 90% at factor=8 (PASS)** —
|
||||
/// so callers should pass at least **8**. Real AETHER traces have
|
||||
/// more structure than uniform noise and usually clear the bar at
|
||||
/// lower factors; recalibrate against your bank.
|
||||
///
|
||||
/// Falls back to [`EmbeddingHistory::search`] if sketches were not
|
||||
/// enabled at construction (`sketch_version = None`) — the caller
|
||||
/// gets correct behaviour either way, just without the speedup.
|
||||
pub fn search_prefilter(
|
||||
&self,
|
||||
query: &[f32],
|
||||
k: usize,
|
||||
prefilter_factor: usize,
|
||||
) -> Vec<(usize, f32)> {
|
||||
let sv = match self.sketch_version {
|
||||
Some(v) => v,
|
||||
None => return self.search(query, k),
|
||||
};
|
||||
if k == 0 || self.entries.is_empty() {
|
||||
return Vec::new();
|
||||
}
|
||||
|
||||
let query_sk = wifi_densepose_ruvector::Sketch::from_embedding(query, sv);
|
||||
let prefilter_k = (k.saturating_mul(prefilter_factor.max(1))).min(self.entries.len());
|
||||
|
||||
// Stage 1: sketch hamming top-K' over all sketches.
|
||||
// (Inlined here rather than going through SketchBank because
|
||||
// EmbeddingHistory owns the parallel `sketches` array directly.)
|
||||
let mut hamming: Vec<(usize, u32)> = self
|
||||
.sketches
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, sk)| (i, sk.distance_unchecked(&query_sk)))
|
||||
.collect();
|
||||
hamming.sort_by_key(|&(_, d)| d);
|
||||
hamming.truncate(prefilter_k);
|
||||
|
||||
// Stage 2: refine the prefilter set with exact cosine.
|
||||
let mut refined: Vec<(usize, f32)> = hamming
|
||||
.into_iter()
|
||||
.map(|(i, _)| (i, cosine_similarity(query, &self.entries[i].embedding)))
|
||||
.collect();
|
||||
refined.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
refined.truncate(k);
|
||||
refined
|
||||
}
|
||||
|
||||
/// ADR-084 Pass 3: novelty score for a query against the bank in [0.0, 1.0].
|
||||
///
|
||||
/// Defined as `min_hamming_distance / embedding_dim` over the stored
|
||||
/// sketches, so 0.0 means "exact bit-match exists in the bank" and
|
||||
/// 1.0 means "every bit differs from the nearest stored sketch."
|
||||
/// Returns 1.0 (max novelty) on an empty bank.
|
||||
///
|
||||
/// This is the primitive the cluster-Pi novelty sensor wraps: a
|
||||
/// per-node bank of recent feature vectors, with each new frame
|
||||
/// scored for novelty before being inserted. Downstream gates
|
||||
/// (model-wake, anomaly-emit, escalation) consume the score.
|
||||
///
|
||||
/// Returns `None` if sketches are not enabled
|
||||
/// (use `EmbeddingHistory::with_sketch` to enable).
|
||||
pub fn novelty(&self, query: &[f32]) -> Option<f32> {
|
||||
let sv = self.sketch_version?;
|
||||
if self.sketches.is_empty() {
|
||||
return Some(1.0);
|
||||
}
|
||||
// L3 hardening (PR #435 security review): a 0-dim history would
|
||||
// produce `min_d as f32 / 0.0 = NaN`, silently poisoning every
|
||||
// downstream gate. `with_sketch(0, ...)` is constructible today;
|
||||
// treating "no comparison possible" as "maximally novel" is the
|
||||
// fail-loud behaviour every consumer of this score expects.
|
||||
if self.embedding_dim == 0 {
|
||||
return Some(1.0);
|
||||
}
|
||||
let q = wifi_densepose_ruvector::Sketch::from_embedding(query, sv);
|
||||
let min_d = self
|
||||
.sketches
|
||||
.iter()
|
||||
.map(|sk| sk.distance_unchecked(&q))
|
||||
.min()
|
||||
.unwrap_or(u32::MAX);
|
||||
Some(min_d as f32 / self.embedding_dim as f32)
|
||||
}
|
||||
|
||||
/// Number of entries stored.
|
||||
pub fn len(&self) -> usize {
|
||||
self.entries.len()
|
||||
|
|
@ -689,4 +828,197 @@ mod tests {
|
|||
let c = vec![1.0_f32, 0.0, 0.0];
|
||||
assert!((cosine_similarity(&a, &c) - 1.0).abs() < 1e-6, "Same = 1");
|
||||
}
|
||||
|
||||
// ─── ADR-084 Pass 2: sketch-prefilter tests ──────────────────────────────
|
||||
|
||||
/// Deterministic LCG so synthetic test embeddings are reproducible
|
||||
/// without pulling in a `rand` dev-dep just for fixture generation.
|
||||
fn lcg_embedding(dim: usize, seed: u32) -> Vec<f32> {
|
||||
let mut s = seed.wrapping_mul(2_654_435_761).wrapping_add(1);
|
||||
(0..dim)
|
||||
.map(|_| {
|
||||
s = s.wrapping_mul(1_664_525).wrapping_add(1_013_904_223);
|
||||
let u = (s >> 8) as f32 / (1u32 << 24) as f32;
|
||||
u * 2.0 - 1.0
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_search_prefilter_falls_back_when_sketches_disabled() {
|
||||
// `EmbeddingHistory::new` does NOT enable sketches; the prefilter
|
||||
// must transparently fall back to brute-force search so callers
|
||||
// never see incorrect results.
|
||||
let mut h = EmbeddingHistory::new(8, 100);
|
||||
for i in 0..5 {
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: i,
|
||||
day_us: i,
|
||||
embedding: lcg_embedding(8, i as u32 + 1),
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
let q = lcg_embedding(8, 42);
|
||||
let bf = h.search(&q, 3);
|
||||
let pf = h.search_prefilter(&q, 3, 4);
|
||||
assert_eq!(bf, pf, "fallback path must equal brute-force exactly");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_search_prefilter_topk_coverage_meets_adr_084() {
|
||||
// ADR-084 acceptance criterion: prefilter top-K must agree with
|
||||
// brute-force top-K on at least 90% of results. We use a 256-entry
|
||||
// bank of 128-d synthetic embeddings (the AETHER shape) and check
|
||||
// both K=8 and K=16 to span the realistic range.
|
||||
const DIM: usize = 128;
|
||||
const N: usize = 256;
|
||||
const K_VALUES: [usize; 2] = [8, 16];
|
||||
const PREFILTER_FACTOR: usize = 8;
|
||||
const SKETCH_VERSION: u16 = 1;
|
||||
|
||||
let mut h = EmbeddingHistory::with_sketch(DIM, N, SKETCH_VERSION);
|
||||
for i in 0..N {
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: i as u64,
|
||||
day_us: i as u64,
|
||||
embedding: lcg_embedding(DIM, i as u32 + 1),
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
|
||||
for &k in &K_VALUES {
|
||||
let mut total_overlap = 0usize;
|
||||
let mut total_expected = 0usize;
|
||||
// 16 different queries to smooth out any single-query luck.
|
||||
for q_seed in 0..16u32 {
|
||||
let q = lcg_embedding(DIM, q_seed.wrapping_add(0xCAFE_BABE));
|
||||
let bf: std::collections::HashSet<usize> =
|
||||
h.search(&q, k).into_iter().map(|(i, _)| i).collect();
|
||||
let pf: std::collections::HashSet<usize> = h
|
||||
.search_prefilter(&q, k, PREFILTER_FACTOR)
|
||||
.into_iter()
|
||||
.map(|(i, _)| i)
|
||||
.collect();
|
||||
total_overlap += bf.intersection(&pf).count();
|
||||
total_expected += k;
|
||||
}
|
||||
let coverage = total_overlap as f32 / total_expected as f32;
|
||||
assert!(
|
||||
coverage >= 0.90,
|
||||
"ADR-084 acceptance failed at k={k}: prefilter coverage {coverage:.3} < 0.90"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_novelty_returns_none_without_sketches() {
|
||||
// EmbeddingHistory::new disables sketches; novelty must be None
|
||||
// so callers can fall back to a slower path or skip the gate.
|
||||
let mut h = EmbeddingHistory::new(8, 100);
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: 1,
|
||||
day_us: 0,
|
||||
embedding: lcg_embedding(8, 1),
|
||||
})
|
||||
.unwrap();
|
||||
let q = lcg_embedding(8, 99);
|
||||
assert_eq!(h.novelty(&q), None);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_novelty_zero_for_exact_match_one_for_empty_bank() {
|
||||
// Empty bank → maximum novelty (1.0).
|
||||
let h = EmbeddingHistory::with_sketch(8, 100, 1);
|
||||
let q = lcg_embedding(8, 1);
|
||||
assert_eq!(h.novelty(&q), Some(1.0));
|
||||
|
||||
// Bank containing the query → minimum novelty (0.0).
|
||||
let mut h = EmbeddingHistory::with_sketch(8, 100, 1);
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: 1,
|
||||
day_us: 0,
|
||||
embedding: q.clone(),
|
||||
})
|
||||
.unwrap();
|
||||
assert_eq!(h.novelty(&q), Some(0.0));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_novelty_zero_dim_history_returns_one_not_nan() {
|
||||
// L3 security-review finding (PR #435): a 0-dim sketch history is
|
||||
// constructible via `with_sketch(0, ...)`. Without the guard,
|
||||
// `novelty` would produce NaN (min_d / 0). This pins down the
|
||||
// documented fail-loud behaviour: 0-dim → max-novelty 1.0.
|
||||
let h = EmbeddingHistory::with_sketch(0, 100, 1);
|
||||
let q: Vec<f32> = vec![]; // 0-dim query is the only valid one here
|
||||
let result = h.novelty(&q);
|
||||
assert_eq!(result, Some(1.0), "0-dim history → max novelty, never NaN");
|
||||
assert!(
|
||||
!result.unwrap().is_nan(),
|
||||
"novelty must never be NaN — 0-dim is fail-loud, not silent"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_novelty_decreases_as_bank_grows_around_query() {
|
||||
// Insert progressively-closer-to-query embeddings; novelty must
|
||||
// monotonically decrease (or stay flat). Guards against an
|
||||
// accidentally-reversed comparator producing the wrong gradient.
|
||||
const DIM: usize = 64;
|
||||
let mut h = EmbeddingHistory::with_sketch(DIM, 100, 1);
|
||||
let target = lcg_embedding(DIM, 0xDEAD_BEEF);
|
||||
|
||||
// Push several embeddings unrelated to the target first.
|
||||
for s in 1..10u32 {
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: s as u64,
|
||||
day_us: s as u64,
|
||||
embedding: lcg_embedding(DIM, s),
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
let novelty_far = h.novelty(&target).unwrap();
|
||||
|
||||
// Push the target itself — novelty must drop to 0.
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: 99,
|
||||
day_us: 99,
|
||||
embedding: target.clone(),
|
||||
})
|
||||
.unwrap();
|
||||
let novelty_near = h.novelty(&target).unwrap();
|
||||
|
||||
assert!(
|
||||
novelty_near <= novelty_far,
|
||||
"novelty must not increase when adding a closer match: {novelty_far} → {novelty_near}"
|
||||
);
|
||||
assert_eq!(novelty_near, 0.0, "exact match should yield novelty 0");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_search_prefilter_evicts_sketches_on_fifo() {
|
||||
// FIFO eviction must drop sketches in lockstep with entries; if
|
||||
// the two arrays drift the prefilter would index the wrong sketch
|
||||
// for an entry and silently corrupt top-K results.
|
||||
let mut h = EmbeddingHistory::with_sketch(4, 3, 1);
|
||||
for i in 0..5u32 {
|
||||
h.push(EmbeddingEntry {
|
||||
person_id: i as u64,
|
||||
day_us: i as u64,
|
||||
embedding: lcg_embedding(4, i + 1),
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
assert_eq!(h.len(), 3);
|
||||
// Sanity: first two entries (day_us 0, 1) evicted.
|
||||
assert_eq!(h.get(0).unwrap().day_us, 2);
|
||||
|
||||
// Prefilter still works post-eviction (no panic, returns valid indices).
|
||||
let q = lcg_embedding(4, 99);
|
||||
let pf = h.search_prefilter(&q, 2, 4);
|
||||
assert_eq!(pf.len(), 2);
|
||||
for (i, _) in &pf {
|
||||
assert!(*i < h.len());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue