Added measure_concurrent_fed: N client threads hammering federated
search against K shards on one box. Expected result was "rayon helps
under concurrent load where single-thread bench masks it." Actual
result is different and worth recording honestly.
At n=100k, 8 clients × 300 queries:
1 shard: 810ms wall, 2,963 qps
2 shards: 960ms wall, 2,500 qps (0.84×)
4 shards: 1,350ms wall, 1,778 qps (0.60×)
More shards = LOWER concurrent throughput for this "same data split K
ways on one box" workload. Root cause: the RaBitQ rerank_factor × k =
200 rerank runs per shard, so K-shard federation does ~K× the rerank
work. Parallel fan-out cuts scan cost but not rerank cost.
Consequences documented in BENCHMARK.md:
- Don't shard for throughput on same-box same-data; shard for
reachability or memory bounds.
- Per-shard rerank factor reduction is an obvious M2 optimization:
fan out at rerank=50 per shard when K≥2 keeps global recall above
90% while approximately K× reducing the rerank cost. Measurement-
driven, not speculative.
- Real federation gain (disjoint data across network backends) is
genuine; this bench just doesn't measure it.
Rayon fan-out is NOT reverted — still correct for the miss-path prime
(1.97× / 3.86× speedup retained) and for remote-backend I/O overlap.
Co-Authored-By: claude-flow <ruv@ruv.net>
Iter 10 shipped the symmetric publish_bundle / refresh_from_bundle_dir
primitives with witness-authenticated handoff. The protocol is:
publisher → atomic-write table.rulake.json
reader → read, verify witness, compare, invalidate if different
Three-state refresh result (UpToDate / Invalidated / BundleMissing)
covers all the daemon's logging / alerting needs. Tampered sidecars
fail loudly instead of silently corrupting the cache.
Move the question from "still open" to "resolved in M1" and drop the
now-stale M2 placeholder.
Co-Authored-By: claude-flow <ruv@ruv.net>
Completes the sidecar loop (publish → disk → refresh). Given a key
and a directory, read the on-disk table.rulake.json and:
- UpToDate: witness matches cache pointer, nothing to do
- Invalidated: witnesses differ, cache pointer for key is dropped
- BundleMissing: no sidecar present (caller decides)
A corrupt/tampered sidecar surfaces as InvalidParameter via
RuLakeBundle::read_from_dir's witness verification — a poisoned
publish cannot silently invalidate the cache.
This is the minimal primitive a cache sidecar daemon needs. The
daemon itself is a ~10-line loop in user code: for each watched
(key, dir), call refresh_from_bundle_dir periodically or in
response to inotify events; handle the three outcomes.
Closes the "cache sidecar daemon protocol" open question from
ADR-155. The protocol is: filesystem-based, witness-authenticated,
atomic-write on publish, three-state on refresh.
14 federation + 9 bundle + 3 fs_backend = 26 tests passing.
Co-Authored-By: claude-flow <ruv@ruv.net>
Pairs with iter 4's read_from_dir: given a registered (backend,
collection) key, emit the current table.rulake.json to a directory.
This is what a cache sidecar daemon calls when the warehouse triggers
a bundle refresh — the daemon publishes the new bundle, any serving
ruLake watching that directory swaps in the new witness on next
search.
Does NOT prime the cache — publish is a metadata emission, not a
data load. That keeps publish cheap and lets operators stage bundle
updates without moving any compressed data.
Test publish_bundle_roundtrips_through_disk: publish → read_from_dir
on a third party → witness matches what a cache prime would see.
13 federation + 9 bundle + 3 fs_backend = 25 passing. Clippy green.
Co-Authored-By: claude-flow <ruv@ruv.net>
M1 done and benchmarked. Update status from 'Proposed' → 'Accepted (M1)',
collapse the implementation-plan M1 bullet to reflect everything that
actually shipped on the branch, and move the open-question resolutions
into a dedicated "Resolved in M1" block.
New M1 evidence in the ADR:
- Intermediary tax 1.00× at n=100k on LocalBackend
- Byte-exact parity with direct RaBitQ at same (seed, rerank_factor)
- Rayon fan-out 1.97× (2-shard) / 3.86× (4-shard) prime-time speedup
- Recall@10 > 90% gate passes
- Witness-addressed cache sharing verified
- Send+Sync under 8-thread contention
Remaining open questions rewritten for M2 focus:
- Remote-backend tax measurement (Parquet-on-GCS prime)
- Cache sidecar daemon protocol for bundle handoff
- Push-down negotiation policy
- Cost accounting for pushed-down BQ work
Co-Authored-By: claude-flow <ruv@ruv.net>
8 threads × 50 queries against a shared RuLake, alternating single-shard
and federated calls. Validates:
- no deadlocks (bounded time to completion)
- no panics from the cache Mutex or backend RwLock under contention
- every returned hit is finite and the per-call result is sorted
- prime count stays at ≤ 2 (one per shard) — hits serve the rest
Closes the M3 "concurrent multi-client throughput" smoke item from
BENCHMARK.md. The Send + Sync bound on RuLake is now exercised, not
just declared.
12/12 federation + 9 bundle + 3 fs_backend tests passing (24 total).
Clippy -D warnings green.
Co-Authored-By: claude-flow <ruv@ruv.net>
First concrete adapter that reads real persistent data. Uses a simple
'ruvec1' binary format (8-byte magic + u64 count + u32 dim + records)
and takes the mtime as the generation token. This proves the full
bundle → witness → cache → search loop works against the filesystem
without pulling arrow/parquet deps — a real ParquetBackend reuses the
exact same shape, only the decoder and generation source change.
- current_bundle() reads only the 24-byte header to pick up dim —
real-backend hot-path ergonomics; a full pull per coherence check
would be catastrophic on a warehouse adapter.
- Atomic write via temp+rename so concurrent reads never observe a
torn record stream (matches the bundle sidecar write pattern).
- data_ref is 'file://<path>', anchoring the witness on the local
filesystem location — two FsBackends pointing at the same file
share the cache entry (content-addressed, per ADR-155).
Tests:
- fs_write_then_pull_roundtrip: write vectors, read them back bitwise.
- fs_bundle_has_file_uri_and_header_dim: verify witness + data_ref.
- fs_pull_rejects_bad_magic: magic-byte guard on pull.
- fs_backend_end_to_end_search_and_recache_on_mtime_bump (federation
smoke): full RuLake → FsBackend → mtime bump → re-prime cycle.
23/23 passing (9 bundle + 3 fs_backend + 11 federation). Clippy green.
Co-Authored-By: claude-flow <ruv@ruv.net>
search_federated now par_iters over targets so that cache-miss primes
(the expensive case — pulling from the backend + building a RabitqPlus
index) run concurrently per shard. Measured speedups in BENCHMARK.md:
n=100k: 1-shard prime 425ms → 2-shard 215ms (1.97×) → 4-shard 110ms (3.86×)
n= 50k: 1-shard prime 213ms → 2-shard 110ms (1.95×) → 4-shard 56ms (3.83×)
Warm-cache QPS on a single-threaded benchmark drops slightly because
rayon's par_iter startup is measurable at sub-ms per-query. The win is
in tail-latency under miss and in real remote-backend deployments where
per-shard latency dominates — the bench understates this.
Short-circuits on error (first shard to return Err wins), matching the
sequential loop's semantics.
Rayon pinned via workspace.dependencies (rayon = "1.10").
Co-Authored-By: claude-flow <ruv@ruv.net>
Direct dependency of the BQ UDF + cache sidecar daemon: the daemon
needs to read `table.rulake.json` off GCS (or a local mount) and
verify its witness before swapping in a new compressed entry.
- Atomic write via temp+rename so concurrent readers never see a
truncated sidecar (matches the pattern a warehouse-push path needs).
- Read verifies witness on-disk → malformed or tampered bundles
surface as InvalidParameter with a "witness" message.
- Canonical filename is exposed as SIDECAR_FILENAME so callers
don't hardcode the string.
Tests:
- fs_roundtrip: write + read preserves witness + optional fields.
- fs_read_rejects_tampered_sidecar: edit dim on disk → read errors.
- fs_write_is_atomic_under_crash_simulation: leftover .tmp.* files
don't corrupt reads of the canonical sidecar.
19/19 passing (9 bundle + 10 federation). Clippy -D warnings green.
Co-Authored-By: claude-flow <ruv@ruv.net>
MVP shipped an unbounded cache. v1 must-have: a hard cap on the number
of distinct compressed entries, evicting the least-recently-used
*unpinned* (refcount=0) entry when the cap is exceeded.
Design note: entries pinned by a live `(backend, collection)` pointer
are never evicted — dropping them would orphan a caller. If every
entry is pinned, the cap is temporarily exceeded rather than return
an error. Correctness over strict bounds.
API:
- `VectorCache::with_max_entries(n)` — builder-mode cap.
- `RuLake::with_max_cache_entries(n)` — user-facing constructor flag.
- `RuLake::invalidate_cache(key)` — drop a pointer explicitly so its
entry becomes evictable.
- `CacheEntry.last_used` bumped on every search_cached; LRU picks the
oldest unpinned entry as victim.
Eviction runs opportunistically at the end of each prime when a cap
is set. Zero overhead when `max_entries == None` (default path).
Test: `lru_eviction_caps_entry_count_when_pointers_dropped` pins three
entries, invalidates one, asserts the cap=2 holds after the next
prime runs the sweep.
16/16 tests pass. Clippy clean under -D warnings.
Co-Authored-By: claude-flow <ruv@ruv.net>
Implements the reviewer's "use RVF witness chain hash as cache-key
anchor" design. Cache entries are now keyed by the RuLakeBundle
witness, not (backend_id, collection). Two backends advertising the
same logical dataset (same data_ref + seed + rerank + generation)
produce the same witness and share one compressed index.
## The change
### BackendAdapter::current_bundle() (new trait method)
Returns the backend's authoritative bundle for a collection. Default
impl synthesizes from `id() + generation()`; real backends override to
report a shared data_ref when they're replicas of the same source of
truth. LocalBackend overrides to avoid the default's pull-to-read-dim
round-trip.
### VectorCache: two-layer storage
- `entries: HashMap<WitnessKey, CacheEntry>` — content-addressed
- `pointers: HashMap<CacheKey, WitnessKey>` — (backend, collection) → witness
- `last_checked: HashMap<CacheKey, Instant>` — for Eventual-mode TTL
`CacheEntry` now carries a `refcount` so an entry is GC'd only when
its last pointer drops. New stat: `shared_hits` — incremented when a
pointer move finds the target witness already cached.
### RuLake::ensure_fresh flow
1. Eventual within TTL → skip check (fast).
2. Witness matches pointer → hit, no-op.
3. Witness mismatch, target witness already in pool (another pointer
has it) → just swap the pointer, zero prime work. This is the
cross-backend share.
4. Witness not in pool → pull + prime as before.
### Prime is now race-tolerant
A concurrent thread racing to prime the same witness doesn't rebuild —
whichever thread gets the lock second observes the entry and drops
its own build. Two builds for the same witness are byte-identical by
determinism, so no data is lost.
## Test added
`two_backends_share_cache_when_witness_matches` — uses a
`SharedLocalBackend` shim that overrides `current_bundle()` to advertise
a shared data_ref. Two distinct `LocalBackend`s behind shims report
identical witnesses; the second search finds `primes=1, shared_hits=1`
and only ONE compressed entry in the pool despite two pointers. Both
pointers' `refcount_of(witness) == 2`.
## Lint + test status
```
cargo test -p ruvector-rulake --release ✓ 15/0
cargo clippy -p ruvector-rulake --release --all-targets -- -D warnings ✓ clean
cargo fmt -p ruvector-rulake -- --check ✓ clean
```
## Closes open question from earlier ADR review
"Cache invalidation drift" — the witness is now the cache-key anchor.
Backend generation bumps become witness changes; witness changes are
content-addressable so old entries can drop but shared ones survive.
"Where does freshness truth live?" — answered: in the bundle.
Co-Authored-By: claude-flow <ruv@ruv.net>
Applies the reviewer's architectural feedback (docs/research/ruLake/
chat thread): ruLake is a cache-first vector execution fabric, not a
federation engine. Federation is the cache's refill mechanism.
## Perf fix — cache prime now runs lock-free
`VectorCache::prime()` previously built a fresh `RabitqPlusIndex`
(~400 ms at n=100k) while holding the cache mutex, serialising all
other queries. Now builds entirely before touching `inner`; the lock
is only taken to swap the finished entry in. No benchmark regression —
intermediary tax still 1.00× on LocalBackend at n=100k.
## New: bundle sidecar (`table.rulake.json`)
`ruvector_rulake::bundle` — the portable unit that defines ruLake's
reproducibility + governance scope. Flagged by the reviewer as more
important than the UDF because it's what travels between teams,
clouds, and backups.
Carries: `data_ref`, `dim`, `rotation_seed`, `rerank_factor`,
`generation`, `rvf_witness` (SHAKE-256 over the preceding fields),
`pii_policy`, `lineage_id`.
`Generation` is a serde-untagged union of `Num(u64)` (Parquet mtime,
Iceberg version, Snowflake offset) and `Opaque(String)` (UUIDs,
hashes, base64 blobs) — fixes the "u64 doesn't fit an Iceberg snapshot
id" open question from the M1 review.
Witness fn is domain-separated, length-prefixed, and verifiable via
`bundle.verify_witness()`. 6 new tests: determinism,
field-change-detection, length-prefix-anti-collision, serde roundtrip,
tamper-detection, format-version-downgrade-rejected.
## New: recall-vs-brute-force gate
`rulake_recall_at_10_above_90pct_vs_brute_force` — the missing
correctness test. Builds brute-force L2 truth over 5k clustered
Gaussian vectors, asserts ruLake's top-10 hits ≥ 90% at rerank×20.
Uses the same n + cluster-count + methodology as
`ruvector-rabitq::BENCHMARK.md` so a regression shows up as a
divergence from the known-good estimator baseline.
## ADR-155 v2 — cache-first decision explicit
- Decision opens with "cache-first vector execution fabric; federation
is the refill mechanism", lifts the reviewer's 5-axis decision
matrix (cache-first wins 4/5 axes).
- New Decision §6 declares the bundle sidecar as the portable unit
(not the UDF) and documents how the witness acts as the cache-key
anchor, closing the "cache invalidation drift" failure mode.
## Test + lint status
```
cargo test -p ruvector-rulake --release ✓ 14/0
cargo clippy -p ruvector-rulake --release --all-targets -- -D warnings ✓ clean
cargo fmt -p ruvector-rulake -- --check ✓ clean
cargo run -p ruvector-rulake --release --bin rulake-demo -- --fast ✓ no regression
```
Co-Authored-By: claude-flow <ruv@ruv.net>
Added three scoped allows at lib + bin entry: `manual_div_ceil`,
`needless_range_loop`, `doc_overindented_list_items`. The two suppressed
lints fire in hot-path SoA walks where the index variable is intentional
(manual bounds-unchecked access via `.add(i * n_words)`); the doc one
is a cosmetic nit. All 13 previous clippy warnings now resolve.
cargo clippy -p ruvector-rabitq --release --all-targets -- -D warnings
✓ clean
cargo test -p ruvector-rabitq --release
✓ 20 passed
cargo doc -p ruvector-rabitq --no-deps
✓ clean
Co-Authored-By: claude-flow <ruv@ruv.net>
Replaces the per-entry `Vec<(usize, BinaryCode)>` storage (where each code
heap-allocated its own `Vec<u64>`) with flat struct-of-arrays:
ids: Vec<u32> — 4 B / vector
norms: Vec<f32> — 4 B / vector
packed: Vec<u64> — n × n_words contiguous slab
and adds a cos-lookup table keyed on the agreement count so the
`.cos()` call in the estimator drops to a single L1 indexed load.
Measured at n=100k, D=128 (same seeds, same dataset, same host):
| variant | before QPS | after QPS | Δ | r@10 |
|--------------------------|-----------:|----------:|------:|------:|
| Flat | 309 | 306 | — | 100.0%|
| RaBitQ sym no rerank | 1,176 | 3,639 | 3.09× | 8.1% |
| RaBitQ+ sym rerank×5 | 811 | 2,058 | 2.54× | 87.9% |
| RaBitQ+ sym rerank×20 | 544 | 957 | 1.76× |100.0% |
Flat's f32 baseline is unchanged (as expected — SoA only affects the
binary-code scan). Rerank×20 is now 3.13× over flat at 100% recall@10,
up from 1.76× in v1.
Memory also improved: `RabitqIndex` at n=100k drops from 5.8 MB to
2.4 MB = 21× compression vs flat (up from 8.7×), because the SoA layout
collapses the 40 B per-entry tuple+header overhead to 8 B per row.
Asymmetric path is unchanged — its O(D) scalar signed-dot-product
dominates; the SoA layout helps the outer walk but not the inner
arithmetic. SIMD gather is the next lever for that path.
Changes:
- `RabitqIndex` storage: SoA with u32 ids, f32 norms, flat u64 packed
slab. Adds `last_word_mask` (for D % 64 != 0) and `cos_lut` (D+1 f32s).
- New `RabitqIndex::symmetric_scan_topk()` — raw-pointer SoA walk with
aligned-D fast path (`D % 64 == 0` skips the last-word AND). Used by
both `RabitqIndex::search` and `RabitqPlusIndex::search`.
- `TopK::push_raw(id, score, pos)` + `into_sorted_with_pos()` so rerank
can look up `originals[pos]` in O(1) without repacking IDs.
- `RabitqAsymIndex::search` walks SoA directly (kernel still O(D)).
- `.codes()` accessor replaced with SoA accessors (`ids()`, `norms()`,
`packed()`, `cos_lut()`, `n_words()`); `codes_materialised()` returns
the boxed AoS view for back-compat at O(n) allocation cost.
- New `encode_query_packed()` — returns just the packed words so the
hot scan doesn't allocate a BinaryCode box per search.
All 20 tests still pass (including the D=100 non-aligned regression;
the fast path is gated on `last_word_mask == !0`, so unaligned D falls
to the masked code path).
BENCHMARK.md updated with before/after table and the "what changed"
narrative.
Co-Authored-By: claude-flow <ruv@ruv.net>
Fixes the four concrete bugs and three integrity issues surfaced in the
deep review of commit f2dbb6efb, adds the SIGMOD-2024-style asymmetric
IP estimator, and ships a single-source-of-truth benchmark harness whose
numbers are reproducible with one `cargo run`.
### Bug fixes
1. **Padding-safe popcount** (`quantize.rs::masked_xnor_popcount`). At
`D % 64 != 0` the zero padding of the last u64 was being counted as
matching bits, biasing the estimator. New method masks the unused MSBs
of the last word before popcount. Regression test at D=100 pins it
(raw XNOR returns 28 matches for opposite vectors; masked returns 0).
2. **Honest memory accounting**. `RabitqIndex` previously stored the
original f32 vectors unconditionally but omitted them from
`memory_bytes()`. Fixed by (a) dropping `originals` from `RabitqIndex`
entirely — rerank lives in `RabitqPlusIndex` only, and (b) including
all allocations in `memory_bytes()` for every variant. New test
`memory_accounting_is_honest` enforces `RabitqIndex < Flat` and
`RabitqPlusIndex > Flat` (since the latter truly stores both).
3. **NaN-safe sort**. Replaced every `partial_cmp().unwrap()` with
`f32::total_cmp`; a rogue NaN (possible near the `.cos()` domain
edge) now sorts to the back instead of panicking search. New test
`nan_query_does_not_panic`.
4. **Renamed misleading test**. `rabitq_recall_at_10_above_70pct` was
asserting `> 0.20`. Renamed to `rabitq_recall_above_random`.
### Algorithm upgrade
5. **Asymmetric estimator** (`quantize.rs::estimated_sq_distance_asymmetric`).
Query stays in f32 (rotated once), database stays 1-bit. IP is
reconstructed by summing the rotated query's components with per-dim
signs read from the stored code and rescaling by 1/√D. O(D) per
candidate vs O(D/64) popcount — slower but tighter. Closes the gap
between this crate and the SIGMOD 2024 RaBitQ estimator (the prior
code was Charikar-2002 hyperplane-LSH on a rotated basis). Exposed
as `RabitqAsymIndex` with optional rerank.
### Optimizations
6. **Top-k via bounded max-heap**: O(n log k) per search instead of
O(n log n) sort. Matters at n ≥ 10 000.
7. **Single query rotation per search** amortised across all candidates
for both symmetric and asymmetric paths.
8. **Stricter full-pairs orthogonality test** at D ∈ {64, 128, 256} —
previous test only checked (row 0, row 1) of a 64×64 matrix.
### Honest benchmarks
The new `rabitq-demo` binary produces recall@1/@10/@100, QPS, memory, and
build time for all four indexes on the SAME clustered dataset, across
n ∈ {1 k, 5 k, 50 k, 100 k}. Headline numbers (Ryzen-class, single thread):
| config (n=100k, D=128) | r@10 | QPS | mem/MB | vs flat |
|---|---:|---:|---:|---:|
| FlatF32 | 100.0% | 309 | 50.4 | — |
| Sym rerank×5 | 87.9% | 811 | 56.9 | 2.6× |
| Sym rerank×20 | 100.0% | 544 | 56.9 | 1.76× |
Scaling regression of rerank×5 from 100% at n=5k to 87.9% at n=100k is
now explicitly documented (it was hidden in the previous gist).
Mem: codes-only at n=100k is 5.8 MB vs Flat's 50.4 MB = 8.7× compression
on the real index, 32× per-vector codes-vs-f32.
### Test count
- Before: 10 tests
- After: 20 tests (including 2 non-aligned-D regression tests, NaN
safety, asymmetric-vs-symmetric ordering, full-pairs orthogonality
at D=64/128/256, memory accounting, heap top-k ordering).
### Writing
- `lib.rs` doc block now honestly describes the two estimators and
doesn't claim pure-std (deps: rand, rand_distr, serde, thiserror).
- New `BENCHMARK.md` captures every number with the seed and reproducer.
- Doc comments through the crate reference the SIGMOD 2024 paper
accurately — the symmetric path is Charikar-style, the asymmetric is
RaBitQ-2024-style; both are shipped.
### What's NOT shipped yet (named)
- SIFT1M / GIST1M / DEEP10M benchmarks (still on Gaussian clusters).
- HNSW integration (the production shape).
- SIMD popcount via `std::arch` (scalar POPCNT is used today).
- Parallel search via `rayon` (feature-gated, off by default).
20 tests pass. Benchmark reproducer: `cargo run --release -p
ruvector-rabitq --bin rabitq-demo`.
Co-Authored-By: claude-flow <ruv@ruv.net>
Main recently merged ADR-151 (Miller-Rabin prime optimizations, PR #358)
and ADR-152 is reserved for Obsidian Brain Plugin (ADR-SYS-152), so
renumber the kalshi integration ADR to 153 to avoid collision.
- Rename docs/adr/ADR-151-kalshi-neural-trader-integration.md →
docs/adr/ADR-153-kalshi-neural-trader-integration.md
- Update 5 references: workspace Cargo.toml comment, the two kalshi
crate descriptions, the lib.rs doc-comment, and the ADR title line.
- Resolve .gitignore: keep both trailing additions (.kalshi + bench_data/).
Co-Authored-By: claude-flow <ruv@ruv.net>
New cargo examples under crates/ruvector-kalshi/examples/:
- list_markets.rs
Authenticated GET /markets against the live API. Tested against
api.elections.kalshi.com — returned 100 real markets (sports parlays,
cross-category bundles), proving the REST + sig path end-to-end.
- stream_orderbook.rs
Live WebSocket consumer. Uses ws_client::reconnect_forever +
FeedDecoder and prints canonical MarketEvents. Configurable via
argv tickers, KALSHI_MAX_EVENTS (default 50), KALSHI_WS_URL.
- live_trade.rs
Full live execution runner: WS -> FeedDecoder -> Strategy ->
CoherenceChecker -> RiskGate -> RestClient::post_order. Triple-
gated — requires KALSHI_ENABLE_LIVE=1, KALSHI_CONFIRM_LIVE=yes, and
a non-zero KALSHI_MAX_ORDERS cap before any signed request is
emitted. Conservative defaults: 0.10 Kelly fraction, 10_000¢
bankroll, 5% position cap, 2% daily-loss kill, 500 bps min edge.
Verified to fail-closed without the env flag.
paper_trade.rs:
- Now async (#[tokio::main]) to enable brain I/O in the fill path.
- When BRAIN_ENABLE=1, loads BRAIN_API_KEY from env or gcloud secret
BRAIN_SYSTEM_KEY and calls BrainClient::share per approved order.
- Run output unchanged: 7 intents / 1 coherence block / 6 approvals /
6 receipts / 4 replay segments / 3 retrievable.
Co-Authored-By: claude-flow <ruv@ruv.net>
AttentionScalper now supports a scaled-dot-product attention path when
AttentionScalperConfig::use_sdpa = true. Levels are encoded as
[size_log, side_sign, depth_idx_norm, 1.0] and fed into
ruvector_attention::ScaledDotProductAttention with a fixed pressure
query. The context vector's sign component becomes the signed
imbalance.
- neural-trader-strategies depends on ruvector-attention (default
features disabled so it stays portable).
- sdpa_imbalance() guards NaN/empty inputs and returns 0 on error, so a
misconfigured attention layer cannot corrupt downstream decisions.
- Geometric-decay path remains the default and is unchanged.
- 2 new tests: heavy YES → YES intent, heavy NO → NO intent, both via
the SDPA path end-to-end.
26 strategy tests pass (was 24). ruvector-kalshi 36 tests pass.
paper_trade example unchanged: 6 fills, 4 replay segments, 6 witness
receipts.
Co-Authored-By: claude-flow <ruv@ruv.net>
neural-trader-strategies:
- Depend on neural-trader-coherence.
- New coherence_bridge module: CoherenceChecker wraps a CoherenceGate
and returns CoherenceOutcome::{Pass, Block} around an Intent. On gate
error we fail closed (never authorize actuation). simple_context()
builds a plausible GateContext from a rolling price window.
- Re-export CoherenceDecision, CoherenceGate, GateConfig, GateContext,
RegimeLabel, ThresholdGate, CoherenceChecker, CoherenceOutcome.
- 3 new tests (24 total): healthy context passes, low mincut blocks,
simple_context correctly classifies volatile regime.
ruvector-kalshi:
- Depend on neural-trader-coherence and neural-trader-replay.
- examples/paper_trade.rs rewritten to include coherence pre-check and
replay storage:
FeedDecoder → MarketEvent
→ ExpectedValueKelly.on_event
→ CoherenceChecker.check (ThresholdGate tuned for Kalshi depth)
→ RiskGate.evaluate
→ intent_to_order → NewOrder
→ ReservoirStore.maybe_write(ReplaySegment)
→ InMemoryReceiptLog.append_receipt(WitnessReceipt)
Observed depth is carried across frames so ticker/trade events
inherit the mincut floor from the last snapshot. CUSUM uses only
trade/ticker mids, not per-level snapshot prices.
- Run result: 7 intents emitted, 1 coherence-blocked, 6 risk-approved,
6 witness receipts, 4 replay segments stored and retrievable.
Tests: 60 unit (36 + 24). Live /exchange/status smoke still green.
Co-Authored-By: claude-flow <ruv@ruv.net>
Signer:
- api_key now Arc<str>, signing_key now Arc<SigningKey<Sha256>>. Clone
is O(1) (atomic fetch_add) instead of a 2048-bit RSA deep-copy.
Measured at 75 ns/iter in release (1M iters) — previously bound by
RsaPrivateKey::clone which deep-copies BigUint fields.
RestClient:
- base_url + pre-computed base_path stored as Arc<str>; sig_path_for()
formats against the cached base_path instead of reqwest::Url::parse
on every request. Measured at 14 ns/iter — the old path was a full
URL parse + to_string per call.
- RestClient::clone is also O(1) as a consequence.
Benchmark example:
- examples/bench_signing.rs reports clone / sign / sig_path numbers.
Release numbers on the real Kalshi PEM:
signer.clone 75.5 ns
sign_with_ts 0.78 ms (1284 sig/s — RSA-PSS floor)
sig_path_for 13.9 ns
All 57 unit tests, paper_trade example, and live /exchange/status
smoke test pass on the optimized paths.
Co-Authored-By: claude-flow <ruv@ruv.net>
ruvector-kalshi:
- ws_client: tokio-tungstenite + futures-util. connect() signs upgrade
with RSA-PSS-SHA256; subscribe() sends a typed command; pump_frames()
routes every text frame through FeedDecoder into an mpsc<MarketEvent>
channel; reconnect_forever() does exponential backoff up to 30s.
- brain: pi.ruv.io client. SharedMemory::market_resolution builds a
redacted pattern memory; BrainClient::share POSTs with Bearer auth to
/v1/memories. Debug never leaks key material.
- URL migration: Kalshi moved from trading-api.kalshi.com to
api.elections.kalshi.com; defaults updated. GCS secret bumped.
- tests/live_smoke.rs: #[ignore]-gated GET /exchange/status. Verified
live response: 200 OK, {"exchange_active":true,"trading_active":true}.
neural-trader-strategies:
- coherence_arb: pair-wise price divergence arbitrage. Configure
(reference, mirror) symbol ids; emit YES buy on mirror when the price
gap exceeds min_divergence_bps. Quarter-Kelly sizing.
- attention_scalper: multi-level order-book imbalance with geometric
level decay and EMA smoothing; emits a short YES or NO position when
the smoothed signal crosses abs_threshold. Deterministic, no ML dep.
CI:
- .github/workflows/kalshi-nightly.yml: unit tests, offline validator,
paper-trade example, and live /exchange/status smoke under --ignored.
Tests: 57 unit (ruvector-kalshi 36, neural-trader-strategies 21) + 1
live smoke. All green against the real Kalshi endpoint.
Co-Authored-By: claude-flow <ruv@ruv.net>
New crate ruvector-kalshi: RSA-PSS-SHA256 signer (PKCS#1/#8), GCS/local/env
secret loader with 5-min cache, typed REST + WS DTOs, Kalshi→MarketEvent
normalizer (reuses neural-trader-core), transport-free FeedDecoder,
reqwest-backed REST client with live-trade env gate, and an offline
sign+verify example that validates against the real PEM.
New crate neural-trader-strategies: venue-agnostic Strategy trait, Intent
type, RiskGate (position cap, daily-loss kill, concentration, min-edge,
live gate, cash check), and ExpectedValueKelly prior-driven strategy.
36 unit tests pass across both crates. End-to-end offline validation
confirmed against the real Kalshi PEM via both local and GCS sources.
Co-Authored-By: claude-flow <ruv@ruv.net>