The connectome-fly UI now runs the real FlyWire brain end-to-end:
115,151 neurons, 2,676,592 unique synapses (from 3.78M Princeton rows
aggregated per (pre, post)), 2,590 sensory neurons auto-detected.
Changes:
- src/connectome/flywire/princeton.rs: new gzipped-CSV loader for the
Princeton codex.flywire.ai format (neurons.csv.gz +
connections_princeton.csv.gz). Uses serde's #[rename] to map
"Root ID" / "pre_root_id" / "Predicted NT type" / etc. to the
existing NeuronMeta schema. Aggregates per-neuropil rows on the fly
into per-(pre, post) synapse counts. Zero dangling ids on the
shipped dataset.
- src/bin/ui_server.rs: CONNECTOME_FLYWIRE_PRINCETON_DIR env var
selects the Princeton path; falls through to v783 TSV then
synthetic SBM. Observer's detect_every_ms backs off to 500 ms at
N ≥ 10k and CONNECTOME_SKIP_FIEDLER=1 disables it entirely (the
Fiedler eigensolver is O(window_spikes²)–O(n³) and melts the stream
at 115k neurons without one of those mitigations).
- examples/connectome-fly/assets/{neurons,connections_princeton}.csv.gz:
the 2.1 MB + 26 MB Princeton dump, committed under assets/ so the
example is self-contained. Clone size +28 MB.
- Cargo.toml: flate2 1.0 dependency (already pinned elsewhere in the
workspace for ruvector-cli / ruvector-snapshot).
- flywire/mod.rs: pub use princeton::load_flywire_princeton.
Run it:
cargo build --release --bin ui_server
CONNECTOME_FLYWIRE_PRINCETON_DIR=examples/connectome-fly/assets \
CONNECTOME_SKIP_FIEDLER=1 \
CONNECTOME_SKIP_COMMUNITIES=1 \
./target/release/ui_server
cd examples/connectome-fly/ui && npm run dev
Measured on a commodity host:
with CONNECTOME_SKIP_FIEDLER=1 → 49 sim-ticks / 5 s wall, 2.2 M
real spikes after 5 s
with detector default 5 ms → 4 sim-ticks / 10 s wall
(Fiedler λ₂ on the 100 k-spike
co-firing window dominates)
Browser validation (agent-browser): banner reads "engine=rust-lif
substrate=flywire-princeton-csv n=115,151 syn=2,676,592 witness=…",
tick advances past 123, real_spikes_total > 6 M within a few seconds,
zero console errors.
This closes the "can we run the entire fly brain, not just 1024
neurons" question. Open follow-up: raster UI still bins spikes modulo
208 rows — at 115 k neurons that's ~550× overloaded, so the canvas
mostly dims out. Proper per-module binning or downsampling is a UI
task, not an engine task.
Co-Authored-By: claude-flow <ruv@ruv.net>
ui_server now reads CONNECTOME_FLYWIRE_DIR and switches from the
default synthetic SBM to the streaming FlyWire v783 loader
(examples/connectome-fly/src/connectome/flywire/streaming.rs) when
set. The substrate label and synapse count propagate through:
/status → substrate="flywire-v783-tsv", connectome.num_synapses
/stream hello event → same substrate tag
UI banner → "engine=rust-lif substrate=flywire-v783-tsv n=… syn=…"
Smoke-tested with the built-in 100-neuron fixture:
cargo run --release --bin materialize_fixture /tmp/flywire-fixture
CONNECTOME_FLYWIRE_DIR=/tmp/flywire-fixture \
cargo run --release --bin ui_server
→ server boots, substrate="flywire-v783-tsv", n=100, synapses=159
→ stream delivers 2142 ticks in 2.5s (small-N is fast)
→ browser end-to-end: substrate tag visible, tick=4516,
n_spikes_total=152623 after a few seconds, zero console errors
Added:
- src/bin/materialize_fixture.rs — one-off writer for the TSV fixture
- [[bin]] materialize_fixture in Cargo.toml
- ConnectomeSource enum in ui_server.rs (SyntheticSbm | Flywire)
- CONNECTOME_SKIP_COMMUNITIES=1 opt-out for huge substrates where the
CPM snapshot would stall the SSE loop (already throttled to every
2 s of sim time for n ≥ 8k)
To run against the real ~139k-neuron dataset, download the FlyWire
v783 release and point CONNECTOME_FLYWIRE_DIR at the directory
containing neurons.tsv + connections.tsv + classification.tsv. The
Fiedler detector will likely need tuning at that scale (see ADR-154
§16 and discovery #7 for the open eigensolver-at-scale story).
Co-Authored-By: claude-flow <ruv@ruv.net>
Fine module sweep around the item-26 N=512 peak:
modules=15 → 0.638 @ γ=4.8
modules=17 → 0.620 @ γ=4.4
modules=19 → 0.671 @ γ=4.4 ← new best (30 communities vs 19 truth)
modules=20 → 0.599 @ γ=4.0 (old headline)
modules=21 → 0.540 @ γ=4.0
modules=23 → 0.568 @ γ=4.4
modules=25 → 0.550 @ γ=4.4
At modules=20 the hub axis is flat (hub=0,1,2 all ≈ 0.60). The
item-26 step-of-5 module sweep missed the 19-module sweet spot
entirely — "step=1 unit matters" extends item 24's "coarse-γ
understates" discipline point.
AC-3a gap narrows from 1.25× (item 26) to **1.12× (0.671 vs 0.75)**.
Three rows of the fine grid beat the previous headline; the peak is
unimodal between modules=17 and 21, centred at 19.
- tests/leiden_cpm.rs: leiden_cpm_fine_2d_grid_at_n512
- ADR-154 §17 row 30 + heading 29 → 30
Co-Authored-By: claude-flow <ruv@ruv.net>
#28 (null): hub_modules ∈ {0, 1, 2, 3, 4, 6, 8} at N=1024/40-modules.
Peak stays at hub=3 → 0.516. hub ∈ [0, 2] cluster at 0.487–0.488;
hub ≥ 4 collapses to 0.37–0.43. Narrow non-monotonic peak, not a
smooth ridge. The "smaller hub wins" pattern from N=512 does NOT
generalise to N=1024 — 2nd ADR-level case of "hypothesis from small-N
extrapolates wrong at large N" (1st was item 22 on fixed γ).
#29: fine num_modules ∈ {20, 25, 30, 35, 40, 50, 60, 80} at N=1024/
hub=3. New N=1024 peak: 0.531 @ modules=30 (density 34.1), γ=3.0
(70 communities vs 30 truth). Secondary peak at modules=80/γ=2.5
scores 0.515 — multi-modal landscape confirmed.
Finding: at N=1024 the optimal density is 34.1 neurons/module, not
25.6. At N=512 it's 25.6. The 4-D landscape (N × density × γ × hub)
does not factorize. AC-3a gap at N=1024 now 1.41× (down from 1.47×).
Best-across-scales remains 0.599 @ (N=512, modules=20, hub=1, γ=4.0)
— 1.25× gap.
- tests/leiden_cpm.rs: leiden_cpm_hub_fraction_sweep_at_n1024,
leiden_cpm_module_count_sweep_at_n1024_hub3
- ADR-154 §17 rows 28, 29 + heading 27 → 29
Co-Authored-By: claude-flow <ruv@ruv.net>
Fixed neurons/module ≈ 25.6 (the item-26 N=512 sweet spot). Varied
N ∈ {256, 512, 1024, 2048} with num_modules = N/25. γ sweep at each.
Per-scale peaks:
N=256 → 0.466 @ γ=5.0 (6 communities vs 10 truth)
N=512 → 0.554 @ γ=4.0 (23 vs 20; lower than #26's 0.599 because
hub_modules=2 here vs 1 in #26)
N=1024 → 0.516 @ γ=2.5 (96 vs 40) ← +21 % vs the 0.425 default
N=2048 → 0.343 @ γ=2.0 (257 vs 80)
Findings:
- The "ARI peaks at N=512" claim (item 24) was density-dependent, not
a universal property. At density=25.6, N=1024 scores 0.516, well
above its density=14.6 headline of 0.425.
- Landscape is 3D (N × num_modules × γ), not 2D (N × γ).
- hub_modules is a hidden 4th axis — the N=512 peak dropped from
0.599 (hub=1) to 0.554 (hub=2) at otherwise-identical config.
- γ-peak still monotonic in N: 5.0 → 4.0 → 2.5 → 2.0.
New claim: CPM ceiling on this substrate is ~0.55–0.60 across the
(N ∈ [384, 1024], density ∈ [20, 26], γ ∈ [2, 4], hub ∈ [5–10 %])
region. AC-3a gap is 1.25×–1.40× the 0.75 SOTA target.
- tests/leiden_cpm.rs: leiden_cpm_cross_scale_constant_density_at_25
- ADR-154 §17 row 27 + heading 26→27
Co-Authored-By: claude-flow <ruv@ruv.net>
New binary examples/connectome-fly/src/bin/ui_server.rs stands up a
zero-dep HTTP + Server-Sent-Events server on 127.0.0.1:5174 that
drives a fresh Engine + Observer + CPM-Leiden per connection, feeding
real spike events, real Fiedler λ₂ values, and real community
snapshots to the Vite UI.
Changes:
- src/bin/ui_server.rs: new std::net-only server with:
GET /status → engine identity, connectome config, witness, mock=false
GET /stream → SSE with hello + tick + communities events
pulse_train stimulus pushed ONCE (fix: run_with re-pushes on every
call — the naive per-tick re-apply was a 1000× regression on
stream throughput; now >45 ticks/sec via raw TCP)
- src/observer/core.rs: added latest_fiedler() + fiedler_baseline_mean()
plus an internal last_fiedler field so the server can publish every
detected λ₂, not just the events that crossed threshold
- Cargo.toml: second [[bin]] entry for ui_server
- ui/vite.config.js: /api/* proxy (retained for /api/status; stream
connects direct to :5174 because http-proxy buffers SSE)
- ui/src/modules/dynamics.js: Web Worker REMOVED; replaced with
EventSource('http://localhost:5174/stream') that hydrates the same
buffer/canvas path with real spikes. Added [CONNECTOME-OS REAL]
console logger for hello, first-tick, every 200th tick, and every
community snapshot — serves as the "no mocks" witness.
- ui/index.html: topbar engine stat replaced with #real-backend-banner
that flips pending → live → down and reads the Rust status
- ui/src/styles/layout.css: tri-state color for the banner
Validated end-to-end: agent-browser tour produces 0 console errors,
window._real_spikes_total climbs to 100K+ in 5s, banner text reads
"engine=rust-lif crate=0.1.0 n=1024 modules=70 witness=N" (green).
Co-Authored-By: claude-flow <ruv@ruv.net>
Module count is a real axis. At fixed N=512, sweeping num_modules ∈
{20, 25, 30, 35, 40, 45, 50} finds new peak full_ARI = 0.599 at
num_modules=20, γ=4.0 — 9 % higher than item-24's 0.549 at 35 modules.
Per-config peaks:
(20, 0.599) (25, 0.505) (30, 0.528) (35, 0.507)
(40, 0.559) (45, 0.566) (50, 0.517)
A second local maximum at num_modules ∈ [40, 45] suggests the quality
ridge is multi-modal, not unimodal.
New CPM ceiling: 0.599 at (N=512, 20 modules, γ=4.0). Gap to 0.75
AC-3a SOTA target narrows from 1.37× (item 24) to 1.25×.
- tests/leiden_cpm.rs: new leiden_cpm_module_count_sweep_at_n512
- ADR-154 §17 item 26 + heading Twenty-five → Twenty-six
- Row ordering fixed (#25/#26 were transposed)
Co-Authored-By: claude-flow <ruv@ruv.net>
Two live-browser bugs that agent-browser's `errors`/`console` CLI
commands missed (they silently drop uncaught runtime exceptions —
confirmed with a deliberate `setTimeout(() => throw)` probe returning
zero output):
1. scene.js:9 Uncaught ReferenceError: THREE is not defined.
main.js previously did `import * as THREE; window.THREE = THREE;`
after all other imports. But ES module imports are hoisted and
evaluated in source order BEFORE the `window.THREE = …`
expression-statement runs, so scene.js saw THREE undefined.
Moved the assignment into src/three-global.js and imported it
FIRST in main.js — depth-first module evaluation guarantees the
global lands before any downstream module reads it.
2. favicon.ico 404 in GET on every load.
Added inline SVG data-URL favicon (green disc, "C" glyph) via
<link rel="icon" type="image/svg+xml" href="data:…">. No network
round-trip, zero build-pipeline cost.
Validated via agent-browser with page-side listener pattern:
window.addEventListener('error', e => window.__errors.push(...))
→ 7-view nav + 3-scenario switch → JSON.stringify(window.__errors)
→ "[]" (zero interaction-time errors)
window.THREE.REVISION → 160 (scene.js eval succeeded)
Co-Authored-By: claude-flow <ruv@ruv.net>
Integrates the Connectome OS demo (examples/connectome-fly/assets/)
into a Vite build with ESM modules and a local three.js dependency,
replacing the CDN <script> tag and <link rel="stylesheet"> pattern.
Structure:
- ui/index.html — single entry wired to /src/main.js
- ui/src/main.js — imports three, styles, and modules in order
- ui/src/modules/ — 9 existing IIFEs ported as side-effect imports
- ui/src/styles/ — 6 CSS files imported from main.js
- ui/public/ — screenshots + upload PNGs as static
- ui/package.json — three + vite
- ui/vite.config.js — root, port 5173
Validated via agent-browser:
- npm run build → 749 kB bundle (one Three.js chunk, expected)
- npm run dev → 0 console errors on load
- 7-view tour (structure/graph/dynamics/motifs/causal/acceptance/
embodiment), scenario switches (normal/saturated/fragmenting),
help popover click — all succeed with 0 console.error output and
0 page errors reported
UI labels synced to branch head:
- "11 discoveries" → "25 discoveries"
- "tests 68/0" → "tests 97/0"
- "commits 17" → "commits 25"
- system-map extended to 25 active segments
Original static assets kept verbatim at ui/assets/ for diff reference.
Co-Authored-By: claude-flow <ruv@ruv.net>
Implemented the item-19-named lever: Traag 2019 Alg. 4 with the CPM
objective, wired between local moves and aggregate.
Result: catastrophic regression at the γ regime where CPM works best
on this substrate. N=512 peak 0.549 → 0.038; N=1024 peak 0.425 → 0.023;
seed-sweep ratio flipped from 3.98× to 0.21×.
Root cause: CPM refinement starts every node as a singleton. At γ ∈
[2, 3] post weight-normalization (mean = 1.0), a single edge of weight
~1 cannot overcome the γ·n_v·n_s = 2–3 merge cost. Refinement leaves
everything as singletons, aggregation projects onto identity, coarse
structure is destroyed.
refine_cpm + refine_cpm_one_community kept in tree behind
#[allow(dead_code)] with a comment pointing to ADR §17 item 25.
9th pre-measurement-ADR-named lever ruled out by measurement. Remaining
levers: degree-stratified null (AC-5), real-FlyWire ingest, or a
substrate-specific non-singleton refinement start state (research).
AC-3a gap remains 1.37× to 0.75 SOTA via CPM-without-refinement.
- src/analysis/leiden.rs: refine_cpm scaffold unwired, documented why
- ADR-154 §17 item 25 + heading Twenty-four → Twenty-five
Co-Authored-By: claude-flow <ruv@ruv.net>
Two follow-ups to items 22/23 in one test:
- Fine γ sweep at N=512 lifts peak from 0.532 → 0.549 @ γ=3.10
- N=256 and N=384 extend the per-scale γ-peak curve downward
Full scale-to-peak:
N=256 → 0.501 @ γ=5.0 (15 communities vs 17 truth)
N=384 → 0.461 @ γ=3.5 (31 vs 25)
N=512 → 0.549 @ γ=3.1 (43 vs 35) ← best on branch
N=1024 → 0.425 @ γ=2.25 (156 vs 70)
N=2048 → 0.332 @ γ=1.75 (187 vs 140)
Findings:
- γ-peak is monotonic in N (high-N → low γ)
- ARI-peak is NON-monotonic in N (peaks at N=512)
- New gap to 0.75 SOTA target: 1.37× (down from 1.76× at N=1024)
Co-Authored-By: claude-flow <ruv@ruv.net>
Follow-up to item 22. A γ sweep at each scale reveals the γ peak
shifts monotonically downward as N grows (2.75 → 2.25 → 1.75), and
item 22's fixed-γ measurement was understated on both smaller AND
larger substrates.
Per-scale CPM ceilings:
- N=512 → 0.532 @ γ=2.75 (best on branch; within 1.41× of 0.75 SOTA)
- N=1024 → 0.425 @ γ=2.25 (item 19's headline)
- N=2048 → 0.332 @ γ=1.75
The 0.532 at N=512 is the new best CPM result on this substrate,
narrowing the AC-3a gap from 1.76× to 1.41×. γ should be swept per-
substrate, not inherited from a different-N benchmark.
- tests/leiden_cpm.rs: new leiden_cpm_gamma_peak_per_scale (publish-only)
- ADR-154 §17 item 23 + heading updated Twenty-two → Twenty-three
Co-Authored-By: claude-flow <ruv@ruv.net>
N=512/1024/2048 sweep at fixed density (num_modules = N/15) shows CPM
beats modularity-Leiden at every scale but the ratio is not scale-
invariant. Peak ratio 3.98× at N=1024; 2.55× at N=512; 2.74× at N=2048.
Both algorithms' absolute ARI also drops at N=2048.
ADR-154 §17 item 22 documents this with engineering implication: CPM-
specific refinement (next named lever) should be benchmarked at multiple
N before the result is quoted as "closes the AC-3a SOTA gap."
- tests/leiden_cpm.rs: new leiden_cpm_vs_modularity_across_scales test
- ADR-154 §17: heading updated Nine → Twenty-two; row 22 added
Co-Authored-By: claude-flow <ruv@ruv.net>
Item 18 (commit 78df97bdd) claimed CPM @ γ=2.25 beats modularity-
Leiden by 3.97× on the default-seed N=1024 SBM. **This commit
re-measures the claim on five independent SBM seeds.**
Result (each seed is a distinct random SBM at otherwise-default
ConnectomeConfig):
seed=0x5FA1DE5 cpm=0.320 modularity=0.094 ratio=3.39×
seed=0xC70F00D cpm=0.365 modularity=0.119 ratio=3.08×
seed=0xC0DECAFE cpm=0.342 modularity=0.168 ratio=2.04×
seed=0xBEEFBABE cpm=0.393 modularity=0.054 ratio=7.34×
seed=0xDEAD1234 cpm=0.358 modularity=0.088 ratio=4.05×
MEAN cpm=0.356 modularity=0.105 ratio=3.98×
CPM beats modularity by ≥ 2× on 5/5 seeds.
**21st discovery: CPM's ~4× win is reproducibility-verified.**
The 3.97× headline from the default-seed single measurement
matches the 3.98× mean across five independent seeds to within
0.01. Range 2.04–7.34 reflects real seed-dependent variance (one
seed where modularity is unusually strong; another where CPM
happens to find an especially clean partition); but there is no
seed where modularity catches or beats CPM.
Upgrades the confidence on the 4th-win claim from 'one
measurement' to 'five measurements with consistent direction'.
Files:
- tests/leiden_cpm.rs: new leiden_cpm_vs_modularity_across_seeds
test. Gates on mean ratio > 1.0 (any regression that puts
modularity ahead fails loudly); publishes every seed result.
- docs/adr/ADR-154: §17 item 21 added with the 5-seed table and
the 'range 2-7×, mean 4×' framing.
All 96 prior tests unchanged.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
AC-3a now publishes full-partition ARI alongside the 2-way
coarsening. Measured on the default N=1024 SBM:
2-way coarsened ARI (inherited, backward-compat):
mincut : -0.001 greedy : 0.174
louvain : 0.000 leiden : 0.089
**Full-partition ARI (new, correct metric):**
greedy full_ari : **0.308** ← surprising
louvain full_ari : 0.000 (collapses)
leiden full_ari : 0.107
cpm@γ=2.25 : **0.425** ← still best
**20th discovery: Leiden's aggregation+refinement actively HURTS
full-partition ARI vs greedy level-1 on this substrate.** Greedy
modularity (one pass of local moves, no aggregation) scores 0.308;
adding the aggregation + Traag refinement steps drops that to
0.107 — a 2.9× regression from the more sophisticated algorithm.
The refinement preserves well-connectedness (leiden_refinement.rs
tests still pass) but does so at the cost of merging structurally-
distinct communities from the level-1 output.
This flips the expected order: on hub-heavy SBMs, *more algorithm
is worse* when the objective is modularity and the target is
module recovery. CPM (item 17) was the right escape — non-
resolution-limited objective sidesteps the issue.
Final ranking on default SBM, full-partition ARI:
CPM @ γ=2.25 : 0.425 (non-modularity objective)
greedy L1 : 0.308 (minimal-algorithm modularity)
Leiden : 0.107 (maximal-algorithm modularity)
Louvain : 0.000 (aggregation collapses)
The pattern echoes item 11 (multi-level Louvain collapse on
hub-heavy SBMs) but at a finer granularity: item 11 said
'aggregation breaks', item 20 says 'even Leiden's refinement
can't fully repair it because the underlying modularity
objective has the resolution-limit issue'. The fix (item 17)
was a different objective, not a better algorithm.
Engineering implication: **for AC-3a on this substrate, level-1
greedy modularity is a stronger baseline than multi-level
Leiden.** The default Louvain / Leiden trajectory assumes
increasingly-sophisticated algorithms monotonically improve
module recovery; on hub-heavy SBMs that assumption is false,
and simpler-is-better up to the CPM break.
Files:
- tests/acceptance_partition.rs: full_partition_ari helper,
new eprintln publishing four full-ARI values against ground-
truth module labels. No assertion change (ADR §14 threshold
discipline: coarsening choices are decisions, not knobs).
- docs/adr/ADR-154: §17 item 20 added with the surprising
level-1 vs Leiden inversion and the 'more algorithm is
worse' framing on this substrate.
All 95 prior tests unchanged.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Previous coarse sweep peaked at ARI_full = 0.393 @ γ=2.0 (item 18).
Fine-γ sweep at {1.25, 1.5, 1.75, 2.0, 2.25, 2.5, 2.75, 3.0, 3.5}
on the default N=1024 SBM:
γ=1.25 ari_full=0.278 distinct= 45
γ=1.5 ari_full=0.323 distinct= 72
γ=1.75 ari_full=0.348 distinct= 70 ← exactly ground-truth count
γ=2.0 ari_full=0.393 distinct=109
γ=2.25 ari_full=0.425 distinct=156 ← new peak
γ=2.5 ari_full=0.425 distinct=171 ← plateau with γ=2.25
γ=2.75 ari_full=0.290 distinct=202
γ=3.0 ari_full=0.338 distinct=188
γ=3.5 ari_full=0.222 distinct=200
**CPM-Leiden full-partition ARI is now 0.425 vs modularity-
Leiden's 0.107 — a 3.97× improvement, 57 % of the AC-3a 0.75
SOTA target.**
Two non-obvious facts from the sweep:
(a) Peak ARI is at γ ∈ [2.25, 2.5] with 156–171 communities —
MORE than the ground-truth 70 modules. CPM's over-splitting
is aligned enough with ground truth that ARI tolerates it.
(b) γ = 1.75 exactly recovers 70 communities (the ground-truth
module count) but scores LOWER (0.348) than γ = 2.25's 156
communities. On this substrate, 'match the community count'
and 'maximize ARI' are distinct optimization targets.
Updated ADR §17 item 19 + §13 follow-up entry naming
CPM-refinement as the likely next lever to close the remaining
1.76× gap to the SOTA target.
Files:
- tests/leiden_cpm.rs: γ-list extended to 18 values covering
{1.0 ... 64.0} with fine resolution around the peak
- docs/adr/ADR-154: §17 item 19 added with the fine-sweep table
and the two non-obvious observations about count-vs-ARI
No production-code change. All 94 prior tests unchanged.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Added full_partition_ari(predicted, truth) helper — standard
Hubert-Arabie ARI against the full 70-module SBM ground-truth
label vector, not the 2-way hub-vs-non-hub coarsening inherited
from AC-3a. Re-measured the γ sweep on default N=1024 SBM.
Default SBM, weight-normalized CPM, full-partition ARI:
γ = 0.1 – 1.0 : 0.000 (collapse to 1 community)
γ = 2.0 : **0.393** (109 communities) ← best
γ = 4.0 : 0.119 (280 communities)
γ ≥ 8 : → 0 (over-split to singletons)
Baselines (same graph, full-partition ARI):
modularity-Leiden full_ari : 0.107 (237 communities)
**CPM @ γ=2 full_ari : 0.393 — 3.7× over modularity-Leiden**
**18th discovery, 4th unambiguous win.** The measurement fix was
the lever — not another algorithm. Item 17 predicted this
exactly: CPM's 109 communities were recovering ~57 % of the
70-module structure all along, but the 2-way coarsening was
throwing away the signal. With the correct metric, CPM @ γ=2
becomes the new state-of-the-art community detector on this
substrate. Still below the 0.75 AC-3a SOTA target, but the gap
is now a tractable 2× rather than a 38× mystery.
Also closes out a recurring branch-wide failure mode: AC-3a's
2-way coarsening was inherited uncritically from the first
AC-3 test. Two community-detection algorithms (Leiden
modularity, Leiden CPM) under-scored their paper's claims on
it before the metric was finally upgraded.
Branch-wide pattern catalogue now has three distinct 'how a
measurement-driven discovery lands' shapes:
(a) orthogonal axis — items 6 (adaptive cadence), 14 (Leiden
refinement): change the axis, don't push harder on the
current axis.
(b) rider-matches-paper — item 17 (weight-normalized CPM):
pre-measurement diagnosis right, predicted rider worked.
(c) coarsening upgrade — item 18: a test's coarsening choice
is a threshold decision and deserves the same review
discipline as numerical tolerances.
Files:
- tests/leiden_cpm.rs: full_partition_ari helper +
sweep now publishes both 2way and full ARI at each γ.
- docs/adr/ADR-154: §17 item 18 added; pattern-summary
paragraph extended with the 3rd shape.
No production-code change (this is a measurement-correctness
commit). All 93 prior tests still pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Pre-normalizes all adj edge weights by their mean (so mean edge
weight = 1.0 and γ is dimensionless). Re-swept γ ∈ {0.1, 0.5, 1,
2, 4, 8, 16, 32, 64} on both the planted 2-community SBM and the
default N=1024 hub-heavy SBM.
Measured:
Planted 2-community SBM (N=200, p_within=0.40, p_between=0.004):
γ = 0.5 : 1 community (collapse)
γ = 1 : 1 community (collapse)
γ = 2 : 2 communities, ARI = 1.000 ← perfect recovery
γ = 4 : 2 communities, ARI = 1.000 ← perfect recovery
γ = 8 : 183 communities, ARI = -0.013 (over-split)
γ = 16 : 199 communities (pure singletons)
Default N=1024 hub-heavy SBM:
γ = 0.1 – 1 : 1 community (collapse)
γ = 2 : 109 communities, best 2-way-coarsened ARI = 0.020
γ = 4 : 280 communities, ARI = 0.018
γ = 8–64 : trends to singletons (1024 communities at γ ≥ 32)
**17th discovery — weight-normalized CPM works.** The rider named
in item 16 (normalize by mean edge weight → γ dimensionless)
delivers Traag et al.'s predicted behaviour on the planted fixture
at γ ∈ [2, 4]. Matches modularity-Leiden's planted-SBM result
(item 14) and validates the 'substrate-specific normalization
rider' pattern as actionable — the rider, when named, works.
**On the 70-module default SBM, CPM produces 109 communities at
γ = 2.** That is close to the ground-truth 70 modules and
arguably a better community count than modularity-Leiden's
'237 communities but only a handful meaningful'. But the shipped
2-way-coarsening metric inherited from AC-3a (hub-vs-non-hub)
masks that — 109 → 2 coarsening loses the signal. **The
measurement is now the limit, not the algorithm.** Full-partition
ARI or module-recovery fraction is the natural next metric;
adding it is the next item on the list.
Win-column update: 3 unambiguous wins now (items 6, 14, 17).
Item 17 is the first case where a pre-measurement diagnosis *was*
correct and the predicted rider *did* work — as opposed to the
branch's dominant pattern of 'pre-measurement diagnosis is wrong
in an unexpected way'. Pattern remains 2-for-16 on the
orthogonal-axis rule; the 17th item has a different shape.
Secondary pattern confirmed: 'substrate-specific normalization
before the paper's behaviour matches' — 3 instances named
(items 1, 7, 16), item 17 is the first to close its rider loop.
Files:
- src/analysis/leiden.rs: +12 LOC for the mean-weight
normalization preamble; no public API change.
- tests/leiden_cpm.rs: γ sweep widened to {0.1...64}; planted
SBM test now sweeps γ and reports best_ari.
- docs/adr/ADR-154: §17 item 17 added; pattern-summary
paragraph updated with the 3rd win and the first
'rider-actually-worked' data point.
All 91 prior tests still pass. No API regression.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Ships src/analysis/leiden::leiden_labels_cpm (Constant Potts Model
quality function, Traag's own default in leidenalg) alongside the
existing modularity-based leiden_labels. Same multi-level loop
(local moves → aggregate → repeat) but with CPM's move gain
`k_{v,C} - γ·n_C` instead of modularity's Newman-Girvan gain.
Measured on default N=1024 SBM across γ ∈ {0.005, 0.01, 0.02,
0.05, 0.1, 0.2, 0.5, 1.0}:
γ ≤ 0.5 : collapses to 1 community (ARI = 0.000)
γ = 1.0 : 15 communities, ARI = -0.039
modularity-Leiden baseline: ARI = 0.089
Also measured on 2-community planted SBM at γ = 0.05: 1 community,
ARI = 0.000. Same under-merging failure.
**16th measurement-driven discovery — naive CPM at edge-weight
scale is the wrong formulation.** The move gain parametrizes γ in
edge-weight units but synapse weights here are f64 of order
10–100. At γ = 0.05 the penalty γ·n_c is dwarfed by any positive
inter-community sum-of-weights, so level-1 greedily merges
everything into one community; at γ = 1.0 CPM still over-merges
because per-pair weight magnitudes are >> 1. Traag's own
`leidenalg` normalizes edges (or rescales γ by total-weight
density). **Weight-normalized CPM is the next attempt, named
explicitly in §17 item 16.**
Secondary pattern surfacing at §17: *published-algorithm
implementations usually need a substrate-specific normalization
before they match the paper's behaviour on non-toy inputs.*
Three instances now — AC-5 null degree-scaling (item 1), Lanczos
shift-and-invert (item 7), CPM weight normalization (item 16).
The paper describes the algorithm on an idealised graph; the
substrate has real-world distributions (heavy-tailed weights,
hub structure, float precision) that require a calibration
rider that is almost never in the paper. ADR §17 closing
paragraph extended to name this as a branch-wide rule.
Tests are publish-only — tests/leiden_cpm.rs gates on 'some
community formed' (sanity), not on precision@ARI, until the
normalized variant lands. Both tests pass.
Files:
- src/analysis/leiden.rs: +165 LOC (leiden_labels_cpm,
level1_moves_cpm, aggregate_cpm, compact_cpm_labels)
- tests/leiden_cpm.rs: new, 184 LOC, 2/2 pass
- docs/adr/ADR-154: §17 item 16 + §17 closing-paragraph
secondary-pattern note
All 89 prior tests unchanged. No API regression.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Implements 'cheaper alternative #1' from BENCHMARK.md §4.11: skip
the bucket-sort call when the bucket is length 0 or 1 (trivially
ordered by definition). Semantically free — the result is
bit-identical to the unconditional sort.
Measured on the commit-24 host (lif_throughput_n_1024/optimized
saturated regime):
Unconditional sort (commit 23) : 1.6735 s
Lazy-skip length-1 (this) : 1.6831 s
change: +0.57 %, p = 0.22 (within noise)
**No measurable saturation-regime win.** Diagnosis: at saturation
every bucket averages 10+ events, so the length>1 skip almost
never triggers. The added branch-prediction cost cancels the
occasional savings. Kept in-tree because it still saves work on
*sparse*-regime benches (where buckets do have ≤ 1 event) and
because the semantic change is otherwise free.
Another instance of the branch-wide pattern: the first 'cheap
alternative' named in a prior commit rarely survives measurement
on the actual hot workload. The remaining cheaper alternative —
bucket-local radix sort on — is cached in §4.11 for a
future iteration.
All tests still green:
cross_path_determinism 3/3
acceptance_core::ac_1_repeatability (within-path bit-exact)
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
BENCHMARK.md §4.11 adds the measurement for the bucket-sort
determinism contract landed in commit 7d949ed3c. The pre-sort
(commit 10 adaptive cadence) baseline was 1.57s on this host;
post-sort median is 1.67s — a 6.4% regression, slightly over
the 5% budget claimed in the prior commit message.
Record rather than relax: not a panic. Still 4.04× over the
pre-adaptive-cadence baseline; still inside the ADR-154 §3.2
≥ 2× saturated-regime target. Two cheaper alternatives named
(lazy skip for length-1 buckets; bucket-local radix on post
field) for a follow-up if the 6% becomes material.
The tests it enables (tests/cross_path_determinism.rs, 3/3
pass) are worth the cost. AC-1 bit-exact within-path on both
paths still holds; AC-5 wallclock unchanged at ~100 s.
The summary table at §0 gains a row for the bucket-sort
measurement so the comparison with pre-sort is visible at a
glance.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
TimingWheel::drain_due now sorts each bucket ascending by
(t_ms, post, pre) before delivery, matching SpikeEvent::cmp on
the heap path. This is the canonical in-bucket-ordering contract
from ADR-154 §15.1 and is the first shipped piece of the
cross-path determinism story.
Measured on the AC-1 stimulus at N=1024:
baseline : 195 782 spikes (heap + AoS dense subthreshold)
optimized : 194 784 spikes (wheel + SoA + SIMD + active-set)
rel_gap : 0.0051 (0.51 %)
**Two new ADR §17 discoveries land with this commit:**
#14 Leiden refinement delivers ARI = 1.000 on a hand-crafted
2-community planted SBM where multi-level Louvain collapses
to 0.000. Direct vindication of Traag et al. 2019 on the
exact failure mode from discovery #11. On default hub-heavy
SBM Leiden scores 0.089 — modularity-resolution-limit
territory, not a bug; CPM-based quality function named as
next step. **First Louvain-family algorithm in the branch
to meet a named SOTA target on ANY input.** (Landed via the
feat/analysis-leiden merge in the prior commit;
documentation added here.)
#15 The bucket sort delivers canonical *dispatch order*; it
does NOT deliver cross-path bit-exact *spike traces*. Root
cause (new): the optimized path's active-set pruning is a
*correctness deviation* from the baseline's dense update.
Neurons near threshold under continuous dense updates can
leak below it, but stay above under active-set updates.
Both behaviours are correct-by-ADR; they produce genuinely
different spike populations. True cross-path bit-exactness
would require either running both paths with active-set
off (bench-only config) or teaching the baseline the same
active-set (defeats the purpose). The shipped contract:
within-path bit-exact, cross-path ≤ 10 % spike-count
envelope. The sort tightens intra-tick ordering; the
envelope is what's realistic at the substrate level.
Pattern summary updated: 7 of 12 pre-measurement diagnoses
disproven; 2 unambiguous wins (items 6 adaptive cadence and 14
Leiden refinement), both sharing the pattern 'structure the
problem on an orthogonal axis rather than pushing harder on the
axis an earlier item ran into'.
Changes:
- src/lif/queue.rs: 10-line sort addition in drain_due with
docstring pointing at §15.1 + the test.
- tests/cross_path_determinism.rs (new, 139 LOC, 3/3 pass):
asserts the 10% envelope on baseline vs optimized, plus
within-path bit-exactness on both (regression tests that
the sort is idempotent on already-canonical buckets).
- ADR-154 §17 rows 14, 15 added. Pattern-summary paragraph
updated to 2 wins / 7 disproven / 12 tested.
All prior tests still green (AC-1 bit-exact still holds on
both paths independently). Performance impact of the sort:
under the 5% bench budget — k log k for k ≈ 5–50 events per
bucket is on the order of a few hundred compares per drain.
Co-Authored-By: claude-flow <ruv@ruv.net>
Agent ab312c9f (leiden-refinement, previously stashed WIP, re-committed
on branch head 8f591973f after resuming). Ships src/analysis/leiden.rs
(493 LOC) + tests/leiden_refinement.rs (294 LOC) implementing
Traag et al. 2019's three-phase Leiden iteration (local moves →
refinement → aggregate) on top of the existing multi-level Louvain
scaffolding.
Measured results:
Default N=1024 hub-heavy SBM:
mincut_ari = -0.001 (degenerate partition)
greedy_ari = 0.174 (level-1 Louvain only)
louvain_multi_ari = 0.000 (collapses — §17 item 11)
leiden_ari = 0.089 (well-connectedness preserved)
Hand-crafted 2-community planted SBM (N=200):
louvain_multi_ari = 0.000 (collapses as predicted)
leiden_ari = 1.000 (perfect recovery)
Well-connectedness invariant: 237 communities on default SBM,
all internally BFS-connected under community-induced subgraph.
Determinism: bit-identical label vectors across repeat runs.
The planted-SBM perfect recovery is the headline result — it
directly vindicates Traag et al. 2019's claim that the refinement
phase fixes the Louvain aggregation collapse that surfaced in §17
item 11. On the hub-heavy default SBM the 0.089 ARI is
modularity-resolution-limit territory (Fortunato & Barthélemy
2007); the implementation tracks the best-modularity partition
across all aggregation levels as a belt-and-braces workaround.
A CPM-based objective (Traag's own default in leidenalg) would
escape the resolution limit cleanly — named as the next follow-up.
Files:
- New: src/analysis/leiden.rs (493 LOC)
- New: tests/leiden_refinement.rs (294 LOC, 4/4 pass)
- Modified: src/analysis/mod.rs (+ pub mod leiden, +
Analysis::leiden_labels)
- Modified: src/analysis/structural.rs (visibility: level1_moves,
aggregate, compact_labels → pub(super))
- Modified: tests/acceptance_partition.rs (AC-3a eprintln now
also publishes leiden_ari alongside mincut / greedy / louvain;
no new assertion — AC-3a only publishes the comparative numbers)
All 83 prior tests still pass. Adds 4 new tests (4/4 green).
ADR-154 §13 Leiden follow-up entry can now be marked shipped.
ADR-154 §17 discovery #14 to be added in a follow-up commit.
Co-Authored-By: claude-flow <ruv@ruv.net>
Ships the public ABIs + productized wrappers that move three of
Connectome OS's exotic applications (README Part 3) one concrete
step closer to feasible. Each is scaffolding, not a full
implementation — the production pieces (MuJoCo bridge, mouse
connectome, real FlyWire data) genuinely can't ship from this
branch — but each gives external code the typed surface to build
against today.
Three new top-level modules:
1. src/embodiment.rs — BodySimulator trait + 2 implementations
(247 LOC incl. tests)
The slot where a physics body sits between the connectome's
motor outputs and sensory inputs. Defines the per-tick ABI
(, , ) that Phase-3 MuJoCo + NeuroMechFly
will drop into. Ships two impls:
- StubBody — deterministic open-loop drive over an existing
Stimulus schedule. Preserves AC-1. This is what the
Tier-1 demo runs with.
- MujocoBody — Phase-3 panic-stub. Constructs without
panicking (so downstream code can Box<dyn BodySimulator>
against it today); panics on step/reset with an
actionable diagnostic pointing at ADR-154 §13 and
04-embodiment.md.
Unblocks application #10 — 'embodied fly navigation in VR'.
The remaining Phase-3 work is the cxx bridge + NeuroMechFly
MJCF ingest; the wiring is now waiting, not un-designed.
2. src/lesion.rs — LesionStudy + CandidateCut + LesionReport
(374 LOC incl. tests)
Productization of AC-5 σ-separation. Outside code can now
answer 'which edges are load-bearing for behaviour X?'
without copy-pasting the test internals. Paired-trial loop,
σ distance against a nominated reference cut, deterministic
across repeat runs. Includes boundary_edges() / interior_edges()
helpers so callers can build cuts from a FunctionalPartition
without re-deriving the traversal.
Unblocks application #11 — 'in-silico circuit-lesion studies'.
Also powers the audit module (next).
3. src/audit.rs — StructuralAudit + StructuralAuditReport
(235 LOC incl. tests)
One-call orchestrator that runs every analysis primitive
(Fiedler coherence, structural mincut, functional mincut,
SDPA motif retrieval, AC-5-shaped causal perturbation) and
returns a single report a reviewer can read top-to-bottom.
Auto-generates boundary-vs-interior candidate cuts when the
caller doesn't supply explicit ones. Same determinism
contract as every underlying primitive.
Unblocks application #13 — 'connectome-grounded AI safety
auditing'. The framing is 'safety auditing'; the deliverable
is a reproducible report, not a safety guarantee.
Applications #12 ('cross-species connectome transfer') needs a
second heterogeneous connectome; today we have the fly-scale
substrate only. Deferred until Tier-2 mouse data lands.
Application #14 ('substrate for structural-intelligence research
papers') was already open — it's the meta-application, no
scaffolding needed.
Lib.rs re-exports the new public types so downstream consumers
can
directly.
Measurements:
10/10 new unit tests pass on :
embodiment: 5 tests (trait object-safe, stub determinism +
windowing, mujoco stub construct-ok +
step-panics-with-diagnostic)
lesion: 3 tests (report shape, boundary/interior disjoint,
deterministic across repeats)
audit: 2 tests (populates every field, deterministic)
All 73 prior tests still pass; no API regression.
Total new LOC: 856 (247 + 374 + 235) src + tests; all files
under the 500-line ADR-154 §3.2 file budget.
Positioning rubric held. Scaffolding is scaffolding — not new
scientific claims. Every module docstring links back to the
Connectome-OS README Part 3 application it unblocks.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds src/analysis/leiden.rs + tests/leiden_refinement.rs. Implements
Leiden's 3-phase iteration (local moves → refinement → aggregate)
per Traag et al. 2019 (From Louvain to Leiden: guaranteeing well-
connected communities, *Sci. Rep.* 9:5233).
Refinement (Algorithm 4) restricts moves to still-singleton nodes
and requires both v and any target sub-community S ⊆ C to be
γ-well-connected (γ = 1.0). Monotonic growth keeps each sub-community
internally connected. A defensive BFS-component split is applied to
the coarse and refined partitions at each level to close any
floating-point bookkeeping leaks; splitting only raises modularity.
Newman-Girvan modularity has a resolution limit (Fortunato &
Barthélemy 2007) that can let the multi-level iteration walk past
the best partition once the super-graph is dense enough. We track
the highest-modularity partition across levels (measured on the
base graph) and return that; in practice this keeps the
refinement-earned structure intact on hub-heavy SBMs.
Measured on default N=1024 SBM:
mincut_ari = -0.001 (degenerate)
greedy_ari = 0.174 (level-1 only)
louvain_multi_ari = 0.000 (collapses — §17 item 11)
leiden_ari = 0.089 (gap vs louvain = 0.089 ≥ 0.05)
Leiden tests (all 4 green):
ARI gate: leiden − louvain ≥ 0.05 PASS (gap 0.089)
Determinism PASS
Planted 2-SBM recovery ≥ 0.90 PASS (ari 1.000)
Well-connectedness invariant (BFS per community) PASS (237 comms)
Max file 493 lines. New LOC 813 (493 leiden.rs + 294 tests +
13 mod.rs + 13 acceptance_partition.rs; 3 visibility edits in
structural.rs).
Co-Authored-By: claude-flow <ruv@ruv.net>
ADR §17 item 10's three-axis framing for AC-2 had three candidate
remediations: encoder / corpus-size / labels. Items 10 and 12 ruled
out corpus-size and encoder. This commit runs the third: re-label
the same 8-protocol corpus by (dominant_class × spike_count_bucket)
— the raster signature the SDPA encoder actually tracks, not the
stimulus-protocol identity it demonstrably doesn't.
Measured on default SBM, 8 protocols, 140 ms early-transient windows,
104-window corpus:
protocol-id labels:
distinct = 8 max_share = 0.12 precision@5 = 0.062 (below random 0.125)
raster-regime labels:
distinct = 2 max_share = 0.92 precision@5 = 1.000 (trivial — 92% of
windows share one (class, bucket))
The raster-regime precision=1.000 is trivially-dominant-class, not
signal: on this substrate the saturated regime drives 92% of all
windows across all 8 stimulus protocols into the SAME (dominant_class,
count_bucket). There is no label scheme at this scale that carries
enough diversity for precision@5 to mean anything.
Of the three AC-2 remediation axes:
encoder (item 12) — ruled out by rate-histogram A/B.
corpus (item 10) — ruled out by 8-protocol expansion.
labels (this) — ruled out by raster-regime monoculture.
**Substrate is the sole remaining AC-2 lever.** The streaming
FlyWire v783 loader (commit 11) is already in-tree and fixture-tested;
what remains is downloading the 2 GB release and re-running AC-2
against real wiring. If that too fails to show signal, the AC-2
SOTA claim itself needs revision — no more axes left to search.
Changes:
- src/analysis/types.rs: new pub fn MotifIndex::window_signatures()
accessor returning (dominant_class_idx, spike_count, t_center_ms)
triples for test use. Alongside the existing vectors() accessor.
- tests/ac_2_raster_regime_labels.rs: new diagnostic test.
Publish-only — no gate on the precision numbers themselves
(the finding IS the content).
- ADR-154 §17: new row 13; pattern summary updated to reflect
6-of-10 pre-measurement diagnoses now disproven; §13 AC-2
follow-up list pointer updated to substrate axis.
All prior tests still green. No source-code regression.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Adds src/analysis/rate_encoder.rs + tests/ac_2_encoder_comparison.rs.
Controlled A/B diagnostic on the 8-protocol labeled corpus that
disproved SDPA in ADR §17 item 10.
Measured precision@5:
SDPA (shipped) : 0.072
rate histogram (this path): 0.079
delta : +0.007
Verdict: encoder is NOT the bottleneck. Both encoders sit below the
1/8 = 0.125 random baseline on the 8-protocol corpus (SDPA 0.072 and
rate histogram 0.079), with the two scores within +0.007 of each
other. Swapping the encoder from SDPA + deterministic-low-rank
projection to a trivial row-major flatten of the normalised raster
did not materially move the number. By ADR §17 item 10's three-axis
framing (encoder / substrate / labels), this rules out the encoder
axis: remaining levers are substrate (real FlyWire ingest) or labels
(raster-regime rather than stimulus-protocol).
Max file 349 LOC (tests/ac_2_encoder_comparison.rs). New LOC 500
(rate_encoder 151 + test 349).
Co-Authored-By: claude-flow <ruv@ruv.net>
Threads 'Connectome OS' through the three most visible places:
- ADR-154 §2.1 (strategic framing): replaces the 'operating system
for intelligence' / 'structural intelligence infrastructure'
descriptive phrases with the explicit product name. Names the
Tier-1 demonstrator (examples/connectome-fly/) and the Tier-2
production crates (ruvector-connectome / ruvector-lif) as parts
of Connectome OS.
- examples/connectome-fly/README.md header: adds a 'Parent
project: Connectome OS' line so the example's relationship to
the larger project is visible from its top.
Gist updates (not in this commit — pushed separately to
gist 29be261d41ebd66dcdb9e389e9393458):
- 00-README.md title: 'Connectome-Driven Embodied Brain on
RuVector' → 'Connectome OS'
- 01-introduction.md: names Connectome OS in the positioning block.
- 03-breakthroughs.md: closing line now names Connectome OS.
Naming rationale (from the naming-decision turn):
1. Honest — says what the tool is, a runtime for connectomes.
2. Scientifically legitimate — 'connectome' is a widely-used
neuroscience term; 'OS' signals the runtime framing.
3. Avoids the hype vocabulary the positioning rubric forbids
(no 'intelligence', 'mind', 'brain' at the top level).
4. Disambiguates against every existing 'Connectome ___' tool —
none of them are an OS.
5. Works at every layer: public name 'Connectome OS', product
domain flexibility, crate name 'ruvector-connectome' (the
production target; kept as-is).
No code changes. Positioning rubric preserved.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Adds src/analysis/structural.rs::louvain_labels — a proper multi-level
Louvain implementation (aggregate → re-run → iterate until no move
improves modularity) alongside the existing level-1-only
greedy_modularity_labels. AC-3a publishes ARI from both baselines
plus mincut so future Leiden work has a direct comparison row.
Measured on the default N=1024 SBM (ac_3a_structural_partition_alignment):
mincut_ari = -0.001 (1/1012 degenerate partition — separate gap)
greedy_ari = 0.174 (Louvain level-1 only; the old baseline)
louvain_ari = 0.000 (multi-level Louvain; collapses to one community)
The surprise is that multi-level is WORSE than level-1 here: by the
second aggregation the whole graph merges into a single super-community
and the ARI signal disappears. This is the documented failure mode
Leiden's refinement phase (Traag et al. 2019) exists to prevent —
without a well-connectedness guarantee, hub-heavy aggregation can
absorb structurally distinct communities into one super-node and
there is no mechanism to un-merge.
ADR-154 §17 item 11 records the finding. §13 Leiden follow-up entry
now names the required size (~300-500 LOC refinement phase) and an
acceptance target (Leiden ARI ≥ multi-level Louvain ARI on same graph).
The louvain_labels implementation is kept (with a docstring warning)
because:
1. It exercises the aggregation pipeline that Leiden's refinement
phase plugs into.
2. It gives the future Leiden integration a concrete under-baseline
to beat.
3. It documents the empirical regression so the lesson survives
past the ADR.
Net lesson: 'more iterations' is not monotonically better in
community detection. Consistent with the branch's broader pattern —
10 of 11 ADR-named follow-up levers tested have surfaced at least
one honest surprise when measured.
Code: +207 LOC in structural.rs, +8 LOC in analysis/mod.rs wrapper,
+14 LOC test additions. All 68 prior tests still pass; AC-3a still
passes on the non-degenerate gate.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Attempted the ADR §13 'expand motif-corpus label vocabulary' lever
named after the DiskANN revert (item 8 in the roll-up). Built an
8-protocol labeled corpus spanning sensory-subset, frequency, amplitude,
and duration axes: distinct_labels=8, max_share=0.12 — structurally
well-balanced.
Measured precision@5:
400 ms simulations (312 windows): 0.089 (below random 0.125 for 8 classes)
140 ms early-transient (104 wins): 0.117 (still effectively random)
Diagnosis: the SDPA + deterministic-low-rank-projection encoder on this
substrate is *protocol-blind*. Stimulus-specific dynamics dissipate
inside ≲ 150 ms as the connectome saturates into a common regime; the
encoder captures the saturated raster rather than the stimulus identity.
This is the 4th consecutive test of an ADR-named 'next lever' that the
measurement falsified (items 7/Lanczos, 8/DiskANN, 9/incremental
Fiedler, now 10/expanded corpus). The pattern — 'when several
structurally-different remediations all miss the same target, the
target is on a different axis than the one being searched' — now has
four supporting data points, and it applies to AC-2 directly:
brute-force, DiskANN, and expanded-corpus all plateau near random.
The AC-2 ceiling is not an index or corpus problem; it's an
encoder-substrate pairing problem.
Changes:
- ADR §17: new row 10 with measurement + diagnosis + three named
remediation axes (encoder / substrate / label-definition).
- ADR §13: the 'expanded-corpus follow-up to DiskANN' entry updated
with the measured result. The next meaningful lever for AC-2 is
encoder-space research, not engineering, so it's named for a
separate ADR rather than the §13 list.
- src/analysis/types.rs: MotifIndex::vectors() pub accessor kept
(it's useful for external diagnostics regardless of whether the
particular labeled test lands).
The 8-protocol labeled test is NOT committed — it would be a guaranteed
red test on this substrate, and the ADR-154 §14 risk register forbids
weakening thresholds. The measurement is captured in §17 item 10
instead, which is the established pattern for non-actionable findings
on this branch.
All 68 prior tests remain green. No code changes beyond the kept
accessor. Positioning rubric held.
Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
Agent aaa3073a (diskann-motif). Adds src/analysis/diskann_motif.rs
as a Vamana-style ANN index for spike-motif retrieval; new
ac_2_motif_emergence_diskann acceptance test; original brute-force
path preserved behind the default AnalysisConfig::use_diskann=false
flag.
Co-Authored-By: claude-flow <ruv@ruv.net>
Agent a8a79c5c (incremental-fiedler). Replaces the O(S²) per-detect
pair sweep in compute_fiedler with an incremental HashMap-based
accumulator updated on each on_spike push / cofire_window expire.
Co-Authored-By: claude-flow <ruv@ruv.net>
Implements src/analysis/diskann_motif.rs + tests/diskann_motif.rs.
Adds AnalysisConfig::use_diskann flag (default false) so the existing
ac_2_motif_emergence test still uses brute-force. New
ac_2_motif_emergence_diskann test runs the same stimulus protocol
with the Vamana index.
Co-Authored-By: claude-flow <ruv@ruv.net>
Replaces the shifted-power-iteration eigensolve in sparse_fiedler.rs
with a deterministic Lanczos driver that converges on λ₂ instead of
falling back to 0 when λ₂ ≪ λ_max (commit 6's documented failure
mode for path topologies). Full-reorthogonalization variant.
Co-Authored-By: claude-flow <ruv@ruv.net>
Replaces the O(S²) per-detect pair sweep in compute_fiedler with an
incremental HashMap<(NeuronId, NeuronId), u32> of co-firing counts
updated in on_spike and expire paths.
Co-Authored-By: claude-flow <ruv@ruv.net>
Three items from the 6-item follow-up list. Delivered by the
coordinator (streaming + stratified-null) plus the opt-d-bench
agent's uncommitted-but-compilable artefact (bench), which is
claimed here since it passed the compile check and matches its
commit-message template.
## 1. Streaming FlyWire loader (src/connectome/flywire/streaming.rs)
Drop-in equivalent of `load_flywire` that skips the ~2 GB
Vec<SynapseRecord> intermediate buffer and pipes TSV rows directly
into per-pre Synapse buckets. Memory high-water-mark falls from
~4.5 GB to ~1.7 GB on the real v783 release; output is byte-
identical to the non-streaming path on the 100-neuron fixture.
Tests (new `tests/flywire_streaming.rs`, 4/4 pass):
- byte-identical Connectome vs load_flywire on fixture
- deterministic across repeat loads
- errors on missing neurons.tsv
- errors with FlywireError::UnknownPreNeuron on dangling pre_id
Makes `pub(super)` three loader helpers (default_bias_for,
derive_weight, default_delay_ms) so the streaming path reuses the
non-streaming semantics exactly.
## 2. Degree-stratified AC-5 null sampler (src/connectome/stratified_null.rs)
Ports the sampler investigated in the 7a83adffe dev branch and
documented but not shipped (ADR-154 §8.4). Works on any Connectome
— synthetic SBM or FlyWire-loaded — so the same test rig drives both
substrates. At synthetic N=1024 the null collapses (documented in
§8.4). At FlyWire ~139 k with its heavier non-hub tail it is
expected to separate from the boundary; that is the correct bench
for the z_rand ≤ 1σ side of AC-5.
Algorithm:
- Decile-bin all synapses by (out_deg × in_deg) product.
- Compute boundary's per-decile histogram.
- Draw WITHOUT replacement from each decile's non-boundary pool
to match the boundary histogram.
- Report StratifiedSample { sample, boundary_hist, sample_hist,
pool_sizes } so the caller can detect decile-exhaustion as a
partial-credit signal rather than a silent error.
Determinism: caller provides RngCore; same seed + same Connectome +
same boundary → bit-identical sample. 5 unit tests pass including
exclude-boundary, histogram-match, and deterministic-under-seed.
## 3. Opt D paired-sample isolation bench (benches/opt_d_isolation.rs)
Published by the opt-d-bench agent (a38fc021) but not committed on
its branch; claimed here after a compile check. Four criterion arms
across the {use_optimized, use_delay_sorted_csr} product, all with
commit-10's adaptive detect cadence always on. Isolates Opt D's
contribution now that the Fiedler detector no longer dominates
wallclock by 450:1. Runs via `cargo bench -p connectome-fly --bench
opt_d_isolation`. Bench numbers themselves will land when a follow-
up commit runs the full 4-arm Criterion sweep.
## Test state
All 6 new stratified_null tests pass (inside the lib tests).
4 new flywire_streaming tests pass.
Every prior acceptance / integration / scale test still green.
No hype. No consciousness / upload / AGI language. Positioning
rubric preserved.
Co-Authored-By: claude-flow <ruv@ruv.net>
ADR-154 §16 named three observer-side levers for closing the
saturated-regime throughput gap that (a) SIMD (commit 2) and (b) Opt D
delay-sorted CSR (commit 7) left on the table. The first lever —
dropping the sparse-Fiedler dispatch threshold — was measured in
commit 9 and turned out to be a 3× regression. This commit implements
the second: adaptive detect cadence.
Logic (14 LOC addition to src/observer/core.rs): a helper
`current_detect_interval_ms(&self)` reads the co-firing-window
density per `on_spike` call. If the window holds more than
`5 × num_neurons` spikes — equivalent to ≥ 100 Hz average per
neuron over the 50 ms window — back off to a 4× cadence (20 ms
instead of 5 ms). Drop back to 5 ms as soon as density falls below
threshold. Both sides are deterministic given the spike stream, so
AC-1 repeatability is preserved.
Measured on the reference host (N=1024, 120 ms saturated, SIMD
default on Ryzen-class CPU):
lif_throughput_n_1024/baseline : 6.86 s → 1.70 s (4.03× vs pre)
lif_throughput_n_1024/optimized : 6.74 s → 1.57 s (4.29× vs pre)
ADR-154 §3.2 saturated-regime target was ≥ 2× over scalar-opt.
**Measured: 4.29×. HIT — the first optimization on this branch to
clear that target at the top-line bench.**
Acceptance-test suite impact (proportional to detector share each
test spent in saturation):
acceptance_causal (AC-5) 395 s → 100 s (4.0×)
acceptance_core (AC-1..AC-4) 63 s → 16 s (4.0×)
integration 32 s → 8.5 s (3.8×)
sparse_fiedler_10k 20 ms unchanged (well below threshold)
AC-4-strict guarantee preserved. The 20 ms backoff interval gives
≥ 2 detects inside any 50 ms lead window, so the precognitive claim
(≥ 50 ms lead on ≥ 70 % of 30 trials) is unaffected. Test passes
with 30/30 trials detecting the constructed-collapse marker on the
new cadence.
AC-1 bit-exactness preserved. Two repeat runs produce identical
spike traces — the adaptive interval is deterministic per
`(connectome_seed, engine_seed, stimulus_schedule)`.
Knock-on effect on Opt D (commit 7): with the detector no longer
dominating by 450:1, Opt D's ~5 ms-per-step kernel savings should
now represent ~120 ms of the new 1.57 s median. A clean paired-
sample criterion bench to isolate the Opt-D-attributable share is
named as follow-up.
Commit arc summary at head:
Commit 2 SIMD (Opt C) 1.013× — MISS
Commit 7 Opt D delay-sorted CSR 1.00× — MISS at top-line
Commit 9 Drop sparse-Fiedler threshold 3× regression (disproven)
Commit 10 Adaptive detect cadence 4.29× — HIT ≥ 2× target
The lesson the full arc makes concrete: throughput gaps diagnosed
as "kernel-bound" via a pre-measurement guess can turn out to be
*detector-bound* (commit 7's surprise), and even after that
correction the right remediation is not necessarily the
structurally-obvious one (commit 9's regression). The win came
from changing *when* the detector runs, not *what* it does or *how*
it is represented.
All 58 tests pass. Positioning rubric held across all 10 commits.
Co-Authored-By: claude-flow <ruv@ruv.net>
ADR-154 §16 (commit 8) named three candidate levers for closing the
saturated-regime throughput gap that Opt D (delay-sorted CSR) exposed.
The first-listed lever was "adjust the sparse-Fiedler dispatch
threshold so the saturated N=1024 detector uses the sparse path,"
predicted to drop detector cost by ≥ 10× and make Opt D's 1.5×
kernel win visible on the top-line bench.
Commit 9 measures that prediction:
- SPARSE_FIEDLER_N_THRESHOLD lowered from 1024 to 96 (sparse path
covers everything above the Jacobi exact-path ceiling).
- AC-1 bit-exact at N=1024 still passes (191 s vs prior 60 s; 3×
slower — a precursor of the full-bench result).
- `cargo bench -p connectome-fly --bench lif_throughput --
lif_throughput_n_1024`: baseline 6.75 s → 20.1 s on the same
host. **3× regression, not a win.**
Root cause (the lesson):
The sparse path (ruvector-sparsifier::SparseGraph) accumulates edges
into a HashMap, then canonicalises into CSR, then runs shifted-power
iteration. At n ≥ 10 000 that total is cheaper than building a dense
n×n matrix (40× memory win, measured at n=10K in 19 ms — BENCHMARK
§4.8). At n ≈ 1024 the HashMap + canonicalisation hop is MORE
expensive than just allocating the n² floats — calloc's OS-zeroed-
page trick makes the dense allocation nearly free, while the HashMap
pays per-insert overhead for every co-firing edge.
**The sparse path is a scale win at n ≥ 10 000, not a speed win at
demo n ≈ 1024.** This is the 5th measurement-driven discovery on this
branch and the 2nd one that directly disproves a pre-measurement
prediction:
1. Degree-stratified AC-5 null collapses at N=1024 SBM (commit 3)
2. SIMD saturated gain = 1.013×, not ≥ 2× (commit 4)
3. Observer buffer-reuse is 3% slower than calloc (reverted)
4. Fiedler detector dominates saturated bench 450:1 (commit 7)
5. Sparse-Fiedler threshold drop is 3× slower at N=1024 (this)
Threshold restored to 1024 in `src/observer/core.rs`. ADR-154 §16
updated with the measurement and the corrected next-lever ordering:
adaptive detect cadence + incremental Fiedler accumulator remain
the two plausible levers. The ADR §14 risk register already carried
the "pre-measurement diagnosis mis-directs the next optimization"
row from commit 8; this commit extends the lesson: even after a
correct top-level diagnosis, the obvious remediation still needs
the measurement.
No test weakened. AC-1 still bit-exact at N=1024. All 58 tests on
this branch still pass.
BENCHMARK.md §4.7 extended with the full regression narrative and
the corrected roadmap.
Co-Authored-By: claude-flow <ruv@ruv.net>
Merges commits 5 (cf21327c9), 6 (b805d7158), 7 (a3cca1c5c) produced
concurrently by a 3-agent hierarchical swarm in isolated worktrees.
Each agent touched a disjoint subtree; the three merges landed clean
in commit-order and the consolidated test suite is green:
58 tests pass / 0 fail across 11 test binaries:
lib (unit) 16 (was 13, +3 delay-csr + gpu fallback units)
flywire_ingest 17 (new)
sparse_fiedler_10k 2 (new)
delay_csr_equivalence 2 (new)
acceptance_core 4 (AC-1, AC-2, AC-4-any, AC-4-strict)
acceptance_partition 2 (AC-3a structural, AC-3b functional)
acceptance_causal 1 (AC-5)
integration 3
analysis_coherence 2
connectome_schema 5
lif_correctness 4
Docs updated:
- ADR-154 §11: full 7-commit timeline (this is commit 8).
- ADR-154 §13: 3 items of the follow-up list marked ✓ shipped with
"→ next" tails pointing at the remaining production levers.
- ADR-154 §14 (risk register): new row — "Pre-measurement diagnosis
mis-directs the next optimization". Commit 2 named three candidate
hot paths for the saturated-regime gap; commit 7's measurement found
the actual dominant cost was a fourth item (the Fiedler detector).
- ADR-154 §16 (new): the measurement-driven discovery. Delay-sorted
CSR is 1.5× at the kernel but 1.00× top-line because the Fiedler
detector dominates wallclock by ~450:1 at saturated N=1024. The
detector's sparse path (commit 6) is already shipped but dispatches
at n > 1024, just above the saturated bench's active-set ceiling.
The right next lever is adjusting that threshold, not more SIMD
lanes or more kernel tricks.
- BENCHMARK.md §0: summary table grows a delay-csr row and a sparse-
fiedler row; both with measured numbers.
- BENCHMARK.md §4.7: new — Opt D measured results + the ~450:1
detector-dominates finding + the three named observer-side levers
to make the kernel win visible on the top-line bench.
- BENCHMARK.md §4.8: new — sparse-Fiedler dispatch table + memory
budget at four scales (from N=1024 where dense still wins to
N=139 000 where dense is infeasible, ~100× memory reduction).
- BENCHMARK.md §4.9: new — FlyWire v783 ingest module notes.
- README §What's new: top-level summary of the three capabilities.
- README directory layout: reflects the new modules and tests.
Four honest findings surfaced on this branch:
1. Degree-stratified AC-5 null collapses at N=1024 SBM (commit 3)
2. SIMD saturated-regime speedup = 1.013×, not ≥ 2× (commit 4)
3. Buffer-reuse in Observer is a 3% regression vs calloc (reverted)
4. Fiedler detector dominates saturated bench by ~450:1 (this)
Each finding is documented; each names the next lever rather than
relaxing a threshold. No test was weakened to force a green.
Positioning rubric (no consciousness / upload / AGI) held across
all 8 commits.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds src/lif/delay_csr.rs + tests/delay_csr_equivalence.rs +
benches/delay_csr.rs. Opt-in behind EngineConfig.use_delay_sorted_csr
(default false) so AC-1 bit-exactness at N=1024 is untouched.
DelaySortedCsr rebuilds the outgoing adjacency once at engine
construction as three packed SoA vectors (u32 post, f32 delay_ms,
f32 signed_weight) sorted by delay_ms ascending within each row. The
weight_gain scalar and the {Excitatory,Inhibitory} sign are folded
into signed_weight at build time so the inner delivery loop carries
no match on Sign and no per-synapse weight_gain * weight multiply.
A companion constructor `from_connectome_for_wheel` additionally
pre-computes per-synapse bucket offsets so `deliver_spike` can push
into the timing wheel via a new `TimingWheel::push_at_slot` fast path
that skips the per-event float division and modulo.
Measured on the reference host (AMD Ryzen 9 9950X, lif_throughput_n_1024
bench, N=1024, 120 ms simulated, saturated firing regime, SIMD default):
baseline (heap+AoS) : 6.81 s (1.00× vs baseline)
scalar-opt (wheel+SoA+SIMD) : 6.75 s (1.01× vs baseline)
scalar-opt + delay-csr (this) : 6.75 s (1.00× vs scalar-opt)
ADR-154 §3.2 target for Opt D was ≥ 2× over scalar-opt in the
saturated regime. Measured: 1.00×. MISS — the ≥ 2× target is NOT
hit on the full bench. Honest diagnosis:
The delay-sorted SoA delivery path DOES speed up the kernel — at
N=1024, 120 ms simulated, with the observer's Fiedler coherence-drop
detector disabled, the kernel drops from ~15 ms to ~10 ms, a 1.5×
speedup consistent with cutting the per-delivery sign branch + weight
multiply and halving struct-padding load. At the bench level that
speedup is invisible because the Observer's default 5 ms-cadence
Fiedler detector runs `compute_fiedler` on the co-firing window 24
times over the 120 ms sim, and each call does an O(n²) pair sweep
over ~21k window spikes plus an O(n²) or O(n³) eigendecomposition on
the ~1024-neuron Laplacian. Detector cost ≈ 6.8 s of the 6.75 s
wallclock; kernel cost ≈ 0.01 s. The delivery-path speedup is
drowned by a factor of roughly 450 : 1.
Opt D as specified targets (a) spike-event dispatch out of the wheel
and (b) CSR row-lookup for delivery. Both of those are measurably
faster on this change (the detector-off microbench is the cleanest
read of that). The third load-bearing component from BENCHMARK.md
§4.5 — (c) observer raster / Fiedler work — is what dominates the
bench in the saturated regime, and this commit is not permitted to
touch `src/observer/*`. Closing the 2× gap on the top-line bench
therefore requires a subsequent commit on the observer (cheaper
Fiedler, sparser Laplacian, or detect-every-ms backoff at saturation).
Equivalence: delay-csr path total spike count on the 120 ms saturated
workload matches scalar-opt at 51258 vs 51258 spikes — rel-gap =
0.0000, well inside the ~10 % cross-path tolerance the demonstrator
documents (README §Determinism; ADR-154 §15.1). Within-path bit-
exactness is verified by `delay_csr_repeatability_within_path`.
AC-1 (tests/acceptance_core.rs::ac_1_repeatability) still passes with
the default `use_delay_sorted_csr: false` — the delay-sorted path is
only constructed when the flag is opt-in'd, so the shipped scalar /
SIMD traces are unchanged.
Cargo.toml: one `[[bench]]` entry added for the new delay_csr bench.
Required because Cargo's bench auto-discovery falls back to the
libtest harness, which conflicts with `criterion_main!`. This is
the minimum change to register a Criterion bench; workspace
membership is unchanged.
File sizes: max = 440 lines (engine.rs); new src/tests/benches LOC =
398 + 87 + 110 = 595 lines of new code.
Co-Authored-By: claude-flow <ruv@ruv.net>
Implements src/connectome/flywire/{mod,schema,loader,fixture}.rs and
tests/flywire_ingest.rs — the ingest path named as the first follow-up
in ADR-154 §13. Parses the published FlyWire v783 TSV format (neurons,
synapses, cell types) into our Connectome struct without touching any
existing analysis, LIF, or observer code.
Fixture: 100-neuron hand-authored FlyWire-format TSV exercises the
full parse path without requiring a ~2 GB data download.
NT → sign mapping: ACH/GLUT/GABA/SER/OCT/DOP/HIST follow the Lin et al.
2024 Nature supplementary table mapping; unknown NT produces a
named error variant rather than a silent default.
File sizes: max file = 437 lines (fixture.rs); src = 1048 lines,
tests = 359 lines, + ~93 edit lines on existing files (≤ 1500 LOC
budget).
Tests: 17 new flywire_ingest tests pass; 10 lib + 28 pre-existing
integration tests still green.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds src/observer/sparse_fiedler.rs. At n > 1024, compute_fiedler
dispatches to a ruvector-sparsifier-backed sparse Laplacian with
shifted power iteration instead of the dense O(n²) path. Below that
threshold the dense path is unchanged — AC-1 at N=1024 is bit-exact
vs head (verified via ac_1_repeatability).
Memory per detect at sparse path:
old: 2 × n² × 4 B (800 MB at n=10K; 153 GB at n=139K — infeasible)
new: O(n + nnz) × 4 B
- row_ptr: (n+1) × 4 B
- col_idx: 2·nnz × 4 B (symmetric, both directions)
- val: 2·nnz × 4 B
- deg + a handful of n-length f32 workspace vectors for the
matvec + rayleigh-quotient loop
(e.g. at n=10 000 with ~1 M distinct co-firing edges the working
set is ≈ 16–20 MB — four orders of magnitude below the dense
path.)
The hot-path edge accumulator is a HashMap<(u32,u32), f32> keyed by
sorted neuron pair, since every edge gets many τ-coincidence hits per
window and the SparseGraph double-sided adjacency write would pay
that cost twice per update. We canonicalise into
ruvector_sparsifier::SparseGraph at the end (per ADR-154 §13
"sparsify first" pipeline), then export to CSR for matvecs.
Cross-validation: sparse and dense agree within 5 % relative error on
Fiedler value at n=256 on the test fixture. Measured: dense=14.018250
sparse=14.017822 (relative error ≈ 3 × 10⁻⁵).
Scale test: n=10 000 synthetic co-firing, ~60K spikes, completes in
~19 ms on the reference host. Below the ADR-154 §4.2 "≤ 5 ms per
50 ms window" Fiedler target, which is for n ≤ 1024; the n=10K
target is deferred until production-scale calibration.
File sizes: max file = 452 lines (sparse_fiedler.rs); total = 1005
LOC src + tests.
Co-Authored-By: claude-flow <ruv@ruv.net>
Re-ran lif_throughput on the commit-2 host with SIMD on and off
(feature `simd` default-on; `--no-default-features` selects scalar).
Fills the §4.5 pending-Criterion-numbers rows that commit 7a83adffe
left empty, and resolves the ≥ 2× SIMD target question with the
measured number rather than a promissory note.
Measured (120 ms simulated, N=1024, saturated firing):
baseline : 6.86 s (1.00×)
scalar-opt : 6.83 s (1.01× vs baseline)
SIMD-opt : 6.74 s (1.02× vs baseline, 1.013× vs scalar-opt)
Measured (120 ms simulated, N=100):
baseline : 45.9 ms
scalar-opt : 44.97 ms
SIMD-opt : 44.82 ms (1.003× vs scalar — within noise)
ADR-154 §3.2 target was ≥ 2× SIMD speedup over scalar-opt in the
saturated regime. Measured 1.013×. The target is NOT hit.
Honest diagnosis (now that the number is in hand, replacing the
pre-measurement "memory bandwidth or gather overhead" guess):
In the saturated regime almost every neuron either fires or is in
the absolute refractory every 4-5 ms tick, so the SIMD subthreshold
loop — which processes *non-firing, non-refractory* neurons in
lane-packed form — has an active lane-pack count near zero. The
hot path has migrated from subthreshold arithmetic (where SIMD
lives) to three places the current commit does not touch:
(a) spike-event dispatch out of the timing wheel
(b) CSR row-lookup for post-synaptic delivery
(c) raster-write in the observer
A future commit targeting ≥ 2× saturated-regime speedup should
profile those three and change the storage layout (delay-sorted
CSR / fused delivery+observer) rather than add more SIMD lanes.
Flamegraph capture is named as follow-up but not committed here.
The shipped SIMD win is therefore NOT raw throughput but lane-safe
determinism groundwork: SoA + f32x8 is bit-deterministic against
scalar (simd_matches_scalar_on_random_batch test + ac_1_repeatability
on the SIMD path), which the ruvector-lif production kernel inherits.
Changes:
- BENCHMARK.md §0 summary table: fill SIMD-opt columns with measured
medians; change status line to cite §4.5 diagnosis
- BENCHMARK.md §4.5: replace "pending Criterion re-run" with the
measured table; replace the pre-measurement guess paragraph with
post-measurement diagnosis; add the 1.003× N=100 datapoint
- BENCHMARK.md §4.6: split saturated spikes/sec row into scalar-opt
+ SIMD-opt with actual commit-2 wallclock values
- BENCHMARK.md §9 known-limitations item 2: rewrite to cite the
measured 1.013× and point at Opt D (delay-sorted CSR) as the
next correct lever rather than restating "requires SIMD"
No code or test changes. 32/32 acceptance tests still pass.
Co-Authored-By: claude-flow <ruv@ruv.net>