mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-22 19:56:25 +00:00
* fix: batch 1 — deadlock, AVX-512 gating, Windows case-collisions
Closes #437: VectorDb::delete in ruvector-router-core acquired the stats
RwLock twice in one statement. parking_lot::RwLock is non-reentrant, so
the second .write() deadlocked against the first guard's lifetime. Bind
the guard once.
Closes #438: Gate AVX-512 intrinsics behind a new `simd-avx512` Cargo
feature (default-on). Lets downstream consumers on stable Rust 1.77–1.88
(before avx512f stabilization in 1.89) opt out without forcing nightly:
cargo build --no-default-features --features simd,storage,hnsw,api-embeddings,parallel
Runtime dispatch falls back to AVX2 + FMA when the feature is disabled.
All 4 #[target_feature(enable = "avx512f")] sites + 4 dispatch branches
updated. Both feature configurations verified to compile cleanly; all
18 simd_intrinsics tests pass.
Closes #458: Rename two pairs of case-colliding research artifacts under
docs/research/claude-code-rvsource/versions/v2.1.x/tree/react_memo_cache_sentinel/
that broke `git clone` on Windows/NTFS:
tmux.js → tmux_lc.js (TMUX.js kept)
type.js → type_lc.js (Type.js kept)
modules-manifest.json updated to match.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(brain): observable hydration + larger page-error budget (issue #464)
Bisect outcome: source diff between the 2026-04-14 working revision
(00203-brv → 22,005 memories) and current main (00204-92l → 10,227)
is whitespace-only (cargo fmt 2026-04-24 + clippy 2026-04-25). No
semantic change in store.rs, types.rs, or graph.rs. BrainMemory schema
is byte-identical. So the regression is environmental, surfacing
through a code path that has no observability today.
Two changes:
1. load_from_firestore() now emits per-collection counters so the next
deploy is diagnosable instead of a black box:
Hydrate brain_memories: considered=N accepted=M rejected_parse=K
First 5 parse errors are logged with the serde_json error so any
live schema drift surfaces immediately.
2. firestore_list MAX_PAGE_ERRORS raised 3 → 8. Hydration crosses ~75
pages of 300 docs each; 3 transient OAuth-refresh blips at the
wrong moment terminated the load at ~10K, consistent with the
reported 10,227 number. 8 still bounds runaway behaviour while
tolerating realistic blip rates.
The actual environmental cause is recoverable from one deploy with the
new logs in place. Until then, traffic stays on 00203-brv (which is
what the rollback already did).
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(router-core): HNSW result-heap inversion, prune drops oldest, k > ef_search (#430)
Three correctness bugs in crates/ruvector-router-core/src/index.rs that
together collapsed recall@1 at scale:
1. `Neighbor::Ord` is reversed so BinaryHeap acts as a min-heap. Correct
for `candidates` (pop closest unexplored first), but WRONG for the
`result` heap — peek returned the BEST candidate, so the eviction
path kept dropping the best item instead of the worst whenever the
set was full. Wrap result in `std::cmp::Reverse<Neighbor>` so
peek/pop return the furthest item (the actual eviction target). This
is the primary recall@1 fix.
2. Per-insert connection pruning used `truncate(m)`, which keeps the
OLDEST m connections — including dropping the just-pushed edge when
it landed past index m. Switch to `drain(0..len-m)` so the freshly
inserted edge always survives.
3. `search()` capped at `ef_search` regardless of caller's k. With
default ef_search=10 and k=25, results were silently 10. Raise ef
to `max(ef_search, k)` before invoking search_knn_internal.
New tests:
- `test_recall_at_1_with_biased_insertion_order`: 1024 vectors,
biased insertion order (the topology that historically exposed the
bug); asserts recall@1 ≥ 95% AND ≥ 80% distinct ids across queries.
- `test_k_exceeds_ef_search_default`: 50 vectors, default ef_search=10,
k=25; asserts 25 results returned.
All 19 router-core tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(npm): publish pipeline — dist/ guaranteed + dual ESM/CJS pi-brain (#462/#415/#376/#372)
@ruvector/pi-brain 0.1.1 → 0.1.2 (closes #462, #372):
* Add `prepack` hook so dist/ is always built before publish — tarballs
on 0.1.0/0.1.1 shipped without dist/ because `tsc` never ran.
* Add a second tsconfig (tsconfig.cjs.json) that emits CommonJS to
dist/cjs/ alongside the ESM build in dist/. A generated
dist/cjs/package.json carries {"type":"commonjs"} so Node treats
that subtree as CJS regardless of the package-level "type":"module".
* Expand the exports map with import + require + default conditions
so ruvector@0.2.x's CJS MCP server (Node 20.x, no require(ESM)
until 22.12) can require() the package. Add subpath exports for
./mcp and ./client.
* Verified locally: dist/cjs/index.js loads via `require()` and
dist/index.js loads via dynamic `import()`.
@ruvector/rvf-wasm 0.1.5 → 0.1.6 (closes #415):
* pkg/rvf_wasm.js contains ESM syntax (`import.meta.url`,
`export default`). The old exports map pointed `require` at this
file, which fails on every CJS consumer. Mark the package
explicitly `"type": "module"`, drop the `require` condition (the
`.mjs` build is the canonical one), and add a `./wasm` subpath for
consumers that want the raw bytes.
ruvector npm 0.2.25 (extends #376 mitigation):
* Add `prepack` mirroring `prepublishOnly` so `npm pack` (and CI
smoke tests that run pack) regenerate dist/ + run verify-dist.
Without this, `npm pack` skips prepublishOnly, masking
missing-dist regressions until publish.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(mcp): hooks_route_enhanced in-process — drop spawnSync (#463/#422)
The hooks_route_enhanced MCP tool shelled out via
execSync('npx ruvector hooks route-enhanced …', { timeout: 30000 })
which deterministically timed out: npx's package-resolution and
bin-launch overhead can spike past 30s on cold-cache machines, even
though the underlying work finishes in ~500ms. Callers got
deterministic `spawnSync /bin/sh ETIMEDOUT`.
The sibling hooks_route tool (reported as working in #463) uses
intel.route() directly. Mirror that pattern: call intel.route(), then
inline the same coverage-router + AST-parser signal enrichment the CLI
does. No subprocess, no timeout, no npx dependency.
Falls back gracefully when coverage-router or ast-parser aren't
installed (try/catch around each optional enhancement, same as the
CLI handler).
Co-Authored-By: claude-flow <ruv@ruv.net>
* ci: regression guard for 9 issues + fixes for 5 latent regressions it surfaced
New workflow .github/workflows/regression-guard.yml runs on every push +
PR. Each job pins one of these issue classes shut:
#437 reentrant-rwlock-double-write
Forbids `x.write()…x.(write|read)()` and `x.read()…x.write()` in
a single statement (parking_lot is non-reentrant). PCRE
backreference matches only same-lock cases.
#458 case-insensitive-collisions
Fails if `git ls-files` has any two paths that match after
lowercasing — Windows clones drop one of each silently.
#438 ruvector-core-no-avx512-builds-on-stable
cargo check ruvector-core with AND without the simd-avx512
feature so the AVX-512 gating doesn't regress.
#430 hnsw-recall-at-1
Runs the new recall@1 (biased insertion / 1024 vectors) test
and the k > ef_search test in release mode.
#462 / #376 npm-publish-pipeline
npm pack each shipped package and assert every entry referenced
by main/module/types/exports is actually inside the tarball.
#463 / #422 no-npx-execSync-in-mcp-server
Forbids execSync('npx ruvector …') anywhere in the MCP server.
#256 shell-injection-in-mcp-server
Flags any exec*/spawn* call that interpolates ${args.X} without
wrapping in sanitizeShellArg(...).
#267 no-systemtime-in-wasm-crates
Crates named *wasm* with ungated SystemTime::now / Instant::now
calls are rejected (the wasm32-unknown-unknown panic class).
#359 no-hardcoded-workspaces-paths
Devcontainer-only `/workspaces/ruvector` literals are banned
from .github/workflows, .claude/settings*, and scripts/publish/.
Adding the guard surfaced five real, already-present regressions of
these classes — fixed in this commit:
* crates/prime-radiant/src/coherence/engine.rs (3 sites):
self.stats.write().X = self.stats.read().X - 1 in the same
statement — exactly issue #437's shape on a different lock. Bind
the write guard once.
* crates/ruvector-wasm/src/lib.rs:465 (benchmark fn):
used std::time::Instant which panics on wasm32 (issue #267).
Switch to js_sys::Date::now().
* scripts/publish/publish-router-wasm.sh + check-and-publish-router-wasm.sh:
hardcoded /workspaces/ruvector paths (issue #359). Resolve REPO_ROOT
from BASH_SOURCE instead.
Co-Authored-By: claude-flow <ruv@ruv.net>
* ci: narrow scope of two guards to avoid pre-existing-debt false positives
After the first PR run two guards caught existing technical debt rather
than fresh regressions:
* no-npx-execSync-in-mcp-server flagged 10 other execSync('npx
ruvector …') sites (ast-analyze, coverage-route, graph-mincut,
security-scan, git-churn, …) which predate issue #463 and are a
distinct concern (some legitimately need subprocess). Narrow the
guard to the EXACT regression — execSync inside the
hooks_route_enhanced case body — using awk to extract that case's
body before grepping. Rename: no-npx-execSync-in-route-enhanced.
* npm-publish-pipeline failed at npm install (peer-dep ERESOLVE).
Add --legacy-peer-deps. The point of this guard is the tarball
content, not the install graph.
Co-Authored-By: claude-flow <ruv@ruv.net>
* style: cargo fmt --all (mechanical, pre-existing diffs on main + my new code)
Workspace had 11 files with rustfmt diffs predating this branch, plus
one new diff in store.rs from the hydration counters added in
|
||
|---|---|---|
| .. | ||
| benchmark | ||
| build | ||
| ci | ||
| deploy | ||
| lib | ||
| patches/hnsw_rs | ||
| publish | ||
| test | ||
| training | ||
| validate | ||
| analyze-evolution.js | ||
| analyze-ham10000.js | ||
| build-solver.sh | ||
| check_brain_status.sh | ||
| claude-code-decompile.sh | ||
| claude-code-rvf-corpus.sh | ||
| create-brainpedia.py | ||
| deploy-crawl-phase1.sh | ||
| deploy-dragnes.sh | ||
| deploy-gemini-agents.sh | ||
| deploy-wet-job.sh | ||
| deploy_brain_services.sh | ||
| deploy_trainer.sh | ||
| discover_and_train.sh | ||
| gemini-agents.js | ||
| generate-rvf-manifest.py | ||
| historical-crawl-import.sh | ||
| publish-rvf.sh | ||
| README.md | ||
| rebuild-all-versions.mjs | ||
| run_mincut_bench.sh | ||
| seed-brain-all.py | ||
| seed-brain.rs | ||
| seed-specialized.py | ||
| setup-gcs-examples.sh | ||
| sql-audit-v3.sql | ||
| swarm_train_15.sh | ||
| sync-lockfile.sh | ||
| train-lora.py | ||
| train_brain.sh | ||
| training_orchestrator.sh | ||
| upvote_memories.py | ||
| vote-boost.py | ||
| wet-filter-inject.js | ||
| wet-full-import.sh | ||
| wet-job.yaml | ||
| wet-orchestrate.sh | ||
| wet-processor.sh | ||
RuVector Automation Scripts
This directory contains automation scripts organized by purpose.
📁 Directory Structure
scripts/
├── README.md # This file
├── benchmark/ # Performance benchmarking
├── build/ # Build utilities
├── ci/ # CI/CD automation
├── deploy/ # Deployment scripts
├── patches/ # Patch files
├── publish/ # Package publishing
├── test/ # Testing scripts
└── validate/ # Validation & verification
🚀 Deployment
Scripts for deploying to production.
| Script | Description |
|---|---|
deploy/deploy.sh |
Comprehensive deployment (crates.io + npm) |
deploy/test-deploy.sh |
Test deployment without publishing |
deploy/DEPLOYMENT.md |
Full deployment documentation |
deploy/DEPLOYMENT-QUICKSTART.md |
Quick deployment guide |
Usage:
# Full deployment
./scripts/deploy/deploy.sh
# Dry run
./scripts/deploy/deploy.sh --dry-run
# Test deployment
./scripts/deploy/test-deploy.sh
📦 Publishing
Scripts for publishing packages to registries.
| Script | Description |
|---|---|
publish/publish-all.sh |
Publish all packages |
publish/publish-crates.sh |
Publish Rust crates to crates.io |
publish/publish-cli.sh |
Publish CLI package |
publish/publish-router-wasm.sh |
Publish router WASM package |
publish/check-and-publish-router-wasm.sh |
Check and publish router WASM |
Usage:
# Set credentials first
export CRATES_API_KEY="your-crates-io-token"
export NPM_TOKEN="your-npm-token"
# Publish all
./scripts/publish/publish-all.sh
# Publish crates only
./scripts/publish/publish-crates.sh
📊 Benchmarking
Performance benchmarking scripts.
| Script | Description |
|---|---|
benchmark/run_benchmarks.sh |
Run core benchmarks |
benchmark/run_llm_benchmarks.sh |
Run LLM inference benchmarks |
Usage:
# Run core benchmarks
./scripts/benchmark/run_benchmarks.sh
# Run LLM benchmarks
./scripts/benchmark/run_llm_benchmarks.sh
🧪 Testing
Testing and validation scripts.
| Script | Description |
|---|---|
test/test-wasm.mjs |
Test WASM bindings |
test/test-graph-cli.sh |
Test graph CLI commands |
test/test-all-graph-commands.sh |
Test all graph commands |
test/test-docker-package.sh |
Test Docker packaging |
Usage:
# Test WASM
node ./scripts/test/test-wasm.mjs
# Test graph CLI
./scripts/test/test-graph-cli.sh
✅ Validation
Package and build verification scripts.
| Script | Description |
|---|---|
validate/validate-packages.sh |
Validate package configs |
validate/validate-packages-simple.sh |
Simple package validation |
validate/verify-paper-impl.sh |
Verify paper implementation |
validate/verify_hnsw_build.sh |
Verify HNSW build |
Usage:
# Validate packages
./scripts/validate/validate-packages.sh
# Verify HNSW
./scripts/validate/verify_hnsw_build.sh
🔄 CI/CD
Continuous integration scripts.
| Script | Description |
|---|---|
ci/ci-sync-lockfile.sh |
Auto-fix lock files in CI |
ci/sync-lockfile.sh |
Sync package-lock.json |
ci/install-hooks.sh |
Install git hooks |
Usage:
# Install git hooks (recommended)
./scripts/ci/install-hooks.sh
# Sync lockfile
./scripts/ci/sync-lockfile.sh
🛠️ Build
Build utility scripts located in build/.
🩹 Patches
Patch files for dependencies located in patches/.
🚀 Quick Start
For Development
-
Install git hooks (recommended):
./scripts/ci/install-hooks.sh -
Run tests:
./scripts/test/test-wasm.mjs
For Deployment
-
Set credentials:
export CRATES_API_KEY="your-crates-io-token" export NPM_TOKEN="your-npm-token" -
Dry run first:
./scripts/deploy/deploy.sh --dry-run -
Deploy:
./scripts/deploy/deploy.sh
🔐 Security
Never commit credentials! Always use environment variables or .env file.
See deploy/DEPLOYMENT.md for security best practices.