ruvector

mirror of https://github.com/ruvnet/RuVector.git synced 2026-05-25 23:24:03 +00:00

Author	SHA1	Message	Date
ruvnet	96d8fdc172	chore(workspace): cargo fmt — mechanical whitespace fix across 427 files Pre-existing rustfmt drift across the workspace was blocking CI's `Rustfmt` check on PR #373 + PR #377. Running plain `cargo fmt` reformats 427 files; no semantic changes, no logic changes, no behavior changes — just what rustfmt already wanted. None of the touched files are in ruvector-rabitq, ruvector-rulake, or the new mirror-rulake workflow — those were already fmt-clean per the per-crate checks on commits `5a4b0d782`, `5f32fd450`, `f5003bc7b`. Drift is in cognitum-gate-kernel, mcp-brain, nervous-system, prime-radiant, ruqu-core, ruvector-attention, ruvector-mincut, ruvix/* and sub-crates, plus several examples. Verified post-fmt: cargo check -p ruvector-rabitq -p ruvector-rulake → clean cargo clippy -p ... -p ... --all-targets -- -D warnings → clean cargo test -p ... -p ... --release → 82/82 pass Intentionally does NOT touch clippy drift — many more warnings (missing docs, precision-loss casts, too-many-args, unsafe-safety- docs) spread across unrelated crates, each category a cross-cutting design decision that deserves its own review. With this commit Rustfmt CI goes green on PR #373 and PR #377. Clippy will still fail — that's honest pre-existing state for a separate dedicated PR. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-04-24 10:44:02 -04:00
rUv	0092507646	feat(decompiler): LLM weight decompiler + API prober (ADR-138) Model weight decompilation: - GGUF v2/v3 parser (self-contained, no ruvllm dep) - Safetensors JSON header parser - Architecture inference from tensor shapes (GQA, FFN, vocab) - Tokenizer extraction, quantization detection - Witness chain for model provenance - 6 integration tests, behind `model` feature flag API probing (live tested): - Probes Claude, OpenAI, Gemini APIs without weight access - Detects: streaming, tools, system_prompt, vision capabilities - Measures: latency, tokens/sec, tokenizer type - Model fingerprinting via self-identification + math tests - Verified: Gemini 2.0 Flash (556ms, 46 tok/s, all caps detected) CLI: npx ruvector decompile --model file.gguf npx ruvector decompile --api gemini-2.0-flash 78 Rust tests passing. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-04-03 19:08:30 +00:00
rUv	6fe406aae5	feat(decompiler): graph-derived hierarchical folder structure (Phase 7) Folder structure emerges from the dependency graph — not hardcoded keywords. tree.rs (362 lines): - Agglomerative clustering on inter-module edge weights - TF-IDF naming: most discriminative strings name each folder - Recursive depth control (configurable max_depth, min_folder_size) inferrer.rs: infer_folder_name() with TF-IDF scoring types.rs: ModuleTree struct, hierarchical config options run_on_cli.rs: --output-dir prints folder tree to disk module-splitter.js: JS-side tree builder with same approach Key principle: tightly-coupled code shares a folder, MinCut boundaries become folder boundaries, names from context. 59 tests passing, zero warnings. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-04-03 03:26:54 +00:00
rUv	84e1886451	feat(decompiler): GPU training pipeline for neural name inference (ADR-136) Training pipeline: - generate-deobfuscation-data.mjs: 1,200+ training pairs from fixtures + synthetic - train-deobfuscator.py: 6M param transformer (3 layers, 4 heads, 128 embed) - export-to-rvf.py: PyTorch → ONNX → GGUF Q4 → RVF OVERLAY - launch-gpu-training.sh: GCloud L4 GPU (--local, --cloud-run, --spot) - Dockerfile.deobfuscator: pytorch/pytorch:2.2.0-cuda12.1 Decompiler integration: - NeuralInferrer behind optional `neural` feature flag - model_path in DecompileConfig - Falls through to pattern-based when model unavailable - Zero binary impact without feature flag All tests pass, cargo check clean with and without neural feature. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-04-03 02:08:19 +00:00
rUv	1c8bec729e	fix(decompiler): review fixes, benchmarks, real-world validation Bugs fixed: - assert!() in witness verification → proper Err return - Swapped property-to-name mappings in inferrer - Escape sequences in beautifier indent_braces - Doc comments: SHAKE-256 → SHA3-256 (correct hash function) Performance: - Cached regex compilation via once_cell::Lazy (7 regexes) - HashSet for O(1) lookups (was Vec O(n)) - Optimized hex encoding with lookup table - Added ES module export support Benchmarks (criterion): - 1KB: 58μs parse, 230μs pipeline - 10KB: 581μs parse, 1.7ms pipeline - 100KB: 5.4ms parse, 26.2ms pipeline - 1MB: 53.5ms parse (linear scaling) Real-world: Claude Code cli.js (10.53 MB): - 27,477 declarations, 601,653 edges - 1,344 HIGH confidence names (5.2%) - 5,843 MEDIUM confidence names (22.8%) - 24.6s total pipeline time OSS fixtures: lodash, express, redux with self-learning loop Co-Authored-By: claude-flow <ruv@ruv.net>	2026-04-03 00:47:13 +00:00
rUv	19578402e3	feat(decompiler): MinCut-based JS decompiler with witness chains (ADR-135) 5-phase decompilation pipeline: 1. Regex-based parser extracts declarations, strings, property accesses 2. MinCut graph partitioning detects original module boundaries 3. Name inference with confidence scoring (HIGH/MEDIUM/LOW) 4. V3 source map generation (browser DevTools compatible) 5. SHAKE-256 Merkle witness chains for cryptographic provenance Ground-truth validation: - 5 test fixtures (Express, MCP Server, React, Multi-Module, Tools) - Self-learning feedback loop via learn_from_ground_truth() - 14 tests, all passing SOTA research document covering JSNice, DeGuard, cross-version fingerprinting, and RuVector's unique advantage combining MinCut, IIT Phi, SONA, and HNSW for decompilation. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-04-03 00:04:36 +00:00

6 commits