Commit graph

12 commits

Author SHA1 Message Date
rUv
36f2599774 feat(training): source map extraction + v2 model (83.67% val accuracy)
- Extract 14,198 training pairs from 6,941 source maps in node_modules
- Train v2 model (4-layer, 192-dim, 6-head transformer, 1.9M params)
- Val accuracy: 83.67% (up from 75.72%), exact match: 12.3% (up from 0.1%)
- Export weights.bin (7.3MB) for Rust runtime inference
- Add decompiler dashboard (React + Tailwind + Vite)
- Add runnable RVF (7,350 vectors, 49 segments, witness chain)
- Update evaluate-model.py to support configurable model architectures
- All 13 Rust tests pass, all 45 RVF files have valid SFVR headers

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-03 04:57:47 +00:00
rUv
db988c90e5 feat(decompiler): pure Rust transformer inference — zero ML dependencies
transformer.rs (416 lines): complete forward pass in std Rust
- Multi-head self-attention with padding mask
- GELU activation, layer norm, softmax
- Loads weights from simple binary format (2.6MB)
- Zero external deps — just f32 math

neural.rs: Backend enum (Transformer/ONNX/Stub)
- .bin → pure Rust (always available, no feature flag)
- .onnx → ort (behind neural feature flag)
- .gguf/.rvf → stub for future RuvLLM integration

export-weights-bin.py: PyTorch → binary weight dump
- 42 tensors, 673,152 parameters, 2.6MB output

56 tests passing, zero warnings.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-03 02:41:47 +00:00
rUv
d5b3be56b8 feat(decompiler): ONNX Runtime neural inference + 8,226 training pairs
Neural inference (behind `neural` feature flag):
- Full ONNX Runtime integration via `ort` crate
- Loads .onnx models, encodes context as byte tensors
- Softmax confidence scoring, character-level decoding
- Falls back to pattern-based when model unavailable

Training data expansion: 1,602 → 8,226 pairs
- 200+ function names, 90+ class names, 170+ variable names
- 16 minifier styles, 5 context variations per entry
- Extracted identifier dictionaries (381 lines)

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-03 02:30:41 +00:00
rUv
84e1886451 feat(decompiler): GPU training pipeline for neural name inference (ADR-136)
Training pipeline:
- generate-deobfuscation-data.mjs: 1,200+ training pairs from fixtures + synthetic
- train-deobfuscator.py: 6M param transformer (3 layers, 4 heads, 128 embed)
- export-to-rvf.py: PyTorch → ONNX → GGUF Q4 → RVF OVERLAY
- launch-gpu-training.sh: GCloud L4 GPU (--local, --cloud-run, --spot)
- Dockerfile.deobfuscator: pytorch/pytorch:2.2.0-cuda12.1

Decompiler integration:
- NeuralInferrer behind optional `neural` feature flag
- model_path in DecompileConfig
- Falls through to pattern-based when model unavailable
- Zero binary impact without feature flag

All tests pass, cargo check clean with and without neural feature.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-03 02:08:19 +00:00
rUv
385eb17d08 feat(training): ADR-129 RuvLTRA training pipeline — calibration, SFT, benchmarks, HF publishing
* docs(adr): update ADR-129 — all phases executing, Phase 4 publishing complete

- Phase 1 Calibration: Complete (all 4 models, benchmarks uploaded to HF)
- Phase 2 SFT: Executing on L4 GPU (rank-16, 2 epochs)
- Phase 3 Benchmarks: Executing (release gates + L4 benchmark job)
- Phase 4 Publishing: Complete (TQ configs + benchmarks + README updates on HF)

Benchmark results (L4 GPU):
- ruvltra-small: 75.4 tok/s
- ruvltra-medium: 62.6 tok/s
- ruvltra-claude-code: 67.1 tok/s

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: add training pipeline and release gates to root README

Add Continuous Training & Optimization section (ADR-129) to the
capabilities table: nightly training, 7-gate release checks,
TurboQuant profiling, training corpus.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(training): include training corpus in Docker build context

The SFT job failed because merged_corpus.jsonl was not in the Docker
image. Copy it to scripts/training/data/training/ so it's included
in the COPY . /app/ step.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(training): handle raw text corpus format in SFT pipeline

The training corpus uses a flat 'text' field (brain memories, ADRs)
rather than chat messages or Alpaca instruction format. Add handler
that converts raw text to completion-style messages for SFT.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 07:58:07 -04:00
rUv
ace0b276d2 fix(training): use torch 2.5.1+cu124 (2.3.1 unavailable on cu124 index)
Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 14:26:28 +00:00
rUv
dc53af1640 fix(training): add libgomp1, optimize Dockerfile for cache + CUDA wheels
- Add libgomp1 (required by llama-cpp-python OpenMP)
- Use PyTorch cu124 index for proper CUDA wheel
- Set default CMD with --model-id for Cloud Run execution
- Consolidate pip installs for Docker layer cache efficiency

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 14:20:54 +00:00
rUv
7f3e0ea1dd fix(training): use prebuilt llama-cpp-python CUDA wheel
The pip install of llama-cpp-python from source requires ninja + cmake
for CUDA compilation. Use the prebuilt wheel from the cu124 index instead.
Falls back to source install, then transformers-only mode.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 13:44:48 +00:00
rUv
7407f78230 refactor(training): use ruvllm-native tooling instead of llama.cpp
- Rewrite run_calibration.py to use gguf Python package + llama-cpp-python
  prebuilt wheels instead of compiling llama.cpp from source
- Simplify Dockerfile: single-stage, pip install only, no CUDA compilation
  (build time: ~5min vs 20+min)
- Update ADR-129 with tooling decision section explaining ruvllm-native choice
- Remove llama-imatrix and llama-quantize binary dependencies

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 13:40:14 +00:00
rUv
58f620ca28 fix(training): use 3600s timeout for GPU Cloud Run jobs
GPU-enabled Cloud Run jobs have a maximum timeout of 1 hour.
The previous 7200s (2hr) setting was rejected by the API.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 12:21:58 +00:00
rUv
737b18e772 feat: add nightly continuous learning pipeline (ADR-129)
- nightly_train.sh: 5-phase nightly pipeline (export brain learnings,
  contamination check, incremental LoRA, release gates, push to HF)
- Updated deploy_training.sh with nightly Cloud Run job + scheduler
- Updated ADR-129 with nightly continuous learning section

Schedule: daily 03:00 UTC, ~$4/day, skips if <10 new records.
All 7 release gates must pass before publishing.

Ref: #310

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 02:30:25 +00:00
rUv
f12e6c1584 feat: implement ADR-129 training pipeline and TurboQuant sidecar infra
Training tooling:
- release_gate.py: Automated 7-gate ship/no-ship checker (G1-G7)
- export_training_data.py: Dataset export with governance (schema,
  dedup, quality scoring, contamination check)
- contamination_check.py: 13-gram eval contamination detection
- run_calibration.py: Phase 1 imatrix + TurboQuant profiling
- run_sft.py: Phase 2 LoRA SFT + DPO training
- deploy_training.sh: Cloud Run job creation + Vertex AI setup
- Dockerfile: GPU training image (transformers + peft + trl)

Rust infrastructure:
- turboquant_profile.rs: .turboquant.json sidecar config loading,
  per-layer TQ config discovery, default profiles

Ref: ADR-129, #310

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 02:27:32 +00:00