Commit graph

2 commits

Author SHA1 Message Date
rUv
7407f78230 refactor(training): use ruvllm-native tooling instead of llama.cpp
- Rewrite run_calibration.py to use gguf Python package + llama-cpp-python
  prebuilt wheels instead of compiling llama.cpp from source
- Simplify Dockerfile: single-stage, pip install only, no CUDA compilation
  (build time: ~5min vs 20+min)
- Update ADR-129 with tooling decision section explaining ruvllm-native choice
- Remove llama-imatrix and llama-quantize binary dependencies

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 13:40:14 +00:00
rUv
f12e6c1584 feat: implement ADR-129 training pipeline and TurboQuant sidecar infra
Training tooling:
- release_gate.py: Automated 7-gate ship/no-ship checker (G1-G7)
- export_training_data.py: Dataset export with governance (schema,
  dedup, quality scoring, contamination check)
- contamination_check.py: 13-gram eval contamination detection
- run_calibration.py: Phase 1 imatrix + TurboQuant profiling
- run_sft.py: Phase 2 LoRA SFT + DPO training
- deploy_training.sh: Cloud Run job creation + Vertex AI setup
- Dockerfile: GPU training image (transformers + peft + trl)

Rust infrastructure:
- turboquant_profile.rs: .turboquant.json sidecar config loading,
  per-layer TQ config discovery, default profiles

Ref: ADR-129, #310

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 02:27:32 +00:00