ruvector/crates/ruvector-cnn/docs
Reuven 9f80f7298f feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization
Complete implementation of INT8 quantization for ruvector-cnn:

Phase 1 - Core Infrastructure:
- QuantizationParams, QuantizationScheme, QuantizationMode
- QuantizedTensor<i8> with quantize/dequantize methods
- CalibrationMethod (MinMax, Percentile, MSE, Entropy)
- 34 unit tests passing

Phase 2 - INT8 Kernels:
- Scalar reference: conv2d, depthwise_conv2d, matmul, requantize
- AVX2 SIMD: _mm256_maddubs_epi16 for 2-4x speedup
- ARM NEON: vmull_s8, vpadalq_s16 for 2-3x speedup
- WASM SIMD128: i8x16 operations for 1.5-2x speedup

Phase 3 - Graph Rewrite Passes:
- GR-1: BatchNorm fusion into Conv weights
- GR-2: Zero-point correction pre-computation
- GR-3: Q/DQ node insertion at FP32/INT8 boundaries
- GR-4: ReLU/HardSwish fusion with LUT

Phase 4 - Quantized Layers:
- QuantizedConv2d with per-channel quantization
- QuantizedDepthwiseConv2d for MobileNet
- QuantizedLinear for FC layers
- QuantizedMaxPool2d/AvgPool2d
- QuantizedResidualAdd with scale alignment

Phase 6 - Tests & Benchmarks:
- quality_validation.rs: cosine similarity ≥0.995
- acceptance_gates.rs: 7 ADR-091 gates
- kernel_equivalence.rs: SIMD vs scalar validation
- int8_bench.rs: Criterion benchmarks

Performance targets:
- 2.5x latency improvement (MobileNetV3)
- 4x memory reduction
- <1% accuracy degradation

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-12 14:45:52 -04:00
..
ADR-091-PHASE-2.1-COMPLETE.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00
ADR-091-PHASE-3-IMPLEMENTATION.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00
ADR-091-PHASE-4-IMPLEMENTATION.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00
ADR-091_PHASE_6_SUMMARY.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00
GRAPH_REWRITE_SUMMARY.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00
INT8_KERNELS_IMPLEMENTATION.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00
INT8_QUANTIZATION_DESIGN.md docs(cnn): add INT8 quantization design document 2026-03-12 10:22:35 -04:00
QUANTIZED_LAYERS_USAGE.md feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization 2026-03-12 14:45:52 -04:00