ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332)

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-06-01 14:29:33 +00:00

* Add block interleaving support for Q4_K quantization

* Remove whitespaces and fix CI/CD issues

* Update pointer of bsums from int16_t to const int16_t

* Add vector version of quantize_q8_K_4x8 function

* Update code formatting based on review comments

This commit is contained in:

Srihari-mcw

2025-03-20 17:05:34 +05:30

• committed by

GitHub

parent 732b5fbf5e

commit 3d82dbcbce

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

1 changed files with 1493 additions and 12 deletions

1505

ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

View file

File diff suppressed because it is too large Load diff

Rows
Columns

ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332)

1505 ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp View file

1505

ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

View file