koboldcpp/ggml
neha-ha a6cc43c286
ggml-webgpu: updated matrix-vector multiplication (#21738)
* merged properly, but slow q3_k and q5_k with u32 indexing

* Start on new mat-vec

* New format float paths working

* Working q4_0

* Work on remaining legacy q-types

* port k-quants to new matvec

* remove old shader

* Remove old constants, format

* remove accidental file

---------

Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local>
Co-authored-by: Reese Levine <reeselevine1@gmail.com>
2026-04-20 07:37:17 -07:00
..
cmake ggml: backend-agnostic tensor parallelism (experimental) (#19378) 2026-04-09 16:42:19 +02:00
include CUDA: manage NCCL communicators in context (#21891) 2026-04-15 15:58:40 +02:00
src ggml-webgpu: updated matrix-vector multiplication (#21738) 2026-04-20 07:37:17 -07:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt cmake: remove CMP0194 policy to restore MSVC builds (#21934) 2026-04-19 10:25:05 +03:00