kvcache-ai-ktransformers/ktransformers/ktransformers_ext/cuda
2025-02-22 09:00:09 +00:00
..
custom_gguf optimize gguf dequant, save mem, support Q2_K 2025-02-22 06:13:01 +00:00
gptq_marlin [ADD] support multi-gpu qlen>1 q5_k 2024-08-12 11:41:26 +00:00
binding.cpp fix merge bug, this branch also padding Marlin 2025-02-22 09:00:09 +00:00
setup.py [ADD] support multi-gpu qlen>1 q5_k 2024-08-12 11:41:26 +00:00
test_dequant.py optimize gguf dequant, save mem, support Q2_K 2025-02-22 06:13:01 +00:00