kvcache-ai-ktransformers/ktransformers/util
2024-08-22 15:10:06 +08:00
..
cuda_graph_runner.py [ADD] support multi-gpu qlen>1 q5_k 2024-08-12 11:41:26 +00:00
custom_gguf.py [fix] f16 dequantize device ignored 2024-08-22 15:10:06 +08:00
textstream.py Initial commit 2024-07-27 16:06:58 +08:00
utils.py [feature] experts can be injected using CPUInfer 2024-08-14 16:10:54 +08:00