kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-10 15:29:39 +00:00

History

wkgcass b2bff17775 fix numa cpu distribution The numa node location would be calculated based on the total number of worker threads. So we should always use the actual number of threads instead of using a min() op.		2025-02-26 14:49:57 +08:00
..
bench	[feature] release 0.1.3	2024-08-28 16:11:43 +00:00
cmake	[feature] support python 310 and multi instruction	2024-07-31 13:58:17 +00:00
cpu_backend	fix numa cpu distribution	2025-02-26 14:49:57 +08:00
cuda	Ensure backward compatibility with Torch 2.2	2025-02-24 21:55:30 +08:00
examples	[feature] release 0.1.3	2024-08-28 16:11:43 +00:00
operators	optimize gguf dequant, save mem, support Q2_K	2025-02-22 06:13:01 +00:00
triton	fix fp8 multi gpu; update FQA	2025-02-25 10:52:29 +00:00
CMakeLists.txt	feat: Support Moore Threads GPU	2025-02-19 18:26:55 +08:00
ext_bindings.cpp	[feature] release 0.1.3	2024-08-28 16:11:43 +00:00