vrr/kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-09 05:54:06 +00:00

Author	SHA1	Message	Date
wkgcass	b2bff17775	fix numa cpu distribution The numa node location would be calculated based on the total number of worker threads. So we should always use the actual number of threads instead of using a min() op.	2025-02-26 14:49:57 +08:00
Xiaodong Ye	18b1d18367	musa: support bf16 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-23 10:19:19 +08:00
Xiaodong Ye	2207f6cd14	feat: Support Moore Threads GPU Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-19 18:26:55 +08:00
liam	098602b08f	⚡ v0.2 ongoing	2025-02-09 22:41:14 +08:00
chenht2022	14869b55ad	Adapt Windows	2024-10-09 11:08:32 +00:00
Yap Sok Ann	6666d62237	Use cond var to avoid busy loop	2024-09-11 16:10:54 +07:00
chenxl	4d1d561d28	[feature] release 0.1.3	2024-08-28 16:11:43 +00:00
chenxl	650c368c18	Merge remote-tracking branch 'upstream/main' into develop-0.1.2	2024-08-12 12:31:49 +00:00
Atream	3c675af61a	Update task_queue.h	2024-08-12 20:06:19 +08:00
chenxl	f5f79f5c0e	[ADD] support multi-gpu qlen>1 q5_k	2024-08-12 11:41:26 +00:00
chenht2022	c1cc7d2cd2	1) Linear and MLP operators support qlen>1; 2) All operators now share a single memory buffer; 3) Refactor CPUInfer submit/sync logic.	2024-08-08 09:04:36 +00:00
chenxl	1d9d397525	fix some bug in compile in linux	2024-08-08 15:34:19 +08:00
Atream	0a2fd52cea	support windows support q4_0 and q5_0 dequant on cpu Add CopyRight from pygguf(It was added before, but disappear after merge). Add some TODO in the code.	2024-08-08 15:34:02 +08:00
chenxl	18c42e67df	Initial commit	2024-07-27 16:06:58 +08:00

14 commits