vrr/kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-07 21:19:51 +00:00

Author	SHA1	Message	Date
Jiaqi Liao	05f6cede37	Merge pull request #943 from SkqLiao/main fix benchmark params for human eval benchmark	2025-03-20 18:49:34 +08:00
SkqLiao	6d4626a5d9	fix params	2025-03-20 18:48:51 +08:00
Atream	633af5d235	Update gate.py	2025-03-20 14:54:01 +08:00
SkqLiao	8cc4df980e	use DeepSeek V3 instead of R1 for benchmarking	2025-03-20 11:59:03 +08:00
Jiaqi Liao	32a91c78c1	Merge pull request #935 from SkqLiao/main Fix benchmarking slow issue on self-hosted actions	2025-03-20 10:14:37 +08:00
SkqLiao	19c824f9d0	change cpu-infer due to actual cpu cores on self-hosted server.	2025-03-20 10:10:52 +08:00
Jiaqi Liao	649489dc67	Merge pull request #931 from SkqLiao/main Add Human Eval Benchmark Test for CI/CD	2025-03-19 21:35:24 +08:00
SkqLiao	bc369b256c	add CI/CD for human eval score benchmarking	2025-03-19 21:25:21 +08:00
Atream	b453333f60	Update gate.py	2025-03-19 16:14:54 +08:00
Atream	44599229cd	Update gate.py	2025-03-19 12:16:48 +08:00
Atream	114995355b	fix-gate-compile	2025-03-19 11:27:18 +08:00
Atream	167506b779	Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml	2025-03-17 17:05:01 +08:00
Atream	c9a0c44213	Update DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml	2025-03-17 17:03:52 +08:00
liam	19f058ec9e	🔧 update multi-gpu-fp8-linear and multi-gpu marlin yaml	2025-03-17 15:08:12 +08:00
Azure-Tang	85c32fdd10	Fix rocm example yaml	2025-03-15 22:27:02 -04:00
Azure-Tang	4a31237346	fix rocm compilation	2025-03-15 12:34:03 -04:00
Atream	3934b9dfc1	rollback-triton-prefill	2025-03-15 14:21:21 +00:00
ZiWei Yuan	9b76cab1a5	Merge pull request #898 from kvcache-ai/develop-0.2.3post2 Release 0.2.3post2	2025-03-15 18:11:42 +08:00
liam	b5ef7c26dc	🔖 release v0.2.3post2	2025-03-15 18:04:10 +08:00
Azure	117a8d2f2a	fix compilation	2025-03-14 19:49:20 +00:00
SkqLiao	0f1684c28d	local chat for cicd test	2025-03-15 02:31:19 +08:00
Azure	3986e2d2cf	Merge pull request #178 from fxzjshm/hip [Feat] Port to ROCm/HIP	2025-03-15 02:31:07 +08:00
Azure-Tang	e5b001d76f	Update readme; Format code; Add example yaml.	2025-03-14 14:25:52 -04:00
Atream	a889288fc1	use compile for gate, slight performance improvement	2025-03-14 12:43:28 +00:00
Azure-Tang	ed8437413b	merge main; Add torch q8 linear	2025-03-14 05:52:07 -04:00
Atream	6f43bbe55f	fix-singleton	2025-03-14 04:16:53 +00:00
Lander-Hatsune	d166fb9f6e	cpuinfer: filter repeated backend instantiation	2025-03-10 22:03:04 +08:00
Atream	09c043d8a6	Merge pull request #842 from BITcyman/fix-openai_chat_completion [fix] thread context bug	2025-03-07 22:56:19 +08:00
BITcyman	08a8b553d6	[fix] thread context bug	2025-03-07 14:52:16 +00:00
Atream	f8c1821f1d	Update __init__.py	2025-03-07 22:08:48 +08:00
Atream	d453c320f1	fix flashinfer precision	2025-03-07 14:07:00 +00:00
BITcyman	299c4dca64	[update] support openai chat completion api	2025-03-07 08:51:09 +00:00
ZiWei Yuan	63b1c8525b	Merge pull request #820 from kvcache-ai/develop-0.2.3 Develop 0.2.3 ready to release	2025-03-06 14:46:09 +08:00
liam	8eeb6dd432	⚡ update compile option for avx512vpopcntdq	2025-03-06 12:18:04 +08:00
chenmz00	b2ba795cfd	fix: list models API Fix the list models API to match the corresponding OpenAI API format.	2025-03-05 21:49:27 +08:00
liam	9c343b4f71	🔖 release v0.2.3	2025-03-05 20:24:11 +08:00
liam	848fe8ab97	⚡ release v0.2.3	2025-03-05 20:21:04 +08:00
Azure	d7becadcf7	Merge branch 'develop-0.2.3' of https://github.com/kvcache-ai/ktransformers into develop-0.2.3	2025-03-05 09:26:23 +00:00
Azure	662c1e4c14	small fix about max new token	2025-03-05 09:25:41 +00:00
liam	dc10480ef6	⚡ add humaneval support	2025-03-04 20:54:49 +08:00
Yi Pan	01755a60c0	fix: wrong shape in KLinearMarlin.	2025-03-03 17:34:45 +08:00
Atream	8963ae7817	Update __init__.py	2025-03-03 16:49:50 +08:00
wang jiahao	48b9800790	Merge pull request #759 from 3wweiweiwu/fix_top_p_typo fix typo for top_p	2025-03-02 13:58:11 +08:00
1668068727@qq.com	7cdf8139f0	fix ollama api temperature bug	2025-03-02 13:55:26 +08:00
Wix Woo	3aa0cfc29d	fix typo for top_p	2025-03-01 20:15:36 +00:00
Atream	ca1dc1e7d1	Merge branch 'main' into main	2025-03-01 23:24:10 +08:00
宁鹏涛	71286ec1c0	Update local_chat.py 修复config.architectures[0] == "DeepseekV2ForCausalLM" or "DeepseekV3ForCausalLM" 永远为真	2025-03-01 21:52:48 +08:00
Atream	fa03ea48dd	Merge branch 'main' into feat-chunk-prefill-flashinfer	2025-03-01 11:35:09 +00:00
Atream	f35e8d41d8	support chunk prefill, support 139K context for 24G VRAM	2025-03-01 11:28:25 +00:00
liam	80e0536fb0	Merge branch 'main' of https://github.com/KMSorSMS/ktransformers into main	2025-03-01 00:12:21 +08:00

1 2 3 4 5

230 commits