vrr/kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-14 09:09:42 +00:00

Author	SHA1	Message	Date
Atream	44599229cd	Update gate.py	2025-03-19 12:16:48 +08:00
Atream	114995355b	fix-gate-compile	2025-03-19 11:27:18 +08:00
Atream	167506b779	Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml	2025-03-17 17:05:01 +08:00
Atream	c9a0c44213	Update DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml	2025-03-17 17:03:52 +08:00
liam	19f058ec9e	🔧 update multi-gpu-fp8-linear and multi-gpu marlin yaml	2025-03-17 15:08:12 +08:00
Azure-Tang	85c32fdd10	Fix rocm example yaml	2025-03-15 22:27:02 -04:00
Azure-Tang	4a31237346	fix rocm compilation	2025-03-15 12:34:03 -04:00
Atream	3934b9dfc1	rollback-triton-prefill	2025-03-15 14:21:21 +00:00
ZiWei Yuan	9b76cab1a5	Merge pull request #898 from kvcache-ai/develop-0.2.3post2 Release 0.2.3post2	2025-03-15 18:11:42 +08:00
liam	b5ef7c26dc	🔖 release v0.2.3post2	2025-03-15 18:04:10 +08:00
Azure	117a8d2f2a	fix compilation	2025-03-14 19:49:20 +00:00
SkqLiao	0f1684c28d	local chat for cicd test	2025-03-15 02:31:19 +08:00
Azure	3986e2d2cf	Merge pull request #178 from fxzjshm/hip [Feat] Port to ROCm/HIP	2025-03-15 02:31:07 +08:00
Azure-Tang	e5b001d76f	Update readme; Format code; Add example yaml.	2025-03-14 14:25:52 -04:00
Atream	a889288fc1	use compile for gate, slight performance improvement	2025-03-14 12:43:28 +00:00
Azure-Tang	ed8437413b	merge main; Add torch q8 linear	2025-03-14 05:52:07 -04:00
Atream	6f43bbe55f	fix-singleton	2025-03-14 04:16:53 +00:00
Lander-Hatsune	d166fb9f6e	cpuinfer: filter repeated backend instantiation	2025-03-10 22:03:04 +08:00
Yuhao Tsui	e5694f91c0	Merge branch 'kvcache-ai:main' into main	2025-03-10 09:10:28 +08:00
Atream	09c043d8a6	Merge pull request #842 from BITcyman/fix-openai_chat_completion [fix] thread context bug	2025-03-07 22:56:19 +08:00
BITcyman	08a8b553d6	[fix] thread context bug	2025-03-07 14:52:16 +00:00
Atream	f8c1821f1d	Update __init__.py	2025-03-07 22:08:48 +08:00
Atream	d453c320f1	fix flashinfer precision	2025-03-07 14:07:00 +00:00
BITcyman	299c4dca64	[update] support openai chat completion api	2025-03-07 08:51:09 +00:00
ZiWei Yuan	63b1c8525b	Merge pull request #820 from kvcache-ai/develop-0.2.3 Develop 0.2.3 ready to release	2025-03-06 14:46:09 +08:00
liam	8eeb6dd432	⚡ update compile option for avx512vpopcntdq	2025-03-06 12:18:04 +08:00
Yuhao Tsui	d050d8655f	Update completions.py	2025-03-06 11:16:33 +08:00
chenmz00	b2ba795cfd	fix: list models API Fix the list models API to match the corresponding OpenAI API format.	2025-03-05 21:49:27 +08:00
liam	9c343b4f71	🔖 release v0.2.3	2025-03-05 20:24:11 +08:00
liam	848fe8ab97	⚡ release v0.2.3	2025-03-05 20:21:04 +08:00
Azure	d7becadcf7	Merge branch 'develop-0.2.3' of https://github.com/kvcache-ai/ktransformers into develop-0.2.3	2025-03-05 09:26:23 +00:00
Azure	662c1e4c14	small fix about max new token	2025-03-05 09:25:41 +00:00
liam	dc10480ef6	⚡ add humaneval support	2025-03-04 20:54:49 +08:00
Yi Pan	01755a60c0	fix: wrong shape in KLinearMarlin.	2025-03-03 17:34:45 +08:00
Atream	8963ae7817	Update __init__.py	2025-03-03 16:49:50 +08:00
wang jiahao	48b9800790	Merge pull request #759 from 3wweiweiwu/fix_top_p_typo fix typo for top_p	2025-03-02 13:58:11 +08:00
1668068727@qq.com	7cdf8139f0	fix ollama api temperature bug	2025-03-02 13:55:26 +08:00
Wix Woo	3aa0cfc29d	fix typo for top_p	2025-03-01 20:15:36 +00:00
Atream	ca1dc1e7d1	Merge branch 'main' into main	2025-03-01 23:24:10 +08:00
宁鹏涛	71286ec1c0	Update local_chat.py 修复config.architectures[0] == "DeepseekV2ForCausalLM" or "DeepseekV3ForCausalLM" 永远为真	2025-03-01 21:52:48 +08:00
Atream	fa03ea48dd	Merge branch 'main' into feat-chunk-prefill-flashinfer	2025-03-01 11:35:09 +00:00
Atream	f35e8d41d8	support chunk prefill, support 139K context for 24G VRAM	2025-03-01 11:28:25 +00:00
liam	80e0536fb0	Merge branch 'main' of https://github.com/KMSorSMS/ktransformers into main	2025-03-01 00:12:21 +08:00
liam	8ddc990668	⚡ fix server cache lens	2025-03-01 00:09:57 +08:00
Shuaiyi	a34a25d5cc	Delete unused code	2025-02-27 13:18:19 +00:00
wang jiahao	7a19f3b781	Merge pull request #721 from kvcache-ai/fix_temperature fix temperature	2025-02-27 21:01:21 +08:00
qiyuxinlin	22df52e94e	fix temperature	2025-02-27 21:00:44 +08:00
Atream	85e2cc7bf4	Merge pull request #719 from kvcache-ai/fix-use-generation-json use generation config from json file in official repo	2025-02-27 19:49:41 +08:00
Atream	e645d84794	use generation config from json file in official repo	2025-02-27 11:48:34 +00:00
lazymio	b121ca4df8	Fix according to upstream changes	2025-02-27 18:11:35 +08:00

1 2 3 4 5 ...

323 commits