Commit graph

862 commits

Author SHA1 Message Date
qiyuxinlin
74bb7fdcf6 Merge remote-tracking branch 'dev/support-amx-2' 2025-04-28 18:46:51 +00:00
qiyuxinlin
be4b27e841 update doc 2025-04-28 18:24:15 +00:00
djw
33cbd47086 support qwen3 2025-04-28 18:15:35 +00:00
djw
68c2b2e6e6 support qwen3 2025-04-28 18:02:07 +00:00
djw
0da3792b27 support qwen3 2025-04-28 14:05:24 +00:00
djw
3f9bbf1181 support qwen3, dont speak human language 2025-04-28 08:44:47 +00:00
Chengyu Qiu
ba92cf1a3b
Merge pull request #1204 from emmanuel-ferdman/main
Change install.md and Update reference to optimize rules directory
2025-04-28 15:10:14 +08:00
Chen Hongtao
27e3b2b98d
Merge pull request #1202 from PC-DOS/main
Replaced Chinese comments in iqk_mul_mat.inc with English to avoid breaking MSVC compiling
2025-04-27 14:36:37 +08:00
Emmanuel Ferdman
cb80cb31a6
Update reference to optimize rules directory
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-04-26 01:43:18 -07:00
PC-DOS
5379e68f19 Replaced Chinese comments with English to avoid breaking MSVC compiling 2025-04-26 03:20:23 +08:00
PC-DOS
5b4d9c41ac Replaced Chinese comments with English to avoid breaking MSVC compiling 2025-04-26 03:18:01 +08:00
chenht2022
f3d842a0ca support AMX 2025-04-25 14:47:16 +00:00
ZiWei Yuan
a7b995365e
Merge pull request #1197 from jizhilong/jizhilong-patch-1
Some checks failed
Book-CI / test (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
fix: make cpufeature a local import
2025-04-25 14:50:58 +08:00
liam
82920e7943 :spakles: update requirements for cpufeature 2025-04-25 06:49:56 +00:00
wang jiahao
b90362b5e6
Merge pull request #1198 from kvcache-ai/fix-max_new_tokens
Some checks are pending
Book-CI / test (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
fix load default max_new_tokens
2025-04-25 12:22:41 +08:00
qiyuxinlin
7af83f9efb fix load default max_new_tokens 2025-04-25 04:20:12 +00:00
jzl
9a759e9fb8
fix: make cpufeature a local import 2025-04-25 11:42:38 +08:00
Atream
67042d11e3
Merge pull request #1193 from kvcache-ai/fix-chat-template-encoding
Some checks are pending
Book-CI / test (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
fix chat template encoding
2025-04-23 22:44:46 -06:00
Atream
46493789eb
fix chat template encoding 2025-04-24 12:44:16 +08:00
wang jiahao
449a83dff6
Merge pull request #1183 from kvcache-ai/check-para
Some checks are pending
Book-CI / test (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
add check-para
2025-04-23 16:27:18 +08:00
Alisehen
f7d939313b Merge remote-tracking branch 'origin/main' into check-para 2025-04-23 02:40:14 +00:00
Alisehen
99540ad01f add check parameters 2025-04-23 02:38:43 +00:00
wang jiahao
7e4813e8ad
Merge pull request #1184 from kvcache-ai/update_param
Some checks failed
Book-CI / test (push) Failing after 3s
Deploy / deploy (ubuntu-latest) (push) Failing after 2s
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
change test
2025-04-22 20:55:11 +08:00
qiyuxinlin
3a044e6b14 change test 2025-04-22 12:50:39 +00:00
Alisehen
c995bdbbfa add check-para 2025-04-22 09:30:08 +00:00
wang jiahao
739358789e
Merge pull request #1182 from kvcache-ai/fix-kill-balance_serve
Some checks failed
Book-CI / test (push) Failing after 5s
Deploy / deploy (ubuntu-latest) (push) Failing after 3s
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
kill serve lead to kill sched and engine
2025-04-22 17:28:06 +08:00
qiyuxinlin
4f9950e30c kill serve lead to kill sched and engine 2025-04-22 09:25:44 +00:00
wang jiahao
4c41f3a35f
Merge pull request #1180 from kvcache-ai/update_param
update speed test
2025-04-22 15:39:57 +08:00
qiyuxinlin
b17ab8653c update speed test 2025-04-22 07:38:05 +00:00
wang jiahao
485588017b
Merge pull request #1177 from kvcache-ai/update_param
Some checks failed
Book-CI / test (push) Failing after 4s
Deploy / deploy (ubuntu-latest) (push) Failing after 2s
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Update param
2025-04-22 10:14:36 +08:00
qiyuxinlin
f5287e908a fix no balance_serve import error 2025-04-22 02:11:18 +00:00
qiyuxinlin
03a65d6bea roll back ktransformers backend, add max_tokens, max_completion_tokens param 2025-04-21 12:55:37 +00:00
wang jiahao
a1162eea01
Merge pull request #1158 from Creeper-MZ/function_call
Some checks failed
Book-CI / test (push) Failing after 9s
Deploy / deploy (ubuntu-latest) (push) Failing after 2s
Deploy / deploy (windows-latest) (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Update Function call
2025-04-19 16:31:37 +08:00
Creeper-MZ
133ba746e9 优化提示词,解决部分Deepseek r1的兼容性
优化提示词,解决部分Deepseek r1的兼容性

fix non stream
2025-04-19 01:20:27 -04:00
Atream
34c199403b
Merge pull request #1170 from onepick/fix-cmake-error
Some checks failed
Deploy / deploy (ubuntu-latest) (push) Failing after 3s
Deploy / deploy (windows-latest) (push) Has been cancelled
Book-CI / test (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Fix cmake config error
2025-04-18 07:51:03 -06:00
wang jiahao
0892d37d2d
Merge pull request #1172 from kvcache-ai/move_create_sched
Some checks failed
Book-CI / test (push) Failing after 4s
Deploy / deploy (ubuntu-latest) (push) Failing after 2s
Deploy / deploy (windows-latest) (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Move KV cache creation to balance_serve
2025-04-18 18:19:29 +08:00
qiyuxinlin
38e841900d Move KV cache creation to balance_serve 2025-04-18 10:10:07 +00:00
onepick
c5edd3fdf0 Fix cmake config error
Signed-off-by: onepick <jiajuku12@163.com>
2025-04-18 15:43:03 +08:00
Atream
e44c45e782
Merge pull request #1163 from cyhasuka/main
Enh: Make Ollama perf data more accurate, consistent with OpenAI's implementation
2025-04-18 00:50:58 -06:00
Atream
08f0bd5e13
Merge pull request #1168 from kvcache-ai/Atream-patch-1
remove hard code max_length
2025-04-17 22:40:28 -06:00
Atream
e6fb4d5a58
remove hard code max_length 2025-04-18 12:11:18 +08:00
Jianwei Dong
22a30d707d
Merge pull request #1167 from kvcache-ai/update-llama4-tutorial-patch-1
update llama4 tutorial
2025-04-18 11:44:11 +08:00
djw
dfaf2b20fb update llama4 tutorial 2025-04-18 03:42:48 +00:00
Creeper-MZ
62c4023160 Fixed #1155 2025-04-17 10:21:51 -04:00
Yuhao Tsui
eff5bbc202
Merge branch 'kvcache-ai:main' into main 2025-04-17 22:01:31 +08:00
Creeper-MZ
4fb19bfcae Update chat.py 2025-04-17 09:19:14 -04:00
ZiWei Yuan
8770b6d573
Merge pull request #1159 from onepick/fix-rocm-build-error
Some checks failed
Book-CI / test (push) Failing after 4s
Deploy / deploy (ubuntu-latest) (push) Failing after 2s
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Fix some build error for ROCM
2025-04-17 19:57:44 +08:00
onepick
6a7624fe4a Change the logic to build device since cuda is as default
Signed-off-by: onepick <jiajuku12@163.com>
2025-04-17 19:44:05 +08:00
Yuhao Tsui
8ce34b3b5c
Modify the performance calculation module
Modify the performance data calculation module from estimation to retrieving from `raw_usage`.
2025-04-17 16:57:53 +08:00
wang jiahao
6e4da83d4b
Merge pull request #978 from cyhasuka/main
Some checks failed
Deploy / deploy (ubuntu-latest) (push) Failing after 3s
Book-CI / test (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Feat: Support Non-streaming chat in Ollama backend
2025-04-17 14:34:35 +08:00