Atream
|
cadd55078f
|
Merge pull request #295 from kvcache-ai/update-wechatgroup
Update wechatgroup
|
2025-02-14 19:52:05 +08:00 |
|
Atream
|
e153d78227
|
Add files via upload
|
2025-02-14 19:51:11 +08:00 |
|
Atream
|
96e6dff7ac
|
Delete WeChatGrouop.png
|
2025-02-14 19:49:14 +08:00 |
|
Atream
|
885a91e7db
|
Merge pull request #294 from kvcache-ai/feat-fast-MLA
Feat fast mla
|
2025-02-14 19:40:36 +08:00 |
|
Atream
|
1084d4e4b4
|
linux support triton MLA kernel
|
2025-02-14 11:38:55 +00:00 |
|
Azure
|
6738908699
|
Merge pull request #280 from Azure-Tang/main
[fix] Fix incorrect image content in the document
|
2025-02-14 17:12:14 +08:00 |
|
Azure
|
1b1f417267
|
Fix incorrect image content in the document
|
2025-02-14 09:04:22 +00:00 |
|
Azure
|
f4bb374eaf
|
Merge pull request #254 from Azure-Tang/main
[Update] Add V3/R1 8 gpu yaml example
|
2025-02-14 11:54:14 +08:00 |
|
Azure
|
95c81eaf01
|
Merge branch 'kvcache-ai:main' into main
|
2025-02-14 11:53:52 +08:00 |
|
Atream
|
0a9c59922a
|
Merge pull request #255 from kvcache-ai/update-wechatgroup
Add files via upload
|
2025-02-14 11:08:59 +08:00 |
|
Atream
|
ce7210321a
|
Add files via upload
|
2025-02-14 11:06:56 +08:00 |
|
Azure
|
b7653b9c4f
|
add V3/R1 8 gpu yaml example
|
2025-02-14 02:56:13 +00:00 |
|
Azure
|
e612b14739
|
Merge pull request #247 from liugddx/patch-1
[Doc]Fix dead link problem
|
2025-02-14 10:37:32 +08:00 |
|
Azure
|
ae5d9e11a9
|
Merge pull request #227 from hrz6976/main
Add a lock to server inference()
|
2025-02-14 10:35:11 +08:00 |
|
Guangdong Liu
|
e65be580ab
|
Fix dead link problem
|
2025-02-14 09:57:57 +08:00 |
|
Atream
|
bb35dc5b0d
|
init support for MLA using Attention kernel
|
2025-02-13 15:01:14 +00:00 |
|
ZiWei Yuan
|
a456e25a54
|
Merge pull request #200 from devin2255/main
add README_ZH.md
|
2025-02-13 22:22:25 +08:00 |
|
Hand Sonic
|
e490265242
|
feat: add GitHub Actions workflow for building Docker image
|
2025-02-13 22:09:49 +08:00 |
|
dhliu
|
d04b570fb5
|
edit README_ZH.md && add DeepseekR1_V3_tutorial_zh.md
|
2025-02-13 21:14:44 +08:00 |
|
Atream
|
aa21edd2fe
|
Merge pull request #230 from kvcache-ai/updata-wechatgroup-1
Updata wechatgroup 1
|
2025-02-13 19:33:51 +08:00 |
|
Atream
|
5fb9d65512
|
Add files via upload
|
2025-02-13 19:33:01 +08:00 |
|
Atream
|
ade346e09a
|
Delete WeChatGrouop.png
|
2025-02-13 19:31:46 +08:00 |
|
Atream
|
127965494c
|
Merge pull request #229 from kvcache-ai/updata-wechatgroup
Add files via upload
|
2025-02-13 19:31:13 +08:00 |
|
Atream
|
30e8e6a32a
|
Add files via upload
|
2025-02-13 19:30:39 +08:00 |
|
hrz6976
|
2c3dcd9774
|
Add a lock to server inference()
|
2025-02-13 10:05:22 +00:00 |
|
ZiWei Yuan
|
76b081879a
|
Merge pull request #224 from kvcache-ai/server_support
Server support
|
2025-02-13 17:28:08 +08:00 |
|
liam
|
8d5ebe49ab
|
📝 ⚡ fix some debug output and update doc
|
2025-02-13 17:25:12 +08:00 |
|
liam
|
ad2c52d72a
|
📝 update doc
|
2025-02-13 17:16:27 +08:00 |
|
Azure
|
8324e7fd9b
|
Merge pull request #220 from TensorBlock/main
Add optimization config for Deepseek V3/R1 with 4 GPUs
|
2025-02-13 16:41:39 +08:00 |
|
liam
|
c74453d8ca
|
📝 add doc support and fix bug in qwen2
|
2025-02-13 16:37:43 +08:00 |
|
MorphisZhang
|
aea4243712
|
Add optimization config for Deepseek V3/R1 with 4 GPUs
|
2025-02-13 16:32:28 +08:00 |
|
dhliu
|
318c88cbeb
|
add README_ZH.md
|
2025-02-13 12:43:06 +08:00 |
|
Atream
|
8bad019ef2
|
Merge pull request #180 from lusipad/patch-1
doc: fix clerical error
|
2025-02-13 10:25:30 +08:00 |
|
Atream
|
0905d2e270
|
Merge pull request #189 from Kattos/main
fix typo in README.md
|
2025-02-13 10:24:01 +08:00 |
|
ZiWei Yuan
|
9b5fd55a3c
|
Merge pull request #190 from kvcache-ai/KMSorSMS-patch-2
Update README.md
|
2025-02-13 10:18:08 +08:00 |
|
ZiWei Yuan
|
36ab3d7e6c
|
Update README.md
update png
|
2025-02-13 10:17:56 +08:00 |
|
cuichengyi
|
01655f7500
|
fix typo in README.md
|
2025-02-13 10:12:04 +08:00 |
|
Atream
|
a0c16db352
|
Merge pull request #183 from kvcache-ai/update-WeChatgroup
Update we chatgroup
|
2025-02-13 09:16:30 +08:00 |
|
Atream
|
78cc219274
|
Delete WeChatGrouop.jpg
|
2025-02-13 09:15:57 +08:00 |
|
Atream
|
ea76f7910a
|
Add files via upload
|
2025-02-13 09:15:30 +08:00 |
|
lusipad
|
8384badc69
|
doc: fix clerical error
|
2025-02-13 07:27:27 +08:00 |
|
fxzjshm
|
c1f13a69ed
|
Correctly import compat layer from llama.cpp
Signed-off-by: fxzjshm <fxzjshm@163.com>
|
2025-02-13 03:15:22 +08:00 |
|
fxzjshm
|
38e5dbc895
|
Fix symbol lookup
Signed-off-by: fxzjshm <fxzjshm@163.com>
|
2025-02-13 03:14:35 +08:00 |
|
fxzjshm
|
ae76a729d8
|
gptq_marlin: temporarily disable on AMD ROCm
Signed-off-by: fxzjshm <fxzjshm@163.com>
|
2025-02-13 02:03:22 +08:00 |
|
fxzjshm
|
4cda45433f
|
Don't add CUDA version to version in case not for CUDA
Signed-off-by: fxzjshm <fxzjshm@163.com>
|
2025-02-13 00:59:28 +08:00 |
|
fxzjshm
|
21fca5a326
|
Add compat layer from llama.cpp
Signed-off-by: fxzjshm <fxzjshm@163.com>
|
2025-02-13 00:58:59 +08:00 |
|
Atream
|
9a3d4c290c
|
Merge pull request #170 from feeeei/main
Update release date info
|
2025-02-12 18:27:11 +08:00 |
|
feeeei
|
e7dd5b250d
|
Update release date info
|
2025-02-12 17:47:22 +08:00 |
|
Azure
|
9e42f33c29
|
Merge pull request #166 from kvcache-ai/update-yaml
[Update] Update FAQ to address common questions
|
2025-02-12 17:00:04 +08:00 |
|
Azure
|
101db0e9de
|
Merge branch 'main' into update-yaml
|
2025-02-12 08:56:03 +00:00 |
|