Commit graph

8 commits

Author SHA1 Message Date
qiyuxinlin
38e841900d Move KV cache creation to balance_serve 2025-04-18 10:10:07 +00:00
Atream
25cee5810e add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
ceerrep
ee24eb8dc3 fix: fix server for triton kernel 2025-02-17 18:08:45 +08:00
ceerrep
bb0ccc7b1a feat: add prefix cache for server 2025-02-17 00:10:55 +08:00
anyanqilin
2d67016d14 wjh-change 2024-11-04 14:02:19 +08:00
liam
dd1d8667f3 : refactor local_chat and fix message slice bug in server 2024-11-04 14:02:19 +08:00
TangJingqi
170b7a6001 fix server don't accept yaml path as param; fix server static cache device problem 2024-08-21 14:19:43 +08:00
chenxl
18c42e67df Initial commit 2024-07-27 16:06:58 +08:00