Commit graph

17 commits

Author SHA1 Message Date
qiyuxinlin
c6aa379de2 support safetensor load, delete architectures argument 2025-05-09 10:38:29 +00:00
Atream
7adb7281f4 fix-cache-lens 2025-04-30 03:37:43 +00:00
djw
33cbd47086 support qwen3 2025-04-28 18:15:35 +00:00
djw
3f9bbf1181 support qwen3, dont speak human language 2025-04-28 08:44:47 +00:00
qiyuxinlin
64de784328 format kvc2, delete quant_configs, move model_configs to ~/.ktransformers 2025-04-08 10:06:07 +00:00
Atream
25cee5810e add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
Atream
f35e8d41d8 support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
ceerrep
f639fbc19e feat: basic api key support 2025-02-25 14:11:39 +08:00
ceerrep
bb0ccc7b1a feat: add prefix cache for server 2025-02-17 00:10:55 +08:00
liam
4385e85096 support force thinking 2025-02-12 12:43:53 +08:00
liam
6f3a39be08 update force_think config 2025-02-12 12:10:16 +08:00
liam
e536e1420d update force_think 2025-02-12 11:42:55 +08:00
liam
04cebec4bb rm opt config path default value and fix some config logic bug 2024-11-14 20:02:30 +08:00
liam
a148da2cfe : rm sensitive info in config.yaml, add readme of makefile. support old model_path config 2024-11-04 14:02:19 +08:00
anyanqilin
9a2e7057c8 wjh fix change 2024-11-04 14:02:19 +08:00
anyanqilin
2d67016d14 wjh-change 2024-11-04 14:02:19 +08:00
liam
dd1d8667f3 : refactor local_chat and fix message slice bug in server 2024-11-04 14:02:19 +08:00