Commit graph

426 commits

Author SHA1 Message Date
Yi Pan
01755a60c0
fix: wrong shape in KLinearMarlin. 2025-03-03 17:34:45 +08:00
Atream
4e43e8a4ee
Merge pull request #775 from kvcache-ai/Atream-patch-1
Update __init__.py
2025-03-03 16:50:02 +08:00
Atream
8963ae7817
Update __init__.py 2025-03-03 16:49:50 +08:00
Atream
659583a92c
Merge pull request #770 from SkqLiao/main
Introduce Testing Jobs for kTransformers Setup on Self-Hosted Runner
2025-03-03 14:35:26 +08:00
Jiaqi Liao
7d96e2eeba fix environment name 2025-03-03 14:01:40 +08:00
Jiaqi Liao
3d62579a6a fix environment name 2025-03-03 13:55:40 +08:00
Jiaqi Liao
5fe0d138ca add install job for self-host testing 2025-03-03 13:52:13 +08:00
wang jiahao
48b9800790
Merge pull request #759 from 3wweiweiwu/fix_top_p_typo
fix typo for top_p
2025-03-02 13:58:11 +08:00
wang jiahao
bb54b68e5c
Merge pull request #761 from kvcache-ai/fix-server-bug
fix ollama api temperature bug
2025-03-02 13:56:46 +08:00
1668068727@qq.com
7cdf8139f0 fix ollama api temperature bug 2025-03-02 13:55:26 +08:00
Wix Woo
3aa0cfc29d fix typo for top_p 2025-03-01 20:15:36 +00:00
Atream
69382e58f9
Merge pull request #313 from MuWinds/main
Update:Solve  `torch.backends.cuda.sdp_kernel()` is deprecated.
2025-03-01 23:24:44 +08:00
Atream
ca1dc1e7d1
Merge branch 'main' into main 2025-03-01 23:24:10 +08:00
Atream
505f4e2c35
Merge pull request #753 from ningpengtao-coder/main
Update local_chat.py
2025-03-01 23:22:13 +08:00
宁鹏涛
71286ec1c0
Update local_chat.py
修复config.architectures[0] == "DeepseekV2ForCausalLM" or "DeepseekV3ForCausalLM" 永远为真
2025-03-01 21:52:48 +08:00
Atream
761de49843
Merge pull request #751 from kvcache-ai/Atream-patch-2
Update DeepseekR1_V3_tutorial.md
2025-03-01 19:57:00 +08:00
Atream
735873a32a
Update DeepseekR1_V3_tutorial.md 2025-03-01 19:56:46 +08:00
Atream
bd33a59ecf
Merge pull request #750 from kvcache-ai/feat-chunk-prefill-flashinfer
Support chunk prefill. Support 139K context for DeepSeek-R1 139K with in 24G VRAM.
2025-03-01 19:50:52 +08:00
Atream
fa03ea48dd Merge branch 'main' into feat-chunk-prefill-flashinfer 2025-03-01 11:35:09 +00:00
Atream
f35e8d41d8 support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
ZiWei Yuan
511958d49c
Merge pull request #743 from KMSorSMS/main
fix cache_lens bug in server and rm test prompt.txt
2025-03-01 00:17:53 +08:00
liam
80e0536fb0 Merge branch 'main' of https://github.com/KMSorSMS/ktransformers into main 2025-03-01 00:12:21 +08:00
liam
8ddc990668 fix server cache lens 2025-03-01 00:09:57 +08:00
Atream
494469d4c5
Merge pull request #722 from ZhangShuaiyi/remove_unused
Delete duplicate code
2025-02-28 15:04:21 +08:00
liam
71f4599dee 📝 rm test_prompt 2025-02-28 11:44:49 +08:00
ZiWei Yuan
1264f9407b
Merge pull request #732 from KMSorSMS/main
 fox docker build
2025-02-28 11:28:06 +08:00
liam
a0e7afa432 fox docker build 2025-02-28 11:25:34 +08:00
Azure
add415124f
Merge pull request #731 from Azure-Tang/update-template
[fix] Fix template name
2025-02-28 11:19:52 +08:00
Azure
bc52969918 fix name 2025-02-28 03:17:33 +00:00
Azure
0439cb36d4
Merge pull request #730 from Azure-Tang/update-template
[UPDATE] Update ZH/EN issue template
2025-02-28 11:10:29 +08:00
Azure
31b01f5b99 update ZH/EN template 2025-02-28 03:09:06 +00:00
Shuaiyi
a34a25d5cc Delete unused code 2025-02-27 13:18:19 +00:00
wang jiahao
7a19f3b781
Merge pull request #721 from kvcache-ai/fix_temperature
fix temperature
2025-02-27 21:01:21 +08:00
qiyuxinlin
22df52e94e fix temperature 2025-02-27 21:00:44 +08:00
Atream
85e2cc7bf4
Merge pull request #719 from kvcache-ai/fix-use-generation-json
use generation config from json file in official repo
2025-02-27 19:49:41 +08:00
Atream
e645d84794 use generation config from json file in official repo 2025-02-27 11:48:34 +00:00
wang jiahao
5e3c6b4f97
Merge pull request #644 from wtdcode/temperature_top_p_from_request
Allow temperature and top_p from /v1/chat/completions
2025-02-27 18:13:13 +08:00
lazymio
b121ca4df8
Fix according to upstream changes 2025-02-27 18:11:35 +08:00
wang jiahao
26f7b4af11
Merge branch 'main' into temperature_top_p_from_request 2025-02-27 18:08:55 +08:00
Azure
1f28f75f55
Merge pull request #717 from kvcache-ai/issue-template
Update issue templates
2025-02-27 18:02:34 +08:00
Azure
c61805dd0a
Update issue templates 2025-02-27 17:53:27 +08:00
Atream
50c691297f
Merge pull request #622 from akemimadoka/fix-msvc
Fix missing macro definition for KTRANSFORMERS_USE_CUDA and <chrono> includes on MSVC
2025-02-27 17:42:00 +08:00
Atream
0422152cf3
Merge pull request #670 from akemimadoka/fix-win
Fix RuntimeError on Windows caused by integer overflow in np.prod
2025-02-27 17:40:27 +08:00
Atream
798e1d0cfa
Merge pull request #532 from xv44586/fix-sse-formatting
fix: fix SSE formatting
2025-02-27 12:19:23 +08:00
Atream
f403cde6d4
Merge pull request #650 from ceerRep/main
feat: basic api key support
2025-02-27 12:16:53 +08:00
Atream
1d5d5faef6
Merge pull request #626 from cyhasuka/main
Feat: Clear cache during weight loading to prevent OOM on GPUs with <=8GB VRAM
2025-02-27 12:13:10 +08:00
Atream
8db6a4d402
Merge branch 'main' into main 2025-02-27 12:12:32 +08:00
wang jiahao
3c8c580580
Merge pull request #691 from swu-hyk/ollama_api_chat
feat:implementation of chat routing for Ollama
2025-02-27 11:17:48 +08:00
Azure
ca93cf7548
Merge pull request #702 from Azure-Tang/update-readme
[UPDATE] Update documents.
2025-02-26 23:45:24 +08:00
Azure
c05ebb74b1 Update fp8 doc; Update install.md broken link 2025-02-26 15:43:08 +00:00