Yi Pan
|
01755a60c0
|
fix: wrong shape in KLinearMarlin.
|
2025-03-03 17:34:45 +08:00 |
|
Atream
|
4e43e8a4ee
|
Merge pull request #775 from kvcache-ai/Atream-patch-1
Update __init__.py
|
2025-03-03 16:50:02 +08:00 |
|
Atream
|
8963ae7817
|
Update __init__.py
|
2025-03-03 16:49:50 +08:00 |
|
Atream
|
659583a92c
|
Merge pull request #770 from SkqLiao/main
Introduce Testing Jobs for kTransformers Setup on Self-Hosted Runner
|
2025-03-03 14:35:26 +08:00 |
|
Jiaqi Liao
|
7d96e2eeba
|
fix environment name
|
2025-03-03 14:01:40 +08:00 |
|
Jiaqi Liao
|
3d62579a6a
|
fix environment name
|
2025-03-03 13:55:40 +08:00 |
|
Jiaqi Liao
|
5fe0d138ca
|
add install job for self-host testing
|
2025-03-03 13:52:13 +08:00 |
|
wang jiahao
|
48b9800790
|
Merge pull request #759 from 3wweiweiwu/fix_top_p_typo
fix typo for top_p
|
2025-03-02 13:58:11 +08:00 |
|
wang jiahao
|
bb54b68e5c
|
Merge pull request #761 from kvcache-ai/fix-server-bug
fix ollama api temperature bug
|
2025-03-02 13:56:46 +08:00 |
|
1668068727@qq.com
|
7cdf8139f0
|
fix ollama api temperature bug
|
2025-03-02 13:55:26 +08:00 |
|
Wix Woo
|
3aa0cfc29d
|
fix typo for top_p
|
2025-03-01 20:15:36 +00:00 |
|
Atream
|
69382e58f9
|
Merge pull request #313 from MuWinds/main
Update:Solve `torch.backends.cuda.sdp_kernel()` is deprecated.
|
2025-03-01 23:24:44 +08:00 |
|
Atream
|
ca1dc1e7d1
|
Merge branch 'main' into main
|
2025-03-01 23:24:10 +08:00 |
|
Atream
|
505f4e2c35
|
Merge pull request #753 from ningpengtao-coder/main
Update local_chat.py
|
2025-03-01 23:22:13 +08:00 |
|
宁鹏涛
|
71286ec1c0
|
Update local_chat.py
修复config.architectures[0] == "DeepseekV2ForCausalLM" or "DeepseekV3ForCausalLM" 永远为真
|
2025-03-01 21:52:48 +08:00 |
|
Atream
|
761de49843
|
Merge pull request #751 from kvcache-ai/Atream-patch-2
Update DeepseekR1_V3_tutorial.md
|
2025-03-01 19:57:00 +08:00 |
|
Atream
|
735873a32a
|
Update DeepseekR1_V3_tutorial.md
|
2025-03-01 19:56:46 +08:00 |
|
Atream
|
bd33a59ecf
|
Merge pull request #750 from kvcache-ai/feat-chunk-prefill-flashinfer
Support chunk prefill. Support 139K context for DeepSeek-R1 139K with in 24G VRAM.
|
2025-03-01 19:50:52 +08:00 |
|
Atream
|
fa03ea48dd
|
Merge branch 'main' into feat-chunk-prefill-flashinfer
|
2025-03-01 11:35:09 +00:00 |
|
Atream
|
f35e8d41d8
|
support chunk prefill, support 139K context for 24G VRAM
|
2025-03-01 11:28:25 +00:00 |
|
ZiWei Yuan
|
511958d49c
|
Merge pull request #743 from KMSorSMS/main
fix cache_lens bug in server and rm test prompt.txt
|
2025-03-01 00:17:53 +08:00 |
|
liam
|
80e0536fb0
|
Merge branch 'main' of https://github.com/KMSorSMS/ktransformers into main
|
2025-03-01 00:12:21 +08:00 |
|
liam
|
8ddc990668
|
⚡ fix server cache lens
|
2025-03-01 00:09:57 +08:00 |
|
Atream
|
494469d4c5
|
Merge pull request #722 from ZhangShuaiyi/remove_unused
Delete duplicate code
|
2025-02-28 15:04:21 +08:00 |
|
liam
|
71f4599dee
|
📝 rm test_prompt
|
2025-02-28 11:44:49 +08:00 |
|
ZiWei Yuan
|
1264f9407b
|
Merge pull request #732 from KMSorSMS/main
⚡ fox docker build
|
2025-02-28 11:28:06 +08:00 |
|
liam
|
a0e7afa432
|
⚡ fox docker build
|
2025-02-28 11:25:34 +08:00 |
|
Azure
|
add415124f
|
Merge pull request #731 from Azure-Tang/update-template
[fix] Fix template name
|
2025-02-28 11:19:52 +08:00 |
|
Azure
|
bc52969918
|
fix name
|
2025-02-28 03:17:33 +00:00 |
|
Azure
|
0439cb36d4
|
Merge pull request #730 from Azure-Tang/update-template
[UPDATE] Update ZH/EN issue template
|
2025-02-28 11:10:29 +08:00 |
|
Azure
|
31b01f5b99
|
update ZH/EN template
|
2025-02-28 03:09:06 +00:00 |
|
Shuaiyi
|
a34a25d5cc
|
Delete unused code
|
2025-02-27 13:18:19 +00:00 |
|
wang jiahao
|
7a19f3b781
|
Merge pull request #721 from kvcache-ai/fix_temperature
fix temperature
|
2025-02-27 21:01:21 +08:00 |
|
qiyuxinlin
|
22df52e94e
|
fix temperature
|
2025-02-27 21:00:44 +08:00 |
|
Atream
|
85e2cc7bf4
|
Merge pull request #719 from kvcache-ai/fix-use-generation-json
use generation config from json file in official repo
|
2025-02-27 19:49:41 +08:00 |
|
Atream
|
e645d84794
|
use generation config from json file in official repo
|
2025-02-27 11:48:34 +00:00 |
|
wang jiahao
|
5e3c6b4f97
|
Merge pull request #644 from wtdcode/temperature_top_p_from_request
Allow temperature and top_p from /v1/chat/completions
|
2025-02-27 18:13:13 +08:00 |
|
lazymio
|
b121ca4df8
|
Fix according to upstream changes
|
2025-02-27 18:11:35 +08:00 |
|
wang jiahao
|
26f7b4af11
|
Merge branch 'main' into temperature_top_p_from_request
|
2025-02-27 18:08:55 +08:00 |
|
Azure
|
1f28f75f55
|
Merge pull request #717 from kvcache-ai/issue-template
Update issue templates
|
2025-02-27 18:02:34 +08:00 |
|
Azure
|
c61805dd0a
|
Update issue templates
|
2025-02-27 17:53:27 +08:00 |
|
Atream
|
50c691297f
|
Merge pull request #622 from akemimadoka/fix-msvc
Fix missing macro definition for KTRANSFORMERS_USE_CUDA and <chrono> includes on MSVC
|
2025-02-27 17:42:00 +08:00 |
|
Atream
|
0422152cf3
|
Merge pull request #670 from akemimadoka/fix-win
Fix RuntimeError on Windows caused by integer overflow in np.prod
|
2025-02-27 17:40:27 +08:00 |
|
Atream
|
798e1d0cfa
|
Merge pull request #532 from xv44586/fix-sse-formatting
fix: fix SSE formatting
|
2025-02-27 12:19:23 +08:00 |
|
Atream
|
f403cde6d4
|
Merge pull request #650 from ceerRep/main
feat: basic api key support
|
2025-02-27 12:16:53 +08:00 |
|
Atream
|
1d5d5faef6
|
Merge pull request #626 from cyhasuka/main
Feat: Clear cache during weight loading to prevent OOM on GPUs with <=8GB VRAM
|
2025-02-27 12:13:10 +08:00 |
|
Atream
|
8db6a4d402
|
Merge branch 'main' into main
|
2025-02-27 12:12:32 +08:00 |
|
wang jiahao
|
3c8c580580
|
Merge pull request #691 from swu-hyk/ollama_api_chat
feat:implementation of chat routing for Ollama
|
2025-02-27 11:17:48 +08:00 |
|
Azure
|
ca93cf7548
|
Merge pull request #702 from Azure-Tang/update-readme
[UPDATE] Update documents.
|
2025-02-26 23:45:24 +08:00 |
|
Azure
|
c05ebb74b1
|
Update fp8 doc; Update install.md broken link
|
2025-02-26 15:43:08 +00:00 |
|