wang jiahao
f4ae7c85ed
Merge pull request #1069 from kvcache-ai/qiyuxinlin-patch-4
...
Update balance-serve.md
2025-04-07 19:20:00 +08:00
wang jiahao
2fcdbee769
Update balance-serve.md
2025-04-07 19:19:49 +08:00
Azure
77c6cc82ac
Merge pull request #1063 from aubreyli/KLinearCPUInfer.forward-fix
...
Fix TypeError when invoke KLinearCPUInfer.forward()
2025-04-07 15:10:46 +08:00
wang jiahao
6463070b16
Merge pull request #1064 from kvcache-ai/fix-temperature
...
fix temperature=0, flashinfer sample error
2025-04-07 12:32:28 +08:00
dongjw
ec03bcbd7f
fix temperature=0, flashinfer sample error
2025-04-07 12:30:47 +08:00
Atream
aac0c91d02
Merge pull request #1060 from kvcache-ai/fix-compile
...
Fix compile
2025-04-07 12:10:31 +08:00
Aubrey Li
12a4c631df
Fix TypeError when invoke KLinearCPUInfer.forward()
...
Fix the following error:
File "/home/aubrey/work/ktransformers/ktransformers/operators/linear.py", line 825, in forward
y = self.generate_linear.forward(x, bsz_tensor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: KLinearCPUInfer.forward() takes 2 positional arguments but 3 were given
2025-04-07 12:03:35 +08:00
Atream
fa87c1aeea
Update CMakeLists.txt
2025-04-07 10:32:30 +08:00
Atream
ad2009425c
Update CMakeLists.txt
2025-04-07 10:32:12 +08:00
Atream
f53d5aa979
Create CMakeLists.txt
2025-04-07 10:30:55 +08:00
wang jiahao
6ca743ed7a
Merge pull request #1049 from kvcache-ai/qiyuxinlin-patch-3
...
Update balance-serve.md
2025-04-05 11:49:16 +08:00
wang jiahao
6cbe044aae
Update balance-serve.md
2025-04-05 11:49:05 +08:00
wang jiahao
8a1313ca4e
Merge pull request #1044 from 255doesnotexist/patch-1
...
📝 Docs: Clarify CMake version requirement for CUDA dialects
2025-04-04 23:45:03 +08:00
255
578d3d9d09
📝 Docs: Clarify CMake version requirement for CUDA dialects
...
Adds a note explaining that default CMake versions on systems like
Ubuntu 22.04 LTS might not support newer CUDA dialects (e.g., CUDA 20),
leading to specific build errors.
Recommends installing a newer CMake via the Kitware APT repository
as a resolution. This helps users troubleshoot errors like:
"Target ... requires the language dialect 'CUDA20', but CMake does not know the compile flags..."
2025-04-04 20:11:59 +08:00
ZiWei Yuan
6617549de0
Merge pull request #1043 from kvcache-ai/KMSorSMS-patch-2
...
🔖 release v0.2.4post1
2025-04-04 16:02:43 +08:00
ZiWei Yuan
a5608dcb80
🔖 release v0.2.4post1
2025-04-04 16:01:25 +08:00
wang jiahao
af36075bda
Merge pull request #1042 from kvcache-ai/v0.2.4-fix
...
Fix bug with non-base-multiple chunk_size, update test examples, and …
2025-04-04 15:43:48 +08:00
dongjw
be84d04253
Fix bug with non-base-multiple chunk_size, update test examples, and resolve issue with writing model_config. Hugging Face URL input is still unsupported.
2025-04-04 15:41:07 +08:00
ZiWei Yuan
64e6aa026a
Merge pull request #1034 from kvcache-ai/patch_config
...
🔧 update config.yaml setting default config
2025-04-03 19:57:04 +08:00
liam
b151a98cab
🔧 update config.yaml setting default config
2025-04-03 11:55:50 +00:00
Atream
ca9695b488
Merge pull request #1033 from kvcache-ai/Atream-patch-1
...
Update modeling_deepseek_v3.py
2025-04-03 17:13:26 +08:00
Atream
e36ddc36a8
Update modeling_deepseek_v3.py
2025-04-03 17:13:06 +08:00
wang jiahao
016d11e6d4
Merge pull request #1030 from ambitiousCC/main
...
slove [Bug] #1023
2025-04-03 16:00:07 +08:00
wang jiahao
47a89ae752
Merge pull request #1031 from wangkuigang-yewu-cmss/doc-update
...
文档更新:model_path名字要求以及在示例中添加force_think
2025-04-03 15:58:12 +08:00
wangkuigang-yewu-cmss
c590583262
doc upgrade: model_path requirements and reasoning
...
* add documentations about `--model_path` requirements
* add `--force_think` in doc (most users would run R1 and would want it to provide reasoning process)
2025-04-03 15:16:56 +08:00
Qin's repo
2c3a3a1e1c
slove [Bug] #1023
...
Only modified the mixed single and double quotes in server/config/config.py
2025-04-03 14:37:32 +08:00
wang jiahao
72e8e16fa4
Merge pull request #1029 from kvcache-ai/mian-update-doc
...
fix local_chat bug and update doc
2025-04-03 12:44:59 +08:00
dongjw
1b7672937b
update install doc and fix local_chat bug
2025-04-03 12:42:41 +08:00
dongjw
ab0b0f4ea1
fix local_chat and update balance-serve and SUMMARY doc
2025-04-03 12:19:43 +08:00
ZiWei Yuan
9654bc1c5b
Merge pull request #1027 from kvcache-ai/KMSorSMS-patch-1
...
Update SUMMARY.md
2025-04-03 12:02:18 +08:00
ZiWei Yuan
a0ce48ee21
Update SUMMARY.md
2025-04-03 12:00:34 +08:00
wang jiahao
f7a8a91f46
Merge pull request #1024 from kvcache-ai/mian-update-doc
...
delete sudo install
2025-04-03 10:51:54 +08:00
dongjw
8acb270c90
delete sudo install
2025-04-03 10:46:52 +08:00
Atream
795524cacc
Merge pull request #954 from aubreyli/yaml_marlin_fix
...
yaml: fix Marlin AssertionError
2025-04-02 15:00:37 +08:00
Atream
ec12429c46
Merge pull request #1005 from fishingfly/improve/backend-error-msg
...
fix: refine backend error message to include ROCM_HOME
2025-04-02 14:54:23 +08:00
wang jiahao
ac95b6c710
Merge pull request #1015 from kvcache-ai/qiyuxinlin-patch-1
...
Update balance-serve.md
2025-04-02 14:22:30 +08:00
wang jiahao
ee179c2ad0
Update balance-serve.md
2025-04-02 14:22:15 +08:00
wang jiahao
a41d216393
Merge pull request #1013 from kvcache-ai/work-concurrent
...
In v0.2.4 version, we’ve added highly desired multi-concurrency support to the community through a major refactor of the whole architecture.
2025-04-02 14:09:10 +08:00
dongjw
4ed9744ebb
update readme
2025-04-02 14:02:57 +08:00
dongjw
b62cefaec9
update readme
2025-04-02 13:11:01 +08:00
dongjw
d41dd23b14
update Dockerfile
2025-04-02 12:10:58 +08:00
dongjw
65798994cb
update Dockerfile
2025-04-02 12:04:09 +08:00
dongjw
56a18ad02c
change tag v0.2.4
2025-04-01 21:07:13 +08:00
Azure-Tang
d98433c2d1
update git action env, add USE_BALANCE_SERVE=1
2025-04-01 12:58:28 +00:00
dongjw
5c7ed7b579
fix top_p = 0 bug
2025-04-01 20:38:33 +08:00
Azure-Tang
aeabd783b0
update git action env, add BALANCE_SERVE=1
2025-04-01 11:21:55 +00:00
Azure-Tang
31677181c3
Fix ktransformers-server flashinfer wrapper position arg issue;
...
Fix db position issue
2025-04-01 07:30:23 +00:00
Azure-Tang
203b853c75
rm KMoEGateDeepSeekV3, fall back to KMoEGate
2025-04-01 07:13:05 +00:00
Azure-Tang
3a5330b215
Merge branch 'main' into work-concurrent
2025-04-01 06:48:19 +00:00
fishingfly
7549ff335a
fix: refine backend error message to include ROCM_HOME
...
Signed-off-by: fishingfly <zhoyuzf@163.com>
2025-04-01 10:50:38 +08:00