Commit graph

877 commits

Author SHA1 Message Date
Jianwei Dong
1e0be68e51
Merge pull request #1096 from kvcache-ai/update-llama4-tutorial
update llama4 tutorial
2025-04-09 17:37:33 +08:00
djw
f73b4ca706 update llama4 tutorial 2025-04-09 09:36:30 +00:00
Jianwei Dong
2de96a1f05
Merge pull request #1095 from kvcache-ai/update-llama4-tutorial
update llama4 tutorial
2025-04-09 17:35:14 +08:00
djw
ecc3028c13 update llama4 tutorial 2025-04-09 09:34:04 +00:00
Azure
a74a58d864
Merge pull request #1091 from aubreyli/add_g++
balance_serve: Add g++ to compiler list
2025-04-09 14:40:30 +08:00
Yuhao Tsui
877aec858e
Merge branch 'kvcache-ai:main' into main 2025-04-09 11:46:39 +08:00
Aubrey Li
45d20fa87b balance_serve: Add g++ to compiler list
In some OS distributions, g++ exists in the following form:

  # ls -l /usr/bin/g++*
  -rwxr-xr-x 4 root root 985784 Dec  9 12:51 /usr/bin/g++

So make sure to add g++ to the compiler list as well.
2025-04-09 11:25:35 +08:00
Atream
9037bf30d5
Merge pull request #1090 from kvcache-ai/Atream-patch-1
Update attention.py
2025-04-09 10:54:37 +08:00
Atream
3b9e16cec7
Update attention.py 2025-04-09 10:54:00 +08:00
wang jiahao
94476ce5cc
Merge pull request #1085 from kvcache-ai/qiyuxinlin-patch-5
Update balance-serve.md
2025-04-08 19:19:37 +08:00
wang jiahao
23ceb1c049
Update balance-serve.md 2025-04-08 19:19:00 +08:00
wang jiahao
41ce92bb22
Merge pull request #1084 from kvcache-ai/fix-config
format kvc2, delete quant_configs, move model_configs to ~/.ktransfor…
2025-04-08 19:14:07 +08:00
qiyuxinlin
64de784328 format kvc2, delete quant_configs, move model_configs to ~/.ktransformers 2025-04-08 10:06:07 +00:00
Atream
10fd2e281f
Merge pull request #1079 from kvcache-ai/fix-compile
fix compile, add abi check to setup.py
2025-04-08 14:36:31 +08:00
Atream
9dd24ecd72 fix compile, add abi check to setup.py 2025-04-08 06:18:30 +00:00
wang jiahao
f4ae7c85ed
Merge pull request #1069 from kvcache-ai/qiyuxinlin-patch-4
Update balance-serve.md
2025-04-07 19:20:00 +08:00
wang jiahao
2fcdbee769
Update balance-serve.md 2025-04-07 19:19:49 +08:00
Azure
77c6cc82ac
Merge pull request #1063 from aubreyli/KLinearCPUInfer.forward-fix
Fix TypeError when invoke KLinearCPUInfer.forward()
2025-04-07 15:10:46 +08:00
wang jiahao
6463070b16
Merge pull request #1064 from kvcache-ai/fix-temperature
fix temperature=0, flashinfer sample error
2025-04-07 12:32:28 +08:00
dongjw
ec03bcbd7f fix temperature=0, flashinfer sample error 2025-04-07 12:30:47 +08:00
Atream
aac0c91d02
Merge pull request #1060 from kvcache-ai/fix-compile
Fix compile
2025-04-07 12:10:31 +08:00
Aubrey Li
12a4c631df Fix TypeError when invoke KLinearCPUInfer.forward()
Fix the following error:

  File "/home/aubrey/work/ktransformers/ktransformers/operators/linear.py", line 825, in forward
    y = self.generate_linear.forward(x, bsz_tensor)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: KLinearCPUInfer.forward() takes 2 positional arguments but 3 were given
2025-04-07 12:03:35 +08:00
Atream
fa87c1aeea
Update CMakeLists.txt 2025-04-07 10:32:30 +08:00
Atream
ad2009425c
Update CMakeLists.txt 2025-04-07 10:32:12 +08:00
Atream
f53d5aa979
Create CMakeLists.txt 2025-04-07 10:30:55 +08:00
wang jiahao
6ca743ed7a
Merge pull request #1049 from kvcache-ai/qiyuxinlin-patch-3
Update balance-serve.md
2025-04-05 11:49:16 +08:00
wang jiahao
6cbe044aae
Update balance-serve.md 2025-04-05 11:49:05 +08:00
wang jiahao
8a1313ca4e
Merge pull request #1044 from 255doesnotexist/patch-1
📝 Docs: Clarify CMake version requirement for CUDA dialects
2025-04-04 23:45:03 +08:00
255
578d3d9d09
📝 Docs: Clarify CMake version requirement for CUDA dialects
Adds a note explaining that default CMake versions on systems like
Ubuntu 22.04 LTS might not support newer CUDA dialects (e.g., CUDA 20),
leading to specific build errors.

Recommends installing a newer CMake via the Kitware APT repository
as a resolution. This helps users troubleshoot errors like:
"Target ... requires the language dialect 'CUDA20', but CMake does not know the compile flags..."
2025-04-04 20:11:59 +08:00
ZiWei Yuan
6617549de0
Merge pull request #1043 from kvcache-ai/KMSorSMS-patch-2
🔖 release v0.2.4post1
2025-04-04 16:02:43 +08:00
ZiWei Yuan
a5608dcb80
🔖 release v0.2.4post1 2025-04-04 16:01:25 +08:00
wang jiahao
af36075bda
Merge pull request #1042 from kvcache-ai/v0.2.4-fix
Fix bug with non-base-multiple chunk_size, update test examples, and …
2025-04-04 15:43:48 +08:00
dongjw
be84d04253 Fix bug with non-base-multiple chunk_size, update test examples, and resolve issue with writing model_config. Hugging Face URL input is still unsupported. 2025-04-04 15:41:07 +08:00
ZiWei Yuan
64e6aa026a
Merge pull request #1034 from kvcache-ai/patch_config
🔧 update config.yaml setting default config
2025-04-03 19:57:04 +08:00
liam
b151a98cab 🔧 update config.yaml setting default config 2025-04-03 11:55:50 +00:00
Atream
ca9695b488
Merge pull request #1033 from kvcache-ai/Atream-patch-1
Update modeling_deepseek_v3.py
2025-04-03 17:13:26 +08:00
Atream
e36ddc36a8
Update modeling_deepseek_v3.py 2025-04-03 17:13:06 +08:00
wang jiahao
016d11e6d4
Merge pull request #1030 from ambitiousCC/main
slove [Bug] #1023
2025-04-03 16:00:07 +08:00
wang jiahao
47a89ae752
Merge pull request #1031 from wangkuigang-yewu-cmss/doc-update
文档更新:model_path名字要求以及在示例中添加force_think
2025-04-03 15:58:12 +08:00
wangkuigang-yewu-cmss
c590583262 doc upgrade: model_path requirements and reasoning
* add documentations about `--model_path` requirements
* add `--force_think` in doc (most users would run R1 and would want it to provide reasoning process)
2025-04-03 15:16:56 +08:00
Qin's repo
2c3a3a1e1c
slove [Bug] #1023
Only modified the mixed single and double quotes in server/config/config.py
2025-04-03 14:37:32 +08:00
wang jiahao
72e8e16fa4
Merge pull request #1029 from kvcache-ai/mian-update-doc
fix local_chat bug and update doc
2025-04-03 12:44:59 +08:00
dongjw
1b7672937b update install doc and fix local_chat bug 2025-04-03 12:42:41 +08:00
dongjw
ab0b0f4ea1 fix local_chat and update balance-serve and SUMMARY doc 2025-04-03 12:19:43 +08:00
ZiWei Yuan
9654bc1c5b
Merge pull request #1027 from kvcache-ai/KMSorSMS-patch-1
Update SUMMARY.md
2025-04-03 12:02:18 +08:00
ZiWei Yuan
a0ce48ee21
Update SUMMARY.md 2025-04-03 12:00:34 +08:00
wang jiahao
f7a8a91f46
Merge pull request #1024 from kvcache-ai/mian-update-doc
delete sudo install
2025-04-03 10:51:54 +08:00
dongjw
8acb270c90 delete sudo install 2025-04-03 10:46:52 +08:00
Atream
795524cacc
Merge pull request #954 from aubreyli/yaml_marlin_fix
yaml: fix Marlin AssertionError
2025-04-02 15:00:37 +08:00
Atream
ec12429c46
Merge pull request #1005 from fishingfly/improve/backend-error-msg
fix: refine backend error message to include ROCM_HOME
2025-04-02 14:54:23 +08:00