Compare commits

...

7 commits
v0.6.1 ... main

Author SHA1 Message Date
Benjamin F
9f34ef46e6
[fix](Qwen3 series): fix gibberish output by correcting RoPE write-back (#31) (#1959)
Some checks are pending
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
Release sglang-kt to PyPI / Publish sglang-kt to PyPI (push) Blocked by required conditions
Release sglang-kt to PyPI / Build sglang-kt wheel (push) Waiting to run
2026-04-27 22:04:29 +08:00
Peilin Li
0656e01ac1
[docs]: refresh KT install commands (#1958)
Some checks are pending
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
2026-04-27 00:45:43 +08:00
Peilin Li
07e274467a
[build]: flatten ktransformers package shim (#1955)
Some checks failed
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
Release sglang-kt to PyPI / Build sglang-kt wheel (push) Has been cancelled
Release sglang-kt to PyPI / Publish sglang-kt to PyPI (push) Has been cancelled
2026-04-25 22:08:52 +08:00
Peilin Li
bfbd0e9352
[chore]: archive kt-sft package (#1954) 2026-04-25 21:49:21 +08:00
Peilin Li
85f1ab530b
[ci]: use hosted runner for sglang-kt release
Use GitHub-hosted runners for the pure Python sglang-kt release workflow so the PyPI release is not blocked by unavailable self-hosted runners.
2026-04-25 21:05:18 +08:00
Peilin Li
bc7afff13b
[chore]: sync sglang-kt packaging fix
Update third_party/sglang to the merged sglang-kt 0.6.1 dependency metadata fix so the release workflow builds the corrected inference package.
2026-04-25 21:02:25 +08:00
Peilin Li
eeaeb7bfd7
[build]: align kt-kernel torch support with v0.6.1 release (#1948)
Some checks are pending
Book-CI / test-2 (push) Waiting to run
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
2026-04-24 23:45:15 +08:00
394 changed files with 51 additions and 43 deletions

View file

@ -24,7 +24,7 @@ permissions:
jobs:
build-sglang-kt:
name: Build sglang-kt wheel
runs-on: [self-hosted, linux, x64]
runs-on: ubuntu-latest
steps:
- name: Checkout repository
@ -70,7 +70,7 @@ jobs:
publish-pypi:
name: Publish sglang-kt to PyPI
needs: [build-sglang-kt]
runs-on: [self-hosted, linux, x64]
runs-on: ubuntu-latest
if: github.repository == 'kvcache-ai/ktransformers' && github.ref == 'refs/heads/main'
environment: prod
permissions:

View file

@ -26,7 +26,7 @@ KTransformers is a research project focused on efficient inference and fine-tuni
* **Dec 22, 2025**: Support RL-DPO fine-tuning with LLaMA-Factory. ([Tutorial](./doc/en/SFT/DPO_tutorial.md))
* **Dec 5, 2025**: Support Native Kimi-K2-Thinking inference ([Tutorial](./doc/en/kt-kernel/Kimi-K2-Thinking-Native.md))
* **Nov 6, 2025**: Support Kimi-K2-Thinking inference ([Tutorial](./doc/en/Kimi-K2-Thinking.md)) and fine-tune ([Tutorial](./doc/en/SFT_Installation_Guide_KimiK2.md))
* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/KTransformers-Fine-Tuning_User-Guide.md))
* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md))
* **Oct 27, 2025**: Support Ascend NPU. ([Tutorial](./doc/zh/DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md))
* **Oct 10, 2025**: Integrating into SGLang. ([Roadmap](https://github.com/sgl-project/sglang/issues/11425), [Blog](https://lmsys.org/blog/2025-10-22-KTransformers/))
* **Sept 11, 2025**: Support Qwen3-Next. ([Tutorial](./doc/en/Qwen3-Next.md))
@ -87,7 +87,7 @@ pip install .
---
### 🎓 [kt-sft](./kt-sft/) - Fine-Tuning Framework
### 🎓 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md) - Fine-Tuning Framework
KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning.
@ -109,12 +109,15 @@ KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning
**Quick Start:**
```bash
cd kt-sft
# Install environment following kt-sft/README.md
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
cd /path/to/LLaMA-Factory
pip install -e .
pip install "ktransformers[sft]"
USE_KT=1 ACCELERATE_USE_KT=true \
accelerate launch --config_file examples/ktransformers/accelerate/fsdp2_kt_bf16.yaml \
-m llamafactory.cli train examples/ktransformers/train_lora/deepseek_v3_lora_sft_kt.yaml
```
👉 **[Full Documentation →](./kt-sft/README.md)**
👉 **[Full Documentation →](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)**
---

View file

@ -13,13 +13,13 @@
## 🎯 概览
KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**[kt-kernel](./kt-kernel/) 和 [kt-sft](./kt-sft/)。
KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**[kt-kernel](./kt-kernel/) 和 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)。
## 🔥 更新
* **2025 年 12 月 5 日**:支持原生 Kimi-K2-Thinking 推理([教程](./doc/en/Kimi-K2-Thinking-Native.md)
* **2025 年 12 月 5 日**:支持原生 Kimi-K2-Thinking 推理([教程](./doc/en/kt-kernel/Kimi-K2-Thinking-Native.md)
* **2025 年 11 月 6 日**:支持 Kimi-K2-Thinking 推理([教程](./doc/en/Kimi-K2-Thinking.md))和微调([教程](./doc/en/SFT_Installation_Guide_KimiK2.md)
* **2025 年 11 月 4 日**KTransformers 微调 × LLaMA-Factory 集成([教程](./doc/en/KTransformers-Fine-Tuning_User-Guide.md)
* **2025 年 11 月 4 日**KTransformers 微调 × LLaMA-Factory 集成([教程](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)
* **2025 年 10 月 27 日**:支持昇腾 NPU[教程](./doc/zh/DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md)
* **2025 年 10 月 10 日**:集成到 SGLang[路线图](https://github.com/sgl-project/sglang/issues/11425)[博客](https://lmsys.org/blog/2025-10-22-KTransformers/)
* **2025 年 9 月 11 日**:支持 Qwen3-Next[教程](./doc/en/Qwen3-Next.md)
@ -79,7 +79,7 @@ pip install .
---
### 🎓 [kt-sft](./kt-sft/) - 微调框架
### 🎓 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md) - 微调框架
KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。
@ -101,12 +101,15 @@ KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。
**快速开始:**
```bash
cd kt-sft
# 按照 kt-sft/README.md 安装环境
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
cd /path/to/LLaMA-Factory
pip install -e .
pip install "ktransformers[sft]"
USE_KT=1 ACCELERATE_USE_KT=true \
accelerate launch --config_file examples/ktransformers/accelerate/fsdp2_kt_bf16.yaml \
-m llamafactory.cli train examples/ktransformers/train_lora/deepseek_v3_lora_sft_kt.yaml
```
👉 **[完整文档 →](./kt-sft/README.md)**
👉 **[完整文档 →](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)**
---

View file

Before

Width:  |  Height:  |  Size: 1.1 MiB

After

Width:  |  Height:  |  Size: 1.1 MiB

Before After
Before After

Some files were not shown because too many files have changed in this diff Show more