diff --git a/.github/workflows/release-pypi.yml b/.github/workflows/release-pypi.yml
index 50537faa..4c8965cf 100644
--- a/.github/workflows/release-pypi.yml
+++ b/.github/workflows/release-pypi.yml
@@ -107,6 +107,7 @@ jobs:
working-directory: kt-kernel
env:
CPUINFER_BUILD_ALL_VARIANTS: '1'
+ CPUINFER_ENABLE_CPPTRACE: '0'
CPUINFER_USE_CUDA: '1'
CPUINFER_CUDA_ARCHS: '80;86;89;90'
CPUINFER_CUDA_STATIC_RUNTIME: '1'
diff --git a/.github/workflows/release-sglang-kt.yml b/.github/workflows/release-sglang-kt.yml
index 0d745e3b..6f1121cf 100644
--- a/.github/workflows/release-sglang-kt.yml
+++ b/.github/workflows/release-sglang-kt.yml
@@ -24,7 +24,7 @@ permissions:
jobs:
build-sglang-kt:
name: Build sglang-kt wheel
- runs-on: [self-hosted, linux, x64]
+ runs-on: ubuntu-latest
steps:
- name: Checkout repository
@@ -70,7 +70,7 @@ jobs:
publish-pypi:
name: Publish sglang-kt to PyPI
needs: [build-sglang-kt]
- runs-on: [self-hosted, linux, x64]
+ runs-on: ubuntu-latest
if: github.repository == 'kvcache-ai/ktransformers' && github.ref == 'refs/heads/main'
environment: prod
permissions:
diff --git a/README.md b/README.md
index 9c057757..b29eb3d4 100644
--- a/README.md
+++ b/README.md
@@ -8,7 +8,7 @@
A Flexible Framework for Experiencing Cutting-edge LLM Inference/Fine-tune Optimizations
- 🎯 Overview | 🚀 kt-kernel | 🎓 kt-sft | 🔥 Citation | 🚀 Roadmap(2025Q4)
+ 🎯 Overview | 🚀 kt-kernel | 🎓 kt-sft | 🔥 Citation | 🚀 Roadmap(2026Q2)
## 🎯 Overview
@@ -16,7 +16,8 @@
KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](https://github.com/kvcache-ai/ktransformers/tree/main/kt-kernel/) and [kt-sft](https://github.com/kvcache-ai/ktransformers/tree/main/kt-sft).
## 🔥 Updates
-
+* **May 6, 2026**: KTransformers at [GOSIM Paris 2026](https://paris2026.gosim.org/zh/schedule/) — "Agentic AI on Edge" track. We'll present KT's inference performance on consumer hardware.
+* **Mar 26, 2026**: Support AVX2-only CPU backend for KT-Kernel inference. ([Tutorial](./doc/en/kt-kernel/AVX2-Tutorial.md))
* **Feb 13, 2026**: MiniMax-M2.5 Day0 Support! ([Tutorial](./doc/en/MiniMax-M2.5.md))
* **Feb 12, 2026**: GLM-5 Day0 Support! ([Tutorial](./doc/en/kt-kernel/GLM-5-Tutorial.md))
* **Jan 27, 2026**: Kimi-K2.5 Day0 Support! ([Tutorial](./doc/en/Kimi-K2.5.md)) ([SFT Tutorial](./doc/en/SFT_Installation_Guide_KimiK2.5.md))
@@ -25,7 +26,7 @@ KTransformers is a research project focused on efficient inference and fine-tuni
* **Dec 22, 2025**: Support RL-DPO fine-tuning with LLaMA-Factory. ([Tutorial](./doc/en/SFT/DPO_tutorial.md))
* **Dec 5, 2025**: Support Native Kimi-K2-Thinking inference ([Tutorial](./doc/en/kt-kernel/Kimi-K2-Thinking-Native.md))
* **Nov 6, 2025**: Support Kimi-K2-Thinking inference ([Tutorial](./doc/en/Kimi-K2-Thinking.md)) and fine-tune ([Tutorial](./doc/en/SFT_Installation_Guide_KimiK2.md))
-* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/KTransformers-Fine-Tuning_User-Guide.md))
+* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md))
* **Oct 27, 2025**: Support Ascend NPU. ([Tutorial](./doc/zh/DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md))
* **Oct 10, 2025**: Integrating into SGLang. ([Roadmap](https://github.com/sgl-project/sglang/issues/11425), [Blog](https://lmsys.org/blog/2025-10-22-KTransformers/))
* **Sept 11, 2025**: Support Qwen3-Next. ([Tutorial](./doc/en/Qwen3-Next.md))
@@ -86,7 +87,7 @@ pip install .
---
-### 🎓 [kt-sft](./kt-sft/) - Fine-Tuning Framework
+### 🎓 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md) - Fine-Tuning Framework
KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning.
@@ -108,12 +109,15 @@ KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning
**Quick Start:**
```bash
-cd kt-sft
-# Install environment following kt-sft/README.md
-USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
+cd /path/to/LLaMA-Factory
+pip install -e .
+pip install "ktransformers[sft]"
+USE_KT=1 ACCELERATE_USE_KT=true \
+ accelerate launch --config_file examples/ktransformers/accelerate/fsdp2_kt_bf16.yaml \
+ -m llamafactory.cli train examples/ktransformers/train_lora/deepseek_v3_lora_sft_kt.yaml
```
-👉 **[Full Documentation →](./kt-sft/README.md)**
+👉 **[Full Documentation →](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)**
---
diff --git a/README_ZH.md b/README_ZH.md
index e60183ae..e1f73c85 100644
--- a/README_ZH.md
+++ b/README_ZH.md
@@ -13,13 +13,13 @@
## 🎯 概览
-KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [kt-sft](./kt-sft/)。
+KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)。
## 🔥 更新
-* **2025 年 12 月 5 日**:支持原生 Kimi-K2-Thinking 推理([教程](./doc/en/Kimi-K2-Thinking-Native.md))
+* **2025 年 12 月 5 日**:支持原生 Kimi-K2-Thinking 推理([教程](./doc/en/kt-kernel/Kimi-K2-Thinking-Native.md))
* **2025 年 11 月 6 日**:支持 Kimi-K2-Thinking 推理([教程](./doc/en/Kimi-K2-Thinking.md))和微调([教程](./doc/en/SFT_Installation_Guide_KimiK2.md))
-* **2025 年 11 月 4 日**:KTransformers 微调 × LLaMA-Factory 集成([教程](./doc/en/KTransformers-Fine-Tuning_User-Guide.md))
+* **2025 年 11 月 4 日**:KTransformers 微调 × LLaMA-Factory 集成([教程](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md))
* **2025 年 10 月 27 日**:支持昇腾 NPU([教程](./doc/zh/DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md))
* **2025 年 10 月 10 日**:集成到 SGLang([路线图](https://github.com/sgl-project/sglang/issues/11425),[博客](https://lmsys.org/blog/2025-10-22-KTransformers/))
* **2025 年 9 月 11 日**:支持 Qwen3-Next([教程](./doc/en/Qwen3-Next.md))
@@ -79,7 +79,7 @@ pip install .
---
-### 🎓 [kt-sft](./kt-sft/) - 微调框架
+### 🎓 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md) - 微调框架
KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。
@@ -101,12 +101,15 @@ KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。
**快速开始:**
```bash
-cd kt-sft
-# 按照 kt-sft/README.md 安装环境
-USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
+cd /path/to/LLaMA-Factory
+pip install -e .
+pip install "ktransformers[sft]"
+USE_KT=1 ACCELERATE_USE_KT=true \
+ accelerate launch --config_file examples/ktransformers/accelerate/fsdp2_kt_bf16.yaml \
+ -m llamafactory.cli train examples/ktransformers/train_lora/deepseek_v3_lora_sft_kt.yaml
```
-👉 **[完整文档 →](./kt-sft/README.md)**
+👉 **[完整文档 →](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)**
---
diff --git a/kt-sft/.flake8 b/archive/kt-sft/.flake8
similarity index 100%
rename from kt-sft/.flake8
rename to archive/kt-sft/.flake8
diff --git a/kt-sft/.gitignore b/archive/kt-sft/.gitignore
similarity index 100%
rename from kt-sft/.gitignore
rename to archive/kt-sft/.gitignore
diff --git a/kt-sft/.gitmodules b/archive/kt-sft/.gitmodules
similarity index 100%
rename from kt-sft/.gitmodules
rename to archive/kt-sft/.gitmodules
diff --git a/kt-sft/.pylintrc b/archive/kt-sft/.pylintrc
similarity index 100%
rename from kt-sft/.pylintrc
rename to archive/kt-sft/.pylintrc
diff --git a/kt-sft/Dockerfile b/archive/kt-sft/Dockerfile
similarity index 100%
rename from kt-sft/Dockerfile
rename to archive/kt-sft/Dockerfile
diff --git a/kt-sft/Dockerfile.xpu b/archive/kt-sft/Dockerfile.xpu
similarity index 100%
rename from kt-sft/Dockerfile.xpu
rename to archive/kt-sft/Dockerfile.xpu
diff --git a/kt-sft/LICENSE b/archive/kt-sft/LICENSE
similarity index 100%
rename from kt-sft/LICENSE
rename to archive/kt-sft/LICENSE
diff --git a/kt-sft/MANIFEST.in b/archive/kt-sft/MANIFEST.in
similarity index 100%
rename from kt-sft/MANIFEST.in
rename to archive/kt-sft/MANIFEST.in
diff --git a/kt-sft/Makefile b/archive/kt-sft/Makefile
similarity index 100%
rename from kt-sft/Makefile
rename to archive/kt-sft/Makefile
diff --git a/kt-sft/README.md b/archive/kt-sft/README.md
similarity index 100%
rename from kt-sft/README.md
rename to archive/kt-sft/README.md
diff --git a/kt-sft/SECURITY.md b/archive/kt-sft/SECURITY.md
similarity index 100%
rename from kt-sft/SECURITY.md
rename to archive/kt-sft/SECURITY.md
diff --git a/kt-sft/WeChatGroup.png b/archive/kt-sft/WeChatGroup.png
similarity index 100%
rename from kt-sft/WeChatGroup.png
rename to archive/kt-sft/WeChatGroup.png
diff --git a/kt-sft/autosetup.sh b/archive/kt-sft/autosetup.sh
similarity index 100%
rename from kt-sft/autosetup.sh
rename to archive/kt-sft/autosetup.sh
diff --git a/kt-sft/book.toml b/archive/kt-sft/book.toml
similarity index 100%
rename from kt-sft/book.toml
rename to archive/kt-sft/book.toml
diff --git a/kt-sft/csrc/custom_marlin/__init__.py b/archive/kt-sft/csrc/custom_marlin/__init__.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/__init__.py
rename to archive/kt-sft/csrc/custom_marlin/__init__.py
diff --git a/kt-sft/csrc/custom_marlin/binding.cpp b/archive/kt-sft/csrc/custom_marlin/binding.cpp
similarity index 100%
rename from kt-sft/csrc/custom_marlin/binding.cpp
rename to archive/kt-sft/csrc/custom_marlin/binding.cpp
diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
similarity index 100%
rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh
similarity index 100%
rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh
rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh
diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh
similarity index 100%
rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh
rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh
diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu
similarity index 100%
rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu
rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu
diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/ops.h b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/ops.h
similarity index 100%
rename from kt-sft/csrc/custom_marlin/gptq_marlin/ops.h
rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/ops.h
diff --git a/kt-sft/csrc/custom_marlin/setup.py b/archive/kt-sft/csrc/custom_marlin/setup.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/setup.py
rename to archive/kt-sft/csrc/custom_marlin/setup.py
diff --git a/kt-sft/csrc/custom_marlin/test_cuda_graph.py b/archive/kt-sft/csrc/custom_marlin/test_cuda_graph.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/test_cuda_graph.py
rename to archive/kt-sft/csrc/custom_marlin/test_cuda_graph.py
diff --git a/kt-sft/csrc/custom_marlin/utils/__init__.py b/archive/kt-sft/csrc/custom_marlin/utils/__init__.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/utils/__init__.py
rename to archive/kt-sft/csrc/custom_marlin/utils/__init__.py
diff --git a/kt-sft/csrc/custom_marlin/utils/format24.py b/archive/kt-sft/csrc/custom_marlin/utils/format24.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/utils/format24.py
rename to archive/kt-sft/csrc/custom_marlin/utils/format24.py
diff --git a/kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py b/archive/kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py
rename to archive/kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py
diff --git a/kt-sft/csrc/custom_marlin/utils/marlin_perms.py b/archive/kt-sft/csrc/custom_marlin/utils/marlin_perms.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/utils/marlin_perms.py
rename to archive/kt-sft/csrc/custom_marlin/utils/marlin_perms.py
diff --git a/kt-sft/csrc/custom_marlin/utils/marlin_utils.py b/archive/kt-sft/csrc/custom_marlin/utils/marlin_utils.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/utils/marlin_utils.py
rename to archive/kt-sft/csrc/custom_marlin/utils/marlin_utils.py
diff --git a/kt-sft/csrc/custom_marlin/utils/quant_utils.py b/archive/kt-sft/csrc/custom_marlin/utils/quant_utils.py
similarity index 100%
rename from kt-sft/csrc/custom_marlin/utils/quant_utils.py
rename to archive/kt-sft/csrc/custom_marlin/utils/quant_utils.py
diff --git a/kt-sft/csrc/ktransformers_ext/CMakeLists.txt b/archive/kt-sft/csrc/ktransformers_ext/CMakeLists.txt
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/CMakeLists.txt
rename to archive/kt-sft/csrc/ktransformers_ext/CMakeLists.txt
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_attention.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_attention.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_linear.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_moe.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py
diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py
rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py
diff --git a/kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake b/archive/kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake
rename to archive/kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h
diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h
rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/binding.cpp b/archive/kt-sft/csrc/ktransformers_ext/cuda/binding.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/binding.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/binding.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu b/archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h b/archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/setup.py b/archive/kt-sft/csrc/ktransformers_ext/cuda/setup.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/setup.py
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/setup.py
diff --git a/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py b/archive/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py
rename to archive/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py
diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_attention.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_attention.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/examples/test_attention.py
rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_attention.py
diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_linear.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_linear.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/examples/test_linear.py
rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_linear.py
diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/examples/test_mlp.py
rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py
diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_moe.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_moe.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/examples/test_moe.py
rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_moe.py
diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py
rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py
diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py
rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py
diff --git a/kt-sft/csrc/ktransformers_ext/ext_bindings.cpp b/archive/kt-sft/csrc/ktransformers_ext/ext_bindings.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/ext_bindings.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/ext_bindings.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h
diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h
diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h
rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h
diff --git a/kt-sft/csrc/ktransformers_ext/vendors/cuda.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/cuda.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/vendors/cuda.h
rename to archive/kt-sft/csrc/ktransformers_ext/vendors/cuda.h
diff --git a/kt-sft/csrc/ktransformers_ext/vendors/hip.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/hip.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/vendors/hip.h
rename to archive/kt-sft/csrc/ktransformers_ext/vendors/hip.h
diff --git a/kt-sft/csrc/ktransformers_ext/vendors/musa.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/musa.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/vendors/musa.h
rename to archive/kt-sft/csrc/ktransformers_ext/vendors/musa.h
diff --git a/kt-sft/csrc/ktransformers_ext/vendors/vendor.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/vendor.h
similarity index 100%
rename from kt-sft/csrc/ktransformers_ext/vendors/vendor.h
rename to archive/kt-sft/csrc/ktransformers_ext/vendors/vendor.h
diff --git a/kt-sft/install-with-cache.sh b/archive/kt-sft/install-with-cache.sh
similarity index 100%
rename from kt-sft/install-with-cache.sh
rename to archive/kt-sft/install-with-cache.sh
diff --git a/kt-sft/install.bat b/archive/kt-sft/install.bat
similarity index 100%
rename from kt-sft/install.bat
rename to archive/kt-sft/install.bat
diff --git a/kt-sft/install.sh b/archive/kt-sft/install.sh
similarity index 100%
rename from kt-sft/install.sh
rename to archive/kt-sft/install.sh
diff --git a/kt-sft/ktransformers/__init__.py b/archive/kt-sft/ktransformers/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/__init__.py
rename to archive/kt-sft/ktransformers/__init__.py
diff --git a/kt-sft/ktransformers/configs/config.yaml b/archive/kt-sft/ktransformers/configs/config.yaml
similarity index 100%
rename from kt-sft/ktransformers/configs/config.yaml
rename to archive/kt-sft/ktransformers/configs/config.yaml
diff --git a/kt-sft/ktransformers/configs/log_config.ini b/archive/kt-sft/ktransformers/configs/log_config.ini
similarity index 100%
rename from kt-sft/ktransformers/configs/log_config.ini
rename to archive/kt-sft/ktransformers/configs/log_config.ini
diff --git a/kt-sft/ktransformers/configs/model_config/config.json b/archive/kt-sft/ktransformers/configs/model_config/config.json
similarity index 100%
rename from kt-sft/ktransformers/configs/model_config/config.json
rename to archive/kt-sft/ktransformers/configs/model_config/config.json
diff --git a/kt-sft/ktransformers/configs/model_config/configuration_deepseek.py b/archive/kt-sft/ktransformers/configs/model_config/configuration_deepseek.py
similarity index 100%
rename from kt-sft/ktransformers/configs/model_config/configuration_deepseek.py
rename to archive/kt-sft/ktransformers/configs/model_config/configuration_deepseek.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py
diff --git a/kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py b/archive/kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py
similarity index 100%
rename from kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py
rename to archive/kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py
diff --git a/kt-sft/ktransformers/local_chat.py b/archive/kt-sft/ktransformers/local_chat.py
similarity index 100%
rename from kt-sft/ktransformers/local_chat.py
rename to archive/kt-sft/ktransformers/local_chat.py
diff --git a/kt-sft/ktransformers/local_chat.sh b/archive/kt-sft/ktransformers/local_chat.sh
similarity index 100%
rename from kt-sft/ktransformers/local_chat.sh
rename to archive/kt-sft/ktransformers/local_chat.sh
diff --git a/kt-sft/ktransformers/lora_test_module.py b/archive/kt-sft/ktransformers/lora_test_module.py
similarity index 100%
rename from kt-sft/ktransformers/lora_test_module.py
rename to archive/kt-sft/ktransformers/lora_test_module.py
diff --git a/kt-sft/ktransformers/models/__init__.py b/archive/kt-sft/ktransformers/models/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/models/__init__.py
rename to archive/kt-sft/ktransformers/models/__init__.py
diff --git a/kt-sft/ktransformers/models/configuration_deepseek.py b/archive/kt-sft/ktransformers/models/configuration_deepseek.py
similarity index 100%
rename from kt-sft/ktransformers/models/configuration_deepseek.py
rename to archive/kt-sft/ktransformers/models/configuration_deepseek.py
diff --git a/kt-sft/ktransformers/models/configuration_deepseek_v3.py b/archive/kt-sft/ktransformers/models/configuration_deepseek_v3.py
similarity index 100%
rename from kt-sft/ktransformers/models/configuration_deepseek_v3.py
rename to archive/kt-sft/ktransformers/models/configuration_deepseek_v3.py
diff --git a/kt-sft/ktransformers/models/configuration_llama.py b/archive/kt-sft/ktransformers/models/configuration_llama.py
similarity index 100%
rename from kt-sft/ktransformers/models/configuration_llama.py
rename to archive/kt-sft/ktransformers/models/configuration_llama.py
diff --git a/kt-sft/ktransformers/models/configuration_qwen2_moe.py b/archive/kt-sft/ktransformers/models/configuration_qwen2_moe.py
similarity index 100%
rename from kt-sft/ktransformers/models/configuration_qwen2_moe.py
rename to archive/kt-sft/ktransformers/models/configuration_qwen2_moe.py
diff --git a/kt-sft/ktransformers/models/configuration_qwen3_moe.py b/archive/kt-sft/ktransformers/models/configuration_qwen3_moe.py
similarity index 100%
rename from kt-sft/ktransformers/models/configuration_qwen3_moe.py
rename to archive/kt-sft/ktransformers/models/configuration_qwen3_moe.py
diff --git a/kt-sft/ktransformers/models/custom_cache.py b/archive/kt-sft/ktransformers/models/custom_cache.py
similarity index 100%
rename from kt-sft/ktransformers/models/custom_cache.py
rename to archive/kt-sft/ktransformers/models/custom_cache.py
diff --git a/kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py b/archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py
similarity index 100%
rename from kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py
rename to archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py
diff --git a/kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py b/archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py
similarity index 100%
rename from kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py
rename to archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py
diff --git a/kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py b/archive/kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py
similarity index 100%
rename from kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py
rename to archive/kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py
diff --git a/kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py b/archive/kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py
similarity index 100%
rename from kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py
rename to archive/kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py
diff --git a/kt-sft/ktransformers/models/modeling_deepseek.py b/archive/kt-sft/ktransformers/models/modeling_deepseek.py
similarity index 100%
rename from kt-sft/ktransformers/models/modeling_deepseek.py
rename to archive/kt-sft/ktransformers/models/modeling_deepseek.py
diff --git a/kt-sft/ktransformers/models/modeling_deepseek_v3.py b/archive/kt-sft/ktransformers/models/modeling_deepseek_v3.py
similarity index 100%
rename from kt-sft/ktransformers/models/modeling_deepseek_v3.py
rename to archive/kt-sft/ktransformers/models/modeling_deepseek_v3.py
diff --git a/kt-sft/ktransformers/models/modeling_llama.py b/archive/kt-sft/ktransformers/models/modeling_llama.py
similarity index 100%
rename from kt-sft/ktransformers/models/modeling_llama.py
rename to archive/kt-sft/ktransformers/models/modeling_llama.py
diff --git a/kt-sft/ktransformers/models/modeling_mixtral.py b/archive/kt-sft/ktransformers/models/modeling_mixtral.py
similarity index 100%
rename from kt-sft/ktransformers/models/modeling_mixtral.py
rename to archive/kt-sft/ktransformers/models/modeling_mixtral.py
diff --git a/kt-sft/ktransformers/models/modeling_qwen2_moe.py b/archive/kt-sft/ktransformers/models/modeling_qwen2_moe.py
similarity index 100%
rename from kt-sft/ktransformers/models/modeling_qwen2_moe.py
rename to archive/kt-sft/ktransformers/models/modeling_qwen2_moe.py
diff --git a/kt-sft/ktransformers/models/modeling_qwen3_moe.py b/archive/kt-sft/ktransformers/models/modeling_qwen3_moe.py
similarity index 100%
rename from kt-sft/ktransformers/models/modeling_qwen3_moe.py
rename to archive/kt-sft/ktransformers/models/modeling_qwen3_moe.py
diff --git a/kt-sft/ktransformers/moe_test_module.py b/archive/kt-sft/ktransformers/moe_test_module.py
similarity index 100%
rename from kt-sft/ktransformers/moe_test_module.py
rename to archive/kt-sft/ktransformers/moe_test_module.py
diff --git a/kt-sft/ktransformers/moe_test_module_old.py b/archive/kt-sft/ktransformers/moe_test_module_old.py
similarity index 100%
rename from kt-sft/ktransformers/moe_test_module_old.py
rename to archive/kt-sft/ktransformers/moe_test_module_old.py
diff --git a/kt-sft/ktransformers/operators/RoPE.py b/archive/kt-sft/ktransformers/operators/RoPE.py
similarity index 100%
rename from kt-sft/ktransformers/operators/RoPE.py
rename to archive/kt-sft/ktransformers/operators/RoPE.py
diff --git a/kt-sft/ktransformers/operators/__init__.py b/archive/kt-sft/ktransformers/operators/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/operators/__init__.py
rename to archive/kt-sft/ktransformers/operators/__init__.py
diff --git a/kt-sft/ktransformers/operators/attention.py b/archive/kt-sft/ktransformers/operators/attention.py
similarity index 100%
rename from kt-sft/ktransformers/operators/attention.py
rename to archive/kt-sft/ktransformers/operators/attention.py
diff --git a/kt-sft/ktransformers/operators/balance_serve_attention.py b/archive/kt-sft/ktransformers/operators/balance_serve_attention.py
similarity index 100%
rename from kt-sft/ktransformers/operators/balance_serve_attention.py
rename to archive/kt-sft/ktransformers/operators/balance_serve_attention.py
diff --git a/kt-sft/ktransformers/operators/base_operator.py b/archive/kt-sft/ktransformers/operators/base_operator.py
similarity index 100%
rename from kt-sft/ktransformers/operators/base_operator.py
rename to archive/kt-sft/ktransformers/operators/base_operator.py
diff --git a/kt-sft/ktransformers/operators/cpuinfer.py b/archive/kt-sft/ktransformers/operators/cpuinfer.py
similarity index 100%
rename from kt-sft/ktransformers/operators/cpuinfer.py
rename to archive/kt-sft/ktransformers/operators/cpuinfer.py
diff --git a/kt-sft/ktransformers/operators/dynamic_attention.py b/archive/kt-sft/ktransformers/operators/dynamic_attention.py
similarity index 100%
rename from kt-sft/ktransformers/operators/dynamic_attention.py
rename to archive/kt-sft/ktransformers/operators/dynamic_attention.py
diff --git a/kt-sft/ktransformers/operators/experts.py b/archive/kt-sft/ktransformers/operators/experts.py
similarity index 93%
rename from kt-sft/ktransformers/operators/experts.py
rename to archive/kt-sft/ktransformers/operators/experts.py
index 19bbd64f..0e80bf18 100644
--- a/kt-sft/ktransformers/operators/experts.py
+++ b/archive/kt-sft/ktransformers/operators/experts.py
@@ -418,6 +418,18 @@ class KSFTExpertsCPU(torch.autograd.Function):
#stream_map:dict = {} # Manage cuda stream on different gpu
#gguf_loader:GGUFLoader = None
CPU_INFER = CPUInfer(Config().cpu_infer)
+
+ # Pinned memory buffers for training (batch mode)
+ # These are used for efficient CPU-GPU data transfer
+ _pinned_input_buf: Tensor = None # [max_tokens, hidden_size]
+ _pinned_output_buf: Tensor = None # [max_tokens, hidden_size]
+ _pinned_expert_ids_buf: Tensor = None # [max_tokens, num_experts_per_tok]
+ _pinned_weights_buf: Tensor = None # [max_tokens, num_experts_per_tok]
+ _pinned_grad_out_buf: Tensor = None # [max_tokens, hidden_size] for backward
+ _pinned_grad_in_buf: Tensor = None # [max_tokens, hidden_size] for backward
+ _pinned_buf_size: int = 0 # current buffer capacity (max_tokens)
+ _hidden_size: int = 0
+ _num_experts_per_tok: int = 0
def __init__(
self,
key: str,
@@ -449,6 +461,57 @@ class KSFTExpertsCPU(torch.autograd.Function):
self.tflops_fwd = []
self.tflops_bwd = []
+ @classmethod
+ def _ensure_pinned_buffers(cls, num_tokens: int, hidden_size: int, num_experts_per_tok: int):
+ """
+ Ensure pinned memory buffers are allocated with sufficient size.
+ Buffers are reused across calls and only reallocated if more space is needed.
+ """
+ # Check if we need to allocate or expand buffers
+ if (cls._pinned_input_buf is None or
+ num_tokens > cls._pinned_buf_size or
+ hidden_size != cls._hidden_size or
+ num_experts_per_tok != cls._num_experts_per_tok):
+
+ # Allocate with some extra capacity to reduce reallocations
+ new_size = max(num_tokens, cls._pinned_buf_size * 2) if cls._pinned_buf_size > 0 else num_tokens
+ new_size = max(new_size, 1024) # minimum 1024 tokens
+
+ # Free old buffers
+ cls._pinned_input_buf = None
+ cls._pinned_output_buf = None
+ cls._pinned_expert_ids_buf = None
+ cls._pinned_weights_buf = None
+ cls._pinned_grad_out_buf = None
+ cls._pinned_grad_in_buf = None
+
+ # Allocate new pinned buffers
+ cls._pinned_input_buf = torch.empty(
+ (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True
+ )
+ cls._pinned_output_buf = torch.empty(
+ (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True
+ )
+ cls._pinned_expert_ids_buf = torch.empty(
+ (new_size, num_experts_per_tok), dtype=torch.long, device="cpu", pin_memory=True
+ )
+ cls._pinned_weights_buf = torch.empty(
+ (new_size, num_experts_per_tok), dtype=torch.float32, device="cpu", pin_memory=True
+ )
+ cls._pinned_grad_out_buf = torch.empty(
+ (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True
+ )
+ cls._pinned_grad_in_buf = torch.empty(
+ (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True
+ )
+
+ cls._pinned_buf_size = new_size
+ cls._hidden_size = hidden_size
+ cls._num_experts_per_tok = num_experts_per_tok
+
+ print(f"[KSFTExpertsCPU] Allocated pinned memory buffers: "
+ f"size={new_size}, hidden={hidden_size}, k={num_experts_per_tok}")
+
def load(self, w: dict | nn.Parameter | tuple | None = None, device:str|None = None, warmup:bool = False):
if device:
assert device.lower() == "cpu", "KSFTExpertsCPU can only be loaded on CPU, Parameter \"device\" can be cpu or None."
@@ -548,7 +611,16 @@ class KSFTExpertsCPU(torch.autograd.Function):
KSFTExpertsCPU.expert_ids_cpu = torch.zeros((num_experts_per_tok), device="cpu", dtype=torch.long, pin_memory=True)
KSFTExpertsCPU.weights_cpu = torch.zeros((num_experts_per_tok), device="cpu", dtype=torch.float32, pin_memory=True)
KSFTExpertsCPU.output_cpu = torch.zeros((self.config.hidden_size), device="cpu", pin_memory=True, dtype=torch.bfloat16)
-
+
+ # Initialize pinned memory buffers for training (batch mode)
+ # Default size is 4096 tokens, will expand automatically if needed
+ default_max_tokens = 4096
+ KSFTExpertsCPU._ensure_pinned_buffers(
+ default_max_tokens,
+ self.config.hidden_size,
+ num_experts_per_tok
+ )
+
self.gate = None
self.up = None
self.down = None
@@ -577,37 +649,68 @@ class KSFTExpertsCPU(torch.autograd.Function):
if input_tensor.size(0)==1 and torch.cuda.is_current_stream_capturing():
# TODO: this branch is unreachable, but the shape of input_tensor([1,hidden_size]) and input_tensor_cpu([hidden_size]) is not compatible
#print("capturing experts")
+ wall_t0 = time.time()
KSFTExpertsCPU.input_tensor_cpu.copy_(input_tensor, non_blocking=True)
KSFTExpertsCPU.expert_ids_cpu.copy_(expert_ids, non_blocking=True)
KSFTExpertsCPU.weights_cpu.copy_(weights, non_blocking=True)
cpu_infer.submit_with_cuda_stream(torch.cuda.current_stream().cuda_stream, moe.forward(1, expert_ids.size(1), KSFTExpertsCPU.expert_ids_cpu.data_ptr(), KSFTExpertsCPU.weights_cpu.data_ptr(), KSFTExpertsCPU.input_tensor_cpu.data_ptr(), KSFTExpertsCPU.output_cpu.data_ptr()))
cpu_infer.sync_with_cuda_stream(torch.cuda.current_stream().cuda_stream)
- t_fwd = time.time() - wall_t0
+ t_fwd = time.time() - wall_t0
KSFTExpertsCPU.output_gpu_map[out_device].copy_(KSFTExpertsCPU.output_cpu, non_blocking=True)
result = KSFTExpertsCPU.output_gpu_map[out_device]
+ # For backward compatibility, copy to CPU tensors
+ input_cpu = input_tensor.contiguous().cpu()
+ expert_ids_cpu = expert_ids.contiguous().cpu()
+ weights_cpu = weights.to(torch.float32).contiguous().cpu()
else:
- input_tensor = input_tensor.contiguous().cpu()
- expert_ids = expert_ids.contiguous().cpu()
- weights = weights.contiguous().to(torch.float32).cpu()
- output = torch.empty_like(input_tensor).contiguous()
- # print("success record")
+ num_tokens = input_tensor.size(0)
+ hidden_size = input_tensor.size(1)
+ num_experts_per_tok = expert_ids.size(1)
+
+ # Ensure pinned buffers are large enough
+ KSFTExpertsCPU._ensure_pinned_buffers(num_tokens, hidden_size, num_experts_per_tok)
+
+ # Use pinned memory buffers for efficient CPU-GPU transfer
+ input_buf = KSFTExpertsCPU._pinned_input_buf[:num_tokens]
+ output_buf = KSFTExpertsCPU._pinned_output_buf[:num_tokens]
+ expert_ids_buf = KSFTExpertsCPU._pinned_expert_ids_buf[:num_tokens]
+ weights_buf = KSFTExpertsCPU._pinned_weights_buf[:num_tokens]
+
+ # Copy data to pinned memory (non_blocking for async transfer)
+ input_buf.copy_(input_tensor.to(torch.bfloat16), non_blocking=True)
+ expert_ids_buf.copy_(expert_ids, non_blocking=True)
+ weights_buf.copy_(weights.to(torch.float32), non_blocking=True)
+
+ # Synchronize to ensure data is ready on CPU
+ if input_tensor.is_cuda:
+ torch.cuda.current_stream().synchronize()
+
+ # Make contiguous views for CPU computation
+ input_cpu = input_buf.contiguous()
+ expert_ids_cpu = expert_ids_buf.contiguous()
+ weights_cpu = weights_buf.contiguous()
+ output_cpu = output_buf.contiguous()
+
wall_t0 = time.time()
cpu_infer.submit(
moe.forward(
- expert_ids.size(0),
- expert_ids.size(1),
- expert_ids.data_ptr(),
- weights.data_ptr(),
- input_tensor.data_ptr(),
- output.data_ptr(),
+ expert_ids_cpu.size(0),
+ expert_ids_cpu.size(1),
+ expert_ids_cpu.data_ptr(),
+ weights_cpu.data_ptr(),
+ input_cpu.data_ptr(),
+ output_cpu.data_ptr(),
)
)
cpu_infer.sync()
- t_fwd = time.time() - wall_t0
+ t_fwd = time.time() - wall_t0
- result = output.to(device=out_device)
+ # Copy result back to GPU using pinned memory (async)
+ result = torch.empty((num_tokens, hidden_size), dtype=input_tensor.dtype, device=out_device)
+ result.copy_(output_cpu, non_blocking=True)
- ctx.save_for_backward(input_tensor, expert_ids, weights)
+ # Save CPU tensors for backward (already in pinned memory)
+ ctx.save_for_backward(input_cpu, expert_ids_cpu, weights_cpu)
ctx.cpu_infer = cpu_infer
ctx.moe = moe
ctx.out_device = out_device
@@ -632,50 +735,63 @@ class KSFTExpertsCPU(torch.autograd.Function):
@staticmethod
def backward(ctx, output_grad):
# print("Go into the backward!!")
-
- # Pick back the middle results
- input_tensor, expert_ids, weights = ctx.saved_tensors
- import random
- layer_idx = random.randint(0, 10000)
- # print(f"layer_idx:{layer_idx}")
- # layer_idx = ctx.layer_idx
-
- # cpu_infer = ctx.cpu_infer
- # moe = ctx.moe
- # out_device = ctx.out_device
- # ready for computing gradient
- output_grad = output_grad.contiguous().cpu()
- input_grad = torch.empty_like(input_tensor).contiguous()
- # print(dir(cpuinfer_ext.moe.MOE))
+ # Pick back the middle results (already in pinned memory from forward)
+ input_tensor, expert_ids, weights = ctx.saved_tensors
+
+ num_tokens = output_grad.size(0)
+ hidden_size = output_grad.size(1)
+ num_experts_per_tok = expert_ids.size(1)
+
+ # Ensure pinned buffers are large enough (should already be from forward)
+ KSFTExpertsCPU._ensure_pinned_buffers(num_tokens, hidden_size, num_experts_per_tok)
+
+ # Use pinned memory buffers for gradient transfer
+ grad_out_buf = KSFTExpertsCPU._pinned_grad_out_buf[:num_tokens]
+ grad_in_buf = KSFTExpertsCPU._pinned_grad_in_buf[:num_tokens]
+
+ # Copy output_grad to pinned memory (async)
+ grad_out_buf.copy_(output_grad.to(torch.bfloat16), non_blocking=True)
+
+ # Synchronize to ensure data is ready on CPU
+ if output_grad.is_cuda:
+ torch.cuda.current_stream().synchronize()
+
+ # Make contiguous for CPU computation
+ output_grad_cpu = grad_out_buf.contiguous()
+ input_grad_cpu = grad_in_buf.contiguous()
+
bw_start = time.time()
ctx.cpu_infer.submit(
ctx.moe.backward(
- # layer_idx,
- output_grad.size(0), # qlen
- expert_ids.size(1), # k
+ output_grad_cpu.size(0), # qlen
+ expert_ids.size(1), # k
expert_ids.data_ptr(),
weights.data_ptr(),
- input_tensor.data_ptr(),
- output_grad.data_ptr(),
- input_grad.data_ptr(),
+ input_tensor.data_ptr(),
+ output_grad_cpu.data_ptr(),
+ input_grad_cpu.data_ptr(),
)
)
ctx.cpu_infer.sync()
-
- bw_end = time.time()
- t_bw = bw_end - bw_start
-
+
+ bw_end = time.time()
+ t_bw = bw_end - bw_start
+
+ # Copy gradient back to GPU using pinned memory (async)
+ result_grad = torch.empty((num_tokens, hidden_size), dtype=output_grad.dtype, device=ctx.out_device)
+ result_grad.copy_(input_grad_cpu, non_blocking=True)
+
# ---------- FLOPs ----------
- qlen, k = ctx.saved_dims
+ qlen, k = ctx.saved_dims
flops_bw = 10 * qlen * k * H_FIXED * M_FIXED
tflops_b = flops_bw / t_bw / 1e12
# print(f"qlen:{qlen}, k:{k}")
# with open("test_V3_ESC.txt", "a", encoding="utf-8") as f:
# f.write(f"[KSFTExpertsCPU]Backward: {flops_bw/1e9:.3f} GFLOPs {tflops_b:.2f} TFLOPS {t_bw*1e3:.2f} ms\n")
-
- return input_grad.to(device=ctx.out_device), None, None, None, None, None, None
+
+ return result_grad, None, None, None, None, None, None
def unload(self):
return
diff --git a/kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py b/archive/kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py
similarity index 100%
rename from kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py
rename to archive/kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py
diff --git a/kt-sft/ktransformers/operators/flashinfer_wrapper.py b/archive/kt-sft/ktransformers/operators/flashinfer_wrapper.py
similarity index 100%
rename from kt-sft/ktransformers/operators/flashinfer_wrapper.py
rename to archive/kt-sft/ktransformers/operators/flashinfer_wrapper.py
diff --git a/kt-sft/ktransformers/operators/gate.py b/archive/kt-sft/ktransformers/operators/gate.py
similarity index 100%
rename from kt-sft/ktransformers/operators/gate.py
rename to archive/kt-sft/ktransformers/operators/gate.py
diff --git a/kt-sft/ktransformers/operators/layernorm.py b/archive/kt-sft/ktransformers/operators/layernorm.py
similarity index 100%
rename from kt-sft/ktransformers/operators/layernorm.py
rename to archive/kt-sft/ktransformers/operators/layernorm.py
diff --git a/kt-sft/ktransformers/operators/linear.py b/archive/kt-sft/ktransformers/operators/linear.py
similarity index 100%
rename from kt-sft/ktransformers/operators/linear.py
rename to archive/kt-sft/ktransformers/operators/linear.py
diff --git a/kt-sft/ktransformers/operators/mlp.py b/archive/kt-sft/ktransformers/operators/mlp.py
similarity index 100%
rename from kt-sft/ktransformers/operators/mlp.py
rename to archive/kt-sft/ktransformers/operators/mlp.py
diff --git a/kt-sft/ktransformers/operators/models.py b/archive/kt-sft/ktransformers/operators/models.py
similarity index 100%
rename from kt-sft/ktransformers/operators/models.py
rename to archive/kt-sft/ktransformers/operators/models.py
diff --git a/kt-sft/ktransformers/operators/triton_attention.py b/archive/kt-sft/ktransformers/operators/triton_attention.py
similarity index 100%
rename from kt-sft/ktransformers/operators/triton_attention.py
rename to archive/kt-sft/ktransformers/operators/triton_attention.py
diff --git a/kt-sft/ktransformers/operators/triton_attention_prefill.py b/archive/kt-sft/ktransformers/operators/triton_attention_prefill.py
similarity index 100%
rename from kt-sft/ktransformers/operators/triton_attention_prefill.py
rename to archive/kt-sft/ktransformers/operators/triton_attention_prefill.py
diff --git a/kt-sft/ktransformers/optimize/optimize.py b/archive/kt-sft/ktransformers/optimize/optimize.py
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize.py
rename to archive/kt-sft/ktransformers/optimize/optimize.py
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml
diff --git a/kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml
similarity index 100%
rename from kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml
rename to archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml
diff --git a/kt-sft/ktransformers/server/__init__.py b/archive/kt-sft/ktransformers/server/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/__init__.py
rename to archive/kt-sft/ktransformers/server/__init__.py
diff --git a/kt-sft/ktransformers/server/api/__init__.py b/archive/kt-sft/ktransformers/server/api/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/__init__.py
rename to archive/kt-sft/ktransformers/server/api/__init__.py
diff --git a/kt-sft/ktransformers/server/api/ollama/__init__.py b/archive/kt-sft/ktransformers/server/api/ollama/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/ollama/__init__.py
rename to archive/kt-sft/ktransformers/server/api/ollama/__init__.py
diff --git a/kt-sft/ktransformers/server/api/ollama/completions.py b/archive/kt-sft/ktransformers/server/api/ollama/completions.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/ollama/completions.py
rename to archive/kt-sft/ktransformers/server/api/ollama/completions.py
diff --git a/kt-sft/ktransformers/server/api/openai/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/__init__.py
rename to archive/kt-sft/ktransformers/server/api/openai/__init__.py
diff --git a/kt-sft/ktransformers/server/api/openai/assistants/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/assistants/__init__.py
rename to archive/kt-sft/ktransformers/server/api/openai/assistants/__init__.py
diff --git a/kt-sft/ktransformers/server/api/openai/assistants/assistants.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/assistants.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/assistants/assistants.py
rename to archive/kt-sft/ktransformers/server/api/openai/assistants/assistants.py
diff --git a/kt-sft/ktransformers/server/api/openai/assistants/messages.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/messages.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/assistants/messages.py
rename to archive/kt-sft/ktransformers/server/api/openai/assistants/messages.py
diff --git a/kt-sft/ktransformers/server/api/openai/assistants/runs.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/runs.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/assistants/runs.py
rename to archive/kt-sft/ktransformers/server/api/openai/assistants/runs.py
diff --git a/kt-sft/ktransformers/server/api/openai/assistants/threads.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/threads.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/assistants/threads.py
rename to archive/kt-sft/ktransformers/server/api/openai/assistants/threads.py
diff --git a/kt-sft/ktransformers/server/api/openai/endpoints/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/endpoints/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/endpoints/__init__.py
rename to archive/kt-sft/ktransformers/server/api/openai/endpoints/__init__.py
diff --git a/kt-sft/ktransformers/server/api/openai/endpoints/chat.py b/archive/kt-sft/ktransformers/server/api/openai/endpoints/chat.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/endpoints/chat.py
rename to archive/kt-sft/ktransformers/server/api/openai/endpoints/chat.py
diff --git a/kt-sft/ktransformers/server/api/openai/legacy/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/legacy/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/legacy/__init__.py
rename to archive/kt-sft/ktransformers/server/api/openai/legacy/__init__.py
diff --git a/kt-sft/ktransformers/server/api/openai/legacy/completions.py b/archive/kt-sft/ktransformers/server/api/openai/legacy/completions.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/openai/legacy/completions.py
rename to archive/kt-sft/ktransformers/server/api/openai/legacy/completions.py
diff --git a/kt-sft/ktransformers/server/api/web/__init__.py b/archive/kt-sft/ktransformers/server/api/web/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/web/__init__.py
rename to archive/kt-sft/ktransformers/server/api/web/__init__.py
diff --git a/kt-sft/ktransformers/server/api/web/system.py b/archive/kt-sft/ktransformers/server/api/web/system.py
similarity index 100%
rename from kt-sft/ktransformers/server/api/web/system.py
rename to archive/kt-sft/ktransformers/server/api/web/system.py
diff --git a/kt-sft/ktransformers/server/args.py b/archive/kt-sft/ktransformers/server/args.py
similarity index 100%
rename from kt-sft/ktransformers/server/args.py
rename to archive/kt-sft/ktransformers/server/args.py
diff --git a/kt-sft/ktransformers/server/backend/__init__.py b/archive/kt-sft/ktransformers/server/backend/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/__init__.py
rename to archive/kt-sft/ktransformers/server/backend/__init__.py
diff --git a/kt-sft/ktransformers/server/backend/args.py b/archive/kt-sft/ktransformers/server/backend/args.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/args.py
rename to archive/kt-sft/ktransformers/server/backend/args.py
diff --git a/kt-sft/ktransformers/server/backend/base.py b/archive/kt-sft/ktransformers/server/backend/base.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/base.py
rename to archive/kt-sft/ktransformers/server/backend/base.py
diff --git a/kt-sft/ktransformers/server/backend/context_manager.py b/archive/kt-sft/ktransformers/server/backend/context_manager.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/context_manager.py
rename to archive/kt-sft/ktransformers/server/backend/context_manager.py
diff --git a/kt-sft/ktransformers/server/backend/interfaces/__init__.py b/archive/kt-sft/ktransformers/server/backend/interfaces/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/interfaces/__init__.py
rename to archive/kt-sft/ktransformers/server/backend/interfaces/__init__.py
diff --git a/kt-sft/ktransformers/server/backend/interfaces/balance_serve.py b/archive/kt-sft/ktransformers/server/backend/interfaces/balance_serve.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/interfaces/balance_serve.py
rename to archive/kt-sft/ktransformers/server/backend/interfaces/balance_serve.py
diff --git a/kt-sft/ktransformers/server/backend/interfaces/exllamav2.py b/archive/kt-sft/ktransformers/server/backend/interfaces/exllamav2.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/interfaces/exllamav2.py
rename to archive/kt-sft/ktransformers/server/backend/interfaces/exllamav2.py
diff --git a/kt-sft/ktransformers/server/backend/interfaces/ktransformers.py b/archive/kt-sft/ktransformers/server/backend/interfaces/ktransformers.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/interfaces/ktransformers.py
rename to archive/kt-sft/ktransformers/server/backend/interfaces/ktransformers.py
diff --git a/kt-sft/ktransformers/server/backend/interfaces/transformers.py b/archive/kt-sft/ktransformers/server/backend/interfaces/transformers.py
similarity index 100%
rename from kt-sft/ktransformers/server/backend/interfaces/transformers.py
rename to archive/kt-sft/ktransformers/server/backend/interfaces/transformers.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/__init__.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/__init__.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/__init__.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/config.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/config.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/config.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/config.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/model_runner.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/model_runner.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/model_runner.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/model_runner.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/query_manager.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/query_manager.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/query_manager.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/query_manager.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py
diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py
rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py
diff --git a/kt-sft/ktransformers/server/balance_serve/sched_rpc.py b/archive/kt-sft/ktransformers/server/balance_serve/sched_rpc.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/sched_rpc.py
rename to archive/kt-sft/ktransformers/server/balance_serve/sched_rpc.py
diff --git a/kt-sft/ktransformers/server/balance_serve/settings.py b/archive/kt-sft/ktransformers/server/balance_serve/settings.py
similarity index 100%
rename from kt-sft/ktransformers/server/balance_serve/settings.py
rename to archive/kt-sft/ktransformers/server/balance_serve/settings.py
diff --git a/kt-sft/ktransformers/server/config/config.py b/archive/kt-sft/ktransformers/server/config/config.py
similarity index 100%
rename from kt-sft/ktransformers/server/config/config.py
rename to archive/kt-sft/ktransformers/server/config/config.py
diff --git a/kt-sft/ktransformers/server/config/log.py b/archive/kt-sft/ktransformers/server/config/log.py
similarity index 100%
rename from kt-sft/ktransformers/server/config/log.py
rename to archive/kt-sft/ktransformers/server/config/log.py
diff --git a/kt-sft/ktransformers/server/config/singleton.py b/archive/kt-sft/ktransformers/server/config/singleton.py
similarity index 100%
rename from kt-sft/ktransformers/server/config/singleton.py
rename to archive/kt-sft/ktransformers/server/config/singleton.py
diff --git a/kt-sft/ktransformers/server/crud/__init__.py b/archive/kt-sft/ktransformers/server/crud/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/crud/__init__.py
rename to archive/kt-sft/ktransformers/server/crud/__init__.py
diff --git a/kt-sft/ktransformers/server/crud/assistants/__init__.py b/archive/kt-sft/ktransformers/server/crud/assistants/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/crud/assistants/__init__.py
rename to archive/kt-sft/ktransformers/server/crud/assistants/__init__.py
diff --git a/kt-sft/ktransformers/server/crud/assistants/assistants.py b/archive/kt-sft/ktransformers/server/crud/assistants/assistants.py
similarity index 100%
rename from kt-sft/ktransformers/server/crud/assistants/assistants.py
rename to archive/kt-sft/ktransformers/server/crud/assistants/assistants.py
diff --git a/kt-sft/ktransformers/server/crud/assistants/messages.py b/archive/kt-sft/ktransformers/server/crud/assistants/messages.py
similarity index 100%
rename from kt-sft/ktransformers/server/crud/assistants/messages.py
rename to archive/kt-sft/ktransformers/server/crud/assistants/messages.py
diff --git a/kt-sft/ktransformers/server/crud/assistants/runs.py b/archive/kt-sft/ktransformers/server/crud/assistants/runs.py
similarity index 100%
rename from kt-sft/ktransformers/server/crud/assistants/runs.py
rename to archive/kt-sft/ktransformers/server/crud/assistants/runs.py
diff --git a/kt-sft/ktransformers/server/crud/assistants/threads.py b/archive/kt-sft/ktransformers/server/crud/assistants/threads.py
similarity index 100%
rename from kt-sft/ktransformers/server/crud/assistants/threads.py
rename to archive/kt-sft/ktransformers/server/crud/assistants/threads.py
diff --git a/kt-sft/ktransformers/server/exceptions.py b/archive/kt-sft/ktransformers/server/exceptions.py
similarity index 100%
rename from kt-sft/ktransformers/server/exceptions.py
rename to archive/kt-sft/ktransformers/server/exceptions.py
diff --git a/kt-sft/ktransformers/server/main.py b/archive/kt-sft/ktransformers/server/main.py
similarity index 100%
rename from kt-sft/ktransformers/server/main.py
rename to archive/kt-sft/ktransformers/server/main.py
diff --git a/kt-sft/ktransformers/server/models/__init__.py b/archive/kt-sft/ktransformers/server/models/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/__init__.py
rename to archive/kt-sft/ktransformers/server/models/__init__.py
diff --git a/kt-sft/ktransformers/server/models/assistants/__init__.py b/archive/kt-sft/ktransformers/server/models/assistants/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/assistants/__init__.py
rename to archive/kt-sft/ktransformers/server/models/assistants/__init__.py
diff --git a/kt-sft/ktransformers/server/models/assistants/assistants.py b/archive/kt-sft/ktransformers/server/models/assistants/assistants.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/assistants/assistants.py
rename to archive/kt-sft/ktransformers/server/models/assistants/assistants.py
diff --git a/kt-sft/ktransformers/server/models/assistants/messages.py b/archive/kt-sft/ktransformers/server/models/assistants/messages.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/assistants/messages.py
rename to archive/kt-sft/ktransformers/server/models/assistants/messages.py
diff --git a/kt-sft/ktransformers/server/models/assistants/run_steps.py b/archive/kt-sft/ktransformers/server/models/assistants/run_steps.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/assistants/run_steps.py
rename to archive/kt-sft/ktransformers/server/models/assistants/run_steps.py
diff --git a/kt-sft/ktransformers/server/models/assistants/runs.py b/archive/kt-sft/ktransformers/server/models/assistants/runs.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/assistants/runs.py
rename to archive/kt-sft/ktransformers/server/models/assistants/runs.py
diff --git a/kt-sft/ktransformers/server/models/assistants/threads.py b/archive/kt-sft/ktransformers/server/models/assistants/threads.py
similarity index 100%
rename from kt-sft/ktransformers/server/models/assistants/threads.py
rename to archive/kt-sft/ktransformers/server/models/assistants/threads.py
diff --git a/kt-sft/ktransformers/server/schemas/__init__.py b/archive/kt-sft/ktransformers/server/schemas/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/__init__.py
rename to archive/kt-sft/ktransformers/server/schemas/__init__.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/__init__.py b/archive/kt-sft/ktransformers/server/schemas/assistants/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/__init__.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/__init__.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/assistants.py b/archive/kt-sft/ktransformers/server/schemas/assistants/assistants.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/assistants.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/assistants.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/messages.py b/archive/kt-sft/ktransformers/server/schemas/assistants/messages.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/messages.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/messages.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/runs.py b/archive/kt-sft/ktransformers/server/schemas/assistants/runs.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/runs.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/runs.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/streaming.py b/archive/kt-sft/ktransformers/server/schemas/assistants/streaming.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/streaming.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/streaming.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/threads.py b/archive/kt-sft/ktransformers/server/schemas/assistants/threads.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/threads.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/threads.py
diff --git a/kt-sft/ktransformers/server/schemas/assistants/tool.py b/archive/kt-sft/ktransformers/server/schemas/assistants/tool.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/assistants/tool.py
rename to archive/kt-sft/ktransformers/server/schemas/assistants/tool.py
diff --git a/kt-sft/ktransformers/server/schemas/base.py b/archive/kt-sft/ktransformers/server/schemas/base.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/base.py
rename to archive/kt-sft/ktransformers/server/schemas/base.py
diff --git a/kt-sft/ktransformers/server/schemas/conversation.py b/archive/kt-sft/ktransformers/server/schemas/conversation.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/conversation.py
rename to archive/kt-sft/ktransformers/server/schemas/conversation.py
diff --git a/kt-sft/ktransformers/server/schemas/endpoints/chat.py b/archive/kt-sft/ktransformers/server/schemas/endpoints/chat.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/endpoints/chat.py
rename to archive/kt-sft/ktransformers/server/schemas/endpoints/chat.py
diff --git a/kt-sft/ktransformers/server/schemas/legacy/__init__.py b/archive/kt-sft/ktransformers/server/schemas/legacy/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/legacy/__init__.py
rename to archive/kt-sft/ktransformers/server/schemas/legacy/__init__.py
diff --git a/kt-sft/ktransformers/server/schemas/legacy/completions.py b/archive/kt-sft/ktransformers/server/schemas/legacy/completions.py
similarity index 100%
rename from kt-sft/ktransformers/server/schemas/legacy/completions.py
rename to archive/kt-sft/ktransformers/server/schemas/legacy/completions.py
diff --git a/kt-sft/ktransformers/server/utils/__init__.py b/archive/kt-sft/ktransformers/server/utils/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/server/utils/__init__.py
rename to archive/kt-sft/ktransformers/server/utils/__init__.py
diff --git a/kt-sft/ktransformers/server/utils/create_interface.py b/archive/kt-sft/ktransformers/server/utils/create_interface.py
similarity index 100%
rename from kt-sft/ktransformers/server/utils/create_interface.py
rename to archive/kt-sft/ktransformers/server/utils/create_interface.py
diff --git a/kt-sft/ktransformers/server/utils/multi_timer.py b/archive/kt-sft/ktransformers/server/utils/multi_timer.py
similarity index 100%
rename from kt-sft/ktransformers/server/utils/multi_timer.py
rename to archive/kt-sft/ktransformers/server/utils/multi_timer.py
diff --git a/kt-sft/ktransformers/server/utils/sql_utils.py b/archive/kt-sft/ktransformers/server/utils/sql_utils.py
similarity index 100%
rename from kt-sft/ktransformers/server/utils/sql_utils.py
rename to archive/kt-sft/ktransformers/server/utils/sql_utils.py
diff --git a/kt-sft/ktransformers/sft/__init__.py b/archive/kt-sft/ktransformers/sft/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/sft/__init__.py
rename to archive/kt-sft/ktransformers/sft/__init__.py
diff --git a/kt-sft/ktransformers/sft/flops_utils/__init__.py b/archive/kt-sft/ktransformers/sft/flops_utils/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/sft/flops_utils/__init__.py
rename to archive/kt-sft/ktransformers/sft/flops_utils/__init__.py
diff --git a/kt-sft/ktransformers/sft/flops_utils/custom_profile.py b/archive/kt-sft/ktransformers/sft/flops_utils/custom_profile.py
similarity index 100%
rename from kt-sft/ktransformers/sft/flops_utils/custom_profile.py
rename to archive/kt-sft/ktransformers/sft/flops_utils/custom_profile.py
diff --git a/kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py b/archive/kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py
similarity index 100%
rename from kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py
rename to archive/kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py
diff --git a/kt-sft/ktransformers/sft/lora.py b/archive/kt-sft/ktransformers/sft/lora.py
similarity index 100%
rename from kt-sft/ktransformers/sft/lora.py
rename to archive/kt-sft/ktransformers/sft/lora.py
diff --git a/kt-sft/ktransformers/sft/metrics.py b/archive/kt-sft/ktransformers/sft/metrics.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics.py
rename to archive/kt-sft/ktransformers/sft/metrics.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/__init__.py b/archive/kt-sft/ktransformers/sft/metrics_utils/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/__init__.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/__init__.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/constants.py b/archive/kt-sft/ktransformers/sft/metrics_utils/constants.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/constants.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/constants.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/env.py b/archive/kt-sft/ktransformers/sft/metrics_utils/env.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/env.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/env.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/logging.py b/archive/kt-sft/ktransformers/sft/metrics_utils/logging.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/logging.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/logging.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/misc.py b/archive/kt-sft/ktransformers/sft/metrics_utils/misc.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/misc.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/misc.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/packages.py b/archive/kt-sft/ktransformers/sft/metrics_utils/packages.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/packages.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/packages.py
diff --git a/kt-sft/ktransformers/sft/metrics_utils/ploting.py b/archive/kt-sft/ktransformers/sft/metrics_utils/ploting.py
similarity index 100%
rename from kt-sft/ktransformers/sft/metrics_utils/ploting.py
rename to archive/kt-sft/ktransformers/sft/metrics_utils/ploting.py
diff --git a/kt-sft/ktransformers/sft/monkey_patch_torch_module.py b/archive/kt-sft/ktransformers/sft/monkey_patch_torch_module.py
similarity index 100%
rename from kt-sft/ktransformers/sft/monkey_patch_torch_module.py
rename to archive/kt-sft/ktransformers/sft/monkey_patch_torch_module.py
diff --git a/kt-sft/ktransformers/sft/peft_utils/__init__.py b/archive/kt-sft/ktransformers/sft/peft_utils/__init__.py
similarity index 100%
rename from kt-sft/ktransformers/sft/peft_utils/__init__.py
rename to archive/kt-sft/ktransformers/sft/peft_utils/__init__.py
diff --git a/kt-sft/ktransformers/sft/peft_utils/lora_layer.py b/archive/kt-sft/ktransformers/sft/peft_utils/lora_layer.py
similarity index 100%
rename from kt-sft/ktransformers/sft/peft_utils/lora_layer.py
rename to archive/kt-sft/ktransformers/sft/peft_utils/lora_layer.py
diff --git a/kt-sft/ktransformers/sft/peft_utils/lora_model.py b/archive/kt-sft/ktransformers/sft/peft_utils/lora_model.py
similarity index 100%
rename from kt-sft/ktransformers/sft/peft_utils/lora_model.py
rename to archive/kt-sft/ktransformers/sft/peft_utils/lora_model.py
diff --git a/kt-sft/ktransformers/sft/peft_utils/mapping.py b/archive/kt-sft/ktransformers/sft/peft_utils/mapping.py
similarity index 100%
rename from kt-sft/ktransformers/sft/peft_utils/mapping.py
rename to archive/kt-sft/ktransformers/sft/peft_utils/mapping.py
diff --git a/kt-sft/ktransformers/sft/peft_utils/peft_model.py b/archive/kt-sft/ktransformers/sft/peft_utils/peft_model.py
similarity index 100%
rename from kt-sft/ktransformers/sft/peft_utils/peft_model.py
rename to archive/kt-sft/ktransformers/sft/peft_utils/peft_model.py
diff --git a/kt-sft/ktransformers/sft/torchviz_test.py b/archive/kt-sft/ktransformers/sft/torchviz_test.py
similarity index 100%
rename from kt-sft/ktransformers/sft/torchviz_test.py
rename to archive/kt-sft/ktransformers/sft/torchviz_test.py
diff --git a/kt-sft/ktransformers/tests/.gitignore b/archive/kt-sft/ktransformers/tests/.gitignore
similarity index 100%
rename from kt-sft/ktransformers/tests/.gitignore
rename to archive/kt-sft/ktransformers/tests/.gitignore
diff --git a/kt-sft/ktransformers/tests/AIME_2024/eval_api.py b/archive/kt-sft/ktransformers/tests/AIME_2024/eval_api.py
similarity index 100%
rename from kt-sft/ktransformers/tests/AIME_2024/eval_api.py
rename to archive/kt-sft/ktransformers/tests/AIME_2024/eval_api.py
diff --git a/kt-sft/ktransformers/tests/AIME_2024/evaluation.py b/archive/kt-sft/ktransformers/tests/AIME_2024/evaluation.py
similarity index 100%
rename from kt-sft/ktransformers/tests/AIME_2024/evaluation.py
rename to archive/kt-sft/ktransformers/tests/AIME_2024/evaluation.py
diff --git a/kt-sft/ktransformers/tests/AIME_2024/prompts.py b/archive/kt-sft/ktransformers/tests/AIME_2024/prompts.py
similarity index 100%
rename from kt-sft/ktransformers/tests/AIME_2024/prompts.py
rename to archive/kt-sft/ktransformers/tests/AIME_2024/prompts.py
diff --git a/kt-sft/ktransformers/tests/dequant_gpu.py b/archive/kt-sft/ktransformers/tests/dequant_gpu.py
similarity index 100%
rename from kt-sft/ktransformers/tests/dequant_gpu.py
rename to archive/kt-sft/ktransformers/tests/dequant_gpu.py
diff --git a/kt-sft/ktransformers/tests/dequant_gpu_t.py b/archive/kt-sft/ktransformers/tests/dequant_gpu_t.py
similarity index 100%
rename from kt-sft/ktransformers/tests/dequant_gpu_t.py
rename to archive/kt-sft/ktransformers/tests/dequant_gpu_t.py
diff --git a/kt-sft/ktransformers/tests/function_call_test.py b/archive/kt-sft/ktransformers/tests/function_call_test.py
similarity index 100%
rename from kt-sft/ktransformers/tests/function_call_test.py
rename to archive/kt-sft/ktransformers/tests/function_call_test.py
diff --git a/kt-sft/ktransformers/tests/humaneval/eval_api.py b/archive/kt-sft/ktransformers/tests/humaneval/eval_api.py
similarity index 100%
rename from kt-sft/ktransformers/tests/humaneval/eval_api.py
rename to archive/kt-sft/ktransformers/tests/humaneval/eval_api.py
diff --git a/kt-sft/ktransformers/tests/humaneval/evaluation.py b/archive/kt-sft/ktransformers/tests/humaneval/evaluation.py
similarity index 100%
rename from kt-sft/ktransformers/tests/humaneval/evaluation.py
rename to archive/kt-sft/ktransformers/tests/humaneval/evaluation.py
diff --git a/kt-sft/ktransformers/tests/humaneval/prompts.py b/archive/kt-sft/ktransformers/tests/humaneval/prompts.py
similarity index 100%
rename from kt-sft/ktransformers/tests/humaneval/prompts.py
rename to archive/kt-sft/ktransformers/tests/humaneval/prompts.py
diff --git a/kt-sft/ktransformers/tests/mmlu_pro_test.py b/archive/kt-sft/ktransformers/tests/mmlu_pro_test.py
similarity index 100%
rename from kt-sft/ktransformers/tests/mmlu_pro_test.py
rename to archive/kt-sft/ktransformers/tests/mmlu_pro_test.py
diff --git a/kt-sft/ktransformers/tests/mmlu_test.py b/archive/kt-sft/ktransformers/tests/mmlu_test.py
similarity index 100%
rename from kt-sft/ktransformers/tests/mmlu_test.py
rename to archive/kt-sft/ktransformers/tests/mmlu_test.py
diff --git a/kt-sft/ktransformers/tests/mmlu_test_multi.py b/archive/kt-sft/ktransformers/tests/mmlu_test_multi.py
similarity index 100%
rename from kt-sft/ktransformers/tests/mmlu_test_multi.py
rename to archive/kt-sft/ktransformers/tests/mmlu_test_multi.py
diff --git a/kt-sft/ktransformers/tests/score.py b/archive/kt-sft/ktransformers/tests/score.py
similarity index 100%
rename from kt-sft/ktransformers/tests/score.py
rename to archive/kt-sft/ktransformers/tests/score.py
diff --git a/kt-sft/ktransformers/tests/test_client.py b/archive/kt-sft/ktransformers/tests/test_client.py
similarity index 100%
rename from kt-sft/ktransformers/tests/test_client.py
rename to archive/kt-sft/ktransformers/tests/test_client.py
diff --git a/kt-sft/ktransformers/tests/test_pytorch_q8.py b/archive/kt-sft/ktransformers/tests/test_pytorch_q8.py
similarity index 100%
rename from kt-sft/ktransformers/tests/test_pytorch_q8.py
rename to archive/kt-sft/ktransformers/tests/test_pytorch_q8.py
diff --git a/kt-sft/ktransformers/tests/test_speed.py b/archive/kt-sft/ktransformers/tests/test_speed.py
similarity index 100%
rename from kt-sft/ktransformers/tests/test_speed.py
rename to archive/kt-sft/ktransformers/tests/test_speed.py
diff --git a/kt-sft/ktransformers/tests/triton_fp8gemm_test.py b/archive/kt-sft/ktransformers/tests/triton_fp8gemm_test.py
similarity index 100%
rename from kt-sft/ktransformers/tests/triton_fp8gemm_test.py
rename to archive/kt-sft/ktransformers/tests/triton_fp8gemm_test.py
diff --git a/kt-sft/ktransformers/util/cuda_graph_runner.py b/archive/kt-sft/ktransformers/util/cuda_graph_runner.py
similarity index 100%
rename from kt-sft/ktransformers/util/cuda_graph_runner.py
rename to archive/kt-sft/ktransformers/util/cuda_graph_runner.py
diff --git a/kt-sft/ktransformers/util/custom_gguf.py b/archive/kt-sft/ktransformers/util/custom_gguf.py
similarity index 100%
rename from kt-sft/ktransformers/util/custom_gguf.py
rename to archive/kt-sft/ktransformers/util/custom_gguf.py
diff --git a/kt-sft/ktransformers/util/custom_loader.py b/archive/kt-sft/ktransformers/util/custom_loader.py
similarity index 100%
rename from kt-sft/ktransformers/util/custom_loader.py
rename to archive/kt-sft/ktransformers/util/custom_loader.py
diff --git a/kt-sft/ktransformers/util/globals.py b/archive/kt-sft/ktransformers/util/globals.py
similarity index 100%
rename from kt-sft/ktransformers/util/globals.py
rename to archive/kt-sft/ktransformers/util/globals.py
diff --git a/kt-sft/ktransformers/util/grad_wrapper.py b/archive/kt-sft/ktransformers/util/grad_wrapper.py
similarity index 100%
rename from kt-sft/ktransformers/util/grad_wrapper.py
rename to archive/kt-sft/ktransformers/util/grad_wrapper.py
diff --git a/kt-sft/ktransformers/util/inference_state.py b/archive/kt-sft/ktransformers/util/inference_state.py
similarity index 100%
rename from kt-sft/ktransformers/util/inference_state.py
rename to archive/kt-sft/ktransformers/util/inference_state.py
diff --git a/kt-sft/ktransformers/util/modeling_rope_utils.py b/archive/kt-sft/ktransformers/util/modeling_rope_utils.py
similarity index 100%
rename from kt-sft/ktransformers/util/modeling_rope_utils.py
rename to archive/kt-sft/ktransformers/util/modeling_rope_utils.py
diff --git a/kt-sft/ktransformers/util/textstream.py b/archive/kt-sft/ktransformers/util/textstream.py
similarity index 100%
rename from kt-sft/ktransformers/util/textstream.py
rename to archive/kt-sft/ktransformers/util/textstream.py
diff --git a/kt-sft/ktransformers/util/utils.py b/archive/kt-sft/ktransformers/util/utils.py
similarity index 100%
rename from kt-sft/ktransformers/util/utils.py
rename to archive/kt-sft/ktransformers/util/utils.py
diff --git a/kt-sft/ktransformers/util/vendors.py b/archive/kt-sft/ktransformers/util/vendors.py
similarity index 100%
rename from kt-sft/ktransformers/util/vendors.py
rename to archive/kt-sft/ktransformers/util/vendors.py
diff --git a/kt-sft/ktransformers/util/weight_loader.py b/archive/kt-sft/ktransformers/util/weight_loader.py
similarity index 100%
rename from kt-sft/ktransformers/util/weight_loader.py
rename to archive/kt-sft/ktransformers/util/weight_loader.py
diff --git a/kt-sft/ktransformers/website/.browserslistrc b/archive/kt-sft/ktransformers/website/.browserslistrc
similarity index 100%
rename from kt-sft/ktransformers/website/.browserslistrc
rename to archive/kt-sft/ktransformers/website/.browserslistrc
diff --git a/kt-sft/ktransformers/website/.eslintrc.js b/archive/kt-sft/ktransformers/website/.eslintrc.js
similarity index 100%
rename from kt-sft/ktransformers/website/.eslintrc.js
rename to archive/kt-sft/ktransformers/website/.eslintrc.js
diff --git a/kt-sft/ktransformers/website/.gitignore b/archive/kt-sft/ktransformers/website/.gitignore
similarity index 100%
rename from kt-sft/ktransformers/website/.gitignore
rename to archive/kt-sft/ktransformers/website/.gitignore
diff --git a/kt-sft/ktransformers/website/README.md b/archive/kt-sft/ktransformers/website/README.md
similarity index 100%
rename from kt-sft/ktransformers/website/README.md
rename to archive/kt-sft/ktransformers/website/README.md
diff --git a/kt-sft/ktransformers/website/config.d.ts b/archive/kt-sft/ktransformers/website/config.d.ts
similarity index 100%
rename from kt-sft/ktransformers/website/config.d.ts
rename to archive/kt-sft/ktransformers/website/config.d.ts
diff --git a/kt-sft/ktransformers/website/jest.config.js b/archive/kt-sft/ktransformers/website/jest.config.js
similarity index 100%
rename from kt-sft/ktransformers/website/jest.config.js
rename to archive/kt-sft/ktransformers/website/jest.config.js
diff --git a/kt-sft/ktransformers/website/package-lock.json b/archive/kt-sft/ktransformers/website/package-lock.json
similarity index 100%
rename from kt-sft/ktransformers/website/package-lock.json
rename to archive/kt-sft/ktransformers/website/package-lock.json
diff --git a/kt-sft/ktransformers/website/package.json b/archive/kt-sft/ktransformers/website/package.json
similarity index 100%
rename from kt-sft/ktransformers/website/package.json
rename to archive/kt-sft/ktransformers/website/package.json
diff --git a/kt-sft/ktransformers/website/public/balck.ico b/archive/kt-sft/ktransformers/website/public/balck.ico
similarity index 100%
rename from kt-sft/ktransformers/website/public/balck.ico
rename to archive/kt-sft/ktransformers/website/public/balck.ico
diff --git a/kt-sft/ktransformers/website/public/config.js b/archive/kt-sft/ktransformers/website/public/config.js
similarity index 100%
rename from kt-sft/ktransformers/website/public/config.js
rename to archive/kt-sft/ktransformers/website/public/config.js
diff --git a/kt-sft/ktransformers/website/public/css/reset.css b/archive/kt-sft/ktransformers/website/public/css/reset.css
similarity index 100%
rename from kt-sft/ktransformers/website/public/css/reset.css
rename to archive/kt-sft/ktransformers/website/public/css/reset.css
diff --git a/kt-sft/ktransformers/website/public/images/assistant-avatar.png b/archive/kt-sft/ktransformers/website/public/images/assistant-avatar.png
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/assistant-avatar.png
rename to archive/kt-sft/ktransformers/website/public/images/assistant-avatar.png
diff --git a/kt-sft/ktransformers/website/public/images/avatar.png b/archive/kt-sft/ktransformers/website/public/images/avatar.png
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/avatar.png
rename to archive/kt-sft/ktransformers/website/public/images/avatar.png
diff --git a/kt-sft/ktransformers/website/public/images/bgbg.png b/archive/kt-sft/ktransformers/website/public/images/bgbg.png
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/bgbg.png
rename to archive/kt-sft/ktransformers/website/public/images/bgbg.png
diff --git a/kt-sft/ktransformers/website/public/images/logo.ico b/archive/kt-sft/ktransformers/website/public/images/logo.ico
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/logo.ico
rename to archive/kt-sft/ktransformers/website/public/images/logo.ico
diff --git a/kt-sft/ktransformers/website/public/images/logo.png b/archive/kt-sft/ktransformers/website/public/images/logo.png
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/logo.png
rename to archive/kt-sft/ktransformers/website/public/images/logo.png
diff --git a/kt-sft/ktransformers/website/public/images/three.png b/archive/kt-sft/ktransformers/website/public/images/three.png
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/three.png
rename to archive/kt-sft/ktransformers/website/public/images/three.png
diff --git a/kt-sft/ktransformers/website/public/images/user-filling.png b/archive/kt-sft/ktransformers/website/public/images/user-filling.png
similarity index 100%
rename from kt-sft/ktransformers/website/public/images/user-filling.png
rename to archive/kt-sft/ktransformers/website/public/images/user-filling.png
diff --git a/kt-sft/ktransformers/website/public/index.html b/archive/kt-sft/ktransformers/website/public/index.html
similarity index 100%
rename from kt-sft/ktransformers/website/public/index.html
rename to archive/kt-sft/ktransformers/website/public/index.html
diff --git a/kt-sft/ktransformers/website/src/App.vue b/archive/kt-sft/ktransformers/website/src/App.vue
similarity index 100%
rename from kt-sft/ktransformers/website/src/App.vue
rename to archive/kt-sft/ktransformers/website/src/App.vue
diff --git a/kt-sft/ktransformers/website/src/api/api-client.ts b/archive/kt-sft/ktransformers/website/src/api/api-client.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/api/api-client.ts
rename to archive/kt-sft/ktransformers/website/src/api/api-client.ts
diff --git a/kt-sft/ktransformers/website/src/api/assistant.ts b/archive/kt-sft/ktransformers/website/src/api/assistant.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/api/assistant.ts
rename to archive/kt-sft/ktransformers/website/src/api/assistant.ts
diff --git a/kt-sft/ktransformers/website/src/api/message.ts b/archive/kt-sft/ktransformers/website/src/api/message.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/api/message.ts
rename to archive/kt-sft/ktransformers/website/src/api/message.ts
diff --git a/kt-sft/ktransformers/website/src/api/run.ts b/archive/kt-sft/ktransformers/website/src/api/run.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/api/run.ts
rename to archive/kt-sft/ktransformers/website/src/api/run.ts
diff --git a/kt-sft/ktransformers/website/src/api/thread.ts b/archive/kt-sft/ktransformers/website/src/api/thread.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/api/thread.ts
rename to archive/kt-sft/ktransformers/website/src/api/thread.ts
diff --git a/kt-sft/ktransformers/website/src/assets/css/mixins.styl b/archive/kt-sft/ktransformers/website/src/assets/css/mixins.styl
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/css/mixins.styl
rename to archive/kt-sft/ktransformers/website/src/assets/css/mixins.styl
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/demo.css b/archive/kt-sft/ktransformers/website/src/assets/iconfont/demo.css
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/demo.css
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/demo.css
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html b/archive/kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff
diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2 b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2
similarity index 100%
rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2
rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2
diff --git a/kt-sft/ktransformers/website/src/components/chat/index.vue b/archive/kt-sft/ktransformers/website/src/components/chat/index.vue
similarity index 100%
rename from kt-sft/ktransformers/website/src/components/chat/index.vue
rename to archive/kt-sft/ktransformers/website/src/components/chat/index.vue
diff --git a/kt-sft/ktransformers/website/src/conf/config.ts b/archive/kt-sft/ktransformers/website/src/conf/config.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/conf/config.ts
rename to archive/kt-sft/ktransformers/website/src/conf/config.ts
diff --git a/kt-sft/ktransformers/website/src/locals/en.js b/archive/kt-sft/ktransformers/website/src/locals/en.js
similarity index 100%
rename from kt-sft/ktransformers/website/src/locals/en.js
rename to archive/kt-sft/ktransformers/website/src/locals/en.js
diff --git a/kt-sft/ktransformers/website/src/locals/index.js b/archive/kt-sft/ktransformers/website/src/locals/index.js
similarity index 100%
rename from kt-sft/ktransformers/website/src/locals/index.js
rename to archive/kt-sft/ktransformers/website/src/locals/index.js
diff --git a/kt-sft/ktransformers/website/src/locals/zh.js b/archive/kt-sft/ktransformers/website/src/locals/zh.js
similarity index 100%
rename from kt-sft/ktransformers/website/src/locals/zh.js
rename to archive/kt-sft/ktransformers/website/src/locals/zh.js
diff --git a/kt-sft/ktransformers/website/src/main.ts b/archive/kt-sft/ktransformers/website/src/main.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/main.ts
rename to archive/kt-sft/ktransformers/website/src/main.ts
diff --git a/kt-sft/ktransformers/website/src/router/index.ts b/archive/kt-sft/ktransformers/website/src/router/index.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/router/index.ts
rename to archive/kt-sft/ktransformers/website/src/router/index.ts
diff --git a/kt-sft/ktransformers/website/src/shims-vue.d.ts b/archive/kt-sft/ktransformers/website/src/shims-vue.d.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/shims-vue.d.ts
rename to archive/kt-sft/ktransformers/website/src/shims-vue.d.ts
diff --git a/kt-sft/ktransformers/website/src/store/index.ts b/archive/kt-sft/ktransformers/website/src/store/index.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/store/index.ts
rename to archive/kt-sft/ktransformers/website/src/store/index.ts
diff --git a/kt-sft/ktransformers/website/src/utils/copy.ts b/archive/kt-sft/ktransformers/website/src/utils/copy.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/utils/copy.ts
rename to archive/kt-sft/ktransformers/website/src/utils/copy.ts
diff --git a/kt-sft/ktransformers/website/src/utils/types.ts b/archive/kt-sft/ktransformers/website/src/utils/types.ts
similarity index 100%
rename from kt-sft/ktransformers/website/src/utils/types.ts
rename to archive/kt-sft/ktransformers/website/src/utils/types.ts
diff --git a/kt-sft/ktransformers/website/src/views/home.vue b/archive/kt-sft/ktransformers/website/src/views/home.vue
similarity index 100%
rename from kt-sft/ktransformers/website/src/views/home.vue
rename to archive/kt-sft/ktransformers/website/src/views/home.vue
diff --git a/kt-sft/ktransformers/website/tests/unit/example.spec.ts b/archive/kt-sft/ktransformers/website/tests/unit/example.spec.ts
similarity index 100%
rename from kt-sft/ktransformers/website/tests/unit/example.spec.ts
rename to archive/kt-sft/ktransformers/website/tests/unit/example.spec.ts
diff --git a/kt-sft/ktransformers/website/tsconfig.json b/archive/kt-sft/ktransformers/website/tsconfig.json
similarity index 100%
rename from kt-sft/ktransformers/website/tsconfig.json
rename to archive/kt-sft/ktransformers/website/tsconfig.json
diff --git a/kt-sft/ktransformers/website/vue.config.js b/archive/kt-sft/ktransformers/website/vue.config.js
similarity index 100%
rename from kt-sft/ktransformers/website/vue.config.js
rename to archive/kt-sft/ktransformers/website/vue.config.js
diff --git a/kt-sft/merge_tensors/merge_safetensor_gguf.py b/archive/kt-sft/merge_tensors/merge_safetensor_gguf.py
similarity index 100%
rename from kt-sft/merge_tensors/merge_safetensor_gguf.py
rename to archive/kt-sft/merge_tensors/merge_safetensor_gguf.py
diff --git a/kt-sft/pyproject.toml b/archive/kt-sft/pyproject.toml
similarity index 100%
rename from kt-sft/pyproject.toml
rename to archive/kt-sft/pyproject.toml
diff --git a/kt-sft/requirements-sft.txt b/archive/kt-sft/requirements-sft.txt
similarity index 100%
rename from kt-sft/requirements-sft.txt
rename to archive/kt-sft/requirements-sft.txt
diff --git a/kt-sft/setup.py b/archive/kt-sft/setup.py
similarity index 100%
rename from kt-sft/setup.py
rename to archive/kt-sft/setup.py
diff --git a/kt-sft/test_adapter/data_transfer.py b/archive/kt-sft/test_adapter/data_transfer.py
similarity index 100%
rename from kt-sft/test_adapter/data_transfer.py
rename to archive/kt-sft/test_adapter/data_transfer.py
diff --git a/kt-sft/test_adapter/infer_with_adapter.py b/archive/kt-sft/test_adapter/infer_with_adapter.py
similarity index 100%
rename from kt-sft/test_adapter/infer_with_adapter.py
rename to archive/kt-sft/test_adapter/infer_with_adapter.py
diff --git a/kt-sft/test_adapter/inspect_adapter.py b/archive/kt-sft/test_adapter/inspect_adapter.py
similarity index 100%
rename from kt-sft/test_adapter/inspect_adapter.py
rename to archive/kt-sft/test_adapter/inspect_adapter.py
diff --git a/kt-sft/test_adapter/pred2metrics.py b/archive/kt-sft/test_adapter/pred2metrics.py
similarity index 100%
rename from kt-sft/test_adapter/pred2metrics.py
rename to archive/kt-sft/test_adapter/pred2metrics.py
diff --git a/kt-sft/test_adapter/test_grad.py b/archive/kt-sft/test_adapter/test_grad.py
similarity index 100%
rename from kt-sft/test_adapter/test_grad.py
rename to archive/kt-sft/test_adapter/test_grad.py
diff --git a/kt-sft/test_adapter/time_test_lora_train.py b/archive/kt-sft/test_adapter/time_test_lora_train.py
similarity index 100%
rename from kt-sft/test_adapter/time_test_lora_train.py
rename to archive/kt-sft/test_adapter/time_test_lora_train.py
diff --git a/kt-sft/withoutKT_PEFT.py b/archive/kt-sft/withoutKT_PEFT.py
similarity index 100%
rename from kt-sft/withoutKT_PEFT.py
rename to archive/kt-sft/withoutKT_PEFT.py
diff --git a/archive/ktransformers/ktransformers b/archive/ktransformers/ktransformers
deleted file mode 120000
index 598751a4..00000000
--- a/archive/ktransformers/ktransformers
+++ /dev/null
@@ -1 +0,0 @@
-/home/djw/py311_717/ktransformers/ktransformers
\ No newline at end of file
diff --git a/doc/SUMMARY.md b/doc/SUMMARY.md
index 35ec1145..8d1ceb37 100644
--- a/doc/SUMMARY.md
+++ b/doc/SUMMARY.md
@@ -5,7 +5,7 @@
- [For kt-kernel](en/kt-kernel/kt-kernel_intro.md)
- [For kt-sft](en/SFT/KTransformers-Fine-Tuning_User-Guide.md)
-# Tutorial
+# Tutorial
- [kt-sft part](en/SFT/README.md)
- [Injection Tutorial](en/SFT/injection_tutorial.md)
- [kt-sft developer tech notes](en/SFT/KTransformers-Fine-Tuning_Developer-Technical-Notes.md)
@@ -19,6 +19,8 @@
- [Makefile Usage](en/makefile_usage.md) -->
- [kt-kernel part](en/kt-kernel/README.md)
- [kt-cli](en/kt-kernel/kt-cli.md)
+ - [AVX2 Backend Tutorial](en/kt-kernel/AVX2-Tutorial.md)
+ - [AVX2 后端教程(中文)](zh/AVX2-Tutorial_zh.md)
# FAQ
- [FAQ](en/FAQ.md)