diff --git a/.github/workflows/release-pypi.yml b/.github/workflows/release-pypi.yml index 50537faa..4c8965cf 100644 --- a/.github/workflows/release-pypi.yml +++ b/.github/workflows/release-pypi.yml @@ -107,6 +107,7 @@ jobs: working-directory: kt-kernel env: CPUINFER_BUILD_ALL_VARIANTS: '1' + CPUINFER_ENABLE_CPPTRACE: '0' CPUINFER_USE_CUDA: '1' CPUINFER_CUDA_ARCHS: '80;86;89;90' CPUINFER_CUDA_STATIC_RUNTIME: '1' diff --git a/.github/workflows/release-sglang-kt.yml b/.github/workflows/release-sglang-kt.yml index 0d745e3b..6f1121cf 100644 --- a/.github/workflows/release-sglang-kt.yml +++ b/.github/workflows/release-sglang-kt.yml @@ -24,7 +24,7 @@ permissions: jobs: build-sglang-kt: name: Build sglang-kt wheel - runs-on: [self-hosted, linux, x64] + runs-on: ubuntu-latest steps: - name: Checkout repository @@ -70,7 +70,7 @@ jobs: publish-pypi: name: Publish sglang-kt to PyPI needs: [build-sglang-kt] - runs-on: [self-hosted, linux, x64] + runs-on: ubuntu-latest if: github.repository == 'kvcache-ai/ktransformers' && github.ref == 'refs/heads/main' environment: prod permissions: diff --git a/README.md b/README.md index 9c057757..b29eb3d4 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@

A Flexible Framework for Experiencing Cutting-edge LLM Inference/Fine-tune Optimizations

- 🎯 Overview | 🚀 kt-kernel | 🎓 kt-sft | 🔥 Citation | 🚀 Roadmap(2025Q4) + 🎯 Overview | 🚀 kt-kernel | 🎓 kt-sft | 🔥 Citation | 🚀 Roadmap(2026Q2) ## 🎯 Overview @@ -16,7 +16,8 @@ KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](https://github.com/kvcache-ai/ktransformers/tree/main/kt-kernel/) and [kt-sft](https://github.com/kvcache-ai/ktransformers/tree/main/kt-sft). ## 🔥 Updates - +* **May 6, 2026**: KTransformers at [GOSIM Paris 2026](https://paris2026.gosim.org/zh/schedule/) — "Agentic AI on Edge" track. We'll present KT's inference performance on consumer hardware. +* **Mar 26, 2026**: Support AVX2-only CPU backend for KT-Kernel inference. ([Tutorial](./doc/en/kt-kernel/AVX2-Tutorial.md)) * **Feb 13, 2026**: MiniMax-M2.5 Day0 Support! ([Tutorial](./doc/en/MiniMax-M2.5.md)) * **Feb 12, 2026**: GLM-5 Day0 Support! ([Tutorial](./doc/en/kt-kernel/GLM-5-Tutorial.md)) * **Jan 27, 2026**: Kimi-K2.5 Day0 Support! ([Tutorial](./doc/en/Kimi-K2.5.md)) ([SFT Tutorial](./doc/en/SFT_Installation_Guide_KimiK2.5.md)) @@ -25,7 +26,7 @@ KTransformers is a research project focused on efficient inference and fine-tuni * **Dec 22, 2025**: Support RL-DPO fine-tuning with LLaMA-Factory. ([Tutorial](./doc/en/SFT/DPO_tutorial.md)) * **Dec 5, 2025**: Support Native Kimi-K2-Thinking inference ([Tutorial](./doc/en/kt-kernel/Kimi-K2-Thinking-Native.md)) * **Nov 6, 2025**: Support Kimi-K2-Thinking inference ([Tutorial](./doc/en/Kimi-K2-Thinking.md)) and fine-tune ([Tutorial](./doc/en/SFT_Installation_Guide_KimiK2.md)) -* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/KTransformers-Fine-Tuning_User-Guide.md)) +* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)) * **Oct 27, 2025**: Support Ascend NPU. ([Tutorial](./doc/zh/DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md)) * **Oct 10, 2025**: Integrating into SGLang. ([Roadmap](https://github.com/sgl-project/sglang/issues/11425), [Blog](https://lmsys.org/blog/2025-10-22-KTransformers/)) * **Sept 11, 2025**: Support Qwen3-Next. ([Tutorial](./doc/en/Qwen3-Next.md)) @@ -86,7 +87,7 @@ pip install . --- -### 🎓 [kt-sft](./kt-sft/) - Fine-Tuning Framework +### 🎓 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md) - Fine-Tuning Framework KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning. @@ -108,12 +109,15 @@ KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning **Quick Start:** ```bash -cd kt-sft -# Install environment following kt-sft/README.md -USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml +cd /path/to/LLaMA-Factory +pip install -e . +pip install "ktransformers[sft]" +USE_KT=1 ACCELERATE_USE_KT=true \ + accelerate launch --config_file examples/ktransformers/accelerate/fsdp2_kt_bf16.yaml \ + -m llamafactory.cli train examples/ktransformers/train_lora/deepseek_v3_lora_sft_kt.yaml ``` -👉 **[Full Documentation →](./kt-sft/README.md)** +👉 **[Full Documentation →](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)** --- diff --git a/README_ZH.md b/README_ZH.md index e60183ae..e1f73c85 100644 --- a/README_ZH.md +++ b/README_ZH.md @@ -13,13 +13,13 @@ ## 🎯 概览 -KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [kt-sft](./kt-sft/)。 +KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)。 ## 🔥 更新 -* **2025 年 12 月 5 日**:支持原生 Kimi-K2-Thinking 推理([教程](./doc/en/Kimi-K2-Thinking-Native.md)) +* **2025 年 12 月 5 日**:支持原生 Kimi-K2-Thinking 推理([教程](./doc/en/kt-kernel/Kimi-K2-Thinking-Native.md)) * **2025 年 11 月 6 日**:支持 Kimi-K2-Thinking 推理([教程](./doc/en/Kimi-K2-Thinking.md))和微调([教程](./doc/en/SFT_Installation_Guide_KimiK2.md)) -* **2025 年 11 月 4 日**:KTransformers 微调 × LLaMA-Factory 集成([教程](./doc/en/KTransformers-Fine-Tuning_User-Guide.md)) +* **2025 年 11 月 4 日**:KTransformers 微调 × LLaMA-Factory 集成([教程](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)) * **2025 年 10 月 27 日**:支持昇腾 NPU([教程](./doc/zh/DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md)) * **2025 年 10 月 10 日**:集成到 SGLang([路线图](https://github.com/sgl-project/sglang/issues/11425),[博客](https://lmsys.org/blog/2025-10-22-KTransformers/)) * **2025 年 9 月 11 日**:支持 Qwen3-Next([教程](./doc/en/Qwen3-Next.md)) @@ -79,7 +79,7 @@ pip install . --- -### 🎓 [kt-sft](./kt-sft/) - 微调框架 +### 🎓 [kt-sft](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md) - 微调框架 KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。 @@ -101,12 +101,15 @@ KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。 **快速开始:** ```bash -cd kt-sft -# 按照 kt-sft/README.md 安装环境 -USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml +cd /path/to/LLaMA-Factory +pip install -e . +pip install "ktransformers[sft]" +USE_KT=1 ACCELERATE_USE_KT=true \ + accelerate launch --config_file examples/ktransformers/accelerate/fsdp2_kt_bf16.yaml \ + -m llamafactory.cli train examples/ktransformers/train_lora/deepseek_v3_lora_sft_kt.yaml ``` -👉 **[完整文档 →](./kt-sft/README.md)** +👉 **[完整文档 →](./doc/en/SFT/KTransformers-Fine-Tuning_User-Guide.md)** --- diff --git a/kt-sft/.flake8 b/archive/kt-sft/.flake8 similarity index 100% rename from kt-sft/.flake8 rename to archive/kt-sft/.flake8 diff --git a/kt-sft/.gitignore b/archive/kt-sft/.gitignore similarity index 100% rename from kt-sft/.gitignore rename to archive/kt-sft/.gitignore diff --git a/kt-sft/.gitmodules b/archive/kt-sft/.gitmodules similarity index 100% rename from kt-sft/.gitmodules rename to archive/kt-sft/.gitmodules diff --git a/kt-sft/.pylintrc b/archive/kt-sft/.pylintrc similarity index 100% rename from kt-sft/.pylintrc rename to archive/kt-sft/.pylintrc diff --git a/kt-sft/Dockerfile b/archive/kt-sft/Dockerfile similarity index 100% rename from kt-sft/Dockerfile rename to archive/kt-sft/Dockerfile diff --git a/kt-sft/Dockerfile.xpu b/archive/kt-sft/Dockerfile.xpu similarity index 100% rename from kt-sft/Dockerfile.xpu rename to archive/kt-sft/Dockerfile.xpu diff --git a/kt-sft/LICENSE b/archive/kt-sft/LICENSE similarity index 100% rename from kt-sft/LICENSE rename to archive/kt-sft/LICENSE diff --git a/kt-sft/MANIFEST.in b/archive/kt-sft/MANIFEST.in similarity index 100% rename from kt-sft/MANIFEST.in rename to archive/kt-sft/MANIFEST.in diff --git a/kt-sft/Makefile b/archive/kt-sft/Makefile similarity index 100% rename from kt-sft/Makefile rename to archive/kt-sft/Makefile diff --git a/kt-sft/README.md b/archive/kt-sft/README.md similarity index 100% rename from kt-sft/README.md rename to archive/kt-sft/README.md diff --git a/kt-sft/SECURITY.md b/archive/kt-sft/SECURITY.md similarity index 100% rename from kt-sft/SECURITY.md rename to archive/kt-sft/SECURITY.md diff --git a/kt-sft/WeChatGroup.png b/archive/kt-sft/WeChatGroup.png similarity index 100% rename from kt-sft/WeChatGroup.png rename to archive/kt-sft/WeChatGroup.png diff --git a/kt-sft/autosetup.sh b/archive/kt-sft/autosetup.sh similarity index 100% rename from kt-sft/autosetup.sh rename to archive/kt-sft/autosetup.sh diff --git a/kt-sft/book.toml b/archive/kt-sft/book.toml similarity index 100% rename from kt-sft/book.toml rename to archive/kt-sft/book.toml diff --git a/kt-sft/csrc/custom_marlin/__init__.py b/archive/kt-sft/csrc/custom_marlin/__init__.py similarity index 100% rename from kt-sft/csrc/custom_marlin/__init__.py rename to archive/kt-sft/csrc/custom_marlin/__init__.py diff --git a/kt-sft/csrc/custom_marlin/binding.cpp b/archive/kt-sft/csrc/custom_marlin/binding.cpp similarity index 100% rename from kt-sft/csrc/custom_marlin/binding.cpp rename to archive/kt-sft/csrc/custom_marlin/binding.cpp diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu similarity index 100% rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh similarity index 100% rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin.cuh diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh similarity index 100% rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_dtypes.cuh diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu similarity index 100% rename from kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu diff --git a/kt-sft/csrc/custom_marlin/gptq_marlin/ops.h b/archive/kt-sft/csrc/custom_marlin/gptq_marlin/ops.h similarity index 100% rename from kt-sft/csrc/custom_marlin/gptq_marlin/ops.h rename to archive/kt-sft/csrc/custom_marlin/gptq_marlin/ops.h diff --git a/kt-sft/csrc/custom_marlin/setup.py b/archive/kt-sft/csrc/custom_marlin/setup.py similarity index 100% rename from kt-sft/csrc/custom_marlin/setup.py rename to archive/kt-sft/csrc/custom_marlin/setup.py diff --git a/kt-sft/csrc/custom_marlin/test_cuda_graph.py b/archive/kt-sft/csrc/custom_marlin/test_cuda_graph.py similarity index 100% rename from kt-sft/csrc/custom_marlin/test_cuda_graph.py rename to archive/kt-sft/csrc/custom_marlin/test_cuda_graph.py diff --git a/kt-sft/csrc/custom_marlin/utils/__init__.py b/archive/kt-sft/csrc/custom_marlin/utils/__init__.py similarity index 100% rename from kt-sft/csrc/custom_marlin/utils/__init__.py rename to archive/kt-sft/csrc/custom_marlin/utils/__init__.py diff --git a/kt-sft/csrc/custom_marlin/utils/format24.py b/archive/kt-sft/csrc/custom_marlin/utils/format24.py similarity index 100% rename from kt-sft/csrc/custom_marlin/utils/format24.py rename to archive/kt-sft/csrc/custom_marlin/utils/format24.py diff --git a/kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py b/archive/kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py similarity index 100% rename from kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py rename to archive/kt-sft/csrc/custom_marlin/utils/marlin_24_perms.py diff --git a/kt-sft/csrc/custom_marlin/utils/marlin_perms.py b/archive/kt-sft/csrc/custom_marlin/utils/marlin_perms.py similarity index 100% rename from kt-sft/csrc/custom_marlin/utils/marlin_perms.py rename to archive/kt-sft/csrc/custom_marlin/utils/marlin_perms.py diff --git a/kt-sft/csrc/custom_marlin/utils/marlin_utils.py b/archive/kt-sft/csrc/custom_marlin/utils/marlin_utils.py similarity index 100% rename from kt-sft/csrc/custom_marlin/utils/marlin_utils.py rename to archive/kt-sft/csrc/custom_marlin/utils/marlin_utils.py diff --git a/kt-sft/csrc/custom_marlin/utils/quant_utils.py b/archive/kt-sft/csrc/custom_marlin/utils/quant_utils.py similarity index 100% rename from kt-sft/csrc/custom_marlin/utils/quant_utils.py rename to archive/kt-sft/csrc/custom_marlin/utils/quant_utils.py diff --git a/kt-sft/csrc/ktransformers_ext/CMakeLists.txt b/archive/kt-sft/csrc/ktransformers_ext/CMakeLists.txt similarity index 100% rename from kt-sft/csrc/ktransformers_ext/CMakeLists.txt rename to archive/kt-sft/csrc/ktransformers_ext/CMakeLists.txt diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_attention.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_attention.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_attention_torch.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_linear.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_linear_torch.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_moe.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py diff --git a/kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py b/archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py rename to archive/kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py diff --git a/kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake b/archive/kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake rename to archive/kt-sft/csrc/ktransformers_ext/cmake/FindSIMD.cmake diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.cpp diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/backend.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/cpuinfer.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.cpp diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/shared_mem_buffer.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.cpp diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/task_queue.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/README.md diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/hip.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/musa.h diff --git a/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h b/archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h rename to archive/kt-sft/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h diff --git a/kt-sft/csrc/ktransformers_ext/cuda/binding.cpp b/archive/kt-sft/csrc/ktransformers_ext/cuda/binding.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/binding.cpp rename to archive/kt-sft/csrc/ktransformers_ext/cuda/binding.cpp diff --git a/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu b/archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu rename to archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/dequant.cu diff --git a/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h b/archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h rename to archive/kt-sft/csrc/ktransformers_ext/cuda/custom_gguf/ops.h diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cu diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin.cuh diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/gptq_marlin_dtypes.cuh diff --git a/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h b/archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h rename to archive/kt-sft/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h diff --git a/kt-sft/csrc/ktransformers_ext/cuda/setup.py b/archive/kt-sft/csrc/ktransformers_ext/cuda/setup.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/setup.py rename to archive/kt-sft/csrc/ktransformers_ext/cuda/setup.py diff --git a/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py b/archive/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py rename to archive/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_attention.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_attention.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/examples/test_attention.py rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_attention.py diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_linear.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_linear.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/examples/test_linear.py rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_linear.py diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/examples/test_mlp.py rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_moe.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_moe.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/examples/test_moe.py rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_moe.py diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py diff --git a/kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py b/archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py similarity index 100% rename from kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py rename to archive/kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py diff --git a/kt-sft/csrc/ktransformers_ext/ext_bindings.cpp b/archive/kt-sft/csrc/ktransformers_ext/ext_bindings.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/ext_bindings.cpp rename to archive/kt-sft/csrc/ktransformers_ext/ext_bindings.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_sft_moe.hpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/debug_tools_sft_moe.hpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/amx.hpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/la/utils.hpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/moe.hpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp b/archive/kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/amx/sft_moe.hpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache.h diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_attn.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_load_dump.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_read_write.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/kvcache/kvcache_utils.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/conversion.h diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/linear.h diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/mlp.h diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/moe.h diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.cpp diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe.h diff --git a/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h b/archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h rename to archive/kt-sft/csrc/ktransformers_ext/operators/llamafile/sft_moe_forward_cache.h diff --git a/kt-sft/csrc/ktransformers_ext/vendors/cuda.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/cuda.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/vendors/cuda.h rename to archive/kt-sft/csrc/ktransformers_ext/vendors/cuda.h diff --git a/kt-sft/csrc/ktransformers_ext/vendors/hip.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/hip.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/vendors/hip.h rename to archive/kt-sft/csrc/ktransformers_ext/vendors/hip.h diff --git a/kt-sft/csrc/ktransformers_ext/vendors/musa.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/musa.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/vendors/musa.h rename to archive/kt-sft/csrc/ktransformers_ext/vendors/musa.h diff --git a/kt-sft/csrc/ktransformers_ext/vendors/vendor.h b/archive/kt-sft/csrc/ktransformers_ext/vendors/vendor.h similarity index 100% rename from kt-sft/csrc/ktransformers_ext/vendors/vendor.h rename to archive/kt-sft/csrc/ktransformers_ext/vendors/vendor.h diff --git a/kt-sft/install-with-cache.sh b/archive/kt-sft/install-with-cache.sh similarity index 100% rename from kt-sft/install-with-cache.sh rename to archive/kt-sft/install-with-cache.sh diff --git a/kt-sft/install.bat b/archive/kt-sft/install.bat similarity index 100% rename from kt-sft/install.bat rename to archive/kt-sft/install.bat diff --git a/kt-sft/install.sh b/archive/kt-sft/install.sh similarity index 100% rename from kt-sft/install.sh rename to archive/kt-sft/install.sh diff --git a/kt-sft/ktransformers/__init__.py b/archive/kt-sft/ktransformers/__init__.py similarity index 100% rename from kt-sft/ktransformers/__init__.py rename to archive/kt-sft/ktransformers/__init__.py diff --git a/kt-sft/ktransformers/configs/config.yaml b/archive/kt-sft/ktransformers/configs/config.yaml similarity index 100% rename from kt-sft/ktransformers/configs/config.yaml rename to archive/kt-sft/ktransformers/configs/config.yaml diff --git a/kt-sft/ktransformers/configs/log_config.ini b/archive/kt-sft/ktransformers/configs/log_config.ini similarity index 100% rename from kt-sft/ktransformers/configs/log_config.ini rename to archive/kt-sft/ktransformers/configs/log_config.ini diff --git a/kt-sft/ktransformers/configs/model_config/config.json b/archive/kt-sft/ktransformers/configs/model_config/config.json similarity index 100% rename from kt-sft/ktransformers/configs/model_config/config.json rename to archive/kt-sft/ktransformers/configs/model_config/config.json diff --git a/kt-sft/ktransformers/configs/model_config/configuration_deepseek.py b/archive/kt-sft/ktransformers/configs/model_config/configuration_deepseek.py similarity index 100% rename from kt-sft/ktransformers/configs/model_config/configuration_deepseek.py rename to archive/kt-sft/ktransformers/configs/model_config/configuration_deepseek.py diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/__init__.py diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/format_24.py diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_24_perms.py diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_perms.py diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/marlin_utils.py diff --git a/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py b/archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py rename to archive/kt-sft/ktransformers/ktransformers_ext/operators/custom_marlin/quantize/utils/quant_utils.py diff --git a/kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py b/archive/kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py similarity index 100% rename from kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py rename to archive/kt-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py diff --git a/kt-sft/ktransformers/local_chat.py b/archive/kt-sft/ktransformers/local_chat.py similarity index 100% rename from kt-sft/ktransformers/local_chat.py rename to archive/kt-sft/ktransformers/local_chat.py diff --git a/kt-sft/ktransformers/local_chat.sh b/archive/kt-sft/ktransformers/local_chat.sh similarity index 100% rename from kt-sft/ktransformers/local_chat.sh rename to archive/kt-sft/ktransformers/local_chat.sh diff --git a/kt-sft/ktransformers/lora_test_module.py b/archive/kt-sft/ktransformers/lora_test_module.py similarity index 100% rename from kt-sft/ktransformers/lora_test_module.py rename to archive/kt-sft/ktransformers/lora_test_module.py diff --git a/kt-sft/ktransformers/models/__init__.py b/archive/kt-sft/ktransformers/models/__init__.py similarity index 100% rename from kt-sft/ktransformers/models/__init__.py rename to archive/kt-sft/ktransformers/models/__init__.py diff --git a/kt-sft/ktransformers/models/configuration_deepseek.py b/archive/kt-sft/ktransformers/models/configuration_deepseek.py similarity index 100% rename from kt-sft/ktransformers/models/configuration_deepseek.py rename to archive/kt-sft/ktransformers/models/configuration_deepseek.py diff --git a/kt-sft/ktransformers/models/configuration_deepseek_v3.py b/archive/kt-sft/ktransformers/models/configuration_deepseek_v3.py similarity index 100% rename from kt-sft/ktransformers/models/configuration_deepseek_v3.py rename to archive/kt-sft/ktransformers/models/configuration_deepseek_v3.py diff --git a/kt-sft/ktransformers/models/configuration_llama.py b/archive/kt-sft/ktransformers/models/configuration_llama.py similarity index 100% rename from kt-sft/ktransformers/models/configuration_llama.py rename to archive/kt-sft/ktransformers/models/configuration_llama.py diff --git a/kt-sft/ktransformers/models/configuration_qwen2_moe.py b/archive/kt-sft/ktransformers/models/configuration_qwen2_moe.py similarity index 100% rename from kt-sft/ktransformers/models/configuration_qwen2_moe.py rename to archive/kt-sft/ktransformers/models/configuration_qwen2_moe.py diff --git a/kt-sft/ktransformers/models/configuration_qwen3_moe.py b/archive/kt-sft/ktransformers/models/configuration_qwen3_moe.py similarity index 100% rename from kt-sft/ktransformers/models/configuration_qwen3_moe.py rename to archive/kt-sft/ktransformers/models/configuration_qwen3_moe.py diff --git a/kt-sft/ktransformers/models/custom_cache.py b/archive/kt-sft/ktransformers/models/custom_cache.py similarity index 100% rename from kt-sft/ktransformers/models/custom_cache.py rename to archive/kt-sft/ktransformers/models/custom_cache.py diff --git a/kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py b/archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py similarity index 100% rename from kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py rename to archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v2.py diff --git a/kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py b/archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py similarity index 100% rename from kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py rename to archive/kt-sft/ktransformers/models/custom_modeling_deepseek_v3.py diff --git a/kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py b/archive/kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py similarity index 100% rename from kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py rename to archive/kt-sft/ktransformers/models/custom_modeling_qwen2_moe.py diff --git a/kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py b/archive/kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py similarity index 100% rename from kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py rename to archive/kt-sft/ktransformers/models/custom_modeling_qwen3_moe.py diff --git a/kt-sft/ktransformers/models/modeling_deepseek.py b/archive/kt-sft/ktransformers/models/modeling_deepseek.py similarity index 100% rename from kt-sft/ktransformers/models/modeling_deepseek.py rename to archive/kt-sft/ktransformers/models/modeling_deepseek.py diff --git a/kt-sft/ktransformers/models/modeling_deepseek_v3.py b/archive/kt-sft/ktransformers/models/modeling_deepseek_v3.py similarity index 100% rename from kt-sft/ktransformers/models/modeling_deepseek_v3.py rename to archive/kt-sft/ktransformers/models/modeling_deepseek_v3.py diff --git a/kt-sft/ktransformers/models/modeling_llama.py b/archive/kt-sft/ktransformers/models/modeling_llama.py similarity index 100% rename from kt-sft/ktransformers/models/modeling_llama.py rename to archive/kt-sft/ktransformers/models/modeling_llama.py diff --git a/kt-sft/ktransformers/models/modeling_mixtral.py b/archive/kt-sft/ktransformers/models/modeling_mixtral.py similarity index 100% rename from kt-sft/ktransformers/models/modeling_mixtral.py rename to archive/kt-sft/ktransformers/models/modeling_mixtral.py diff --git a/kt-sft/ktransformers/models/modeling_qwen2_moe.py b/archive/kt-sft/ktransformers/models/modeling_qwen2_moe.py similarity index 100% rename from kt-sft/ktransformers/models/modeling_qwen2_moe.py rename to archive/kt-sft/ktransformers/models/modeling_qwen2_moe.py diff --git a/kt-sft/ktransformers/models/modeling_qwen3_moe.py b/archive/kt-sft/ktransformers/models/modeling_qwen3_moe.py similarity index 100% rename from kt-sft/ktransformers/models/modeling_qwen3_moe.py rename to archive/kt-sft/ktransformers/models/modeling_qwen3_moe.py diff --git a/kt-sft/ktransformers/moe_test_module.py b/archive/kt-sft/ktransformers/moe_test_module.py similarity index 100% rename from kt-sft/ktransformers/moe_test_module.py rename to archive/kt-sft/ktransformers/moe_test_module.py diff --git a/kt-sft/ktransformers/moe_test_module_old.py b/archive/kt-sft/ktransformers/moe_test_module_old.py similarity index 100% rename from kt-sft/ktransformers/moe_test_module_old.py rename to archive/kt-sft/ktransformers/moe_test_module_old.py diff --git a/kt-sft/ktransformers/operators/RoPE.py b/archive/kt-sft/ktransformers/operators/RoPE.py similarity index 100% rename from kt-sft/ktransformers/operators/RoPE.py rename to archive/kt-sft/ktransformers/operators/RoPE.py diff --git a/kt-sft/ktransformers/operators/__init__.py b/archive/kt-sft/ktransformers/operators/__init__.py similarity index 100% rename from kt-sft/ktransformers/operators/__init__.py rename to archive/kt-sft/ktransformers/operators/__init__.py diff --git a/kt-sft/ktransformers/operators/attention.py b/archive/kt-sft/ktransformers/operators/attention.py similarity index 100% rename from kt-sft/ktransformers/operators/attention.py rename to archive/kt-sft/ktransformers/operators/attention.py diff --git a/kt-sft/ktransformers/operators/balance_serve_attention.py b/archive/kt-sft/ktransformers/operators/balance_serve_attention.py similarity index 100% rename from kt-sft/ktransformers/operators/balance_serve_attention.py rename to archive/kt-sft/ktransformers/operators/balance_serve_attention.py diff --git a/kt-sft/ktransformers/operators/base_operator.py b/archive/kt-sft/ktransformers/operators/base_operator.py similarity index 100% rename from kt-sft/ktransformers/operators/base_operator.py rename to archive/kt-sft/ktransformers/operators/base_operator.py diff --git a/kt-sft/ktransformers/operators/cpuinfer.py b/archive/kt-sft/ktransformers/operators/cpuinfer.py similarity index 100% rename from kt-sft/ktransformers/operators/cpuinfer.py rename to archive/kt-sft/ktransformers/operators/cpuinfer.py diff --git a/kt-sft/ktransformers/operators/dynamic_attention.py b/archive/kt-sft/ktransformers/operators/dynamic_attention.py similarity index 100% rename from kt-sft/ktransformers/operators/dynamic_attention.py rename to archive/kt-sft/ktransformers/operators/dynamic_attention.py diff --git a/kt-sft/ktransformers/operators/experts.py b/archive/kt-sft/ktransformers/operators/experts.py similarity index 93% rename from kt-sft/ktransformers/operators/experts.py rename to archive/kt-sft/ktransformers/operators/experts.py index 19bbd64f..0e80bf18 100644 --- a/kt-sft/ktransformers/operators/experts.py +++ b/archive/kt-sft/ktransformers/operators/experts.py @@ -418,6 +418,18 @@ class KSFTExpertsCPU(torch.autograd.Function): #stream_map:dict = {} # Manage cuda stream on different gpu #gguf_loader:GGUFLoader = None CPU_INFER = CPUInfer(Config().cpu_infer) + + # Pinned memory buffers for training (batch mode) + # These are used for efficient CPU-GPU data transfer + _pinned_input_buf: Tensor = None # [max_tokens, hidden_size] + _pinned_output_buf: Tensor = None # [max_tokens, hidden_size] + _pinned_expert_ids_buf: Tensor = None # [max_tokens, num_experts_per_tok] + _pinned_weights_buf: Tensor = None # [max_tokens, num_experts_per_tok] + _pinned_grad_out_buf: Tensor = None # [max_tokens, hidden_size] for backward + _pinned_grad_in_buf: Tensor = None # [max_tokens, hidden_size] for backward + _pinned_buf_size: int = 0 # current buffer capacity (max_tokens) + _hidden_size: int = 0 + _num_experts_per_tok: int = 0 def __init__( self, key: str, @@ -449,6 +461,57 @@ class KSFTExpertsCPU(torch.autograd.Function): self.tflops_fwd = [] self.tflops_bwd = [] + @classmethod + def _ensure_pinned_buffers(cls, num_tokens: int, hidden_size: int, num_experts_per_tok: int): + """ + Ensure pinned memory buffers are allocated with sufficient size. + Buffers are reused across calls and only reallocated if more space is needed. + """ + # Check if we need to allocate or expand buffers + if (cls._pinned_input_buf is None or + num_tokens > cls._pinned_buf_size or + hidden_size != cls._hidden_size or + num_experts_per_tok != cls._num_experts_per_tok): + + # Allocate with some extra capacity to reduce reallocations + new_size = max(num_tokens, cls._pinned_buf_size * 2) if cls._pinned_buf_size > 0 else num_tokens + new_size = max(new_size, 1024) # minimum 1024 tokens + + # Free old buffers + cls._pinned_input_buf = None + cls._pinned_output_buf = None + cls._pinned_expert_ids_buf = None + cls._pinned_weights_buf = None + cls._pinned_grad_out_buf = None + cls._pinned_grad_in_buf = None + + # Allocate new pinned buffers + cls._pinned_input_buf = torch.empty( + (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True + ) + cls._pinned_output_buf = torch.empty( + (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True + ) + cls._pinned_expert_ids_buf = torch.empty( + (new_size, num_experts_per_tok), dtype=torch.long, device="cpu", pin_memory=True + ) + cls._pinned_weights_buf = torch.empty( + (new_size, num_experts_per_tok), dtype=torch.float32, device="cpu", pin_memory=True + ) + cls._pinned_grad_out_buf = torch.empty( + (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True + ) + cls._pinned_grad_in_buf = torch.empty( + (new_size, hidden_size), dtype=torch.bfloat16, device="cpu", pin_memory=True + ) + + cls._pinned_buf_size = new_size + cls._hidden_size = hidden_size + cls._num_experts_per_tok = num_experts_per_tok + + print(f"[KSFTExpertsCPU] Allocated pinned memory buffers: " + f"size={new_size}, hidden={hidden_size}, k={num_experts_per_tok}") + def load(self, w: dict | nn.Parameter | tuple | None = None, device:str|None = None, warmup:bool = False): if device: assert device.lower() == "cpu", "KSFTExpertsCPU can only be loaded on CPU, Parameter \"device\" can be cpu or None." @@ -548,7 +611,16 @@ class KSFTExpertsCPU(torch.autograd.Function): KSFTExpertsCPU.expert_ids_cpu = torch.zeros((num_experts_per_tok), device="cpu", dtype=torch.long, pin_memory=True) KSFTExpertsCPU.weights_cpu = torch.zeros((num_experts_per_tok), device="cpu", dtype=torch.float32, pin_memory=True) KSFTExpertsCPU.output_cpu = torch.zeros((self.config.hidden_size), device="cpu", pin_memory=True, dtype=torch.bfloat16) - + + # Initialize pinned memory buffers for training (batch mode) + # Default size is 4096 tokens, will expand automatically if needed + default_max_tokens = 4096 + KSFTExpertsCPU._ensure_pinned_buffers( + default_max_tokens, + self.config.hidden_size, + num_experts_per_tok + ) + self.gate = None self.up = None self.down = None @@ -577,37 +649,68 @@ class KSFTExpertsCPU(torch.autograd.Function): if input_tensor.size(0)==1 and torch.cuda.is_current_stream_capturing(): # TODO: this branch is unreachable, but the shape of input_tensor([1,hidden_size]) and input_tensor_cpu([hidden_size]) is not compatible #print("capturing experts") + wall_t0 = time.time() KSFTExpertsCPU.input_tensor_cpu.copy_(input_tensor, non_blocking=True) KSFTExpertsCPU.expert_ids_cpu.copy_(expert_ids, non_blocking=True) KSFTExpertsCPU.weights_cpu.copy_(weights, non_blocking=True) cpu_infer.submit_with_cuda_stream(torch.cuda.current_stream().cuda_stream, moe.forward(1, expert_ids.size(1), KSFTExpertsCPU.expert_ids_cpu.data_ptr(), KSFTExpertsCPU.weights_cpu.data_ptr(), KSFTExpertsCPU.input_tensor_cpu.data_ptr(), KSFTExpertsCPU.output_cpu.data_ptr())) cpu_infer.sync_with_cuda_stream(torch.cuda.current_stream().cuda_stream) - t_fwd = time.time() - wall_t0 + t_fwd = time.time() - wall_t0 KSFTExpertsCPU.output_gpu_map[out_device].copy_(KSFTExpertsCPU.output_cpu, non_blocking=True) result = KSFTExpertsCPU.output_gpu_map[out_device] + # For backward compatibility, copy to CPU tensors + input_cpu = input_tensor.contiguous().cpu() + expert_ids_cpu = expert_ids.contiguous().cpu() + weights_cpu = weights.to(torch.float32).contiguous().cpu() else: - input_tensor = input_tensor.contiguous().cpu() - expert_ids = expert_ids.contiguous().cpu() - weights = weights.contiguous().to(torch.float32).cpu() - output = torch.empty_like(input_tensor).contiguous() - # print("success record") + num_tokens = input_tensor.size(0) + hidden_size = input_tensor.size(1) + num_experts_per_tok = expert_ids.size(1) + + # Ensure pinned buffers are large enough + KSFTExpertsCPU._ensure_pinned_buffers(num_tokens, hidden_size, num_experts_per_tok) + + # Use pinned memory buffers for efficient CPU-GPU transfer + input_buf = KSFTExpertsCPU._pinned_input_buf[:num_tokens] + output_buf = KSFTExpertsCPU._pinned_output_buf[:num_tokens] + expert_ids_buf = KSFTExpertsCPU._pinned_expert_ids_buf[:num_tokens] + weights_buf = KSFTExpertsCPU._pinned_weights_buf[:num_tokens] + + # Copy data to pinned memory (non_blocking for async transfer) + input_buf.copy_(input_tensor.to(torch.bfloat16), non_blocking=True) + expert_ids_buf.copy_(expert_ids, non_blocking=True) + weights_buf.copy_(weights.to(torch.float32), non_blocking=True) + + # Synchronize to ensure data is ready on CPU + if input_tensor.is_cuda: + torch.cuda.current_stream().synchronize() + + # Make contiguous views for CPU computation + input_cpu = input_buf.contiguous() + expert_ids_cpu = expert_ids_buf.contiguous() + weights_cpu = weights_buf.contiguous() + output_cpu = output_buf.contiguous() + wall_t0 = time.time() cpu_infer.submit( moe.forward( - expert_ids.size(0), - expert_ids.size(1), - expert_ids.data_ptr(), - weights.data_ptr(), - input_tensor.data_ptr(), - output.data_ptr(), + expert_ids_cpu.size(0), + expert_ids_cpu.size(1), + expert_ids_cpu.data_ptr(), + weights_cpu.data_ptr(), + input_cpu.data_ptr(), + output_cpu.data_ptr(), ) ) cpu_infer.sync() - t_fwd = time.time() - wall_t0 + t_fwd = time.time() - wall_t0 - result = output.to(device=out_device) + # Copy result back to GPU using pinned memory (async) + result = torch.empty((num_tokens, hidden_size), dtype=input_tensor.dtype, device=out_device) + result.copy_(output_cpu, non_blocking=True) - ctx.save_for_backward(input_tensor, expert_ids, weights) + # Save CPU tensors for backward (already in pinned memory) + ctx.save_for_backward(input_cpu, expert_ids_cpu, weights_cpu) ctx.cpu_infer = cpu_infer ctx.moe = moe ctx.out_device = out_device @@ -632,50 +735,63 @@ class KSFTExpertsCPU(torch.autograd.Function): @staticmethod def backward(ctx, output_grad): # print("Go into the backward!!") - - # Pick back the middle results - input_tensor, expert_ids, weights = ctx.saved_tensors - import random - layer_idx = random.randint(0, 10000) - # print(f"layer_idx:{layer_idx}") - # layer_idx = ctx.layer_idx - - # cpu_infer = ctx.cpu_infer - # moe = ctx.moe - # out_device = ctx.out_device - # ready for computing gradient - output_grad = output_grad.contiguous().cpu() - input_grad = torch.empty_like(input_tensor).contiguous() - # print(dir(cpuinfer_ext.moe.MOE)) + # Pick back the middle results (already in pinned memory from forward) + input_tensor, expert_ids, weights = ctx.saved_tensors + + num_tokens = output_grad.size(0) + hidden_size = output_grad.size(1) + num_experts_per_tok = expert_ids.size(1) + + # Ensure pinned buffers are large enough (should already be from forward) + KSFTExpertsCPU._ensure_pinned_buffers(num_tokens, hidden_size, num_experts_per_tok) + + # Use pinned memory buffers for gradient transfer + grad_out_buf = KSFTExpertsCPU._pinned_grad_out_buf[:num_tokens] + grad_in_buf = KSFTExpertsCPU._pinned_grad_in_buf[:num_tokens] + + # Copy output_grad to pinned memory (async) + grad_out_buf.copy_(output_grad.to(torch.bfloat16), non_blocking=True) + + # Synchronize to ensure data is ready on CPU + if output_grad.is_cuda: + torch.cuda.current_stream().synchronize() + + # Make contiguous for CPU computation + output_grad_cpu = grad_out_buf.contiguous() + input_grad_cpu = grad_in_buf.contiguous() + bw_start = time.time() ctx.cpu_infer.submit( ctx.moe.backward( - # layer_idx, - output_grad.size(0), # qlen - expert_ids.size(1), # k + output_grad_cpu.size(0), # qlen + expert_ids.size(1), # k expert_ids.data_ptr(), weights.data_ptr(), - input_tensor.data_ptr(), - output_grad.data_ptr(), - input_grad.data_ptr(), + input_tensor.data_ptr(), + output_grad_cpu.data_ptr(), + input_grad_cpu.data_ptr(), ) ) ctx.cpu_infer.sync() - - bw_end = time.time() - t_bw = bw_end - bw_start - + + bw_end = time.time() + t_bw = bw_end - bw_start + + # Copy gradient back to GPU using pinned memory (async) + result_grad = torch.empty((num_tokens, hidden_size), dtype=output_grad.dtype, device=ctx.out_device) + result_grad.copy_(input_grad_cpu, non_blocking=True) + # ---------- FLOPs ---------- - qlen, k = ctx.saved_dims + qlen, k = ctx.saved_dims flops_bw = 10 * qlen * k * H_FIXED * M_FIXED tflops_b = flops_bw / t_bw / 1e12 # print(f"qlen:{qlen}, k:{k}") # with open("test_V3_ESC.txt", "a", encoding="utf-8") as f: # f.write(f"[KSFTExpertsCPU]Backward: {flops_bw/1e9:.3f} GFLOPs {tflops_b:.2f} TFLOPS {t_bw*1e3:.2f} ms\n") - - return input_grad.to(device=ctx.out_device), None, None, None, None, None, None + + return result_grad, None, None, None, None, None, None def unload(self): return diff --git a/kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py b/archive/kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py similarity index 100% rename from kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py rename to archive/kt-sft/ktransformers/operators/flashinfer_batch_prefill_wrapper.py diff --git a/kt-sft/ktransformers/operators/flashinfer_wrapper.py b/archive/kt-sft/ktransformers/operators/flashinfer_wrapper.py similarity index 100% rename from kt-sft/ktransformers/operators/flashinfer_wrapper.py rename to archive/kt-sft/ktransformers/operators/flashinfer_wrapper.py diff --git a/kt-sft/ktransformers/operators/gate.py b/archive/kt-sft/ktransformers/operators/gate.py similarity index 100% rename from kt-sft/ktransformers/operators/gate.py rename to archive/kt-sft/ktransformers/operators/gate.py diff --git a/kt-sft/ktransformers/operators/layernorm.py b/archive/kt-sft/ktransformers/operators/layernorm.py similarity index 100% rename from kt-sft/ktransformers/operators/layernorm.py rename to archive/kt-sft/ktransformers/operators/layernorm.py diff --git a/kt-sft/ktransformers/operators/linear.py b/archive/kt-sft/ktransformers/operators/linear.py similarity index 100% rename from kt-sft/ktransformers/operators/linear.py rename to archive/kt-sft/ktransformers/operators/linear.py diff --git a/kt-sft/ktransformers/operators/mlp.py b/archive/kt-sft/ktransformers/operators/mlp.py similarity index 100% rename from kt-sft/ktransformers/operators/mlp.py rename to archive/kt-sft/ktransformers/operators/mlp.py diff --git a/kt-sft/ktransformers/operators/models.py b/archive/kt-sft/ktransformers/operators/models.py similarity index 100% rename from kt-sft/ktransformers/operators/models.py rename to archive/kt-sft/ktransformers/operators/models.py diff --git a/kt-sft/ktransformers/operators/triton_attention.py b/archive/kt-sft/ktransformers/operators/triton_attention.py similarity index 100% rename from kt-sft/ktransformers/operators/triton_attention.py rename to archive/kt-sft/ktransformers/operators/triton_attention.py diff --git a/kt-sft/ktransformers/operators/triton_attention_prefill.py b/archive/kt-sft/ktransformers/operators/triton_attention_prefill.py similarity index 100% rename from kt-sft/ktransformers/operators/triton_attention_prefill.py rename to archive/kt-sft/ktransformers/operators/triton_attention_prefill.py diff --git a/kt-sft/ktransformers/optimize/optimize.py b/archive/kt-sft/ktransformers/optimize/optimize.py similarity index 100% rename from kt-sft/ktransformers/optimize/optimize.py rename to archive/kt-sft/ktransformers/optimize/optimize.py diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu-4.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-multi-gpu.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat-sft-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Chat.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-multi-gpu.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx-multi-gpu.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-sft.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat-use-adapter.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V2-Lite-Chat.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-4.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-8.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-fp8-linear-ggml-experts.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu-marlin.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu-4.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx-multi-gpu.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-sft-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Internlm2_5-7b-Chat-1m.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Mixtral.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B-serve.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct-multi-gpu.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-57B-A14B-Instruct.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen2-serve.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/Qwen3Moe-sft-amx.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V2-Chat.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/DeepSeek-V3-Chat.yaml diff --git a/kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml b/archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml similarity index 100% rename from kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml rename to archive/kt-sft/ktransformers/optimize/optimize_rules/xpu/Qwen3Moe-Chat.yaml diff --git a/kt-sft/ktransformers/server/__init__.py b/archive/kt-sft/ktransformers/server/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/__init__.py rename to archive/kt-sft/ktransformers/server/__init__.py diff --git a/kt-sft/ktransformers/server/api/__init__.py b/archive/kt-sft/ktransformers/server/api/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/__init__.py rename to archive/kt-sft/ktransformers/server/api/__init__.py diff --git a/kt-sft/ktransformers/server/api/ollama/__init__.py b/archive/kt-sft/ktransformers/server/api/ollama/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/ollama/__init__.py rename to archive/kt-sft/ktransformers/server/api/ollama/__init__.py diff --git a/kt-sft/ktransformers/server/api/ollama/completions.py b/archive/kt-sft/ktransformers/server/api/ollama/completions.py similarity index 100% rename from kt-sft/ktransformers/server/api/ollama/completions.py rename to archive/kt-sft/ktransformers/server/api/ollama/completions.py diff --git a/kt-sft/ktransformers/server/api/openai/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/__init__.py rename to archive/kt-sft/ktransformers/server/api/openai/__init__.py diff --git a/kt-sft/ktransformers/server/api/openai/assistants/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/assistants/__init__.py rename to archive/kt-sft/ktransformers/server/api/openai/assistants/__init__.py diff --git a/kt-sft/ktransformers/server/api/openai/assistants/assistants.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/assistants.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/assistants/assistants.py rename to archive/kt-sft/ktransformers/server/api/openai/assistants/assistants.py diff --git a/kt-sft/ktransformers/server/api/openai/assistants/messages.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/messages.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/assistants/messages.py rename to archive/kt-sft/ktransformers/server/api/openai/assistants/messages.py diff --git a/kt-sft/ktransformers/server/api/openai/assistants/runs.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/runs.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/assistants/runs.py rename to archive/kt-sft/ktransformers/server/api/openai/assistants/runs.py diff --git a/kt-sft/ktransformers/server/api/openai/assistants/threads.py b/archive/kt-sft/ktransformers/server/api/openai/assistants/threads.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/assistants/threads.py rename to archive/kt-sft/ktransformers/server/api/openai/assistants/threads.py diff --git a/kt-sft/ktransformers/server/api/openai/endpoints/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/endpoints/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/endpoints/__init__.py rename to archive/kt-sft/ktransformers/server/api/openai/endpoints/__init__.py diff --git a/kt-sft/ktransformers/server/api/openai/endpoints/chat.py b/archive/kt-sft/ktransformers/server/api/openai/endpoints/chat.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/endpoints/chat.py rename to archive/kt-sft/ktransformers/server/api/openai/endpoints/chat.py diff --git a/kt-sft/ktransformers/server/api/openai/legacy/__init__.py b/archive/kt-sft/ktransformers/server/api/openai/legacy/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/legacy/__init__.py rename to archive/kt-sft/ktransformers/server/api/openai/legacy/__init__.py diff --git a/kt-sft/ktransformers/server/api/openai/legacy/completions.py b/archive/kt-sft/ktransformers/server/api/openai/legacy/completions.py similarity index 100% rename from kt-sft/ktransformers/server/api/openai/legacy/completions.py rename to archive/kt-sft/ktransformers/server/api/openai/legacy/completions.py diff --git a/kt-sft/ktransformers/server/api/web/__init__.py b/archive/kt-sft/ktransformers/server/api/web/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/api/web/__init__.py rename to archive/kt-sft/ktransformers/server/api/web/__init__.py diff --git a/kt-sft/ktransformers/server/api/web/system.py b/archive/kt-sft/ktransformers/server/api/web/system.py similarity index 100% rename from kt-sft/ktransformers/server/api/web/system.py rename to archive/kt-sft/ktransformers/server/api/web/system.py diff --git a/kt-sft/ktransformers/server/args.py b/archive/kt-sft/ktransformers/server/args.py similarity index 100% rename from kt-sft/ktransformers/server/args.py rename to archive/kt-sft/ktransformers/server/args.py diff --git a/kt-sft/ktransformers/server/backend/__init__.py b/archive/kt-sft/ktransformers/server/backend/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/backend/__init__.py rename to archive/kt-sft/ktransformers/server/backend/__init__.py diff --git a/kt-sft/ktransformers/server/backend/args.py b/archive/kt-sft/ktransformers/server/backend/args.py similarity index 100% rename from kt-sft/ktransformers/server/backend/args.py rename to archive/kt-sft/ktransformers/server/backend/args.py diff --git a/kt-sft/ktransformers/server/backend/base.py b/archive/kt-sft/ktransformers/server/backend/base.py similarity index 100% rename from kt-sft/ktransformers/server/backend/base.py rename to archive/kt-sft/ktransformers/server/backend/base.py diff --git a/kt-sft/ktransformers/server/backend/context_manager.py b/archive/kt-sft/ktransformers/server/backend/context_manager.py similarity index 100% rename from kt-sft/ktransformers/server/backend/context_manager.py rename to archive/kt-sft/ktransformers/server/backend/context_manager.py diff --git a/kt-sft/ktransformers/server/backend/interfaces/__init__.py b/archive/kt-sft/ktransformers/server/backend/interfaces/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/backend/interfaces/__init__.py rename to archive/kt-sft/ktransformers/server/backend/interfaces/__init__.py diff --git a/kt-sft/ktransformers/server/backend/interfaces/balance_serve.py b/archive/kt-sft/ktransformers/server/backend/interfaces/balance_serve.py similarity index 100% rename from kt-sft/ktransformers/server/backend/interfaces/balance_serve.py rename to archive/kt-sft/ktransformers/server/backend/interfaces/balance_serve.py diff --git a/kt-sft/ktransformers/server/backend/interfaces/exllamav2.py b/archive/kt-sft/ktransformers/server/backend/interfaces/exllamav2.py similarity index 100% rename from kt-sft/ktransformers/server/backend/interfaces/exllamav2.py rename to archive/kt-sft/ktransformers/server/backend/interfaces/exllamav2.py diff --git a/kt-sft/ktransformers/server/backend/interfaces/ktransformers.py b/archive/kt-sft/ktransformers/server/backend/interfaces/ktransformers.py similarity index 100% rename from kt-sft/ktransformers/server/backend/interfaces/ktransformers.py rename to archive/kt-sft/ktransformers/server/backend/interfaces/ktransformers.py diff --git a/kt-sft/ktransformers/server/backend/interfaces/transformers.py b/archive/kt-sft/ktransformers/server/backend/interfaces/transformers.py similarity index 100% rename from kt-sft/ktransformers/server/backend/interfaces/transformers.py rename to archive/kt-sft/ktransformers/server/backend/interfaces/transformers.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/__init__.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/__init__.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/__init__.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/config.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/config.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/config.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/config.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/__init__.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/communication_op.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/cuda_wrapper.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/custom_all_reduce_utils.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/parallel_state.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/pynccl_wrapper.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/distributed/utils.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/forward_batch.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/model_runner.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/model_runner.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/model_runner.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/model_runner.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/query_manager.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/query_manager.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/query_manager.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/query_manager.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/__init__.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/orchestrator.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/frequency_penalty.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/min_new_tokens.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/presence_penalty.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/penaltylib/penalizers/repetition_penalty.py diff --git a/kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py b/archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py rename to archive/kt-sft/ktransformers/server/balance_serve/inference/sampling/sampler.py diff --git a/kt-sft/ktransformers/server/balance_serve/sched_rpc.py b/archive/kt-sft/ktransformers/server/balance_serve/sched_rpc.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/sched_rpc.py rename to archive/kt-sft/ktransformers/server/balance_serve/sched_rpc.py diff --git a/kt-sft/ktransformers/server/balance_serve/settings.py b/archive/kt-sft/ktransformers/server/balance_serve/settings.py similarity index 100% rename from kt-sft/ktransformers/server/balance_serve/settings.py rename to archive/kt-sft/ktransformers/server/balance_serve/settings.py diff --git a/kt-sft/ktransformers/server/config/config.py b/archive/kt-sft/ktransformers/server/config/config.py similarity index 100% rename from kt-sft/ktransformers/server/config/config.py rename to archive/kt-sft/ktransformers/server/config/config.py diff --git a/kt-sft/ktransformers/server/config/log.py b/archive/kt-sft/ktransformers/server/config/log.py similarity index 100% rename from kt-sft/ktransformers/server/config/log.py rename to archive/kt-sft/ktransformers/server/config/log.py diff --git a/kt-sft/ktransformers/server/config/singleton.py b/archive/kt-sft/ktransformers/server/config/singleton.py similarity index 100% rename from kt-sft/ktransformers/server/config/singleton.py rename to archive/kt-sft/ktransformers/server/config/singleton.py diff --git a/kt-sft/ktransformers/server/crud/__init__.py b/archive/kt-sft/ktransformers/server/crud/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/crud/__init__.py rename to archive/kt-sft/ktransformers/server/crud/__init__.py diff --git a/kt-sft/ktransformers/server/crud/assistants/__init__.py b/archive/kt-sft/ktransformers/server/crud/assistants/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/crud/assistants/__init__.py rename to archive/kt-sft/ktransformers/server/crud/assistants/__init__.py diff --git a/kt-sft/ktransformers/server/crud/assistants/assistants.py b/archive/kt-sft/ktransformers/server/crud/assistants/assistants.py similarity index 100% rename from kt-sft/ktransformers/server/crud/assistants/assistants.py rename to archive/kt-sft/ktransformers/server/crud/assistants/assistants.py diff --git a/kt-sft/ktransformers/server/crud/assistants/messages.py b/archive/kt-sft/ktransformers/server/crud/assistants/messages.py similarity index 100% rename from kt-sft/ktransformers/server/crud/assistants/messages.py rename to archive/kt-sft/ktransformers/server/crud/assistants/messages.py diff --git a/kt-sft/ktransformers/server/crud/assistants/runs.py b/archive/kt-sft/ktransformers/server/crud/assistants/runs.py similarity index 100% rename from kt-sft/ktransformers/server/crud/assistants/runs.py rename to archive/kt-sft/ktransformers/server/crud/assistants/runs.py diff --git a/kt-sft/ktransformers/server/crud/assistants/threads.py b/archive/kt-sft/ktransformers/server/crud/assistants/threads.py similarity index 100% rename from kt-sft/ktransformers/server/crud/assistants/threads.py rename to archive/kt-sft/ktransformers/server/crud/assistants/threads.py diff --git a/kt-sft/ktransformers/server/exceptions.py b/archive/kt-sft/ktransformers/server/exceptions.py similarity index 100% rename from kt-sft/ktransformers/server/exceptions.py rename to archive/kt-sft/ktransformers/server/exceptions.py diff --git a/kt-sft/ktransformers/server/main.py b/archive/kt-sft/ktransformers/server/main.py similarity index 100% rename from kt-sft/ktransformers/server/main.py rename to archive/kt-sft/ktransformers/server/main.py diff --git a/kt-sft/ktransformers/server/models/__init__.py b/archive/kt-sft/ktransformers/server/models/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/models/__init__.py rename to archive/kt-sft/ktransformers/server/models/__init__.py diff --git a/kt-sft/ktransformers/server/models/assistants/__init__.py b/archive/kt-sft/ktransformers/server/models/assistants/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/models/assistants/__init__.py rename to archive/kt-sft/ktransformers/server/models/assistants/__init__.py diff --git a/kt-sft/ktransformers/server/models/assistants/assistants.py b/archive/kt-sft/ktransformers/server/models/assistants/assistants.py similarity index 100% rename from kt-sft/ktransformers/server/models/assistants/assistants.py rename to archive/kt-sft/ktransformers/server/models/assistants/assistants.py diff --git a/kt-sft/ktransformers/server/models/assistants/messages.py b/archive/kt-sft/ktransformers/server/models/assistants/messages.py similarity index 100% rename from kt-sft/ktransformers/server/models/assistants/messages.py rename to archive/kt-sft/ktransformers/server/models/assistants/messages.py diff --git a/kt-sft/ktransformers/server/models/assistants/run_steps.py b/archive/kt-sft/ktransformers/server/models/assistants/run_steps.py similarity index 100% rename from kt-sft/ktransformers/server/models/assistants/run_steps.py rename to archive/kt-sft/ktransformers/server/models/assistants/run_steps.py diff --git a/kt-sft/ktransformers/server/models/assistants/runs.py b/archive/kt-sft/ktransformers/server/models/assistants/runs.py similarity index 100% rename from kt-sft/ktransformers/server/models/assistants/runs.py rename to archive/kt-sft/ktransformers/server/models/assistants/runs.py diff --git a/kt-sft/ktransformers/server/models/assistants/threads.py b/archive/kt-sft/ktransformers/server/models/assistants/threads.py similarity index 100% rename from kt-sft/ktransformers/server/models/assistants/threads.py rename to archive/kt-sft/ktransformers/server/models/assistants/threads.py diff --git a/kt-sft/ktransformers/server/schemas/__init__.py b/archive/kt-sft/ktransformers/server/schemas/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/__init__.py rename to archive/kt-sft/ktransformers/server/schemas/__init__.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/__init__.py b/archive/kt-sft/ktransformers/server/schemas/assistants/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/__init__.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/__init__.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/assistants.py b/archive/kt-sft/ktransformers/server/schemas/assistants/assistants.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/assistants.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/assistants.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/messages.py b/archive/kt-sft/ktransformers/server/schemas/assistants/messages.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/messages.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/messages.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/runs.py b/archive/kt-sft/ktransformers/server/schemas/assistants/runs.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/runs.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/runs.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/streaming.py b/archive/kt-sft/ktransformers/server/schemas/assistants/streaming.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/streaming.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/streaming.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/threads.py b/archive/kt-sft/ktransformers/server/schemas/assistants/threads.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/threads.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/threads.py diff --git a/kt-sft/ktransformers/server/schemas/assistants/tool.py b/archive/kt-sft/ktransformers/server/schemas/assistants/tool.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/assistants/tool.py rename to archive/kt-sft/ktransformers/server/schemas/assistants/tool.py diff --git a/kt-sft/ktransformers/server/schemas/base.py b/archive/kt-sft/ktransformers/server/schemas/base.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/base.py rename to archive/kt-sft/ktransformers/server/schemas/base.py diff --git a/kt-sft/ktransformers/server/schemas/conversation.py b/archive/kt-sft/ktransformers/server/schemas/conversation.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/conversation.py rename to archive/kt-sft/ktransformers/server/schemas/conversation.py diff --git a/kt-sft/ktransformers/server/schemas/endpoints/chat.py b/archive/kt-sft/ktransformers/server/schemas/endpoints/chat.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/endpoints/chat.py rename to archive/kt-sft/ktransformers/server/schemas/endpoints/chat.py diff --git a/kt-sft/ktransformers/server/schemas/legacy/__init__.py b/archive/kt-sft/ktransformers/server/schemas/legacy/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/legacy/__init__.py rename to archive/kt-sft/ktransformers/server/schemas/legacy/__init__.py diff --git a/kt-sft/ktransformers/server/schemas/legacy/completions.py b/archive/kt-sft/ktransformers/server/schemas/legacy/completions.py similarity index 100% rename from kt-sft/ktransformers/server/schemas/legacy/completions.py rename to archive/kt-sft/ktransformers/server/schemas/legacy/completions.py diff --git a/kt-sft/ktransformers/server/utils/__init__.py b/archive/kt-sft/ktransformers/server/utils/__init__.py similarity index 100% rename from kt-sft/ktransformers/server/utils/__init__.py rename to archive/kt-sft/ktransformers/server/utils/__init__.py diff --git a/kt-sft/ktransformers/server/utils/create_interface.py b/archive/kt-sft/ktransformers/server/utils/create_interface.py similarity index 100% rename from kt-sft/ktransformers/server/utils/create_interface.py rename to archive/kt-sft/ktransformers/server/utils/create_interface.py diff --git a/kt-sft/ktransformers/server/utils/multi_timer.py b/archive/kt-sft/ktransformers/server/utils/multi_timer.py similarity index 100% rename from kt-sft/ktransformers/server/utils/multi_timer.py rename to archive/kt-sft/ktransformers/server/utils/multi_timer.py diff --git a/kt-sft/ktransformers/server/utils/sql_utils.py b/archive/kt-sft/ktransformers/server/utils/sql_utils.py similarity index 100% rename from kt-sft/ktransformers/server/utils/sql_utils.py rename to archive/kt-sft/ktransformers/server/utils/sql_utils.py diff --git a/kt-sft/ktransformers/sft/__init__.py b/archive/kt-sft/ktransformers/sft/__init__.py similarity index 100% rename from kt-sft/ktransformers/sft/__init__.py rename to archive/kt-sft/ktransformers/sft/__init__.py diff --git a/kt-sft/ktransformers/sft/flops_utils/__init__.py b/archive/kt-sft/ktransformers/sft/flops_utils/__init__.py similarity index 100% rename from kt-sft/ktransformers/sft/flops_utils/__init__.py rename to archive/kt-sft/ktransformers/sft/flops_utils/__init__.py diff --git a/kt-sft/ktransformers/sft/flops_utils/custom_profile.py b/archive/kt-sft/ktransformers/sft/flops_utils/custom_profile.py similarity index 100% rename from kt-sft/ktransformers/sft/flops_utils/custom_profile.py rename to archive/kt-sft/ktransformers/sft/flops_utils/custom_profile.py diff --git a/kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py b/archive/kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py similarity index 100% rename from kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py rename to archive/kt-sft/ktransformers/sft/flops_utils/lora_test_utils.py diff --git a/kt-sft/ktransformers/sft/lora.py b/archive/kt-sft/ktransformers/sft/lora.py similarity index 100% rename from kt-sft/ktransformers/sft/lora.py rename to archive/kt-sft/ktransformers/sft/lora.py diff --git a/kt-sft/ktransformers/sft/metrics.py b/archive/kt-sft/ktransformers/sft/metrics.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics.py rename to archive/kt-sft/ktransformers/sft/metrics.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/__init__.py b/archive/kt-sft/ktransformers/sft/metrics_utils/__init__.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/__init__.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/__init__.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/constants.py b/archive/kt-sft/ktransformers/sft/metrics_utils/constants.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/constants.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/constants.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/env.py b/archive/kt-sft/ktransformers/sft/metrics_utils/env.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/env.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/env.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/logging.py b/archive/kt-sft/ktransformers/sft/metrics_utils/logging.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/logging.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/logging.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/misc.py b/archive/kt-sft/ktransformers/sft/metrics_utils/misc.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/misc.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/misc.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/packages.py b/archive/kt-sft/ktransformers/sft/metrics_utils/packages.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/packages.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/packages.py diff --git a/kt-sft/ktransformers/sft/metrics_utils/ploting.py b/archive/kt-sft/ktransformers/sft/metrics_utils/ploting.py similarity index 100% rename from kt-sft/ktransformers/sft/metrics_utils/ploting.py rename to archive/kt-sft/ktransformers/sft/metrics_utils/ploting.py diff --git a/kt-sft/ktransformers/sft/monkey_patch_torch_module.py b/archive/kt-sft/ktransformers/sft/monkey_patch_torch_module.py similarity index 100% rename from kt-sft/ktransformers/sft/monkey_patch_torch_module.py rename to archive/kt-sft/ktransformers/sft/monkey_patch_torch_module.py diff --git a/kt-sft/ktransformers/sft/peft_utils/__init__.py b/archive/kt-sft/ktransformers/sft/peft_utils/__init__.py similarity index 100% rename from kt-sft/ktransformers/sft/peft_utils/__init__.py rename to archive/kt-sft/ktransformers/sft/peft_utils/__init__.py diff --git a/kt-sft/ktransformers/sft/peft_utils/lora_layer.py b/archive/kt-sft/ktransformers/sft/peft_utils/lora_layer.py similarity index 100% rename from kt-sft/ktransformers/sft/peft_utils/lora_layer.py rename to archive/kt-sft/ktransformers/sft/peft_utils/lora_layer.py diff --git a/kt-sft/ktransformers/sft/peft_utils/lora_model.py b/archive/kt-sft/ktransformers/sft/peft_utils/lora_model.py similarity index 100% rename from kt-sft/ktransformers/sft/peft_utils/lora_model.py rename to archive/kt-sft/ktransformers/sft/peft_utils/lora_model.py diff --git a/kt-sft/ktransformers/sft/peft_utils/mapping.py b/archive/kt-sft/ktransformers/sft/peft_utils/mapping.py similarity index 100% rename from kt-sft/ktransformers/sft/peft_utils/mapping.py rename to archive/kt-sft/ktransformers/sft/peft_utils/mapping.py diff --git a/kt-sft/ktransformers/sft/peft_utils/peft_model.py b/archive/kt-sft/ktransformers/sft/peft_utils/peft_model.py similarity index 100% rename from kt-sft/ktransformers/sft/peft_utils/peft_model.py rename to archive/kt-sft/ktransformers/sft/peft_utils/peft_model.py diff --git a/kt-sft/ktransformers/sft/torchviz_test.py b/archive/kt-sft/ktransformers/sft/torchviz_test.py similarity index 100% rename from kt-sft/ktransformers/sft/torchviz_test.py rename to archive/kt-sft/ktransformers/sft/torchviz_test.py diff --git a/kt-sft/ktransformers/tests/.gitignore b/archive/kt-sft/ktransformers/tests/.gitignore similarity index 100% rename from kt-sft/ktransformers/tests/.gitignore rename to archive/kt-sft/ktransformers/tests/.gitignore diff --git a/kt-sft/ktransformers/tests/AIME_2024/eval_api.py b/archive/kt-sft/ktransformers/tests/AIME_2024/eval_api.py similarity index 100% rename from kt-sft/ktransformers/tests/AIME_2024/eval_api.py rename to archive/kt-sft/ktransformers/tests/AIME_2024/eval_api.py diff --git a/kt-sft/ktransformers/tests/AIME_2024/evaluation.py b/archive/kt-sft/ktransformers/tests/AIME_2024/evaluation.py similarity index 100% rename from kt-sft/ktransformers/tests/AIME_2024/evaluation.py rename to archive/kt-sft/ktransformers/tests/AIME_2024/evaluation.py diff --git a/kt-sft/ktransformers/tests/AIME_2024/prompts.py b/archive/kt-sft/ktransformers/tests/AIME_2024/prompts.py similarity index 100% rename from kt-sft/ktransformers/tests/AIME_2024/prompts.py rename to archive/kt-sft/ktransformers/tests/AIME_2024/prompts.py diff --git a/kt-sft/ktransformers/tests/dequant_gpu.py b/archive/kt-sft/ktransformers/tests/dequant_gpu.py similarity index 100% rename from kt-sft/ktransformers/tests/dequant_gpu.py rename to archive/kt-sft/ktransformers/tests/dequant_gpu.py diff --git a/kt-sft/ktransformers/tests/dequant_gpu_t.py b/archive/kt-sft/ktransformers/tests/dequant_gpu_t.py similarity index 100% rename from kt-sft/ktransformers/tests/dequant_gpu_t.py rename to archive/kt-sft/ktransformers/tests/dequant_gpu_t.py diff --git a/kt-sft/ktransformers/tests/function_call_test.py b/archive/kt-sft/ktransformers/tests/function_call_test.py similarity index 100% rename from kt-sft/ktransformers/tests/function_call_test.py rename to archive/kt-sft/ktransformers/tests/function_call_test.py diff --git a/kt-sft/ktransformers/tests/humaneval/eval_api.py b/archive/kt-sft/ktransformers/tests/humaneval/eval_api.py similarity index 100% rename from kt-sft/ktransformers/tests/humaneval/eval_api.py rename to archive/kt-sft/ktransformers/tests/humaneval/eval_api.py diff --git a/kt-sft/ktransformers/tests/humaneval/evaluation.py b/archive/kt-sft/ktransformers/tests/humaneval/evaluation.py similarity index 100% rename from kt-sft/ktransformers/tests/humaneval/evaluation.py rename to archive/kt-sft/ktransformers/tests/humaneval/evaluation.py diff --git a/kt-sft/ktransformers/tests/humaneval/prompts.py b/archive/kt-sft/ktransformers/tests/humaneval/prompts.py similarity index 100% rename from kt-sft/ktransformers/tests/humaneval/prompts.py rename to archive/kt-sft/ktransformers/tests/humaneval/prompts.py diff --git a/kt-sft/ktransformers/tests/mmlu_pro_test.py b/archive/kt-sft/ktransformers/tests/mmlu_pro_test.py similarity index 100% rename from kt-sft/ktransformers/tests/mmlu_pro_test.py rename to archive/kt-sft/ktransformers/tests/mmlu_pro_test.py diff --git a/kt-sft/ktransformers/tests/mmlu_test.py b/archive/kt-sft/ktransformers/tests/mmlu_test.py similarity index 100% rename from kt-sft/ktransformers/tests/mmlu_test.py rename to archive/kt-sft/ktransformers/tests/mmlu_test.py diff --git a/kt-sft/ktransformers/tests/mmlu_test_multi.py b/archive/kt-sft/ktransformers/tests/mmlu_test_multi.py similarity index 100% rename from kt-sft/ktransformers/tests/mmlu_test_multi.py rename to archive/kt-sft/ktransformers/tests/mmlu_test_multi.py diff --git a/kt-sft/ktransformers/tests/score.py b/archive/kt-sft/ktransformers/tests/score.py similarity index 100% rename from kt-sft/ktransformers/tests/score.py rename to archive/kt-sft/ktransformers/tests/score.py diff --git a/kt-sft/ktransformers/tests/test_client.py b/archive/kt-sft/ktransformers/tests/test_client.py similarity index 100% rename from kt-sft/ktransformers/tests/test_client.py rename to archive/kt-sft/ktransformers/tests/test_client.py diff --git a/kt-sft/ktransformers/tests/test_pytorch_q8.py b/archive/kt-sft/ktransformers/tests/test_pytorch_q8.py similarity index 100% rename from kt-sft/ktransformers/tests/test_pytorch_q8.py rename to archive/kt-sft/ktransformers/tests/test_pytorch_q8.py diff --git a/kt-sft/ktransformers/tests/test_speed.py b/archive/kt-sft/ktransformers/tests/test_speed.py similarity index 100% rename from kt-sft/ktransformers/tests/test_speed.py rename to archive/kt-sft/ktransformers/tests/test_speed.py diff --git a/kt-sft/ktransformers/tests/triton_fp8gemm_test.py b/archive/kt-sft/ktransformers/tests/triton_fp8gemm_test.py similarity index 100% rename from kt-sft/ktransformers/tests/triton_fp8gemm_test.py rename to archive/kt-sft/ktransformers/tests/triton_fp8gemm_test.py diff --git a/kt-sft/ktransformers/util/cuda_graph_runner.py b/archive/kt-sft/ktransformers/util/cuda_graph_runner.py similarity index 100% rename from kt-sft/ktransformers/util/cuda_graph_runner.py rename to archive/kt-sft/ktransformers/util/cuda_graph_runner.py diff --git a/kt-sft/ktransformers/util/custom_gguf.py b/archive/kt-sft/ktransformers/util/custom_gguf.py similarity index 100% rename from kt-sft/ktransformers/util/custom_gguf.py rename to archive/kt-sft/ktransformers/util/custom_gguf.py diff --git a/kt-sft/ktransformers/util/custom_loader.py b/archive/kt-sft/ktransformers/util/custom_loader.py similarity index 100% rename from kt-sft/ktransformers/util/custom_loader.py rename to archive/kt-sft/ktransformers/util/custom_loader.py diff --git a/kt-sft/ktransformers/util/globals.py b/archive/kt-sft/ktransformers/util/globals.py similarity index 100% rename from kt-sft/ktransformers/util/globals.py rename to archive/kt-sft/ktransformers/util/globals.py diff --git a/kt-sft/ktransformers/util/grad_wrapper.py b/archive/kt-sft/ktransformers/util/grad_wrapper.py similarity index 100% rename from kt-sft/ktransformers/util/grad_wrapper.py rename to archive/kt-sft/ktransformers/util/grad_wrapper.py diff --git a/kt-sft/ktransformers/util/inference_state.py b/archive/kt-sft/ktransformers/util/inference_state.py similarity index 100% rename from kt-sft/ktransformers/util/inference_state.py rename to archive/kt-sft/ktransformers/util/inference_state.py diff --git a/kt-sft/ktransformers/util/modeling_rope_utils.py b/archive/kt-sft/ktransformers/util/modeling_rope_utils.py similarity index 100% rename from kt-sft/ktransformers/util/modeling_rope_utils.py rename to archive/kt-sft/ktransformers/util/modeling_rope_utils.py diff --git a/kt-sft/ktransformers/util/textstream.py b/archive/kt-sft/ktransformers/util/textstream.py similarity index 100% rename from kt-sft/ktransformers/util/textstream.py rename to archive/kt-sft/ktransformers/util/textstream.py diff --git a/kt-sft/ktransformers/util/utils.py b/archive/kt-sft/ktransformers/util/utils.py similarity index 100% rename from kt-sft/ktransformers/util/utils.py rename to archive/kt-sft/ktransformers/util/utils.py diff --git a/kt-sft/ktransformers/util/vendors.py b/archive/kt-sft/ktransformers/util/vendors.py similarity index 100% rename from kt-sft/ktransformers/util/vendors.py rename to archive/kt-sft/ktransformers/util/vendors.py diff --git a/kt-sft/ktransformers/util/weight_loader.py b/archive/kt-sft/ktransformers/util/weight_loader.py similarity index 100% rename from kt-sft/ktransformers/util/weight_loader.py rename to archive/kt-sft/ktransformers/util/weight_loader.py diff --git a/kt-sft/ktransformers/website/.browserslistrc b/archive/kt-sft/ktransformers/website/.browserslistrc similarity index 100% rename from kt-sft/ktransformers/website/.browserslistrc rename to archive/kt-sft/ktransformers/website/.browserslistrc diff --git a/kt-sft/ktransformers/website/.eslintrc.js b/archive/kt-sft/ktransformers/website/.eslintrc.js similarity index 100% rename from kt-sft/ktransformers/website/.eslintrc.js rename to archive/kt-sft/ktransformers/website/.eslintrc.js diff --git a/kt-sft/ktransformers/website/.gitignore b/archive/kt-sft/ktransformers/website/.gitignore similarity index 100% rename from kt-sft/ktransformers/website/.gitignore rename to archive/kt-sft/ktransformers/website/.gitignore diff --git a/kt-sft/ktransformers/website/README.md b/archive/kt-sft/ktransformers/website/README.md similarity index 100% rename from kt-sft/ktransformers/website/README.md rename to archive/kt-sft/ktransformers/website/README.md diff --git a/kt-sft/ktransformers/website/config.d.ts b/archive/kt-sft/ktransformers/website/config.d.ts similarity index 100% rename from kt-sft/ktransformers/website/config.d.ts rename to archive/kt-sft/ktransformers/website/config.d.ts diff --git a/kt-sft/ktransformers/website/jest.config.js b/archive/kt-sft/ktransformers/website/jest.config.js similarity index 100% rename from kt-sft/ktransformers/website/jest.config.js rename to archive/kt-sft/ktransformers/website/jest.config.js diff --git a/kt-sft/ktransformers/website/package-lock.json b/archive/kt-sft/ktransformers/website/package-lock.json similarity index 100% rename from kt-sft/ktransformers/website/package-lock.json rename to archive/kt-sft/ktransformers/website/package-lock.json diff --git a/kt-sft/ktransformers/website/package.json b/archive/kt-sft/ktransformers/website/package.json similarity index 100% rename from kt-sft/ktransformers/website/package.json rename to archive/kt-sft/ktransformers/website/package.json diff --git a/kt-sft/ktransformers/website/public/balck.ico b/archive/kt-sft/ktransformers/website/public/balck.ico similarity index 100% rename from kt-sft/ktransformers/website/public/balck.ico rename to archive/kt-sft/ktransformers/website/public/balck.ico diff --git a/kt-sft/ktransformers/website/public/config.js b/archive/kt-sft/ktransformers/website/public/config.js similarity index 100% rename from kt-sft/ktransformers/website/public/config.js rename to archive/kt-sft/ktransformers/website/public/config.js diff --git a/kt-sft/ktransformers/website/public/css/reset.css b/archive/kt-sft/ktransformers/website/public/css/reset.css similarity index 100% rename from kt-sft/ktransformers/website/public/css/reset.css rename to archive/kt-sft/ktransformers/website/public/css/reset.css diff --git a/kt-sft/ktransformers/website/public/images/assistant-avatar.png b/archive/kt-sft/ktransformers/website/public/images/assistant-avatar.png similarity index 100% rename from kt-sft/ktransformers/website/public/images/assistant-avatar.png rename to archive/kt-sft/ktransformers/website/public/images/assistant-avatar.png diff --git a/kt-sft/ktransformers/website/public/images/avatar.png b/archive/kt-sft/ktransformers/website/public/images/avatar.png similarity index 100% rename from kt-sft/ktransformers/website/public/images/avatar.png rename to archive/kt-sft/ktransformers/website/public/images/avatar.png diff --git a/kt-sft/ktransformers/website/public/images/bgbg.png b/archive/kt-sft/ktransformers/website/public/images/bgbg.png similarity index 100% rename from kt-sft/ktransformers/website/public/images/bgbg.png rename to archive/kt-sft/ktransformers/website/public/images/bgbg.png diff --git a/kt-sft/ktransformers/website/public/images/logo.ico b/archive/kt-sft/ktransformers/website/public/images/logo.ico similarity index 100% rename from kt-sft/ktransformers/website/public/images/logo.ico rename to archive/kt-sft/ktransformers/website/public/images/logo.ico diff --git a/kt-sft/ktransformers/website/public/images/logo.png b/archive/kt-sft/ktransformers/website/public/images/logo.png similarity index 100% rename from kt-sft/ktransformers/website/public/images/logo.png rename to archive/kt-sft/ktransformers/website/public/images/logo.png diff --git a/kt-sft/ktransformers/website/public/images/three.png b/archive/kt-sft/ktransformers/website/public/images/three.png similarity index 100% rename from kt-sft/ktransformers/website/public/images/three.png rename to archive/kt-sft/ktransformers/website/public/images/three.png diff --git a/kt-sft/ktransformers/website/public/images/user-filling.png b/archive/kt-sft/ktransformers/website/public/images/user-filling.png similarity index 100% rename from kt-sft/ktransformers/website/public/images/user-filling.png rename to archive/kt-sft/ktransformers/website/public/images/user-filling.png diff --git a/kt-sft/ktransformers/website/public/index.html b/archive/kt-sft/ktransformers/website/public/index.html similarity index 100% rename from kt-sft/ktransformers/website/public/index.html rename to archive/kt-sft/ktransformers/website/public/index.html diff --git a/kt-sft/ktransformers/website/src/App.vue b/archive/kt-sft/ktransformers/website/src/App.vue similarity index 100% rename from kt-sft/ktransformers/website/src/App.vue rename to archive/kt-sft/ktransformers/website/src/App.vue diff --git a/kt-sft/ktransformers/website/src/api/api-client.ts b/archive/kt-sft/ktransformers/website/src/api/api-client.ts similarity index 100% rename from kt-sft/ktransformers/website/src/api/api-client.ts rename to archive/kt-sft/ktransformers/website/src/api/api-client.ts diff --git a/kt-sft/ktransformers/website/src/api/assistant.ts b/archive/kt-sft/ktransformers/website/src/api/assistant.ts similarity index 100% rename from kt-sft/ktransformers/website/src/api/assistant.ts rename to archive/kt-sft/ktransformers/website/src/api/assistant.ts diff --git a/kt-sft/ktransformers/website/src/api/message.ts b/archive/kt-sft/ktransformers/website/src/api/message.ts similarity index 100% rename from kt-sft/ktransformers/website/src/api/message.ts rename to archive/kt-sft/ktransformers/website/src/api/message.ts diff --git a/kt-sft/ktransformers/website/src/api/run.ts b/archive/kt-sft/ktransformers/website/src/api/run.ts similarity index 100% rename from kt-sft/ktransformers/website/src/api/run.ts rename to archive/kt-sft/ktransformers/website/src/api/run.ts diff --git a/kt-sft/ktransformers/website/src/api/thread.ts b/archive/kt-sft/ktransformers/website/src/api/thread.ts similarity index 100% rename from kt-sft/ktransformers/website/src/api/thread.ts rename to archive/kt-sft/ktransformers/website/src/api/thread.ts diff --git a/kt-sft/ktransformers/website/src/assets/css/mixins.styl b/archive/kt-sft/ktransformers/website/src/assets/css/mixins.styl similarity index 100% rename from kt-sft/ktransformers/website/src/assets/css/mixins.styl rename to archive/kt-sft/ktransformers/website/src/assets/css/mixins.styl diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/demo.css b/archive/kt-sft/ktransformers/website/src/assets/iconfont/demo.css similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/demo.css rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/demo.css diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html b/archive/kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/demo_index.html diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.css diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.js diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.json diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.ttf diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff diff --git a/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2 b/archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2 similarity index 100% rename from kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2 rename to archive/kt-sft/ktransformers/website/src/assets/iconfont/iconfont.woff2 diff --git a/kt-sft/ktransformers/website/src/components/chat/index.vue b/archive/kt-sft/ktransformers/website/src/components/chat/index.vue similarity index 100% rename from kt-sft/ktransformers/website/src/components/chat/index.vue rename to archive/kt-sft/ktransformers/website/src/components/chat/index.vue diff --git a/kt-sft/ktransformers/website/src/conf/config.ts b/archive/kt-sft/ktransformers/website/src/conf/config.ts similarity index 100% rename from kt-sft/ktransformers/website/src/conf/config.ts rename to archive/kt-sft/ktransformers/website/src/conf/config.ts diff --git a/kt-sft/ktransformers/website/src/locals/en.js b/archive/kt-sft/ktransformers/website/src/locals/en.js similarity index 100% rename from kt-sft/ktransformers/website/src/locals/en.js rename to archive/kt-sft/ktransformers/website/src/locals/en.js diff --git a/kt-sft/ktransformers/website/src/locals/index.js b/archive/kt-sft/ktransformers/website/src/locals/index.js similarity index 100% rename from kt-sft/ktransformers/website/src/locals/index.js rename to archive/kt-sft/ktransformers/website/src/locals/index.js diff --git a/kt-sft/ktransformers/website/src/locals/zh.js b/archive/kt-sft/ktransformers/website/src/locals/zh.js similarity index 100% rename from kt-sft/ktransformers/website/src/locals/zh.js rename to archive/kt-sft/ktransformers/website/src/locals/zh.js diff --git a/kt-sft/ktransformers/website/src/main.ts b/archive/kt-sft/ktransformers/website/src/main.ts similarity index 100% rename from kt-sft/ktransformers/website/src/main.ts rename to archive/kt-sft/ktransformers/website/src/main.ts diff --git a/kt-sft/ktransformers/website/src/router/index.ts b/archive/kt-sft/ktransformers/website/src/router/index.ts similarity index 100% rename from kt-sft/ktransformers/website/src/router/index.ts rename to archive/kt-sft/ktransformers/website/src/router/index.ts diff --git a/kt-sft/ktransformers/website/src/shims-vue.d.ts b/archive/kt-sft/ktransformers/website/src/shims-vue.d.ts similarity index 100% rename from kt-sft/ktransformers/website/src/shims-vue.d.ts rename to archive/kt-sft/ktransformers/website/src/shims-vue.d.ts diff --git a/kt-sft/ktransformers/website/src/store/index.ts b/archive/kt-sft/ktransformers/website/src/store/index.ts similarity index 100% rename from kt-sft/ktransformers/website/src/store/index.ts rename to archive/kt-sft/ktransformers/website/src/store/index.ts diff --git a/kt-sft/ktransformers/website/src/utils/copy.ts b/archive/kt-sft/ktransformers/website/src/utils/copy.ts similarity index 100% rename from kt-sft/ktransformers/website/src/utils/copy.ts rename to archive/kt-sft/ktransformers/website/src/utils/copy.ts diff --git a/kt-sft/ktransformers/website/src/utils/types.ts b/archive/kt-sft/ktransformers/website/src/utils/types.ts similarity index 100% rename from kt-sft/ktransformers/website/src/utils/types.ts rename to archive/kt-sft/ktransformers/website/src/utils/types.ts diff --git a/kt-sft/ktransformers/website/src/views/home.vue b/archive/kt-sft/ktransformers/website/src/views/home.vue similarity index 100% rename from kt-sft/ktransformers/website/src/views/home.vue rename to archive/kt-sft/ktransformers/website/src/views/home.vue diff --git a/kt-sft/ktransformers/website/tests/unit/example.spec.ts b/archive/kt-sft/ktransformers/website/tests/unit/example.spec.ts similarity index 100% rename from kt-sft/ktransformers/website/tests/unit/example.spec.ts rename to archive/kt-sft/ktransformers/website/tests/unit/example.spec.ts diff --git a/kt-sft/ktransformers/website/tsconfig.json b/archive/kt-sft/ktransformers/website/tsconfig.json similarity index 100% rename from kt-sft/ktransformers/website/tsconfig.json rename to archive/kt-sft/ktransformers/website/tsconfig.json diff --git a/kt-sft/ktransformers/website/vue.config.js b/archive/kt-sft/ktransformers/website/vue.config.js similarity index 100% rename from kt-sft/ktransformers/website/vue.config.js rename to archive/kt-sft/ktransformers/website/vue.config.js diff --git a/kt-sft/merge_tensors/merge_safetensor_gguf.py b/archive/kt-sft/merge_tensors/merge_safetensor_gguf.py similarity index 100% rename from kt-sft/merge_tensors/merge_safetensor_gguf.py rename to archive/kt-sft/merge_tensors/merge_safetensor_gguf.py diff --git a/kt-sft/pyproject.toml b/archive/kt-sft/pyproject.toml similarity index 100% rename from kt-sft/pyproject.toml rename to archive/kt-sft/pyproject.toml diff --git a/kt-sft/requirements-sft.txt b/archive/kt-sft/requirements-sft.txt similarity index 100% rename from kt-sft/requirements-sft.txt rename to archive/kt-sft/requirements-sft.txt diff --git a/kt-sft/setup.py b/archive/kt-sft/setup.py similarity index 100% rename from kt-sft/setup.py rename to archive/kt-sft/setup.py diff --git a/kt-sft/test_adapter/data_transfer.py b/archive/kt-sft/test_adapter/data_transfer.py similarity index 100% rename from kt-sft/test_adapter/data_transfer.py rename to archive/kt-sft/test_adapter/data_transfer.py diff --git a/kt-sft/test_adapter/infer_with_adapter.py b/archive/kt-sft/test_adapter/infer_with_adapter.py similarity index 100% rename from kt-sft/test_adapter/infer_with_adapter.py rename to archive/kt-sft/test_adapter/infer_with_adapter.py diff --git a/kt-sft/test_adapter/inspect_adapter.py b/archive/kt-sft/test_adapter/inspect_adapter.py similarity index 100% rename from kt-sft/test_adapter/inspect_adapter.py rename to archive/kt-sft/test_adapter/inspect_adapter.py diff --git a/kt-sft/test_adapter/pred2metrics.py b/archive/kt-sft/test_adapter/pred2metrics.py similarity index 100% rename from kt-sft/test_adapter/pred2metrics.py rename to archive/kt-sft/test_adapter/pred2metrics.py diff --git a/kt-sft/test_adapter/test_grad.py b/archive/kt-sft/test_adapter/test_grad.py similarity index 100% rename from kt-sft/test_adapter/test_grad.py rename to archive/kt-sft/test_adapter/test_grad.py diff --git a/kt-sft/test_adapter/time_test_lora_train.py b/archive/kt-sft/test_adapter/time_test_lora_train.py similarity index 100% rename from kt-sft/test_adapter/time_test_lora_train.py rename to archive/kt-sft/test_adapter/time_test_lora_train.py diff --git a/kt-sft/withoutKT_PEFT.py b/archive/kt-sft/withoutKT_PEFT.py similarity index 100% rename from kt-sft/withoutKT_PEFT.py rename to archive/kt-sft/withoutKT_PEFT.py diff --git a/archive/ktransformers/ktransformers b/archive/ktransformers/ktransformers deleted file mode 120000 index 598751a4..00000000 --- a/archive/ktransformers/ktransformers +++ /dev/null @@ -1 +0,0 @@ -/home/djw/py311_717/ktransformers/ktransformers \ No newline at end of file diff --git a/doc/SUMMARY.md b/doc/SUMMARY.md index 35ec1145..8d1ceb37 100644 --- a/doc/SUMMARY.md +++ b/doc/SUMMARY.md @@ -5,7 +5,7 @@ - [For kt-kernel](en/kt-kernel/kt-kernel_intro.md) - [For kt-sft](en/SFT/KTransformers-Fine-Tuning_User-Guide.md) -# Tutorial +# Tutorial - [kt-sft part](en/SFT/README.md) - [Injection Tutorial](en/SFT/injection_tutorial.md) - [kt-sft developer tech notes](en/SFT/KTransformers-Fine-Tuning_Developer-Technical-Notes.md) @@ -19,6 +19,8 @@ - [Makefile Usage](en/makefile_usage.md) --> - [kt-kernel part](en/kt-kernel/README.md) - [kt-cli](en/kt-kernel/kt-cli.md) + - [AVX2 Backend Tutorial](en/kt-kernel/AVX2-Tutorial.md) + - [AVX2 后端教程(中文)](zh/AVX2-Tutorial_zh.md) # FAQ - [FAQ](en/FAQ.md)