kvcache-ai-ktransformers/doc/zh
mrhaoxx 7a9daf0cd4
Some checks are pending
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
[feat](kt-kernel): support avx2 only inference for bf16 fp8 and gptq int4 (#1892)
* feat: support avx2 bf16 fp8 inference

* feat: support avx2 gptq int4 inference

* fix: numeric issues in fp8 dequant

* Tutorial avx2 (#1900)

* fix: prevent injecting -DLLAMA_AVX512=ON on AVX2-only machines

* docs: add AVX2 tutorial for running KTransformers on AVX2-only CPUs

* Tutorial avx2 (#1901)

* fix: prevent injecting -DLLAMA_AVX512=ON on AVX2-only machines

* docs: add AVX2 tutorial for running KTransformers on AVX2-only CPUs

* docs: update README.md

---------

Co-authored-by: Benjamin F <159887351+yyj6666667@users.noreply.github.com>
2026-03-27 14:45:02 +08:00
..
api/server Initial commit 2024-07-27 16:06:58 +08:00
AVX2-Tutorial_zh.md [feat](kt-kernel): support avx2 only inference for bf16 fp8 and gptq int4 (#1892) 2026-03-27 14:45:02 +08:00
clawdbot_integration_guide.md [docs]: add clawd bot docs 2026-01-30 15:51:30 +08:00
DeepseekR1_V3_tutorial_zh.md add flashinfer to cuda device 2025-05-15 07:03:45 +00:00
DeepseekR1_V3_tutorial_zh_for_Ascend_NPU.md Npu revise benchmark results and prerequisites (#1716) 2025-12-16 14:26:44 +08:00
KTransformers-Fine-Tuning_Developer-Technical-Notes_zh.md [feat](cmake & doc): fix bug with cmake arch detect & update doc for sft 2025-11-04 08:46:26 +00:00
KTransformers-Fine-Tuning_User-Guide_zh.md fix: remove py310 as guide 2025-11-08 08:54:32 +00:00
Low-cost_Cloud_Training_and_Inference_KTransformers+AutoDL+LlamaFactory_An_On-demand_Cost-efficient_Integrated_Pipeline_for_Ultra-large_Model_Fine-tuning_and_Inference.pdf Add files via upload (#1814) 2026-01-27 17:44:50 +08:00
Qwen3-MoE_tutorial_zh_for_Ascend_NPU.md fix: qwen3-npu bugs; update: add readme-for-qwen3-npu (#1717) 2025-12-16 14:27:04 +08:00
【云端低价训推】 KTransformers+AutoDL+LlamaFactory:随用随租的低成本超大模型「微调+推理」一体化流程.pdf Add AutoDL Tutorial (#1801) 2026-01-22 14:52:47 +08:00