kvcache-ai-ktransformers/kt-kernel/python
mrhaoxx 7a9daf0cd4
Some checks are pending
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
[feat](kt-kernel): support avx2 only inference for bf16 fp8 and gptq int4 (#1892)
* feat: support avx2 bf16 fp8 inference

* feat: support avx2 gptq int4 inference

* fix: numeric issues in fp8 dequant

* Tutorial avx2 (#1900)

* fix: prevent injecting -DLLAMA_AVX512=ON on AVX2-only machines

* docs: add AVX2 tutorial for running KTransformers on AVX2-only CPUs

* Tutorial avx2 (#1901)

* fix: prevent injecting -DLLAMA_AVX512=ON on AVX2-only machines

* docs: add AVX2 tutorial for running KTransformers on AVX2-only CPUs

* docs: update README.md

---------

Co-authored-by: Benjamin F <159887351+yyj6666667@users.noreply.github.com>
2026-03-27 14:45:02 +08:00
..
cli [fix] improve Sglang kt-kernel detect time duration (#1887) 2026-03-18 23:07:40 +08:00
utils [feat](kt-kernel): support avx2 only inference for bf16 fp8 and gptq int4 (#1892) 2026-03-27 14:45:02 +08:00
__init__.py [feat](kt-kernel): CPU-GPU experts sched (#1796) 2026-01-16 17:01:15 +08:00
_cpu_detect.py [feat](kt-kernel): Fix CPU instruction set variants for build & install (#1746) 2025-12-24 18:57:45 +08:00
experts.py [feat](kt-kernel): support avx2 only inference for bf16 fp8 and gptq int4 (#1892) 2026-03-27 14:45:02 +08:00
experts_base.py [feat](kt-kernel): CPU-GPU experts sched (#1796) 2026-01-16 17:01:15 +08:00