Jianwei Dong
17d9e49dd0
Update README.md
2026-04-18 21:28:46 +08:00
Jianwei Dong
a9f28d495b
Update README.md ( #1934 )
2026-04-18 21:10:25 +08:00
mrhaoxx
7a9daf0cd4
[feat](kt-kernel): support avx2 only inference for bf16 fp8 and gptq int4 ( #1892 )
...
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
* feat: support avx2 bf16 fp8 inference
* feat: support avx2 gptq int4 inference
* fix: numeric issues in fp8 dequant
* Tutorial avx2 (#1900 )
* fix: prevent injecting -DLLAMA_AVX512=ON on AVX2-only machines
* docs: add AVX2 tutorial for running KTransformers on AVX2-only CPUs
* Tutorial avx2 (#1901 )
* fix: prevent injecting -DLLAMA_AVX512=ON on AVX2-only machines
* docs: add AVX2 tutorial for running KTransformers on AVX2-only CPUs
* docs: update README.md
---------
Co-authored-by: Benjamin F <159887351+yyj6666667@users.noreply.github.com>
2026-03-27 14:45:02 +08:00
Jiaqi Liao
411b69bec0
Add GLM-5 Day0 Support update to README ( #1851 )
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
2026-02-15 11:37:12 +08:00
Jiaqi Liao
7d9943365a
Add MiniMax-M2.5 Day0 support update ( #1850 )
...
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Added update for MiniMax-M2.5 Day0 support in the README.
2026-02-13 22:49:10 +08:00
Jiaqi Liao
2f6f7f1921
Kimi k2.5 doc ( #1812 )
...
Book-CI / test-2 (push) Waiting to run
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
* [doc]: add Kimi-K2.5 deploy&sft guide
* [doc]: add Kimi-K2.5 deploy&sft guide
2026-01-27 13:33:25 +08:00
Oql
5bd5c8f750
[fix]: fix experts-sched-Tutorial.md ( #1808 )
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
2026-01-23 18:06:24 +08:00
Oql
bf4c8a690b
Add Native Precision Tutorial, update worker strategy and README.md ( #1807 )
2026-01-23 18:00:13 +08:00
Jiaqi Liao
be668074de
Update tutorial links in README.md ( #1749 )
2025-12-25 14:26:10 +08:00
ZiWei Yuan
dc5feece8f
[docs]: update doc link ( #1745 )
2025-12-24 18:00:47 +08:00
ErvinXie
d8046e1bb4
Kt minimax ( #1742 )
...
[feat]: fp8 kernel and kt-cli support
2025-12-24 15:39:44 +08:00
mrhaoxx
e7d277d163
[docs]: refine README for dpo updates ( #1740 )
...
Book-CI / test (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
* [docs]: refine dpo tutorial
* [docs]: refine README for dpo updates
* Update doc/en/DPO_tutorial.md
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* [docs]: update website doc & refine location
---------
Co-authored-by: ErvinXie <ervinxie@foxmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: ZiWei Yuan <yzwliam@126.com>
2025-12-24 11:20:08 +08:00
ZiWei Yuan
34230eaf44
[docs]: Fix image link in README.md ( #1718 )
...
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
Updated image link to use raw GitHub URL for better accessibility.
2025-12-15 17:10:15 +08:00
ZiWei Yuan
2f1b743050
[docs]: update website doc png ( #1696 )
2025-12-11 13:01:32 +08:00
Jiaqi Liao
721b6c4c94
[docs] Update Native Kimi-K2-Thinking documentation and kt-kernel parameters ( #1671 )
2025-12-05 22:46:16 +08:00
Jiaqi Liao
1ca3a2662e
Add 9#AISoft to the list of contributors ( #1668 )
2025-12-05 15:44:04 +08:00
Peilin Li
171578a7ec
[refactor]: Change named 'KT-SFT' to 'kt-sft' ( #1626 )
...
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
* Change named 'KT-SFT' to 'kt-sft'
* [docs]: update kt-sft name
---------
Co-authored-by: ZiWei Yuan <yzwliam@126.com>
2025-11-17 11:48:42 +08:00
ZiWei Yuan
550e4986f5
[docs]: update README.md ( #1616 )
...
* [docs]: update README.md
2025-11-15 20:56:26 +08:00
ZiWei Yuan
7c2ad6dbca
[docs]: update README.md ( #1614 )
...
* [docs]: update README.md
2025-11-15 18:34:27 +08:00
ErvinXie
5179f0d634
Add roadmap link to README ( #1585 )
2025-11-10 18:15:53 +08:00
Jiaqi Liao
07322ca2bd
Refactor: restructure repository to focus on kt-kernel and KT-SFT modules ( #1583 )
...
* refactor repo
* fix README
2025-11-10 17:57:48 +08:00
ErvinXie
2cb1674020
Fix image reference in README.md ( #1584 )
...
Updated image reference in README for heterogeneous computing.
2025-11-10 17:53:41 +08:00
Jiaqi Liao
57d14d22bc
Refactor: restructure repository to focus on kt-kernel and KT-SFT modulesq recon ( #1581 )
...
* refactor: move legacy code to archive/ directory
- Moved ktransformers, csrc, third_party, merge_tensors to archive/
- Moved build scripts and configurations to archive/
- Kept kt-kernel, KT-SFT, doc, and README files in root
- Preserved complete git history for all moved files
* refactor: restructure repository to focus on kt-kernel and KT-SFT modules
* fix README
* fix README
* fix README
* fix README
* docs: add performance benchmarks to kt-kernel section
Add comprehensive performance data for kt-kernel to match KT-SFT's presentation:
- AMX kernel optimization: 21.3 TFLOPS (3.9× faster than PyTorch)
- Prefill phase: up to 20× speedup vs baseline
- Decode phase: up to 4× speedup
- NUMA optimization: up to 63% throughput improvement
- Multi-GPU (8×L20): 227.85 tokens/s total throughput with DeepSeek-R1 FP8
Source: https://lmsys.org/blog/2025-10-22-KTransformers/
This provides users with concrete performance metrics for both core modules,
making it easier to understand the capabilities of each component.
* refactor: improve kt-kernel performance data with specific hardware and models
Replace generic performance descriptions with concrete benchmarks:
- Specify exact hardware: 8×L20 GPU + Xeon Gold 6454S, Single/Dual-socket Xeon + AMX
- Include specific models: DeepSeek-R1-0528 (FP8), DeepSeek-V3 (671B)
- Show detailed metrics: total throughput, output throughput, concurrency details
- Match KT-SFT presentation style for consistency
This provides users with actionable performance data they can use to evaluate
hardware requirements and expected performance for their use cases.
* fix README
* docs: clean up performance table and improve formatting
* add pic for README
* refactor: simplify .gitmodules and backup legacy submodules
- Remove 7 legacy submodules from root .gitmodules (archive/third_party/*)
- Keep only 2 active submodules for kt-kernel (llama.cpp, pybind11)
- Backup complete .gitmodules to archive/.gitmodules
- Add documentation in archive/README.md for researchers who need legacy submodules
This reduces initial clone size by ~500MB and avoids downloading unused dependencies.
* refactor: move doc/ back to root directory
Keep documentation in root for easier access and maintenance.
* refactor: consolidate all images to doc/assets/
- Move kt-kernel/assets/heterogeneous_computing.png to doc/assets/
- Remove KT-SFT/assets/ (images already in doc/assets/)
- Update KT-SFT/README.md image references to ../doc/assets/
- Eliminates ~7.9MB image duplication
- Centralizes all documentation assets in one location
* fix pic path for README
2025-11-10 17:42:26 +08:00
Atream
86229c852d
Add update for Kimi-K2-Thinking support
2025-11-06 17:56:46 +08:00
ovowei
44e47ad75a
update readme.md
2025-11-05 23:30:58 +08:00
ovowei
00f038e763
update readme.md
2025-11-05 23:29:59 +08:00
ovowei
1e17d75bfd
fix
2025-10-30 10:47:05 +08:00
ovowei
ca21992e46
update readme.md. (Support Ascend NPU)
2025-10-27 20:53:06 +08:00
Atream
8ef6111ae0
Update README with Citation link
2025-10-10 19:12:31 +08:00
Atream
1e48eab7d5
Add citation section to README
...
Added citation section with reference to KTransformers paper.
2025-10-10 18:59:29 +08:00
Atream
e93abc93ec
Add SGLang Integration to README.md
2025-10-10 18:50:05 +08:00
Jianwei Dong
d4b3fe2427
Merge branch 'main' into support-qwen3next
2025-09-12 21:59:32 +08:00
djw
a44b710649
support qwen3 next
2025-09-11 11:55:09 +00:00
Azure
24fe61bbc3
Update date for Kimi-K2-0905 support
2025-09-05 17:47:17 +08:00
Azure-Tang
b6d36bffbb
update kimi-k2-0905
2025-09-05 03:52:43 +00:00
qiyuxinlin
1334ddc833
update readme
2025-07-25 17:02:36 +00:00
Atream
cf79c93fae
Update README.md
2025-07-11 09:35:12 +08:00
Atream
18690d819f
Update README.md
2025-07-11 09:34:07 +08:00
ErvinXie
aadf31b35d
Update README.md
2025-06-30 17:55:49 +08:00
ErvinXie
a9a72e52c3
Update README.md
2025-06-30 14:56:46 +08:00
liam Yuan
22d0d9ccb2
✨ update vendor ZTE name
2025-06-23 21:07:17 +08:00
liam Yuan
cb77b52c63
✨ update vendor support list
2025-06-23 21:00:01 +08:00
Atream
d051a14941
Update README.md
2025-05-15 10:29:43 +08:00
rnwang04
142fb7ce6c
Enable support for Intel XPU devices, add support for DeepSeek V2/V3 first
2025-05-14 19:37:27 +00:00
Atream
7ebf82a492
Update Qwen3 date
2025-04-29 09:43:13 +08:00
qiyuxinlin
a3ba63665a
update readme
2025-04-28 22:38:41 +00:00
qiyuxinlin
89823ccb1f
update readme
2025-04-28 22:34:47 +00:00
qiyuxinlin
e7763a4b59
update readme
2025-04-28 22:32:35 +00:00
qiyuxinlin
d3ebdafd4b
update readme
2025-04-28 22:31:09 +00:00
qiyuxinlin
59b0631e33
update readme
2025-04-28 22:26:38 +00:00