kvcache-ai-ktransformers/kt-kernel/python/cli
Benjamin F 8484ef8b16
[feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash (#1950)
V4-Flash routed experts ship as native MXFP4 (E2M1 nibble + ue8m0 group
scale). Expose AMXFP4_KGroup_MOE through NativeMoEWrapper, add a loader
that handles V4's `layers.{L}.ffn.experts.{i}.{w1,w3,w2}.{weight,scale}`
naming and converts ue8m0 → bf16 via a lossless bit-cast, register the
model entry, and ship an end-to-end numerical validation script.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:11:53 +08:00
..
commands [fix]: fix --numa-nodes handling (#1904) 2026-03-31 17:50:22 +08:00
completions kt-cli enhancement (#1834) 2026-02-04 16:44:54 +08:00
config Kt minimax (#1742) 2025-12-24 15:39:44 +08:00
requirements Kt minimax (#1742) 2025-12-24 15:39:44 +08:00
utils [feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash (#1950) 2026-04-25 18:11:53 +08:00
__init__.py chore: bump version to 0.5.3 (#1909) 2026-04-01 18:58:48 +08:00
i18n.py Fix/sglang kt detection (#1875) 2026-03-04 16:54:48 +08:00
main.py kt-cli enhancement (#1834) 2026-02-04 16:44:54 +08:00