vrr/kvcache-ai-ktransformers

Fork 0

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2026-04-28 11:49:51 +00:00

Commit graph

Author	SHA1	Message	Date
Benjamin F	8484ef8b16	[feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash (#1950 ) V4-Flash routed experts ship as native MXFP4 (E2M1 nibble + ue8m0 group scale). Expose AMXFP4_KGroup_MOE through NativeMoEWrapper, add a loader that handles V4's `layers.{L}.ffn.experts.{i}.{w1,w3,w2}.{weight,scale}` naming and converts ue8m0 → bf16 via a lossless bit-cast, register the model entry, and ship an end-to-end numerical validation script. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 18:11:53 +08:00
ErvinXie	9539ab91eb	Cli (#1765 ) * [feat]: add custom option for kt run * [feat]: depth 3	2025-12-29 15:18:42 +08:00
ErvinXie	d8046e1bb4	Kt minimax (#1742 ) [feat]: fp8 kernel and kt-cli support	2025-12-24 15:39:44 +08:00

Author

SHA1

Message

Date

Benjamin F

8484ef8b16

[feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash (#1950 )

V4-Flash routed experts ship as native MXFP4 (E2M1 nibble + ue8m0 group
scale). Expose AMXFP4_KGroup_MOE through NativeMoEWrapper, add a loader
that handles V4's `layers.{L}.ffn.experts.{i}.{w1,w3,w2}.{weight,scale}`
naming and converts ue8m0 → bf16 via a lossless bit-cast, register the
model entry, and ship an end-to-end numerical validation script.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-25 18:11:53 +08:00

ErvinXie

9539ab91eb

Cli (#1765 )

* [feat]: add custom option for kt run

* [feat]: depth 3

2025-12-29 15:18:42 +08:00

ErvinXie

d8046e1bb4

Kt minimax (#1742 )

[feat]: fp8 kernel and kt-cli support

2025-12-24 15:39:44 +08:00

3 commits