- Add convert_cpu_weights_ds4.py: dequantizes MXFP4 routed experts
(E2M1 + ue8m0, group size 32) on GPU and re-quantizes to AMX-INT4 on CPU.
- Document the script as Step 2 in DeepSeek-V4-Flash.md so AMX users
can opt into AMXINT4 mode instead of the default MXFP4 CPU experts.
* Add utility script to merge loose layer weights to safetensors
* Send warnings and errors to stderr
* Fix expert index parsing for MOE_INT4 and MOE_INT8
* Support for GLM-5 and Minimax-M2.5
Add CPU weight conversion support for GLM-5 and Minimax-M2.5
* fix: remove overly restrictive MiniMax condition and deduplicate code
- Remove `args.input_type == "fp8"` from MiniMaxConverter selection so
bf16/fp16 MiniMax models no longer fall through to OnlineQuantConverter
(which doesn't handle w1/w2/w3 naming and would fail).
- Remove OnlineQuantConverter._find_expert_layers() which is identical
to the inherited ConverterBase._find_expert_layers().
- Remove redundant expert_key_filter assignment (same as base default).
---------
Co-authored-by: ErvinXie <ervinxie@foxmail.com>
* [feat]: Enhance CPU feature detection and support for AVX512 extensions
- Added cmake/DetectCPU.cmake for automatic CPU feature detection.
- Updated CMakeLists.txt to include auto-detection logic for AVX512 features.
- Modified install.sh to include new AVX512_VBMI option for FP8 MoE.
- Enhanced _cpu_detect.py to support progressive matching of CPU variants.
- Created scripts/check_cpu_features.py for manual CPU feature checks.
- Updated setup.py to reflect changes in CPU variant building and environment variables.
* [fix](kt-kernel): Add conditional inclusion of FP8 MoE for AVX512 BF16 support
* [chore](kt-kernel): update project version to 0.5.0 in CMakeLists.txt and version.py
* [feat]: kt-kernel: Add resume arg to CPU weight conversion
* [docs]: kt-kernel: Document resume arg for CPU weight conversion
* [fix]: kt-kernel: Only print resume layer if in use
* [fix]: kt-kernel: Don't log skipped layers when using resume_layer
* [feat]: update kt-kernel hooks and add contribution guide
* [docs]: add contributing guide
* [style]: format the python file and cpp file in kt-kernel