mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-28 11:49:51 +00:00
V4-Flash routed experts ship as native MXFP4 (E2M1 nibble + ue8m0 group
scale). Expose AMXFP4_KGroup_MOE through NativeMoEWrapper, add a loader
that handles V4's `layers.{L}.ffn.experts.{i}.{w1,w3,w2}.{weight,scale}`
naming and converts ue8m0 → bf16 via a lossless bit-cast, register the
model entry, and ship an end-to-end numerical validation script.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| analyze_moe_model.py | ||
| console.py | ||
| debug_configs.py | ||
| download_helper.py | ||
| environment.py | ||
| input_validators.py | ||
| kv_cache_calculator.py | ||
| model_discovery.py | ||
| model_registry.py | ||
| model_scanner.py | ||
| model_table_builder.py | ||
| model_verifier.py | ||
| port_checker.py | ||
| quant_interactive.py | ||
| repo_detector.py | ||
| run_configs.py | ||
| run_interactive.py | ||
| sglang_checker.py | ||
| tuna_engine.py | ||
| user_model_registry.py | ||