mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-04-28 03:30:20 +00:00
* mtmd, llama : add HunyuanVL vision-language model support - add LLM_ARCH_HUNYUAN_VL with M-RoPE (XD-RoPE) support - add PROJECTOR_TYPE_HUNYUANVL with PatchMerger vision encoder - add HunyuanVL-specific M-RoPE position encoding for image tokens - add GGUF conversion for HunyuanVL vision and text models - add smoke test in tools/mtmd/tests.sh * fix: fix HunyuanVL XD-RoPE h/w section order * fix: Remove redundant code * convert : fix HunyuanOCR / HunyuanVL conversion - Tested locally: both HunyuanOCR and HunyuanVL-4B convert to GGUF - successfully and produce correct inference output on Metal (F16 / Q8_0). * clip : fix -Werror=misleading-indentation in bilinear resize * fix CI: convert_hf_to_gguf type check error - convert_hf_to_gguf.py: give HunyuanVLTextModel.__init__ an explicit `dir_model: Path` parameter so ty can infer the type for load_hparams instead of reporting `Unknown | None`. --------- Co-authored-by: wendadawen <wendadawen@tencent.com> |
||
|---|---|---|
| .. | ||
| cogvlm.cpp | ||
| conformer.cpp | ||
| deepseekocr.cpp | ||
| dotsocr.cpp | ||
| gemma4a.cpp | ||
| gemma4v.cpp | ||
| glm4v.cpp | ||
| hunyuanocr.cpp | ||
| internvl.cpp | ||
| kimik25.cpp | ||
| kimivl.cpp | ||
| llama4.cpp | ||
| llava.cpp | ||
| minicpmv.cpp | ||
| mobilenetv5.cpp | ||
| models.h | ||
| nemotron-v2-vl.cpp | ||
| paddleocr.cpp | ||
| pixtral.cpp | ||
| qwen2vl.cpp | ||
| qwen3a.cpp | ||
| qwen3vl.cpp | ||
| siglip.cpp | ||
| step3vl.cpp | ||
| whisper-enc.cpp | ||
| yasa2.cpp | ||
| youtuvl.cpp | ||