koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-17 04:09:19 +00:00

History

manayang 7bfe60fdf9 mtmd, llama : Update HunyuanVL vision-language model support (#22037 ) * mtmd, llama : add HunyuanVL vision-language model support - add LLM_ARCH_HUNYUAN_VL with M-RoPE (XD-RoPE) support - add PROJECTOR_TYPE_HUNYUANVL with PatchMerger vision encoder - add HunyuanVL-specific M-RoPE position encoding for image tokens - add GGUF conversion for HunyuanVL vision and text models - add smoke test in tools/mtmd/tests.sh * fix: fix HunyuanVL XD-RoPE h/w section order * fix: Remove redundant code * convert : fix HunyuanOCR / HunyuanVL conversion - Tested locally: both HunyuanOCR and HunyuanVL-4B convert to GGUF - successfully and produce correct inference output on Metal (F16 / Q8_0). * clip : fix -Werror=misleading-indentation in bilinear resize * fix CI: convert_hf_to_gguf type check error - convert_hf_to_gguf.py: give HunyuanVLTextModel.__init__ an explicit `dir_model: Path` parameter so ty can infer the type for load_hparams instead of reporting `Unknown \| None`. --------- Co-authored-by: wendadawen <wendadawen@tencent.com>		2026-04-22 11:58:43 +02:00
..
afmoe.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
apertus.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
arcee.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
arctic.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
arwkv7.cpp	refactor : llama-model.cpp (#16252 )	2025-10-31 23:40:23 +01:00
baichuan.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
bailingmoe.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
bailingmoe2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
bert.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
bitnet.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
bloom.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
chameleon.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
chatglm.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
codeshell.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
cogvlm.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
cohere2-iswa.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
command-r.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
dbrx.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
deci.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
deepseek.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
deepseek2.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
delta-net-base.cpp	graph : remove redundant GDN state transposes (#20443 )	2026-03-13 22:12:54 +02:00
dots1.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
dream.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
ernie4-5-moe.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
ernie4-5.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
eurobert.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
exaone-moe.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
exaone.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
exaone4.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
falcon-h1.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
falcon.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gemma-embedding.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gemma.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gemma2-iswa.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gemma3.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gemma3n-iswa.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gemma4-iswa.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
glm4-moe.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
glm4.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
gpt2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
gptneox.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
granite-hybrid.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
granite.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
grok.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
grovemoe.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
hunyuan-dense.cpp	mtmd, llama : Update HunyuanVL vision-language model support (#22037 )	2026-04-22 11:58:43 +02:00
hunyuan-moe.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
internlm2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
jais.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
jais2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
jamba.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
kimi-linear.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
lfm2.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
llada-moe.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
llada.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
llama.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
llama4.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
maincoder.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
mamba-base.cpp	model : wire up Nemotron-H tensors for NVFP4 support (#20561 )	2026-03-16 09:19:16 +01:00
mamba.cpp	models : deduplicate delta-net graphs for Qwen family (#19597 )	2026-02-16 14:35:04 +02:00
mimo2-iswa.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
minicpm3.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
minimax-m2.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
mistral3.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
models.h	model: using single llm_build per arch (#21970 )	2026-04-16 21:10:22 +02:00
modern-bert.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
mpt.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
nemotron-h.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
nemotron.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
neo-bert.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
olmo.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
olmo2.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
olmoe.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
openai-moe-iswa.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
openelm.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
orion.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
paddleocr.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
pangu-embedded.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
phi2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
phi3.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
plamo.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
plamo2.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
plamo3.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
plm.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
qwen.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
qwen2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen2moe.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen2vl.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen3.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen3moe.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen3next.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
qwen3vl-moe.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen3vl.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
qwen35.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
qwen35moe.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
refact.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
rnd1.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
rwkv6-base.cpp	models : deduplicate delta-net graphs for Qwen family (#19597 )	2026-02-16 14:35:04 +02:00
rwkv6.cpp	models : move the token embedding norms to the first layer (#20943 )	2026-03-24 17:00:30 +02:00
rwkv6qwen2.cpp	refactor : llama-model.cpp (#16252 )	2025-10-31 23:40:23 +01:00
rwkv7-base.cpp	models : deduplicate delta-net graphs for Qwen family (#19597 )	2026-02-16 14:35:04 +02:00
rwkv7.cpp	models : move the token embedding norms to the first layer (#20943 )	2026-03-24 17:00:30 +02:00
seed-oss.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
smallthinker.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
smollm3.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
stablelm.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00
starcoder.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
starcoder2.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
step35-iswa.cpp	model : support NVFP4 tensors for Gemma4 (#21971 )	2026-04-16 16:51:47 +02:00
t5.cpp	model : refactor bias tensor variable names (#22079 )	2026-04-18 20:12:00 +02:00
t5encoder.cpp	model: using single llm_build per arch (#21970 )	2026-04-16 21:10:22 +02:00
wavtokenizer-dec.cpp	models : move the token embedding norms to the first layer (#20943 )	2026-03-24 17:00:30 +02:00
xverse.cpp	model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245 )	2026-04-16 17:41:34 +02:00