koboldcpp/src/models
Concedo cd6788007e Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build-cross.yml
#	.github/workflows/build-self-hosted.yml
#	.github/workflows/release.yml
#	examples/llama.android/lib/src/main/cpp/CMakeLists.txt
#	ggml/CMakeLists.txt
#	ggml/src/ggml-rpc/CMakeLists.txt
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-sycl/mmvq.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	scripts/sync_vendor.py
#	tests/test-chat.cpp
#	tests/test-mtmd-c-api.c
#	tools/server/README.md
2026-04-20 20:19:11 +08:00
..
afmoe.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
apertus.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
arcee.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
arctic.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
arwkv7.cpp refactor : llama-model.cpp (#16252) 2025-10-31 23:40:23 +01:00
baichuan.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
bailingmoe.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
bailingmoe2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
bert.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
bitnet.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
bloom.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
chameleon.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
chatglm.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
codeshell.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
cogvlm.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
cohere2-iswa.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
command-r.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
dbrx.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
deci.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
deepseek.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
deepseek2.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
delta-net-base.cpp graph : remove redundant GDN state transposes (#20443) 2026-03-13 22:12:54 +02:00
dots1.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
dream.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
ernie4-5-moe.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
ernie4-5.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
eurobert.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
exaone-moe.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
exaone.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
exaone4.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
falcon-h1.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
falcon.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gemma-embedding.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gemma.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gemma2-iswa.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gemma3.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gemma3n-iswa.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gemma4-iswa.cpp Merge branch 'upstream' into concedo_experimental 2026-04-17 22:37:37 +08:00
glm4-moe.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
glm4.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
gpt2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
gptneox.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
granite-hybrid.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
granite.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
grok.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
grovemoe.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
hunyuan-dense.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
hunyuan-moe.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
internlm2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
jais.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
jais2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
jamba.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
kimi-linear.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
lfm2.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
llada-moe.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
llada.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
llama.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
llama4.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
maincoder.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
mamba-base.cpp model : wire up Nemotron-H tensors for NVFP4 support (#20561) 2026-03-16 09:19:16 +01:00
mamba.cpp models : deduplicate delta-net graphs for Qwen family (#19597) 2026-02-16 14:35:04 +02:00
mimo2-iswa.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
minicpm3.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
minimax-m2.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
mistral3.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
models.h model: using single llm_build per arch (#21970) 2026-04-16 21:10:22 +02:00
modern-bert.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
mpt.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
nemotron-h.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
nemotron.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
neo-bert.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
olmo.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
olmo2.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
olmoe.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
openai-moe-iswa.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
openelm.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
orion.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
paddleocr.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
pangu-embedded.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
phi2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
phi3.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
plamo.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
plamo2.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
plamo3.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
plm.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
qwen.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
qwen2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen2moe.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen2vl.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen3.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen3moe.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen3next.cpp Merge branch 'upstream' into concedo_experimental 2026-04-17 22:37:37 +08:00
qwen3vl-moe.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen3vl.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
qwen35.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
qwen35moe.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
refact.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
rnd1.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
rwkv6-base.cpp models : deduplicate delta-net graphs for Qwen family (#19597) 2026-02-16 14:35:04 +02:00
rwkv6.cpp models : move the token embedding norms to the first layer (#20943) 2026-03-24 17:00:30 +02:00
rwkv6qwen2.cpp refactor : llama-model.cpp (#16252) 2025-10-31 23:40:23 +01:00
rwkv7-base.cpp models : deduplicate delta-net graphs for Qwen family (#19597) 2026-02-16 14:35:04 +02:00
rwkv7.cpp models : move the token embedding norms to the first layer (#20943) 2026-03-24 17:00:30 +02:00
seed-oss.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
smallthinker.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
smollm3.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
stablelm.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00
starcoder.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
starcoder2.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
step35-iswa.cpp model : support NVFP4 tensors for Gemma4 (#21971) 2026-04-16 16:51:47 +02:00
t5.cpp model : refactor bias tensor variable names (#22079) 2026-04-18 20:12:00 +02:00
t5encoder.cpp model: using single llm_build per arch (#21970) 2026-04-16 21:10:22 +02:00
wavtokenizer-dec.cpp models : move the token embedding norms to the first layer (#20943) 2026-03-24 17:00:30 +02:00
xverse.cpp model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245) 2026-04-16 17:41:34 +02:00