koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-31 21:39:42 +00:00

Author	SHA1	Message	Date
Saba Fallah	da3f990a47	mtmd: Add DeepSeekOCR 2 Support (#20975 ) * mtmd: DeepSeek-OCR 2 support, with multi-tile dynamic resolution * introduced clip_image_f32::add_viewsep * address PR review - drop redundant ggml_cpy ops in both deepseekocr versions build - drop no-op ggml_cont in build_sam - assert num_image_tokens deepseekocr2 - view_seperator as (1, n_embd) at conversion (for both versions) - drop redundant ggml_reshape_2d * Update tools/mtmd/models/deepseekocr2.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2026-05-29 16:13:51 +02:00
fairydreaming	1f0aa2a696	model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation (#23346 ) * llama : support DeepSeek V3.2 model family (with DSA lightning indexer) * convert : handle DeepseekV32ForCausalLM architecture * ggml : support for f16 GGML_OP_FILL * memory : separate hparams argument in llama_kv_cache constructor * memory : add llama_kv_cache_dsa memory (KV cache + lightning indexer cache) * llama : support for LLM_ARCH_DEEPSEEK32 * model : llama_model_deepseek32 implementation * model : merge two scale operations into one in DSA lightning indexer implementation * chore : remove unused code * model : support NVFP4 in DeepSeek V3.2 Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * memory : refactoring TODO Co-authored-by: ggerganov <ggerganov@users.noreply.github.com> --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Co-authored-by: ggerganov <ggerganov@users.noreply.github.com>	2026-05-29 10:15:17 +02:00
ghleg	dbe9c0c8ce	convert : support Gemma4ForCausalLM architecture (#23682 ) * convert : support Gemma4ForCausalLM architecture (#23674) * fix indent --------- Co-authored-by: Oleg Afonin <your.email@example.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-05-26 08:00:31 +03:00
Niklas Sheth	c9d98295a3	model : add support for talkie-1930-13b (#22596 ) * initial talkie support, coherent * reorder to follow convention * absorb inverse rope * stop folding scalars to improve quantization * use broadcasting instead of duplication * style cleanup * add scaling support to LoraTorchTensor; use that path in conversion * use layer_out_scale instead of embd_skip_scale	2026-05-26 07:57:38 +03:00
Piotr Wilkin (ilintar)	cc7200bf12	Refactor: convert_hf_to_gguf.py (#17114 ) * move conversion code to a dedicated conversion directory and split the files akin to the src/models architecture --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2026-05-15 15:18:12 +02:00

5 commits