koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-22 19:47:49 +00:00

History

Daniel Bevenius baf3cc6e1d model : clarify MTP layer comment in qwen35.cpp [no ci] (#23338 ) This commit attempts to clarify a code comment in graph_mtp regarding where the MTP layer is stored. The motivation for this is that it was not obvious to me what the original comment meant and hopefully this makes it clearer.		2026-05-19 18:41:44 +02:00
..
models	model : clarify MTP layer comment in qwen35.cpp [no ci] (#23338 )	2026-05-19 18:41:44 +02:00
CMakeLists.txt	cmake: use glob to collect src/models sources (#22005 )	2026-04-16 23:25:16 +02:00
llama-adapter.cpp	fix: correct misspellings in code comments (#21217 )	2026-03-31 13:50:51 +02:00
llama-adapter.h	llama : re-enable manual LoRA adapter free (#19983 )	2026-03-18 12:03:26 +02:00
llama-arch.cpp	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-arch.h	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-batch.cpp	kv-cache : fix M-RoPE checkpoints (#20132 )	2026-03-06 08:46:51 +02:00
llama-batch.h	fix: correct misspellings in code comments (#21217 )	2026-03-31 13:50:51 +02:00
llama-chat.cpp	model : add HunyuanOCR support (#21395 )	2026-04-05 23:32:14 +02:00
llama-chat.h	model : add HunyuanOCR support (#21395 )	2026-04-05 23:32:14 +02:00
llama-context.cpp	llama: initialize pre-norm embedding mask flag (#23256 )	2026-05-18 14:20:49 +03:00
llama-context.h	llama: avoid copying logits during prompt decode in MTP (#23198 )	2026-05-17 23:30:25 +08:00
llama-cparams.cpp	cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188 )	2025-06-15 10:08:58 +03:00
llama-cparams.h	llama: avoid copying logits during prompt decode in MTP (#23198 )	2026-05-17 23:30:25 +08:00
llama-ext.h	llama: avoid copying logits during prompt decode in MTP (#23198 )	2026-05-17 23:30:25 +08:00
llama-grammar.cpp	common/grammar: fix grammar parsing issues to prevent stack overflow and hangs (#18604 )	2026-03-21 18:43:35 +01:00
llama-grammar.h	common/grammar : replace problematic backtracking regex `[\s\S]*` (#18342 )	2026-01-03 16:02:43 -06:00
llama-graph.cpp	llama: avoid copying logits during prompt decode in MTP (#23198 )	2026-05-17 23:30:25 +08:00
llama-graph.h	llama : MTP clean-up (#23269 )	2026-05-19 15:32:58 +03:00
llama-hparams.cpp	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-hparams.h	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-impl.cpp	llama : correct platform-independent loading of BOOL metadata (#21428 )	2026-04-06 01:40:38 +02:00
llama-impl.h	llama : enable chunked fused GDN path (#20340 )	2026-03-11 22:46:40 +02:00
llama-io.cpp	server : avoid checkpoint data host copies (#22558 )	2026-05-02 18:03:25 +03:00
llama-io.h	llama : add option to save memory in device buffers (#22679 )	2026-05-05 06:35:07 +03:00
llama-kv-cache-iswa.cpp	(revert) kv-cache : do not quantize SWA KV cache (#21332 )	2026-04-03 09:07:01 +03:00
llama-kv-cache-iswa.h	llama: print memory breakdown on exit (#15860 )	2025-09-24 16:53:48 +02:00
llama-kv-cache.cpp	ggml : implement fast walsh-hadamard transform for kv rotation (#21352 ) (#22631 )	2026-05-05 10:05:05 +08:00
llama-kv-cache.h	kv-cache : support attention rotation for heterogeneous iSWA (#21513 )	2026-04-07 20:31:28 +03:00
llama-kv-cells.h	llama: store mrope data in KV cell (#16825 )	2025-10-29 18:09:18 +01:00
llama-memory-hybrid-iswa.cpp	llama : MTP clean-up (#23269 )	2026-05-19 15:32:58 +03:00
llama-memory-hybrid-iswa.h	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-memory-hybrid.cpp	llama : MTP clean-up (#23269 )	2026-05-19 15:32:58 +03:00
llama-memory-hybrid.h	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-memory-recurrent.cpp	llama : MTP clean-up (#23269 )	2026-05-19 15:32:58 +03:00
llama-memory-recurrent.h	llama : MTP clean-up (#23269 )	2026-05-19 15:32:58 +03:00
llama-memory.cpp	memory : correctly handle failure in apply() (#14438 )	2025-06-30 18:03:03 +03:00
llama-memory.h	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-mmap.cpp	Update llama-mmap to use ftello/fseeko (#22497 )	2026-04-30 14:17:52 -07:00
llama-mmap.h	llama: fix llama-model-saver (#20503 )	2026-03-25 12:53:16 +02:00
llama-model-loader.cpp	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-model-loader.h	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-model-saver.cpp	model : NvFP4 quantized LM head support (#23046 )	2026-05-16 11:09:27 +02:00
llama-model-saver.h	llama: fix llama-model-saver (#20503 )	2026-03-25 12:53:16 +02:00
llama-model.cpp	llama + spec: MTP Support (#22673 )	2026-05-16 20:06:23 +08:00
llama-model.h	model : NvFP4 quantized LM head support (#23046 )	2026-05-16 11:09:27 +02:00
llama-quant.cpp	model: move `load_hparams` and `load_tensors` to per-model definition (#22004 )	2026-05-04 12:36:59 +02:00
llama-quant.h	llama : refactor `src/llama.cpp` (#10902 )	2025-01-03 10:18:53 +02:00
llama-sampler.cpp	llama : rename llama-sampling to llama-sampler (#19363 )	2026-02-06 07:26:54 +01:00
llama-sampler.h	llama : rename llama-sampling to llama-sampler (#19363 )	2026-02-06 07:26:54 +01:00
llama-vocab.cpp	model : add sarvam_moe architecture support (#20275 )	2026-05-09 16:31:50 +02:00
llama-vocab.h	model : add sarvam_moe architecture support (#20275 )	2026-05-09 16:31:50 +02:00
llama.cpp	llama : add missing call to ggml_backend_load_all() (#22752 )	2026-05-07 08:24:47 +03:00
unicode-data.cpp	server : better security control for public deployments (#9776 )	2024-10-08 13:27:04 +02:00
unicode-data.h	llama : reduce compile time and binary size (#9712 )	2024-10-02 15:49:55 +02:00
unicode.cpp	unicode,test: add Qwen3.5 non-backtracking tokenizer handler and regr… (#22110 )	2026-05-14 11:03:40 +02:00
unicode.h	vocab: fix Gemma4 tokenizer (#21343 )	2026-04-03 10:33:03 +02:00