koboldcpp/src
Daniel Bevenius baf3cc6e1d
model : clarify MTP layer comment in qwen35.cpp [no ci] (#23338)
This commit attempts to clarify a code comment in graph_mtp regarding
where the MTP layer is stored.

The motivation for this is that it was not obvious to me what the
original comment meant and hopefully this makes it clearer.
2026-05-19 18:41:44 +02:00
..
models model : clarify MTP layer comment in qwen35.cpp [no ci] (#23338) 2026-05-19 18:41:44 +02:00
CMakeLists.txt cmake: use glob to collect src/models sources (#22005) 2026-04-16 23:25:16 +02:00
llama-adapter.cpp fix: correct misspellings in code comments (#21217) 2026-03-31 13:50:51 +02:00
llama-adapter.h llama : re-enable manual LoRA adapter free (#19983) 2026-03-18 12:03:26 +02:00
llama-arch.cpp llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-arch.h llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-batch.cpp kv-cache : fix M-RoPE checkpoints (#20132) 2026-03-06 08:46:51 +02:00
llama-batch.h fix: correct misspellings in code comments (#21217) 2026-03-31 13:50:51 +02:00
llama-chat.cpp model : add HunyuanOCR support (#21395) 2026-04-05 23:32:14 +02:00
llama-chat.h model : add HunyuanOCR support (#21395) 2026-04-05 23:32:14 +02:00
llama-context.cpp llama: initialize pre-norm embedding mask flag (#23256) 2026-05-18 14:20:49 +03:00
llama-context.h llama: avoid copying logits during prompt decode in MTP (#23198) 2026-05-17 23:30:25 +08:00
llama-cparams.cpp cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188) 2025-06-15 10:08:58 +03:00
llama-cparams.h llama: avoid copying logits during prompt decode in MTP (#23198) 2026-05-17 23:30:25 +08:00
llama-ext.h llama: avoid copying logits during prompt decode in MTP (#23198) 2026-05-17 23:30:25 +08:00
llama-grammar.cpp common/grammar: fix grammar parsing issues to prevent stack overflow and hangs (#18604) 2026-03-21 18:43:35 +01:00
llama-grammar.h common/grammar : replace problematic backtracking regex [\s\S]* (#18342) 2026-01-03 16:02:43 -06:00
llama-graph.cpp llama: avoid copying logits during prompt decode in MTP (#23198) 2026-05-17 23:30:25 +08:00
llama-graph.h llama : MTP clean-up (#23269) 2026-05-19 15:32:58 +03:00
llama-hparams.cpp llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-hparams.h llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-impl.cpp llama : correct platform-independent loading of BOOL metadata (#21428) 2026-04-06 01:40:38 +02:00
llama-impl.h llama : enable chunked fused GDN path (#20340) 2026-03-11 22:46:40 +02:00
llama-io.cpp server : avoid checkpoint data host copies (#22558) 2026-05-02 18:03:25 +03:00
llama-io.h llama : add option to save memory in device buffers (#22679) 2026-05-05 06:35:07 +03:00
llama-kv-cache-iswa.cpp (revert) kv-cache : do not quantize SWA KV cache (#21332) 2026-04-03 09:07:01 +03:00
llama-kv-cache-iswa.h llama: print memory breakdown on exit (#15860) 2025-09-24 16:53:48 +02:00
llama-kv-cache.cpp ggml : implement fast walsh-hadamard transform for kv rotation (#21352) (#22631) 2026-05-05 10:05:05 +08:00
llama-kv-cache.h kv-cache : support attention rotation for heterogeneous iSWA (#21513) 2026-04-07 20:31:28 +03:00
llama-kv-cells.h llama: store mrope data in KV cell (#16825) 2025-10-29 18:09:18 +01:00
llama-memory-hybrid-iswa.cpp llama : MTP clean-up (#23269) 2026-05-19 15:32:58 +03:00
llama-memory-hybrid-iswa.h llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-memory-hybrid.cpp llama : MTP clean-up (#23269) 2026-05-19 15:32:58 +03:00
llama-memory-hybrid.h llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-memory-recurrent.cpp llama : MTP clean-up (#23269) 2026-05-19 15:32:58 +03:00
llama-memory-recurrent.h llama : MTP clean-up (#23269) 2026-05-19 15:32:58 +03:00
llama-memory.cpp memory : correctly handle failure in apply() (#14438) 2025-06-30 18:03:03 +03:00
llama-memory.h llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-mmap.cpp Update llama-mmap to use ftello/fseeko (#22497) 2026-04-30 14:17:52 -07:00
llama-mmap.h llama: fix llama-model-saver (#20503) 2026-03-25 12:53:16 +02:00
llama-model-loader.cpp llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-model-loader.h llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-model-saver.cpp model : NvFP4 quantized LM head support (#23046) 2026-05-16 11:09:27 +02:00
llama-model-saver.h llama: fix llama-model-saver (#20503) 2026-03-25 12:53:16 +02:00
llama-model.cpp llama + spec: MTP Support (#22673) 2026-05-16 20:06:23 +08:00
llama-model.h model : NvFP4 quantized LM head support (#23046) 2026-05-16 11:09:27 +02:00
llama-quant.cpp model: move load_hparams and load_tensors to per-model definition (#22004) 2026-05-04 12:36:59 +02:00
llama-quant.h llama : refactor src/llama.cpp (#10902) 2025-01-03 10:18:53 +02:00
llama-sampler.cpp llama : rename llama-sampling to llama-sampler (#19363) 2026-02-06 07:26:54 +01:00
llama-sampler.h llama : rename llama-sampling to llama-sampler (#19363) 2026-02-06 07:26:54 +01:00
llama-vocab.cpp model : add sarvam_moe architecture support (#20275) 2026-05-09 16:31:50 +02:00
llama-vocab.h model : add sarvam_moe architecture support (#20275) 2026-05-09 16:31:50 +02:00
llama.cpp llama : add missing call to ggml_backend_load_all() (#22752) 2026-05-07 08:24:47 +03:00
unicode-data.cpp server : better security control for public deployments (#9776) 2024-10-08 13:27:04 +02:00
unicode-data.h llama : reduce compile time and binary size (#9712) 2024-10-02 15:49:55 +02:00
unicode.cpp unicode,test: add Qwen3.5 non-backtracking tokenizer handler and regr… (#22110) 2026-05-14 11:03:40 +02:00
unicode.h vocab: fix Gemma4 tokenizer (#21343) 2026-04-03 10:33:03 +02:00