koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-08 08:09:06 +00:00

Author	SHA1	Message	Date
Concedo	3210b378e8	better tool calls	2025-08-20 22:11:31 +08:00
Concedo	5a921a40f9	add overridenativecontext flag, stop nagging me	2025-08-14 22:54:45 +08:00
Concedo	4c1faf61b2	increment version (+1 squashed commits) Squashed commits: [6e5080ad2] increment version	2025-08-09 20:53:26 +08:00
Concedo	338b1fe97e	readjusted mistral and oai template, fixed compile issue on termux, updated lite, show generated token ids in debug mode	2025-08-07 21:14:48 +08:00
Concedo	34487d3c02	gpt oss harmony template	2025-08-06 11:39:40 +08:00
Concedo	e40d26b9e7	allow offloading moe to cpu with --moecpu	2025-08-05 23:42:42 +08:00
Concedo	428a07416a	cleanup some debug	2025-08-05 00:07:22 +08:00
Concedo	3284757b56	voxstral mini is really bad	2025-07-29 21:22:17 +08:00
Concedo	abf527a207	clearer multimodal capability display	2025-07-28 22:54:49 +08:00
Concedo	12a6088a65	added voxtral support, however without the magic token it hears audio as text	2025-07-28 22:35:59 +08:00
Concedo	b87864144b	no ctx shift for all mrope	2025-07-25 13:53:20 +08:00
Concedo	9f4d0f6ccf	fixed swa pp bug by retrying smaller batches	2025-07-21 23:34:22 +08:00
Concedo	6d50def409	default kv_unified to true, handle LLAMA_SET_ROWS.	2025-07-21 16:13:20 +08:00
Concedo	b028dd4e84	minor fixes	2025-07-18 13:22:59 +08:00
Concedo	f0564f9caf	updated lite, added better separators for multimodal chunks (universal)	2025-07-17 00:11:08 +08:00
Concedo	bc2877d2fe	test without g3n fix	2025-07-13 23:42:59 +08:00
Concedo	811463a704	split audio and vision detection separately	2025-07-13 17:47:15 +08:00
Concedo	dca49de059	fixed qwen2 audio issues, works fine now (+3 squashed commit) Squashed commit: [b3053a1ba] updated lite [5071630d6] fixed mtmd issues, audio works [06efa5af4] fix mtmd compile	2025-07-12 18:54:41 +08:00
Concedo	5a3b2e3921	fix for jamba models - they have recurrent layers like rwkv, so context shifting and forwarding wont work on them.	2025-07-12 18:54:40 +08:00
Concedo	e9473305d0	wip2 (+1 squashed commits) Squashed commits: [4628777b6] wip	2025-07-12 18:54:40 +08:00
Concedo	c45b8dc56f	fix for gemma3n	2025-07-10 17:39:08 +08:00
Reithan	0097de5c57	improve performance by actually applying nsigma's masking (#1602 ) merging, please report any issues.	2025-07-07 15:41:46 +08:00
Concedo	2e14338455	additional padding for the swa kv cache itself	2025-06-28 15:52:48 +08:00
Concedo	815d2056d9	gentoken reservations	2025-06-28 09:16:20 +08:00
Concedo	39b0699c71	fixed savestates with drafting	2025-06-27 20:35:38 +08:00
Reithan	54dde5e565	Add memoized cache to `llama_grammar_reject_candidates_for_stack` (#1615 ) * Add memoized cache to llama_grammar_reject_candidates_for_stack * make size cutoff more aggressive and move to outer branch * update comment * add cache reset whenever grammar is reloaded * remove explicit reference types for compiler transportability	2025-06-25 19:22:19 +08:00
Concedo	65ff041827	added more perf stats	2025-06-21 12:12:28 +08:00
Reithan	f07434f4c1	streamline grammar sampler to speed up generation while using heavy grammar (#1606 )	2025-06-17 23:04:59 +08:00
Concedo	c494525b33	update deprecated apis	2025-06-13 22:21:15 +08:00
Reithan	f1c9db4174	fix-loss-of-destroyed-tokens-in-grammar-pre-pass (#1600 )	2025-06-13 18:46:38 +08:00
Concedo	5bac0fb3d5	remove debug prints for now, they were kind of cluttered	2025-06-13 16:00:23 +08:00
Reithan	5af9138ebe	Improve GNBF performance by attempting culled grammar search first (#1597 ) * cull tokens with top_3k first before running grammar, fallback to unculled if none found * fix errors * fix improvement and test against concedo's GBNF * revert non-culling changes	2025-06-13 15:57:27 +08:00
Concedo	1cbe716e45	allow setting maingpu	2025-06-12 17:53:43 +08:00
Concedo	f6bbc350f2	various qol fixes	2025-06-05 10:26:02 +08:00
Concedo	736030bb9f	save and load state upgraded to 3 available states	2025-06-04 22:09:40 +08:00
Concedo	53f1511396	use a static buffer for kv reloads instead. also, added into lite ui	2025-06-03 22:32:46 +08:00
Concedo	4b57108508	Save KV State and Load KV State to memory added. GUI not yet updated	2025-06-03 17:46:29 +08:00
Concedo	6ce85c54d6	not working correctly	2025-06-02 22:12:10 +08:00
Concedo	8e1ebc55b5	dropped support for lora base as upstream no longer uses it. If provided it will be silently ignored	2025-06-02 12:49:53 +08:00
Concedo	51dc1cf920	added scale for text lora	2025-06-02 00:13:42 +08:00
Concedo	0c108f6054	Merge commit '34b7c0439ed0f98575cc4689dfecd98991dee8be' into concedo_experimental # Conflicts: # ggml/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/element_wise.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # scripts/sync-ggml.last # src/CMakeLists.txt # tools/mtmd/clip.cpp	2025-05-31 12:27:45 +08:00
Concedo	f97bbdde00	fix to allow all EOGs to trigger a stop, occam's glm4 fix,	2025-05-24 22:55:11 +08:00
Concedo	ec04115ae9	swa options now available	2025-05-24 11:50:37 +08:00
Concedo	c4df151298	experimental swa flag	2025-05-23 21:33:26 +08:00
Concedo	69b5d4d4af	cursed hack for glm4, may or may not be better	2025-05-22 22:40:37 +08:00
Concedo	f125e724eb	fix off-by-one npast during some instances of fast forwarding	2025-05-22 19:51:21 +08:00
Concedo	f10574e598	debug text	2025-05-22 14:22:01 +08:00
Concedo	9f976e9c65	swa full used unless ctx shift and fast forward disabled	2025-05-21 22:47:45 +08:00
Concedo	3fefb3bdf2	Merge commit 'f0adb80bf7c2c0d80abb04f4533b5513622d9964' into concedo_experimental # Conflicts: # docs/backend/CANN.md # docs/backend/SYCL.md # docs/docker.md # examples/sycl/run-llama2.sh # examples/sycl/win-run-llama2.bat # ggml/src/ggml-sycl/ggml-sycl.cpp # tools/llama-bench/README.md	2025-05-21 19:10:57 +08:00
Concedo	8b6dfbd1be	disabling the gMask prefix for glm-4 completions	2025-05-21 17:29:24 +08:00

1 2 3 4 5 ...

437 commits