koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-04 22:50:50 +00:00

Author	SHA1	Message	Date
henk717	4e30294cb1	Henk's Gemma4 31B Magic (#2096 )	2026-04-06 18:49:19 +08:00
Concedo	6c937c05d9	improve ncmoe / moecpu regex	2026-04-04 23:53:13 +08:00
Concedo	db8bc40731	add some warnings if shifting fails	2026-04-04 23:16:26 +08:00
Concedo	eb3422996a	BOS fix for gemma4	2026-04-04 22:15:01 +08:00
Concedo	97f785efce	ensure BOS on vision prefix	2026-04-03 16:20:36 +08:00
Concedo	e8cffa37c8	fixed gemma4v image crashing on encode, however images are not yet working correctly	2026-04-03 15:56:35 +08:00
Concedo	0c2b679ea3	support bf16 quantkv cache type	2026-03-28 00:01:17 +08:00
Concedo	c91f350ed5	increase max images, take images from the end instead of beginning if too many images	2026-03-26 23:03:52 +08:00
Concedo	993925ba96	gracefully handle bad grammar instead of crashing	2026-03-23 17:00:53 +08:00
Concedo	07327b6c10	double n_batch size when pipeline parallel is enabled, keep u_batch the same	2026-03-21 11:22:10 +08:00
Concedo	3113e3a643	move main device print	2026-03-21 10:47:21 +08:00
Concedo	f579939057	updated lite, change smartcache snapshot behavior to conserve slots	2026-03-15 15:15:39 +08:00
Concedo	fcdf2f40d5	no need snapshot after gen is complete.	2026-03-15 12:34:48 +08:00
Concedo	33211e6edf	timing measure fixes	2026-03-15 12:23:14 +08:00
Concedo	500a1ab466	disable smartcache if slots is zero	2026-03-10 08:57:31 +08:00
Concedo	0df18d2ae2	fixed single token bans	2026-03-07 22:50:53 +08:00
JustCommitRandomness	2fbc3b2ae5	Adjust int types in format strings (#2009 ) * tweak format sting types This may not be all of them, but it's the ones which warn on OpenBSD * complete the changes needed to fix the format string specifers * avoid using inttypes, directly cast to size_t (u64 usually) instead --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2026-03-06 19:06:18 +08:00
Concedo	e36d7b6464	warn about RNN models not supporting antislop	2026-03-06 14:02:51 +08:00
Concedo	4f1b22c415	kv snapshots save and load last logits for correctness. added some text for musicui, updated docs	2026-03-04 21:57:28 +08:00
Concedo	54cf43ae64	rnn fix adjust	2026-03-04 10:59:51 +08:00
Concedo	707f7b37bf	optimize pp	2026-03-03 21:02:51 +08:00
Concedo	ae67caa2f7	ace qwen rep pen for codes	2026-03-02 21:18:06 +08:00
Concedo	d904b51b0f	adjust slot counts	2026-03-02 15:56:15 +08:00
Concedo	42134db6b4	finally fixed smartcache for qwen	2026-03-02 00:47:38 +08:00
Concedo	0b76f73fc2	smartcache bug seems to be fixed	2026-02-28 18:08:54 +08:00
Concedo	dd08d675f2	incomplete fix for rnn models, load state works but logits slightly different	2026-02-28 11:52:24 +08:00
Concedo	72f7e01b27	Merge commit '`01d8eaa28d`' into concedo_experimental # Conflicts: # build-xcframework.sh # scripts/sync_vendor.py # tests/test-backend-ops.cpp # tools/mtmd/CMakeLists.txt # tools/rpc/rpc-server.cpp	2026-02-16 15:36:59 +08:00
Concedo	9258d91b70	try initi rocblas before autofit	2026-02-11 22:29:47 +08:00
Concedo	e6d271db05	fixed typo	2026-02-09 17:12:03 +08:00
Concedo	3d3f02ef4a	revert layers if fail	2026-02-08 12:58:05 +08:00
Concedo	6bfbb5b283	allow autofit logspam in debugmode	2026-02-08 01:00:41 +08:00
Concedo	5cf21443bc	added autofit padding. autofit is now in the quick menu	2026-02-07 18:29:30 +08:00
Concedo	812da8b75d	fix autofit spamming	2026-02-07 18:01:01 +08:00
Concedo	349c461453	add stop reason for error	2026-02-04 20:23:18 +08:00
Concedo	226c79338f	handle glm4.7 flash template	2026-01-28 23:29:08 +08:00
Concedo	5c6cc02985	remove clblast, part 2	2026-01-23 14:09:46 +08:00
Concedo	3816391a74	increase logprobs returned to 10	2026-01-18 11:13:42 +08:00
Concedo	62bea5ef4f	allow overriding the devices directly	2026-01-17 19:08:06 +08:00
Concedo	983baac46b	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/vulkan.Dockerfile # .github/workflows/build.yml # ci/run.sh # examples/model-conversion/Makefile # examples/model-conversion/README.md # examples/model-conversion/scripts/causal/compare-logits.py # examples/model-conversion/scripts/embedding/run-converted-model.sh # examples/model-conversion/scripts/utils/common.py # examples/model-conversion/scripts/utils/semantic_check.py # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # scripts/pr2wt.sh # scripts/sync_vendor.py # tests/test-arg-parser.cpp	2026-01-09 01:23:10 +08:00
Concedo	d8942cde14	smartcache allow custom number of slots	2026-01-02 17:19:40 +08:00
Concedo	bfa2ae7744	fixed smartcache bug when used with images	2026-01-02 00:35:05 +08:00
Concedo	51edb6ae61	allow clip fa for anything besides cuda on gpu	2026-01-01 21:09:51 +08:00
Concedo	54e419f587	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/docker.yml # docs/ops.md # docs/ops/Metal.csv # ggml/CMakeLists.txt # ggml/src/ggml-sycl/CMakeLists.txt # grammars/README.md # models/templates/llama-cpp-deepseek-r1.jinja # scripts/sync-ggml.last # tests/test-chat.cpp	2026-01-01 15:34:10 +08:00
Concedo	76ef726ec8	adaptive p sharpness to 10.0f	2025-12-31 17:28:30 +08:00
Concedo	0e26e4d354	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/ISSUE_TEMPLATE/010-bug-compilation.yml # .github/ISSUE_TEMPLATE/011-bug-results.yml # .github/ISSUE_TEMPLATE/019-bug-misc.yml # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-rpc/ggml-rpc.cpp	2025-12-28 23:47:55 +08:00
Concedo	21d801f6d5	init total weight for adaptive p	2025-12-28 15:33:06 +08:00
Concedo	27261bfc26	adaptive decay as an overridable param (+1 squashed commits) Squashed commits: [d94df7843] adaptive decay as an overridable param	2025-12-28 13:34:20 +08:00
Concedo	6548645aaa	rename power law sampler to adaptive p	2025-12-27 17:50:58 +08:00
Concedo	9bb362cce9	revised power law sampling	2025-12-27 10:59:46 +08:00
Concedo	91d8863f18	power law sampler added	2025-12-27 09:46:06 +08:00

1 2 3 4 5 ...

527 commits