koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-09 16:44:35 +00:00

Author	SHA1	Message	Date
Concedo	6b6597ebf1	allow for single token prompt processing (actual batch size 1)	2025-04-25 16:54:46 +08:00
Concedo	28a2723100	merged pixtral support, not fully working	2025-04-24 15:27:02 +08:00
Concedo	9cd6a1add2	allow mmproj to be run on cpu	2025-04-21 21:03:10 +08:00
Concedo	2ed6850c0b	added override tensor	2025-04-20 20:56:17 +08:00
Concedo	c67510718e	kv override option (+1 squashed commits) Squashed commits: [e615fc01] kv override option	2025-04-17 14:22:30 +08:00
Concedo	93a226d9e4	added prefix for llava, reverted system role in template as it degreaded gemma3. truncated debug logs	2025-04-05 18:06:41 +08:00
Concedo	b3143384b4	larger warmup batch	2025-04-05 10:57:04 +08:00
Concedo	61a73347c6	fixed mrope for multiple images in qwen2vl (+1 squashed commits) Squashed commits: [63e4d91c] fixed mrope for multiple images in qwen2vl (+1 squashed commits) Squashed commits: [bb78db1e] wip fixing mrope	2025-03-30 17:23:58 +08:00
Concedo	6a709be50a	replace deprecated	2025-03-27 10:27:20 +08:00
Concedo	e84596ec1a	add config for default gen tokens and bos toggle	2025-03-15 19:53:06 +08:00
Concedo	4212f0b8e8	wip on multiple fixes	2025-03-15 10:50:36 +08:00
Concedo	6a1dd57435	gemma3 template, updated lite, fixed tool calling, reenable ctx shift for gemma3	2025-03-14 17:47:01 +08:00
Concedo	0db4ae6237	traded my ink for a pen	2025-03-14 11:58:15 +08:00
Concedo	52cf1ded0c	remove unwanted print	2025-03-14 00:24:28 +08:00
Concedo	0460d92cc3	disable context shifting for gemma3	2025-03-13 20:28:26 +08:00
Concedo	e75539e8cb	too many issues without BOS (+1 squashed commits) Squashed commits: [7138d941] only print bos alert in debug	2025-03-13 16:48:29 +08:00
Concedo	1ef41c2124	streamline output console log (+1 squashed commits) Squashed commits: [ca474bdd] streamline output console log	2025-03-13 15:33:49 +08:00
Concedo	77debb1b1b	gemma3 vision works, but is using more tokens than expected - may need resizing	2025-03-13 00:31:16 +08:00
Concedo	eb1809c105	add more perf stats	2025-03-12 18:58:27 +08:00
Concedo	b0541f3652	added draft results	2025-03-10 22:03:20 +08:00
Concedo	72bc855e8a	honor add bos token settings from metadata	2025-03-07 22:10:50 +08:00
Concedo	6b7d2349a7	Rewrite history to fix bad vulkan shader commits without increasing repo size added dpe colab (+8 squashed commit) Squashed commit: [b8362da4] updated lite [ed6c037d] move nsigma into the regular sampler stack [ac5f61c6] relative filepath fixed [05fe96ab] export template [ed0a5a3e] nix_example.md: refactor (#1401) * nix_example.md: add override example * nix_example.md: drop graphics example, already basic nixos knowledge * nix_example.md: format * nix_example.md: Vulkan is disabled on macOS Disabled in: `1ccd253acc` * nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities} Fixes: https://github.com/LostRuins/koboldcpp/issues/1367 [675c62f7] AutoGuess: Phi 4 (mini) (#1402) [`4bf56982`] phrasing [`b8c0df04`] Add Rep Pen to Top N Sigma sampler chain (#1397) - place after nsigma and before xtc (+3 squashed commit) Squashed commit: [`87c52b97`] disable VMM from HIP [`ee8906f3`] edit description [`e85c0e69`] Remove Unnecessary Rep Counting (#1394) * stop counting reps * fix range-based initializer * strike that - reverse it	2025-03-05 00:02:20 +08:00
Reithan	62cd9bb0b2	use range neq zero instead of lt (#1388 )	2025-02-24 18:47:19 +08:00
Concedo	f2ac10c014	added nsigma to lite	2025-02-21 15:11:24 +08:00
EquinoxPsychosis	2740af3660	add top n sigma sampler from llama.cpp (#1384 ) * Add N Sigma Sampler * update nsigma sampler chain * xtc position fix * remove stray newline --------- Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>	2025-02-21 14:31:42 +08:00
Concedo	6d7ef10671	Merge branch 'upstream' into concedo_experimental Renable qwen2vl GPU for vulkan https://github.com/ggml-org/llama.cpp/pull/11902 # Conflicts: # .github/workflows/build.yml # .github/workflows/docker.yml # .gitignore # CONTRIBUTING.md # Makefile # common/CMakeLists.txt # common/arg.cpp # common/common.cpp # examples/main/main.cpp # examples/run/run.cpp # examples/server/tests/README.md # ggml/src/ggml-cuda/mma.cuh # scripts/get_chat_template.py # tests/test-backend-ops.cpp # tests/test-chat-template.cpp # tests/test-chat.cpp	2025-02-20 23:17:20 +08:00
Concedo	b162c25a5e	fixed moe experts to use detected arch for key	2025-02-10 17:46:08 +08:00
Concedo	d22eca6c47	fix potential crash in autoguess	2025-02-09 12:33:28 +08:00
Concedo	e68a3cf1dc	fixed some functions when no model is loaded	2025-02-08 11:15:26 +08:00
Concedo	8fef9f3fb5	reloading is working correctly.	2025-02-06 22:24:18 +08:00
Concedo	fd84b062f9	allow reuse of clip embds	2025-01-30 19:02:45 +08:00
Concedo	f4e2f4b069	disable context shift when using mrope	2025-01-30 00:36:05 +08:00
Concedo	70f1d8d746	vision can set max res (+1 squashed commits) Squashed commits: [938fc655] vision can set max res	2025-01-30 00:19:49 +08:00
Concedo	0e45d3bb7a	quiet flags now set at load time	2025-01-25 16:46:56 +08:00
Concedo	cca4a934dd	fix for chat templates and drafting	2025-01-23 11:49:40 +08:00
Concedo	0e74db7fd4	fixed another tts bug, clblast selection and quiet mode	2025-01-22 21:36:13 +08:00
Concedo	2a00ee8fa8	broken commit	2025-01-16 21:41:18 +08:00
Concedo	b3de1598e7	Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS tts is functional (+6 squashed commit) Squashed commit: [22396311] wip tts [3a883027] tts not yet working [0dcfab0e] fix silly bug [a378d9ef] some long overdue cleanup [fc5a6fb5] Wip tts [39f50497] wip TTS integration	2025-01-13 14:23:25 +08:00
Nexes the Elder	3e6ef8e0ef	Probable typo (#1287 )	2024-12-26 11:51:04 +08:00
Concedo	10d4fc637d	fixed a bug with drafting tokens	2024-12-23 11:36:08 +08:00
Concedo	fd5100c382	fix for query param	2024-12-21 10:41:25 +08:00
Concedo	4c56b7cada	Merge branch 'upstream' into concedo_experimental # Conflicts: # README.md # examples/gbnf-validator/gbnf-validator.cpp # examples/llava/clip.cpp # examples/run/README.md # examples/run/run.cpp # examples/server/README.md # ggml/src/ggml-cpu/CMakeLists.txt # src/llama.cpp # tests/test-grammar-integration.cpp # tests/test-llama-grammar.cpp	2024-12-21 09:41:49 +08:00
Concedo	b7d3274523	temporarily make qwenv2l use clip on cpu for vulkan and macos	2024-12-21 09:15:31 +08:00
Concedo	bc297da91e	remove unused function	2024-12-16 11:39:52 +08:00
Concedo	00d154b32b	wip on qwen2vl integration, updated msvc runtimes	2024-12-15 23:58:02 +08:00
Concedo	60cd68a39d	draft model sets gpu split instead of id, made mmq default for cli	2024-12-14 23:58:45 +08:00
Concedo	595cc6975f	added new flags --moeexperts --failsafe --draftgpulayers and --draftgpuid	2024-12-13 17:11:59 +08:00
Concedo	00a686fc72	fixed fast forwarding context corruption after abort during prompt processing	2024-12-10 22:37:40 +08:00
Concedo	5106816eac	drafted tokens debug prints	2024-12-05 17:05:20 +08:00
Concedo	e93c2427b4	allow incompatible vocab in debugmode	2024-12-01 14:11:03 +08:00

1 2 3 4 5 ...

365 commits