koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-11 09:34:37 +00:00

Author	SHA1	Message	Date
Concedo	e9473305d0	wip2 (+1 squashed commits) Squashed commits: [4628777b6] wip	2025-07-12 18:54:40 +08:00
Wagner Bruna	d74c16e6e0	enable flash attention for image generation (#1633 )	2025-07-05 11:20:51 +08:00
Concedo	4ec0e0fd21	now accept multiple images for reference images	2025-06-28 17:30:28 +08:00
Concedo	0bd648ffa4	photomaker renamed to extra image to handle future extension	2025-06-28 10:26:06 +08:00
Concedo	65ff041827	added more perf stats	2025-06-21 12:12:28 +08:00
Wagner Bruna	08adfb53c9	Configurable VAE threshold limit (#1601 ) * add backend support for changing the VAE tiling threshold * trigger VAE tiling by image area instead of dimensions I've tested with GGML_VULKAN_MEMORY_DEBUG all resolutions with the same 768x768 area (even extremes like 64x9216), and many below that: all consistently allocate 6656 bytes per image pixel. As tiling is primarily useful to avoid excessive memory usage, it seems reasonable to enable VAE tiling based on area rather than maximum image side. However, as there is currently no user interface option to change it back to a lower value, it's best to maintain the default behavior for now. * replace the notile option with a configurable threshold This allows selecting a lower threshold value, reducing the peak memory usage. The legacy sdnotile parameter gets automatically converted to the new parameter, if it's the only one supplied. * simplify tiling checks, 768 default visible in launcher --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-06-21 10:14:57 +08:00
Concedo	4e40f2aaf4	added photomaker face cloning	2025-06-20 21:33:36 +08:00
Concedo	21881a861d	rename restrict square to sdclampedsoft	2025-06-20 15:39:55 +08:00
Wagner Bruna	f6d2d1ce5c	configurable resolution limit (#1586 ) * refactor image gen configuration screen * make image size limit configurable * fix resolution limits and keep dimensions closer to the original ratio * use 0.0 for the configured default image size limit This prevents the current default value from being saved into the config files, in case we later decide to adopt a different value. * export image model version when loading * restore model-specific default image size limit * change the image area restriction to be specified by a square side * move image resolution limits down to the C++ level * Revert "export image model version when loading" This reverts commit `fa65b23de3`. * Linting Fixes: PY: - Inconsistent var name sd_restrict_square -> sd_restrict_square_var - GUI swap back to using absolute row numbers for now. - fstring fix - size_limit -> side_limit inconsistency C++: - roundup_64 standalone function - refactor sd_fix_resolution variable names for clarity - move "anti crashing" hard total megapixel limit always to be applied after soft total megapixel limit instead of conditionally only when sd_restrict_square is unset * allow unsafe resolutions if debugmode is on --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-06-13 20:05:20 +08:00
Concedo	1cbe716e45	allow setting maingpu	2025-06-12 17:53:43 +08:00
Concedo	7d8aa31f1f	fixed embeddings, added new parameter to limit max embeddings context	2025-06-10 01:11:55 +08:00
Concedo	cfcdfd69bd	allow embeddings models to use mmap	2025-06-07 10:14:00 +08:00
Concedo	6ce85c54d6	not working correctly	2025-06-02 22:12:10 +08:00
Concedo	8e1ebc55b5	dropped support for lora base as upstream no longer uses it. If provided it will be silently ignored	2025-06-02 12:49:53 +08:00
Concedo	51dc1cf920	added scale for text lora	2025-06-02 00:13:42 +08:00
Concedo	c4df151298	experimental swa flag	2025-05-23 21:33:26 +08:00
Concedo	38a8778f24	wip cfg scale	2025-05-06 23:06:25 +08:00
Concedo	13cee48740	embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits) Squashed commits: [b9b695217] embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits) Squashed commits: [90b5d389d] embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits) Squashed commits: [fbbaa989f] embed aria2c for windows	2025-05-06 18:56:02 +08:00
Concedo	f59b5eb561	added toggle for guidance	2025-05-05 22:21:46 +08:00
Concedo	9cd6a1add2	allow mmproj to be run on cpu	2025-04-21 21:03:10 +08:00
Concedo	2ed6850c0b	added override tensor	2025-04-20 20:56:17 +08:00
Concedo	c67510718e	kv override option (+1 squashed commits) Squashed commits: [e615fc01] kv override option	2025-04-17 14:22:30 +08:00
Concedo	27f575dc83	inpaining support completed, invert mask added	2025-04-09 23:50:17 +08:00
Concedo	23339ace9b	inpainting works in kcpp!	2025-04-09 23:01:05 +08:00
Concedo	e37f27632f	clear cpu flag manually for templates, added truncation for embeddings	2025-04-02 00:18:30 +08:00
Concedo	2bdf1dacff	embeddings done	2025-03-25 22:41:46 +08:00
Concedo	3992fb79cc	wip adding embeddings support	2025-03-24 18:01:23 +08:00
Concedo	c1e58419c7	support for voice cloning is done (+2 squashed commit) Squashed commit: [e7301628] support for voice cloning is done [1653c576] wip adding voice cloning	2025-03-21 22:28:59 +08:00
Concedo	e84596ec1a	add config for default gen tokens and bos toggle	2025-03-15 19:53:06 +08:00
Concedo	eb1809c105	add more perf stats	2025-03-12 18:58:27 +08:00
Concedo	f2ac10c014	added nsigma to lite	2025-02-21 15:11:24 +08:00
EquinoxPsychosis	2740af3660	add top n sigma sampler from llama.cpp (#1384 ) * Add N Sigma Sampler * update nsigma sampler chain * xtc position fix * remove stray newline --------- Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>	2025-02-21 14:31:42 +08:00
Concedo	71016db617	remove tts audio caching	2025-02-12 11:37:43 +08:00
Concedo	70f1d8d746	vision can set max res (+1 squashed commits) Squashed commits: [938fc655] vision can set max res	2025-01-30 00:19:49 +08:00
Concedo	558bc5c901	tts can now set a length limit	2025-01-28 22:06:59 +08:00
Concedo	0e45d3bb7a	quiet flags now set at load time	2025-01-25 16:46:56 +08:00
Concedo	fa7e661133	various fixes	2025-01-18 23:52:39 +08:00
Concedo	e8570de0e6	improved tts default voices quality and sample rate	2025-01-17 18:45:16 +08:00
Concedo	8e3cad1aa2	added audio caching, as a hacky fix for ST TTS bug	2025-01-16 12:04:58 +08:00
Concedo	b3de1598e7	Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS tts is functional (+6 squashed commit) Squashed commit: [22396311] wip tts [3a883027] tts not yet working [0dcfab0e] fix silly bug [a378d9ef] some long overdue cleanup [fc5a6fb5] Wip tts [39f50497] wip TTS integration	2025-01-13 14:23:25 +08:00
Concedo	91b6e29af3	added multilingual support for whisper	2025-01-09 23:28:52 +08:00
Concedo	0cb599546e	increase max supported llava images to 8	2025-01-09 22:12:06 +08:00
Concedo	568e476997	added toggle for vae tiling, use custom memory buffer	2025-01-08 13:12:03 +08:00
Concedo	60cd68a39d	draft model sets gpu split instead of id, made mmq default for cli	2024-12-14 23:58:45 +08:00
Concedo	595cc6975f	added new flags --moeexperts --failsafe --draftgpulayers and --draftgpuid	2024-12-13 17:11:59 +08:00
Concedo	e9d2332dd8	improved tool calls and whisper	2024-12-06 14:34:31 +08:00
Concedo	32ac3153e4	default speculative set to 8. added more adapter fields	2024-11-30 16:18:27 +08:00
Concedo	e0c59486ee	default to 12 tokens drafted	2024-11-30 11:52:07 +08:00
Concedo	b21d0fe3ac	customizable speculative size	2024-11-30 11:28:19 +08:00
Concedo	f75bbb945f	speculative decoding initial impl completed (+6 squashed commit) Squashed commit: [0a6306ca0] draft wip dont use (will be squashed) [a758a1c9c] wip dont use (will be squashed) [e1994d3ce] wip dont use [f59690d68] wip [77228147d] wip on spec decoding. dont use yet [2445bca54] wip adding speculative decoding (+1 squashed commits) Squashed commits: [50e341bb7] wip adding speculative decoding	2024-11-30 10:41:10 +08:00

1 2 3 4

155 commits