koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-12 14:11:27 +00:00

Author	SHA1	Message	Date
Concedo	3bda0bf102	passthrough mode without any gens	2026-03-22 23:09:08 +08:00
Concedo	f846c83a7a	pre-seed the tts so it can be shown	2026-03-22 10:36:42 +08:00
Concedo	fdfb713d91	added `--sdmaingpu` allowing image models to be independently placed on any gpu	2026-03-21 17:34:12 +08:00
Concedo	a3d3800f3e	added passthrough mode for esrgan upscale, triggered by img2img denoise 0.0 with 1 step	2026-03-21 16:19:10 +08:00
Wagner Bruna	51187d5362	sd: support changing preloaded LoRA multipliers (#2041 ) * sd: remove C++ support for enforcing fixed LoRA multipliers The logic at the Python level is enough. * sd: support changing preloaded LoRA multipliers We keep the same rules as before: - Any LoRA with multiplier 0 can be changed - If all LoRAs have multiplier != 0, they are fixed and optimized but tweak the corner case of LoRAs specified more than once to allow adjusting the multiplier if the same LoRA is also specified with a zero multiplier, as if they were two different LoRAs. So the following keeps working as before: - --sdlora /loras/lcm.gguf --sdloramult 1 : fixed as 1 - --sdlora /loras/lcm.gguf --sdloramult 0 : dynamic, default 0 - --sdlora /loras/ : dynamic, default 0 - --sdlora /loras/lcm.gguf /loras/lcm.gguf --sdloramult 1 1 : fixed as 2 But now we have: - --sdlora /loras/lcm.gguf /loras/lcm.gguf --sdloramult 1 0 : dynamic, default 1 - --sdlora /loras/lcm.gguf /loras/ --sdloramult 1 : dynamic, default 1	2026-03-17 10:09:55 +08:00
Wagner Bruna	6e7b9a1549	sd: sync to master-529-630ee03 (#2040 )	2026-03-17 00:23:28 +08:00
Wagner Bruna	feea014774	sd: support for dynamic LoRA loading from a directory (#2036 ) * backend support for controlling LoRA cache and fixed multipliers The generation LoRA multipliers are now added to the initial multipliers, so e.g. a merged LCM model will behave the same as a normal model with a preloaded LCM LoRA. * frontend support	2026-03-16 20:39:21 +08:00
Concedo	b88fc44d0e	add some debug prints	2026-03-16 16:27:49 +08:00
Concedo	2093ca4c73	ace step optimizations	2026-03-15 20:58:45 +08:00
Wagner Bruna	b437d18319	add support for cache modes to accelerate image generation (#2021 ) * sd: sync to master-525-d6dd6d7 * sd: add support for cache modes for inference acceleration * keep gendefaults as a JSON object inside the config file * covered more invalid cases on gendefaults parsing	2026-03-15 15:27:14 +08:00
Concedo	b1c500ae2b	Merge commit '`2948e6049a`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # CONTRIBUTING.md # docs/backend/VirtGPU/development.md # docs/ops.md # docs/ops/WebGPU.csv # embd_res/templates/GigaChat3-10B-A1.8B.jinja # embd_res/templates/GigaChat3.1-10B-A1.8B.jinja # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # scripts/sync_vendor.py # tests/CMakeLists.txt # tests/test-backend-ops.cpp # tests/test-chat.cpp # tests/test-grammar-integration.cpp # tests/test-quantize-fns.cpp	2026-03-15 11:21:24 +08:00
Concedo	22c78f6c82	fix q3tts compile, update docs and lite	2026-03-14 23:33:18 +08:00
Concedo	1d067933f0	claude fixes for ace step, idk man who am i to argue with an agi	2026-03-14 12:27:26 +08:00
Concedo	349fc744e9	cleanup, fixed a regression in music gen with codes due to instruct prompt change	2026-03-14 11:32:47 +08:00
Concedo	4189508ef3	qwen3tts support 1.7b model	2026-03-13 21:15:24 +08:00
Concedo	a13641c00c	tts loader fixes	2026-03-13 18:33:10 +08:00
Concedo	0a38237ff5	original qwen3tts files	2026-03-13 15:24:18 +08:00
Concedo	4427bab37e	cover mode is now working	2026-03-13 14:55:39 +08:00
Concedo	84734eb409	better audio runtime reload	2026-03-13 14:02:56 +08:00
Concedo	8f23b8d81e	wip on ref audio, but it compiles	2026-03-12 23:46:10 +08:00
Concedo	d5a4c17e14	mp3 not default	2026-03-12 21:42:59 +08:00
Concedo	3fd9648726	added mp3 support	2026-03-12 21:00:50 +08:00
Concedo	3092694d2e	better resampler	2026-03-12 16:49:53 +08:00
Concedo	318a5486ce	duration	2026-03-12 15:33:51 +08:00
Concedo	3cc6e2ea17	make stereo default	2026-03-12 00:10:25 +08:00
Concedo	211d4fe632	lots of tweaks for ace step	2026-03-11 23:57:52 +08:00
Concedo	ecc4865244	improves code output quality	2026-03-10 23:07:52 +08:00
Concedo	c8800ed16c	gcc path fix	2026-03-10 21:40:32 +08:00
Wagner Bruna	3f42ed1af7	support for customizing LoRA multipliers through the sdapi (#1982 ) * fix corner case in sd_oai_transform_params Also fix typo in the function name. * support for customizing loaded LoRA multipliers The `sdloramult` flag now accepts a list of multipliers, one for each LoRA. If all multipliers are non-zero, LoRAs load as before, with no extra VRAM usage or performance impact. If any LoRA has a multiplier of 0, we switch to `at_runtime` mode, and these LoRAs will be available to multiplier changes via the `lora` sdapi field and show up in the `sdapi/v1/loras` endpoint. All LoRAs are still preloaded on startup, and cached to avoid file reloads. If the list of multipliers is shorter than the list of LoRAs, the multiplier list is extended with the first multiplier (1.0 by default), to keep it compatible with the previous behavior. * support for `<lora:name:multiplier>` prompt syntax and metadata * add a few tests for sanitize_lora_multipliers	2026-03-10 21:29:39 +08:00
Concedo	ee96e71bae	don't resample audio	2026-03-09 22:53:55 +08:00
Concedo	45c74da08b	adjust ace step, still wip on caption rework	2026-03-09 00:11:48 +08:00
Wagner Bruna	9158bd8b4d	sd: sync to master-520-d950627 (#2006 ) * sd: sync to master-509-4cdfff5 * sd: Anima support * sd: sync to master-514-5792c66 * sd: additional workaround for Anima .safetensors model * sd: sync to master-517-ba35dd7 * sd: sync to master-520-d950627	2026-03-08 01:23:03 +08:00
Concedo	ebe44e7819	modify q3tts loader	2026-03-08 00:53:33 +08:00
JustCommitRandomness	2fbc3b2ae5	Adjust int types in format strings (#2009 ) * tweak format sting types This may not be all of them, but it's the ones which warn on OpenBSD * complete the changes needed to fix the format string specifers * avoid using inttypes, directly cast to size_t (u64 usually) instead --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2026-03-06 19:06:18 +08:00
JustCommitRandomness	389773070f	OpenBSD also needs alloca.h (#2012 )	2026-03-05 12:32:31 +08:00
Concedo	8658af1018	qwen3tts default to cpu unless gpu selected	2026-03-05 11:11:46 +08:00
Concedo	4f1b22c415	kv snapshots save and load last logits for correctness. added some text for musicui, updated docs	2026-03-04 21:57:28 +08:00
Concedo	707f7b37bf	optimize pp	2026-03-03 21:02:51 +08:00
Concedo	ae67caa2f7	ace qwen rep pen for codes	2026-03-02 21:18:06 +08:00
Concedo	de9840afac	qwen image max ref image size fix from 512x512 to 1024x1024	2026-03-02 21:08:52 +08:00
Concedo	b632d2ce1c	print timestamp when image generated	2026-03-02 18:38:21 +08:00
Concedo	42134db6b4	finally fixed smartcache for qwen	2026-03-02 00:47:38 +08:00
Concedo	6c5a7a27af	clamp music duration	2026-03-01 01:15:26 +08:00
Wagner Bruna	5c40f07d4a	sd: sync to 0752cc9 (master-507-b314d80 +1) (#1999 ) * sd: sync to 0752cc9 (master-507-b314d80 +1) * sd: add flow-shift support to gendefaults	2026-02-28 12:22:32 +08:00
Concedo	d643d945f5	clamp music inference steps to 100 max	2026-02-28 12:12:50 +08:00
Concedo	14d82bb38e	allow music llm and diffusion gen models to be loaded independently	2026-02-27 21:56:48 +08:00
Concedo	19eb78844c	audio codes working	2026-02-27 21:23:00 +08:00
Concedo	ba42f22fc8	stereo is working	2026-02-27 20:36:44 +08:00
Concedo	5a57ed8ca4	revert to 8 step	2026-02-26 22:07:01 +08:00
Concedo	173702d1a4	music lowvram indicator	2026-02-26 21:30:47 +08:00

1 2 3 4 5 ...

625 commits