Commit graph

1324 commits

Author SHA1 Message Date
Wagner Bruna
796f7bdeff
sd: fix LoRA multiplier logic to switch to at_runtime mode (#2029)
`0. in inputs.lora_multipliers` didn't work because the C array has
variable length.

Also fixed a few corner cases related to the default multipliers
(mainly to ensure robustness against future changes, since in most
cases the multiplier list is already sanitized by a previous
function).
2026-03-12 15:36:51 +08:00
Concedo
3cc6e2ea17 make stereo default 2026-03-12 00:10:25 +08:00
Concedo
211d4fe632 lots of tweaks for ace step 2026-03-11 23:57:52 +08:00
Concedo
8095bf9807 include overhead fromn music models 2026-03-10 22:52:20 +08:00
Concedo
b06dd2606e ruff: linting 2026-03-10 21:32:36 +08:00
Wagner Bruna
3f42ed1af7
support for customizing LoRA multipliers through the sdapi (#1982)
* fix corner case in sd_oai_transform_params

Also fix typo in the function name.

* support for customizing loaded LoRA multipliers

The `sdloramult` flag now accepts a list of multipliers, one for each
LoRA. If all multipliers are non-zero, LoRAs load as before, with no extra
VRAM usage or performance impact.

If any LoRA has a multiplier of 0, we switch to `at_runtime` mode, and these
LoRAs will be available to multiplier changes via the `lora` sdapi field and
show up in the `sdapi/v1/loras` endpoint. All LoRAs are still preloaded on
startup, and cached to avoid file reloads.

If the list of multipliers is shorter than the list of LoRAs, the multiplier
list is extended with the first multiplier (1.0 by default), to keep it
compatible with the previous behavior.

* support for `<lora:name:multiplier>` prompt syntax and metadata

* add a few tests for sanitize_lora_multipliers
2026-03-10 21:29:39 +08:00
Concedo
eafb5ff4c5 autofit improvement e.g. for strix (+1 squashed commits)
Squashed commits:

[6f6fd59c3] autofit improvement e.g. for strix
2026-03-10 21:20:02 +08:00
Concedo
270d4ad2c1 fixed a typo 2026-03-08 12:56:08 +08:00
Concedo
73fc5c4767 handle jinja exceptions 2026-03-08 12:12:02 +08:00
Concedo
41df8b09e5 jinjatools now works mostly well 2026-03-08 11:55:22 +08:00
Concedo
2c38638b3d Merge commit '2afcdb9777' into concedo_experimental
# Conflicts:
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
2026-03-06 21:13:15 +08:00
Gustavo Rocha Dias
cbecc34667
Fix OAI-compatible token usage and unique request IDs (#2015)
* fix: token usage fix for mistral-vibe

* fix: generate unique request IDs for OAI-compatible responses

* fix: prompt_tokens reporting KV cache size instead of actual count during streaming

* fixes for PR #2015
For (1), this is not a good idea. If it returned 0 (e.g. during an error), this value may not be updated and will return the value of a previous or different request. It's better to return 0 in those cases.
For (2), this is a good idea but we don't need that level of randomness. I'll probably swap it with a 6 digit random number instead.
For (3), the official openai spec gates it behind stream_options.include_usage = true so i'll do that too

* missed 1 item

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2026-03-06 20:57:22 +08:00
Concedo
8658af1018 qwen3tts default to cpu unless gpu selected 2026-03-05 11:11:46 +08:00
Concedo
5d35193749 fixed a sse stream issue 2026-03-03 21:30:28 +08:00
Concedo
7df210833e missed one case for autofit 2026-03-03 21:05:59 +08:00
Concedo
d7fb3df10a support 1 level deep admindir 2026-03-02 16:23:34 +08:00
Concedo
c9e651f7e5 updated lite, fix some cuda spams, fix qwen3tts voice loading 2026-03-01 00:41:56 +08:00
Wagner Bruna
5c40f07d4a
sd: sync to 0752cc9 (master-507-b314d80 +1) (#1999)
* sd: sync to 0752cc9 (master-507-b314d80 +1)

* sd: add flow-shift support to gendefaults
2026-02-28 12:22:32 +08:00
Concedo
14d82bb38e allow music llm and diffusion gen models to be loaded independently 2026-02-27 21:56:48 +08:00
Concedo
ba42f22fc8 stereo is working 2026-02-27 20:36:44 +08:00
Wagner Bruna
d400b37215
config file saving enhancements (#1994)
* process --exportconfig and --exporttemplate after --config

This allows using `--config oldfile.kcpps --exportconfig newfile.kcpps`
to update old config items, copy a config file with changed parameters,
download and save a remote config, etc.

* filter out command flags from the saved config files

Also ident files saved by command-line.
2026-02-26 14:55:01 +08:00
Concedo
5c5fe55f7d bump kv overrides max (+1 squashed commits)
Squashed commits:

[9bc8212a0] bump kv overrides max
2026-02-26 00:24:53 +08:00
Concedo
8a3ccfcba5 some fixes but some issues 2026-02-25 23:41:32 +08:00
Concedo
11a85d62fc lowvram for music lm 2026-02-24 22:21:17 +08:00
Concedo
488c431331 not yet working 2026-02-24 17:47:50 +08:00
Concedo
2e713cfff5 fixed compile issue, trying out 8bit pcm 2026-02-23 21:19:03 +08:00
Concedo
c2b0cb26a8 ace step codes api 2026-02-23 14:04:45 +08:00
Concedo
4be93db21c ace step codes generation now working 2026-02-23 00:27:26 +08:00
Concedo
13db5aee9e stub files for loading ace step 2026-02-22 23:15:08 +08:00
Concedo
73f3ffaeb7 fix followup tool call check with assistant prefills 2026-02-22 10:33:00 +08:00
Concedo
78b4b87e54 fixed compile issue for tts on ci (+1 squashed commits)
Squashed commits:

[d6f778499] fixed compile issue for tts on ci
2026-02-22 02:28:11 +08:00
Concedo
5536fb29f2 add some default voices for qwen3tts 2026-02-21 23:45:15 +08:00
Concedo
2db018a1d7 qwen3tts support reference audio 2026-02-21 17:30:21 +08:00
Concedo
72219fdbf5 basic qwen3 tts working 2026-02-21 12:03:53 +08:00
Concedo
ad0618e351 bump defaults, updated lite, fixed glm4.7 autoguess template 2026-02-21 08:51:53 +08:00
Concedo
4115f1c54d fixed tts for outetts 2026-02-20 14:27:36 +08:00
Concedo
bf3f2e1ba8 support loading multiple sd loras (up to 4 at once) 2026-02-19 13:57:58 +08:00
Concedo
a089284d13 fixed autofit breaking file association auto backend select issues 2026-02-18 23:35:01 +08:00
Concedo
05d6188408 try disable dpi awareness 2026-02-18 20:59:31 +08:00
Concedo
a380d23ff1 fix typo 2026-02-18 20:15:17 +08:00
Concedo
a82a429fba possible fix for broken pipe due to timeouts - send some data first 2026-02-17 20:29:24 +08:00
Concedo
faf322c83f minor fix 2026-02-17 19:39:05 +08:00
Concedo
dbc3db0d99 updated sdui 2026-02-14 23:06:17 +08:00
Concedo
cb5755bc96 reworked soft limit default restrictions for sd image gen 2026-02-12 17:53:04 +08:00
Concedo
d432844cb9 added ollama show endpoint (+1 squashed commits)
Squashed commits:

[65f7bb220] added ollama show endpoint
2026-02-12 17:36:42 +08:00
Concedo
7a2fb8ec7c Revert "try improve mcp"
This reverts commit 10bf868088.
2026-02-10 21:41:59 +08:00
Concedo
10bf868088 try improve mcp 2026-02-10 15:33:39 +08:00
Concedo
9a33257742 prevent download dir being changed by config 2026-02-09 16:43:04 +08:00
Concedo
098af866dd fractional scale adjust 2026-02-08 15:32:02 +08:00
Concedo
6ded3f04e4 patch for output filenames 2026-02-08 13:05:52 +08:00