Commit graph

1372 commits

Author SHA1 Message Date
Concedo
674b7f5eee indicate support for claude messages api 2026-03-29 00:57:58 +08:00
Concedo
e3b7905e1c added anthropic messages api support 2026-03-29 00:55:32 +08:00
Concedo
5ad9e3ee31 crude openai responses streaming 2026-03-29 00:16:30 +08:00
Concedo
1e787cd03a improve responses api 2026-03-28 18:42:15 +08:00
Wagner Bruna
e3c6227d46
sd: report back image generation parameters and metadata (#2062)
* sd: refactor image generation result handling

* sd: report back image generation metadata
2026-03-28 00:49:03 +08:00
Concedo
0c2b679ea3 support bf16 quantkv cache type 2026-03-28 00:01:17 +08:00
Concedo
326542f480 rudimentary responses api, not usable yet 2026-03-27 23:38:08 +08:00
scottf007
f0818e1eae
Add socket timeout to is_port_in_use() to fix ~280s startup delay on WSL2 (#2077)
On WSL2 with networkingMode=mirrored, connect_ex() to non-listening ports
gets black-holed through the Windows host networking stack instead of
returning ECONNREFUSED. Without a timeout, TCP SYN retransmits with
exponential backoff (1+2+4+8+16+32+64 ≈ 127s per port), causing Router
Mode's port scan of 15001-15010 to stall for ~280 seconds on startup.

Adding a 1-second timeout makes connect_ex() fail fast, reducing startup
from ~303s to ~23s on affected systems.

Tested on WSL2 Ubuntu 24.04 with mirrored networking, KoboldCpp v1.110,
RTX 3090 Ti, Qwen3.5-27B Q4_K_M.
2026-03-27 22:50:59 +08:00
Concedo
a03998bed6 added jinja kwargs support 2026-03-27 00:28:59 +08:00
Concedo
c91f350ed5 increase max images, take images from the end instead of beginning if too many images 2026-03-26 23:03:52 +08:00
Concedo
39938e19d3 allow router mode to auto-wake other endpoints if put to sleep by auto unload 2026-03-25 23:17:20 +08:00
Concedo
24ab1c1451 upgrade musicui to do tts, show musicui for tts models (+1 squashed commits)
Squashed commits:

[975630b15] upgrade musicui to do tts
2026-03-25 00:24:44 +08:00
Concedo
8437c346a7 fixed tts instruction regex, encapsulate thinking by default 2026-03-24 13:53:46 +08:00
Concedo
9e9028b1a9 fixed cpu mis-selection 2026-03-23 21:30:57 +08:00
Concedo
0d50cafd8b added CustomVoice support 2026-03-23 18:50:08 +08:00
Concedo
0aa6f21c88 jinja prefill fixed 2026-03-22 14:55:44 +08:00
Concedo
79e39e1989 fixed a help menu bug, updated colab (+1 squashed commits)
Squashed commits:

[618478e00] fixed a help menu bug, updated colab
2026-03-22 01:00:30 +08:00
Concedo
89e2397014 updatede lite, up ver (+1 squashed commits)
Squashed commits:

[f1f899070] up version
2026-03-21 17:42:58 +08:00
Concedo
fdfb713d91 added --sdmaingpu allowing image models to be independently placed on any gpu 2026-03-21 17:34:12 +08:00
Concedo
a3d3800f3e added passthrough mode for esrgan upscale, triggered by img2img denoise 0.0 with 1 step 2026-03-21 16:19:10 +08:00
Concedo
58a585d0e7 popular templates section in help menu 2026-03-21 15:37:07 +08:00
Concedo
c4b1a17e1a tools debug 2026-03-19 23:13:02 +08:00
Concedo
2f63f94fd8 fix router nocertify mode 2026-03-19 12:45:19 +08:00
Concedo
8cf9ba34e9 fixed SSL in routermode 2026-03-19 12:43:11 +08:00
Concedo
15e86010d8 autofit will clear moecpu and overridetensors 2026-03-18 21:20:57 +08:00
Concedo
d85272a958 fixed wrong encoding (+1 squashed commits)
Squashed commits:

[a87d059a8] fixed wrong encoding
2026-03-17 15:54:54 +08:00
Concedo
e09ddc8fff mcp fix (+1 squashed commits)
Squashed commits:

[c5a959a07] mcp fix
2026-03-17 15:45:05 +08:00
Concedo
837fe9d832 mcp stdio fixes 2026-03-17 15:34:05 +08:00
Concedo
39f9007d12 handle notifications in mcp 2026-03-17 15:13:42 +08:00
Concedo
6d3f01d139 compact css, fix .py variable name error 2026-03-17 11:11:46 +08:00
henk717
927d3c68bb
502 Loading page (#2042)
* Proper Loading page

* Loading page wording

* Different wording
2026-03-17 10:59:44 +08:00
Wagner Bruna
51187d5362
sd: support changing preloaded LoRA multipliers (#2041)
* sd: remove C++ support for enforcing fixed LoRA multipliers

The logic at the Python level is enough.

* sd: support changing preloaded LoRA multipliers

We keep the same rules as before:
- Any LoRA with multiplier 0 can be changed
- If all LoRAs have multiplier != 0, they are fixed and optimized

but tweak the corner case of LoRAs specified more than once to
allow adjusting the multiplier if the same LoRA is also specified
with a zero multiplier, as if they were two different LoRAs.

So the following keeps working as before:
- --sdlora /loras/lcm.gguf --sdloramult 1 : fixed as 1
- --sdlora /loras/lcm.gguf --sdloramult 0 : dynamic, default 0
- --sdlora /loras/ : dynamic, default 0
- --sdlora /loras/lcm.gguf /loras/lcm.gguf --sdloramult 1 1 : fixed as 2

But now we have:
- --sdlora /loras/lcm.gguf /loras/lcm.gguf --sdloramult 1 0 : dynamic, default 1
- --sdlora /loras/lcm.gguf /loras/ --sdloramult 1 : dynamic, default 1
2026-03-17 10:09:55 +08:00
Wagner Bruna
0c66ed863d
sd: additional validation for the LoRA list (#2043)
* sd: additional validation for the LoRA list

* sd: sanitize LoRA list before downloading
2026-03-17 10:09:10 +08:00
Concedo
ea15dfab83 added auto unload for admin mode 2026-03-16 23:56:34 +08:00
Concedo
b656b7c929 router works with singleinstance 2026-03-16 22:53:24 +08:00
Concedo
6c8c55afb5 PR cleanup 2026-03-16 21:04:16 +08:00
Wagner Bruna
feea014774
sd: support for dynamic LoRA loading from a directory (#2036)
* backend support for controlling LoRA cache and fixed multipliers

The generation LoRA multipliers are now added to the initial
multipliers, so e.g. a merged LCM model will behave the same as
a normal model with a preloaded LCM LoRA.

* frontend support
2026-03-16 20:39:21 +08:00
Concedo
310bd97972 router mode is fully functional 2026-03-15 23:07:47 +08:00
Concedo
5a65005d27 allow swap back to initial model 2026-03-15 21:35:23 +08:00
Concedo
ccd4745e0c ollama streaming emulation 2026-03-15 18:25:37 +08:00
Concedo
2e725e4f10 fixed port bug 2026-03-15 17:00:50 +08:00
Wagner Bruna
b437d18319
add support for cache modes to accelerate image generation (#2021)
* sd: sync to master-525-d6dd6d7

* sd: add support for cache modes for inference acceleration

* keep gendefaults as a JSON object inside the config file

* covered more invalid cases on gendefaults parsing
2026-03-15 15:27:14 +08:00
Concedo
3b9385a627 updated colab, wip model router 2026-03-15 00:38:29 +08:00
Concedo
8b9594b6ea wip router mode 2026-03-14 17:07:05 +08:00
Concedo
6143a75426 improve autofit padding heuristics 2026-03-14 00:36:52 +08:00
Concedo
8f23b8d81e wip on ref audio, but it compiles 2026-03-12 23:46:10 +08:00
Concedo
d5a4c17e14 mp3 not default 2026-03-12 21:42:59 +08:00
Concedo
3fd9648726 added mp3 support 2026-03-12 21:00:50 +08:00
Wagner Bruna
796f7bdeff
sd: fix LoRA multiplier logic to switch to at_runtime mode (#2029)
`0. in inputs.lora_multipliers` didn't work because the C array has
variable length.

Also fixed a few corner cases related to the default multipliers
(mainly to ensure robustness against future changes, since in most
cases the multiplier list is already sanitized by a previous
function).
2026-03-12 15:36:51 +08:00
Concedo
3cc6e2ea17 make stereo default 2026-03-12 00:10:25 +08:00