Commit graph

1318 commits

Author SHA1 Message Date
Concedo
eafb5ff4c5 autofit improvement e.g. for strix (+1 squashed commits)
Squashed commits:

[6f6fd59c3] autofit improvement e.g. for strix
2026-03-10 21:20:02 +08:00
Concedo
270d4ad2c1 fixed a typo 2026-03-08 12:56:08 +08:00
Concedo
73fc5c4767 handle jinja exceptions 2026-03-08 12:12:02 +08:00
Concedo
41df8b09e5 jinjatools now works mostly well 2026-03-08 11:55:22 +08:00
Concedo
2c38638b3d Merge commit '2afcdb9777' into concedo_experimental
# Conflicts:
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
2026-03-06 21:13:15 +08:00
Gustavo Rocha Dias
cbecc34667
Fix OAI-compatible token usage and unique request IDs (#2015)
* fix: token usage fix for mistral-vibe

* fix: generate unique request IDs for OAI-compatible responses

* fix: prompt_tokens reporting KV cache size instead of actual count during streaming

* fixes for PR #2015
For (1), this is not a good idea. If it returned 0 (e.g. during an error), this value may not be updated and will return the value of a previous or different request. It's better to return 0 in those cases.
For (2), this is a good idea but we don't need that level of randomness. I'll probably swap it with a 6 digit random number instead.
For (3), the official openai spec gates it behind stream_options.include_usage = true so i'll do that too

* missed 1 item

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2026-03-06 20:57:22 +08:00
Concedo
8658af1018 qwen3tts default to cpu unless gpu selected 2026-03-05 11:11:46 +08:00
Concedo
5d35193749 fixed a sse stream issue 2026-03-03 21:30:28 +08:00
Concedo
7df210833e missed one case for autofit 2026-03-03 21:05:59 +08:00
Concedo
d7fb3df10a support 1 level deep admindir 2026-03-02 16:23:34 +08:00
Concedo
c9e651f7e5 updated lite, fix some cuda spams, fix qwen3tts voice loading 2026-03-01 00:41:56 +08:00
Wagner Bruna
5c40f07d4a
sd: sync to 0752cc9 (master-507-b314d80 +1) (#1999)
* sd: sync to 0752cc9 (master-507-b314d80 +1)

* sd: add flow-shift support to gendefaults
2026-02-28 12:22:32 +08:00
Concedo
14d82bb38e allow music llm and diffusion gen models to be loaded independently 2026-02-27 21:56:48 +08:00
Concedo
ba42f22fc8 stereo is working 2026-02-27 20:36:44 +08:00
Wagner Bruna
d400b37215
config file saving enhancements (#1994)
* process --exportconfig and --exporttemplate after --config

This allows using `--config oldfile.kcpps --exportconfig newfile.kcpps`
to update old config items, copy a config file with changed parameters,
download and save a remote config, etc.

* filter out command flags from the saved config files

Also ident files saved by command-line.
2026-02-26 14:55:01 +08:00
Concedo
5c5fe55f7d bump kv overrides max (+1 squashed commits)
Squashed commits:

[9bc8212a0] bump kv overrides max
2026-02-26 00:24:53 +08:00
Concedo
8a3ccfcba5 some fixes but some issues 2026-02-25 23:41:32 +08:00
Concedo
11a85d62fc lowvram for music lm 2026-02-24 22:21:17 +08:00
Concedo
488c431331 not yet working 2026-02-24 17:47:50 +08:00
Concedo
2e713cfff5 fixed compile issue, trying out 8bit pcm 2026-02-23 21:19:03 +08:00
Concedo
c2b0cb26a8 ace step codes api 2026-02-23 14:04:45 +08:00
Concedo
4be93db21c ace step codes generation now working 2026-02-23 00:27:26 +08:00
Concedo
13db5aee9e stub files for loading ace step 2026-02-22 23:15:08 +08:00
Concedo
73f3ffaeb7 fix followup tool call check with assistant prefills 2026-02-22 10:33:00 +08:00
Concedo
78b4b87e54 fixed compile issue for tts on ci (+1 squashed commits)
Squashed commits:

[d6f778499] fixed compile issue for tts on ci
2026-02-22 02:28:11 +08:00
Concedo
5536fb29f2 add some default voices for qwen3tts 2026-02-21 23:45:15 +08:00
Concedo
2db018a1d7 qwen3tts support reference audio 2026-02-21 17:30:21 +08:00
Concedo
72219fdbf5 basic qwen3 tts working 2026-02-21 12:03:53 +08:00
Concedo
ad0618e351 bump defaults, updated lite, fixed glm4.7 autoguess template 2026-02-21 08:51:53 +08:00
Concedo
4115f1c54d fixed tts for outetts 2026-02-20 14:27:36 +08:00
Concedo
bf3f2e1ba8 support loading multiple sd loras (up to 4 at once) 2026-02-19 13:57:58 +08:00
Concedo
a089284d13 fixed autofit breaking file association auto backend select issues 2026-02-18 23:35:01 +08:00
Concedo
05d6188408 try disable dpi awareness 2026-02-18 20:59:31 +08:00
Concedo
a380d23ff1 fix typo 2026-02-18 20:15:17 +08:00
Concedo
a82a429fba possible fix for broken pipe due to timeouts - send some data first 2026-02-17 20:29:24 +08:00
Concedo
faf322c83f minor fix 2026-02-17 19:39:05 +08:00
Concedo
dbc3db0d99 updated sdui 2026-02-14 23:06:17 +08:00
Concedo
cb5755bc96 reworked soft limit default restrictions for sd image gen 2026-02-12 17:53:04 +08:00
Concedo
d432844cb9 added ollama show endpoint (+1 squashed commits)
Squashed commits:

[65f7bb220] added ollama show endpoint
2026-02-12 17:36:42 +08:00
Concedo
7a2fb8ec7c Revert "try improve mcp"
This reverts commit 10bf868088.
2026-02-10 21:41:59 +08:00
Concedo
10bf868088 try improve mcp 2026-02-10 15:33:39 +08:00
Concedo
9a33257742 prevent download dir being changed by config 2026-02-09 16:43:04 +08:00
Concedo
098af866dd fractional scale adjust 2026-02-08 15:32:02 +08:00
Concedo
6ded3f04e4 patch for output filenames 2026-02-08 13:05:52 +08:00
Concedo
5962330dde -1 layer triggers autofit if tensor split and override tensors is not set 2026-02-07 22:31:01 +08:00
Concedo
404b5fe659 wip reworking auto mode 2026-02-07 19:33:50 +08:00
Concedo
5cf21443bc added autofit padding. autofit is now in the quick menu 2026-02-07 18:29:30 +08:00
Concedo
78bba94f72 autofit hides gpu layer inputs entirely 2026-02-07 17:17:19 +08:00
Concedo
bab1c4ca50 proper handling of common think tags in lcpp ui jinja mode 2026-02-07 17:05:21 +08:00
Concedo
a0a78dacc4 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	pyproject.toml
#	requirements/requirements-convert_legacy_llama.txt
#	src/CMakeLists.txt
#	src/llama-vocab.cpp
#	tests/test-backend-ops.cpp
2026-02-07 15:54:02 +08:00