Commit graph

1059 commits

Author SHA1 Message Date
Concedo
6039791adf minor bugfixes 2025-06-21 18:41:28 +08:00
Concedo
65ff041827 added more perf stats 2025-06-21 12:12:28 +08:00
Wagner Bruna
08adfb53c9
Configurable VAE threshold limit (#1601)
* add backend support for changing the VAE tiling threshold

* trigger VAE tiling by image area instead of dimensions

I've tested with GGML_VULKAN_MEMORY_DEBUG all resolutions with
the same 768x768 area (even extremes like 64x9216), and many
below that: all consistently allocate 6656 bytes per image pixel.
As tiling is primarily useful to avoid excessive memory usage, it
seems reasonable to enable VAE tiling based on area rather than
maximum image side.

However, as there is currently no user interface option to change
it back to a lower value, it's best to maintain the default
behavior for now.

* replace the notile option with a configurable threshold

This allows selecting a lower threshold value, reducing the
peak memory usage.

The legacy sdnotile parameter gets automatically converted to
the new parameter, if it's the only one supplied.

* simplify tiling checks, 768 default visible in launcher

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-06-21 10:14:57 +08:00
Concedo
2ba7803b95 replace_instruct_placeholders is now default 2025-06-20 22:11:58 +08:00
Concedo
4e40f2aaf4 added photomaker face cloning 2025-06-20 21:33:36 +08:00
Concedo
21881a861d rename restrict square to sdclampedsoft 2025-06-20 15:39:55 +08:00
Concedo
924dfa7cd3 bump version 2025-06-18 21:37:24 +08:00
Concedo
e0a7694328 try to set cuda pcie order first thing 2025-06-18 20:25:38 +08:00
Concedo
a8d33ebb0d increase genamt hardlimit from 0.1 to 0.2 ratio 2025-06-18 19:34:12 +08:00
Concedo
40443a98f5 show available RAM, fixed SD vae tiling noise 2025-06-18 18:44:50 +08:00
Concedo
7966bdd1ad allow embeddings model to use gpu 2025-06-18 00:46:30 +08:00
Concedo
ab29be54c4 comfyui compat - serve temporary upload endpoint for img2img 2025-06-16 23:18:47 +08:00
Concedo
861a2f5275 terminal title 2025-06-15 21:51:44 +08:00
Concedo
238be98efa Allow override config for gguf files when reloading in admin mode, updated lite, fixed typo (+1 squashed commits)
Squashed commits:

[fe14845cc] Allow override config for gguf files when reloading in admin mode, updated lite (+2 squashed commit)

Squashed commit:

[9ded66aa5] Allow override config for gguf files when reloading in admin mode

[9597f6a34] update lite
2025-06-14 12:00:20 +08:00
Wagner Bruna
f6d2d1ce5c
configurable resolution limit (#1586)
* refactor image gen configuration screen

* make image size limit configurable

* fix resolution limits and keep dimensions closer to the original ratio

* use 0.0 for the configured default image size limit

This prevents the current default value from being saved into the
config files, in case we later decide to adopt a different value.

* export image model version when loading

* restore model-specific default image size limit

* change the image area restriction to be specified by a square side

* move image resolution limits down to the C++ level

* Revert "export image model version when loading"

This reverts commit fa65b23de3.

* Linting Fixes:
PY:
- Inconsistent var name sd_restrict_square -> sd_restrict_square_var
- GUI swap back to using absolute row numbers for now.
- fstring fix
- size_limit -> side_limit inconsistency
C++:
- roundup_64 standalone function
- refactor sd_fix_resolution variable names for clarity
- move "anti crashing" hard total megapixel limit always to be applied after soft total megapixel limit instead of conditionally only when sd_restrict_square is unset

* allow unsafe resolutions if debugmode is on

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-06-13 20:05:20 +08:00
Concedo
1cbe716e45 allow setting maingpu 2025-06-12 17:53:43 +08:00
henk717
f151648f03 Pyinstaller launcher and dependency updates
This PR adds a new launcher executable to the unpack feature, eliminating the need to have python and its dependencies in the unpacked version. It also does a few dependency changes to help future proof.
2025-06-10 23:08:02 +08:00
Concedo
8386546e08 Switched VS2019 for revert cu12.1 build, hopefully solves dll issues
try change order (+3 squashed commit)

Squashed commit:

[457f02507] try newer jimver

[64af28862] windows pyinstaller shim. the final loader will be moved into the packed directory later.

[0272ecf2d] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm
try again (+3 squashed commit)

Squashed commit:

[133e81633] try without pwsh

[4d99cefba] try without pwsh

[bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm
2025-06-10 23:08:02 +08:00
Concedo
7d8aa31f1f fixed embeddings, added new parameter to limit max embeddings context 2025-06-10 01:11:55 +08:00
Concedo
8780b33c64 consolidate imports 2025-06-09 17:48:54 +08:00
Concedo
82d7c53b85 embeddings handle base64 2025-06-09 00:26:40 +08:00
Concedo
7f57846c2f update bundled vcrts 2025-06-08 19:39:42 +08:00
Concedo
dcf88d6e78 Revert "make tts use gpu by default. use --ttscpu to disable"
This reverts commit 669f80265b.
2025-06-08 17:08:04 +08:00
Concedo
669f80265b make tts use gpu by default. use --ttscpu to disable 2025-06-08 17:06:19 +08:00
Concedo
cfcdfd69bd allow embeddings models to use mmap 2025-06-07 10:14:00 +08:00
Concedo
6effb65cfe change singleinstance order 2025-06-06 21:20:30 +08:00
Concedo
740f91e3fd lower aria interval 2025-06-06 17:43:38 +08:00
Concedo
9cf32e5fee step limits over adapter for sd 2025-06-06 14:12:43 +08:00
Concedo
f6bbc350f2 various qol fixes 2025-06-05 10:26:02 +08:00
Concedo
736030bb9f save and load state upgraded to 3 available states 2025-06-04 22:09:40 +08:00
Concedo
06d2bc3404 ollama compat fixes 2025-06-04 19:22:29 +08:00
Concedo
53f1511396 use a static buffer for kv reloads instead. also, added into lite ui 2025-06-03 22:32:46 +08:00
Concedo
4b57108508 Save KV State and Load KV State to memory added. GUI not yet updated 2025-06-03 17:46:29 +08:00
Concedo
6ce85c54d6 not working correctly 2025-06-02 22:12:10 +08:00
Concedo
8e1ebc55b5 dropped support for lora base as upstream no longer uses it. If provided it will be silently ignored 2025-06-02 12:49:53 +08:00
Concedo
51dc1cf920 added scale for text lora 2025-06-02 00:13:42 +08:00
Concedo
74ef097c4a added ability to set koboldcpp as default handler for gguf and kcpps 2025-06-01 22:36:41 +08:00
Concedo
f3bb947a13 cuda use wmma flash attention for turing (+1 squashed commits)
Squashed commits:

[3c5112398] 117 (+10 squashed commit)

Squashed commit:

[4f01bb2d4] 117 graphs 80v

[7549034ea] 117 graphs

[dabf9cb99] checking if cuda 11.5.2 works

[ba7ccdb7a] another try cu11.7 only

[752cf2ae5] increase aria2c download log rate

[dc4f198fd] test send turing to wmma flash attention

[496a22e83] temp build test cu11.7.0

[ca759c424] temp build test cu11.7

[c46ada17c] test build: enable virtual80 for oldcpu

[3ccfd939a] test build: with cuda graphs for all
2025-06-01 11:41:45 +08:00
Concedo
b08dca65ed Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	common/CMakeLists.txt
#	common/arg.cpp
#	common/chat.cpp
#	examples/parallel/README.md
#	examples/parallel/parallel.cpp
#	ggml/cmake/common.cmake
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/rope.cpp
#	models/ggml-vocab-bert-bge.gguf.inp
#	models/ggml-vocab-bert-bge.gguf.out
#	models/ggml-vocab-command-r.gguf.inp
#	models/ggml-vocab-command-r.gguf.out
#	models/ggml-vocab-deepseek-coder.gguf.inp
#	models/ggml-vocab-deepseek-coder.gguf.out
#	models/ggml-vocab-deepseek-llm.gguf.inp
#	models/ggml-vocab-deepseek-llm.gguf.out
#	models/ggml-vocab-falcon.gguf.inp
#	models/ggml-vocab-falcon.gguf.out
#	models/ggml-vocab-gpt-2.gguf.inp
#	models/ggml-vocab-gpt-2.gguf.out
#	models/ggml-vocab-llama-bpe.gguf.inp
#	models/ggml-vocab-llama-bpe.gguf.out
#	models/ggml-vocab-llama-spm.gguf.inp
#	models/ggml-vocab-llama-spm.gguf.out
#	models/ggml-vocab-mpt.gguf.inp
#	models/ggml-vocab-mpt.gguf.out
#	models/ggml-vocab-phi-3.gguf.inp
#	models/ggml-vocab-phi-3.gguf.out
#	models/ggml-vocab-qwen2.gguf.inp
#	models/ggml-vocab-qwen2.gguf.out
#	models/ggml-vocab-refact.gguf.inp
#	models/ggml-vocab-refact.gguf.out
#	models/ggml-vocab-starcoder.gguf.inp
#	models/ggml-vocab-starcoder.gguf.out
#	requirements/requirements-gguf_editor_gui.txt
#	tests/CMakeLists.txt
#	tests/test-chat.cpp
#	tests/test-grammar-integration.cpp
#	tests/test-json-schema-to-grammar.cpp
#	tools/mtmd/CMakeLists.txt
#	tools/run/run.cpp
#	tools/server/CMakeLists.txt
2025-05-31 13:04:21 +08:00
Concedo
c923e9fe46 added option to unload model from admin control 2025-05-31 11:51:09 +08:00
Concedo
08e0745e7e added singleinstance flag and local shutdown api 2025-05-31 11:37:32 +08:00
Concedo
6529326c59 allow temperatures up to 1.0 when function calling 2025-05-30 15:59:18 +08:00
Concedo
c881bb7348 match a few common oai voices 2025-05-29 23:29:17 +08:00
Concedo
26bf5b446d fixed thread count <=0 , fixed clip skip <= 0 2025-05-28 00:38:15 +08:00
Concedo
f97bbdde00 fix to allow all EOGs to trigger a stop, occam's glm4 fix, 2025-05-24 22:55:11 +08:00
Concedo
ec04115ae9 swa options now available 2025-05-24 11:50:37 +08:00
Concedo
748dfcc2e4 massively improved tool calling 2025-05-24 02:26:11 +08:00
Concedo
c4df151298 experimental swa flag 2025-05-23 21:33:26 +08:00
Concedo
499283c63a rename define to match upstream 2025-05-23 17:10:12 +08:00
Concedo
e68a5f448c add ddim sampler 2025-05-22 21:28:01 +08:00