Commit graph

1042 commits

Author SHA1 Message Date
Concedo
4b348d0b7e add 2 more save slots 2025-07-22 21:07:19 +08:00
Concedo
75154a3d91 add ping endpoint 2025-07-22 18:55:35 +08:00
Concedo
30675b0798 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	CODEOWNERS
#	docs/build.md
#	scripts/sync-ggml.last
#	tests/test-backend-ops.cpp
#	tools/imatrix/README.md
#	tools/imatrix/imatrix.cpp
2025-07-20 22:47:31 +08:00
Concedo
b028dd4e84 minor fixes 2025-07-18 13:22:59 +08:00
Concedo
1ca666f9c1 allow handling multipart files up to 999 2025-07-18 01:18:28 +08:00
Concedo
afca31bfbe handle clean_env for remotetunnel 2025-07-17 18:21:22 +08:00
Concedo
d4a394ff73 label attached media with ids 2025-07-17 10:04:46 +08:00
Concedo
d3d5e36af6 backwards compat for older flags in config load 2025-07-15 22:05:57 +08:00
Concedo
51cac6f30c add a title to load config 2025-07-15 18:22:44 +08:00
Concedo
8396add5be removed hunyuan autoguess template, fixed multi file loading up to 999 parts 2025-07-15 17:49:49 +08:00
Concedo
b7f8d0fe2b handle inconsistent final message content being sent with finish_reason 2025-07-14 22:17:18 +08:00
Concedo
0e8f96414a add in backwards compatibility for older clients with incorrect json_schema passing 2025-07-14 17:41:08 +08:00
Concedo
e7eb6d3200 increase default ctx size to 8k, rename usecublas to usecuda 2025-07-13 18:27:42 +08:00
Concedo
811463a704 split audio and vision detection separately 2025-07-13 17:47:15 +08:00
Concedo
0938af7c83 fixed noscript image gen 2025-07-13 11:37:52 +08:00
Concedo
6f4f1b7389 allowing resuming incomplete aria2 downloads 2025-07-13 11:14:39 +08:00
Concedo
e9473305d0 wip2 (+1 squashed commits)
Squashed commits:

[4628777b6] wip
2025-07-12 18:54:40 +08:00
Concedo
57ce374240 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/ISSUE_TEMPLATE/010-bug-compilation.yml
#	.github/ISSUE_TEMPLATE/011-bug-results.yml
#	.github/labeler.yml
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	.gitmodules
#	CMakeLists.txt
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/softmax_4_f16.cl
#	ggml/src/ggml-opencl/kernels/softmax_4_f32.cl
#	ggml/src/ggml-opencl/kernels/softmax_f16.cl
#	ggml/src/ggml-opencl/kernels/softmax_f32.cl
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/element_wise.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.last
#	scripts/sync-ggml.sh
#	tests/test-backend-ops.cpp
#	tests/test-c.c
2025-07-05 12:16:28 +08:00
Concedo
ae0c6b02f8 print system info 2025-07-05 11:58:00 +08:00
Wagner Bruna
d74c16e6e0
enable flash attention for image generation (#1633) 2025-07-05 11:20:51 +08:00
Wagner Bruna
bc3e4c1197
remove obsolete warning about flash attention on Vulkan (#1634) 2025-07-03 16:57:03 +08:00
Concedo
f407aa3b8a emulated oai image generation 2025-07-02 16:01:56 +08:00
Concedo
cdda9d16e0 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/tools.sh
#	build-xcframework.sh
#	ci/run.sh
#	examples/Miku.sh
#	examples/chat-13B.sh
#	examples/chat-persistent.sh
#	examples/chat-vicuna.sh
#	examples/chat.sh
#	examples/jeopardy/jeopardy.sh
#	examples/reason-act.sh
#	examples/server-llama2-13B.sh
#	examples/sycl/build.sh
#	examples/sycl/run-llama2.sh
#	examples/sycl/run-llama3.sh
#	examples/ts-type-to-grammar.sh
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/element_wise.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	scripts/apple/validate-apps.sh
#	scripts/apple/validate-ios.sh
#	scripts/apple/validate-macos.sh
#	scripts/apple/validate-tvos.sh
#	scripts/apple/validate-visionos.sh
#	scripts/check-requirements.sh
#	scripts/ci-run.sh
#	scripts/compare-commits.sh
#	scripts/debug-test.sh
#	scripts/gen-authors.sh
#	scripts/get-hellaswag.sh
#	scripts/get-pg.sh
#	scripts/get-wikitext-103.sh
#	scripts/get-wikitext-2.sh
#	scripts/get-winogrande.sh
#	scripts/hf.sh
#	scripts/qnt-all.sh
#	scripts/run-all-perf.sh
#	scripts/run-all-ppl.sh
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.sh
#	scripts/tool_bench.sh
#	tests/test-backend-ops.cpp
#	tests/test-lora-conversion-inference.sh
#	tests/test-tokenizer-0.sh
#	tools/server/README.md
2025-06-30 20:38:44 +08:00
Concedo
b6edb79648 filter out empty entries 2025-06-28 20:22:34 +08:00
Concedo
a88c56e70c Merge remote-tracking branch 'origin/upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	examples/eval-callback/eval-callback.cpp
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-vulkan/CMakeLists.txt
#	ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
#	tests/test-backend-ops.cpp
2025-06-28 17:47:53 +08:00
Concedo
4ec0e0fd21 now accept multiple images for reference images 2025-06-28 17:30:28 +08:00
Concedo
0bd648ffa4 photomaker renamed to extra image to handle future extension 2025-06-28 10:26:06 +08:00
tsite
df47b51bd1
support python 3.13 (#1621) 2025-06-27 00:18:30 +08:00
Concedo
ace537d44e Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	CMakeLists.txt
#	examples/simple-chat/simple-chat.cpp
#	src/llama-quant.cpp
#	tools/run/run.cpp
#	tools/server/README.md
2025-06-24 23:06:16 +08:00
Concedo
2d822d3059 fixed a typo 2025-06-22 23:28:29 +08:00
Concedo
abc1d8ac25 better way of checking for avx2 support 2025-06-22 22:56:50 +08:00
Concedo
52dcfe42d6 try auto selecting correct backend while checking intrinsics 2025-06-22 18:16:02 +08:00
Concedo
72d467c6d5 vision is now working in ollama owui 2025-06-21 23:43:43 +08:00
Concedo
6039791adf minor bugfixes 2025-06-21 18:41:28 +08:00
Concedo
65ff041827 added more perf stats 2025-06-21 12:12:28 +08:00
Wagner Bruna
08adfb53c9
Configurable VAE threshold limit (#1601)
* add backend support for changing the VAE tiling threshold

* trigger VAE tiling by image area instead of dimensions

I've tested with GGML_VULKAN_MEMORY_DEBUG all resolutions with
the same 768x768 area (even extremes like 64x9216), and many
below that: all consistently allocate 6656 bytes per image pixel.
As tiling is primarily useful to avoid excessive memory usage, it
seems reasonable to enable VAE tiling based on area rather than
maximum image side.

However, as there is currently no user interface option to change
it back to a lower value, it's best to maintain the default
behavior for now.

* replace the notile option with a configurable threshold

This allows selecting a lower threshold value, reducing the
peak memory usage.

The legacy sdnotile parameter gets automatically converted to
the new parameter, if it's the only one supplied.

* simplify tiling checks, 768 default visible in launcher

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-06-21 10:14:57 +08:00
Concedo
2ba7803b95 replace_instruct_placeholders is now default 2025-06-20 22:11:58 +08:00
Concedo
4e40f2aaf4 added photomaker face cloning 2025-06-20 21:33:36 +08:00
Concedo
21881a861d rename restrict square to sdclampedsoft 2025-06-20 15:39:55 +08:00
Concedo
924dfa7cd3 bump version 2025-06-18 21:37:24 +08:00
Concedo
e0a7694328 try to set cuda pcie order first thing 2025-06-18 20:25:38 +08:00
Concedo
a8d33ebb0d increase genamt hardlimit from 0.1 to 0.2 ratio 2025-06-18 19:34:12 +08:00
Concedo
40443a98f5 show available RAM, fixed SD vae tiling noise 2025-06-18 18:44:50 +08:00
Concedo
7966bdd1ad allow embeddings model to use gpu 2025-06-18 00:46:30 +08:00
Concedo
ab29be54c4 comfyui compat - serve temporary upload endpoint for img2img 2025-06-16 23:18:47 +08:00
Concedo
861a2f5275 terminal title 2025-06-15 21:51:44 +08:00
Concedo
238be98efa Allow override config for gguf files when reloading in admin mode, updated lite, fixed typo (+1 squashed commits)
Squashed commits:

[fe14845cc] Allow override config for gguf files when reloading in admin mode, updated lite (+2 squashed commit)

Squashed commit:

[9ded66aa5] Allow override config for gguf files when reloading in admin mode

[9597f6a34] update lite
2025-06-14 12:00:20 +08:00
Wagner Bruna
f6d2d1ce5c
configurable resolution limit (#1586)
* refactor image gen configuration screen

* make image size limit configurable

* fix resolution limits and keep dimensions closer to the original ratio

* use 0.0 for the configured default image size limit

This prevents the current default value from being saved into the
config files, in case we later decide to adopt a different value.

* export image model version when loading

* restore model-specific default image size limit

* change the image area restriction to be specified by a square side

* move image resolution limits down to the C++ level

* Revert "export image model version when loading"

This reverts commit fa65b23de3.

* Linting Fixes:
PY:
- Inconsistent var name sd_restrict_square -> sd_restrict_square_var
- GUI swap back to using absolute row numbers for now.
- fstring fix
- size_limit -> side_limit inconsistency
C++:
- roundup_64 standalone function
- refactor sd_fix_resolution variable names for clarity
- move "anti crashing" hard total megapixel limit always to be applied after soft total megapixel limit instead of conditionally only when sd_restrict_square is unset

* allow unsafe resolutions if debugmode is on

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-06-13 20:05:20 +08:00
Concedo
1cbe716e45 allow setting maingpu 2025-06-12 17:53:43 +08:00
henk717
f151648f03 Pyinstaller launcher and dependency updates
This PR adds a new launcher executable to the unpack feature, eliminating the need to have python and its dependencies in the unpacked version. It also does a few dependency changes to help future proof.
2025-06-10 23:08:02 +08:00