Commit graph

1171 commits

Author SHA1 Message Date
Concedo
bf5efcf86d Merge commit 'd82b7a7c1d' into concedo_experimental
# Conflicts:
#	ci/run.sh
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cuda/common.cuh
#	tests/CMakeLists.txt
2025-11-30 15:43:11 +08:00
Concedo
2985575be4 allow assistant prefills, fixed showgui issue 2025-11-30 12:52:28 +08:00
Concedo
925e7f8f6d added a secondary terminal mirror for linux 2025-11-29 21:53:51 +08:00
Concedo
9999b8950d cleaner resizing 2025-11-29 18:01:49 +08:00
Concedo
eda4a312cb Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/vulkan.Dockerfile
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-sycl/common.hpp
#	tests/test-backend-ops.cpp
#	tools/server/README.md
2025-11-28 13:22:02 +08:00
Concedo
e570478275 limit cuda arches + scale tweaks 2025-11-28 13:05:11 +08:00
Concedo
7527f1eff0 handle media for jinja path (+1 squashed commits)
Squashed commits:

[29d47d6b7] handle media for jinja path
2025-11-27 11:40:08 +08:00
Concedo
d68f4a5ae5 disable clip fa for now 2025-11-27 10:20:38 +08:00
Concedo
2b00292bfe display path on 404 2025-11-27 10:07:08 +08:00
Concedo
c12f9e3b7c bump version 2025-11-27 01:04:09 +08:00
Wagner Bruna
998dfcd1be
sd: add an API endpoint to list the available schedulers (#1856) 2025-11-26 22:49:36 +08:00
Concedo
d9b9c54393 added another alias to jinjatools 2025-11-26 18:52:08 +08:00
Concedo
9b6320cd71 adjust launcher scaling behavior 2025-11-25 21:32:03 +08:00
Concedo
9a7f749f7c minor tweak for sd 2025-11-24 22:31:03 +08:00
Wagner Bruna
3a7dd1a97f sd: sync to master-358-347710f
Also adapt Koboldcpp LoRA loading function, and add
backend support for lora_apply_mode.
2025-11-23 19:28:54 -03:00
Concedo
1cc4403cba updated llama.cpp web ui (+2 squashed commit)
Squashed commit:

[9b22ac6e4] more fixes for lcpp web ui,. will be squashed

[522b59b4c] henky tries using svelte or something
2025-11-24 00:43:27 +08:00
Rose
eeb7363985
improvements to tool calling logic (merged changes from old PR branch) (#1855)
* improvements to tool calling logic (merged changes from old PR branch)

* added some tweaks for improved tool calls to reuse old ctx, but needs testing. refer to PR.

* fixes to some stuff that concedo's modifications broke

* fixed error in reasoning

* extremely hacky way to cache tool list please fix

* oops forgot to add this

* slightly less hacky way to preserve the tool list in context

* prevented unintended toolcalls from happening when LLM states something irrelevant to toolcall decision

* fixed something that broke koboldlite

* fixed bug added by concedo that broke jinja tools

* experimental further compression of tools array, needs testing

* reverted experimental further compression of tools array

* final cleanup

* add newline after memory insert

* changed tool reasoning to always be in json format to enforce including final decision

* used new json format to skip extra llm call when not necessary

* more catching of possible bad llm output

* further cleanup

* got it down to just one llm call!

* better json format

* even better json format

* further refinement to json format

* further refinement to json format

* fixed broken tool calling

* single-call enforced json method now seems to work well. removed fallbacks as they are no longer required.

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-11-23 22:41:31 +08:00
Concedo
b281d2554a add a resume after horde worker pauses 2025-11-18 22:48:18 +08:00
Concedo
8631bbcee3 linting 2025-11-18 18:56:31 +08:00
LostRuins Concedo
281542aa0d add smoothing curve, not tested 2025-11-17 23:07:35 +08:00
LostRuins Concedo
ea22e04320 filename insenstive search for adapters 2025-11-15 09:48:22 +08:00
LostRuins Concedo
357bef3082 add toggle for jinja tools 2025-11-12 17:29:42 +08:00
LostRuins Concedo
95291a93df
rosie fixes: add format normalization for tools and tool call streaming fixes (#1842) 2025-11-11 23:06:27 +08:00
LostRuins Concedo
cdc18f0945 linting (+1 squashed commits)
Squashed commits:

[994427d3c] linting
2025-11-10 20:54:44 +08:00
Wagner Bruna
2ae6bff5bd
split memory detection functions and add debug command (#1832) 2025-11-10 18:07:15 +08:00
LostRuins Concedo
60a74bdd89 make tool calling work with jinja. but still need to fix qwen omni first (+1 squashed commits)
Squashed commits:

[e394da61e] make tool calling work with jinja. but still need to fix qwen omni first
2025-11-09 16:56:14 +08:00
LostRuins Concedo
055fdcef63 update model path
jinja tojson
2025-11-08 21:51:50 +08:00
LostRuins Concedo
af94884971 update props 2025-11-08 10:15:13 +08:00
LostRuins Concedo
92b5afc019 flag to show if jinja is enabled 2025-11-08 00:49:50 +08:00
LostRuins Concedo
462a34ed5b jinja is now working 2025-11-07 23:46:22 +08:00
LostRuins Concedo
cfb22b5c9d rename a missed BLAS -> batch 2025-11-06 16:11:26 +08:00
LostRuins Concedo
978d755ddc escape clause for tool calling 2025-11-05 22:02:24 +08:00
LostRuins Concedo
3e4a33499f updated lite 2025-11-05 20:52:47 +08:00
LostRuins Concedo
6ddacb62a0 serve gzipped versions of files. added a modded lcpp gui with modified path handling and proper stream termination, see https://github.com/ggml-org/llama.cpp/pull/14839#issuecomment-3490987929 2025-11-05 20:40:30 +08:00
Concedo
333e2bb30b fix for qwen image crashing due to ref images being too big, trial and error shows it happens after 512x512 2025-11-02 01:31:01 +08:00
xzuyn
988baa544e
add JobRate and JobCost to worker log (#1820)
- adds average jobs per hour
- adds average kudos earned per job
- change EarnRate to show 2 decimal places
2025-11-01 10:01:13 +08:00
Concedo
d229774e11 added compatibility endpoint for VITS api 2025-10-26 17:35:10 +08:00
Concedo
b730c99ecb fixed a typo 2025-10-26 10:06:59 +08:00
Concedo
57e1d9c822 rename blasbatchsize to batchsize 2025-10-24 18:16:54 +08:00
Concedo
68c9d955d2 support multiple override kv 2025-10-24 17:28:54 +08:00
Concedo
7446e03851 send logprobs in streaming for oai 2025-10-21 18:23:56 +08:00
Concedo
7d20e6bdb3 updated layer count to be more accurate +1 instead of +3 2025-10-18 15:29:07 +08:00
Concedo
f6916ba864 updated sdui 2025-10-17 13:56:45 +08:00
Concedo
45a02ae534 rename blas to just batching 2025-10-16 16:27:51 +08:00
Concedo
4eaf05dfeb handle oai without v1 prefix 2025-10-16 02:16:49 +08:00
Concedo
dfeccea3a1 added shitty fractional scaling support for GNOME. but really just use KDE 2025-10-15 22:28:04 +08:00
Concedo
8b787866c6 fixed a typo 2025-10-13 11:14:38 +08:00
Concedo
1a360b8458 sdcpp: optimize the handling of the FeedForward precision fix (+1 squashed commits)
Squashed commits:

[621ff6392] sdcpp: optimize the handling of the FeedForward precision fix (+1 squashed commits)

Squashed commits:

[05b16906c] sdcpp: optimize the handling of the FeedForward precision fix
2025-10-12 17:49:38 +08:00
Concedo
a0ed446e61 handle numbers outside int32 range with wrapping 2025-10-12 12:46:45 +08:00
Wagner Bruna
9f9494cf3f
sd: add 'default' to the list of supported samplers (#1788) 2025-10-12 12:35:56 +08:00