Concedo
bf5efcf86d
Merge commit ' d82b7a7c1d' into concedo_experimental
...
# Conflicts:
# ci/run.sh
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cuda/common.cuh
# tests/CMakeLists.txt
2025-11-30 15:43:11 +08:00
Concedo
2985575be4
allow assistant prefills, fixed showgui issue
2025-11-30 12:52:28 +08:00
Concedo
925e7f8f6d
added a secondary terminal mirror for linux
2025-11-29 21:53:51 +08:00
Concedo
9999b8950d
cleaner resizing
2025-11-29 18:01:49 +08:00
Concedo
eda4a312cb
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/vulkan.Dockerfile
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-sycl/common.hpp
# tests/test-backend-ops.cpp
# tools/server/README.md
2025-11-28 13:22:02 +08:00
Concedo
e570478275
limit cuda arches + scale tweaks
2025-11-28 13:05:11 +08:00
Concedo
7527f1eff0
handle media for jinja path (+1 squashed commits)
...
Squashed commits:
[29d47d6b7] handle media for jinja path
2025-11-27 11:40:08 +08:00
Concedo
d68f4a5ae5
disable clip fa for now
2025-11-27 10:20:38 +08:00
Concedo
2b00292bfe
display path on 404
2025-11-27 10:07:08 +08:00
Concedo
c12f9e3b7c
bump version
2025-11-27 01:04:09 +08:00
Wagner Bruna
998dfcd1be
sd: add an API endpoint to list the available schedulers ( #1856 )
2025-11-26 22:49:36 +08:00
Concedo
d9b9c54393
added another alias to jinjatools
2025-11-26 18:52:08 +08:00
Concedo
9b6320cd71
adjust launcher scaling behavior
2025-11-25 21:32:03 +08:00
Concedo
9a7f749f7c
minor tweak for sd
2025-11-24 22:31:03 +08:00
Wagner Bruna
3a7dd1a97f
sd: sync to master-358-347710f
...
Also adapt Koboldcpp LoRA loading function, and add
backend support for lora_apply_mode.
2025-11-23 19:28:54 -03:00
Concedo
1cc4403cba
updated llama.cpp web ui (+2 squashed commit)
...
Squashed commit:
[9b22ac6e4] more fixes for lcpp web ui,. will be squashed
[522b59b4c] henky tries using svelte or something
2025-11-24 00:43:27 +08:00
Rose
eeb7363985
improvements to tool calling logic (merged changes from old PR branch) ( #1855 )
...
* improvements to tool calling logic (merged changes from old PR branch)
* added some tweaks for improved tool calls to reuse old ctx, but needs testing. refer to PR.
* fixes to some stuff that concedo's modifications broke
* fixed error in reasoning
* extremely hacky way to cache tool list please fix
* oops forgot to add this
* slightly less hacky way to preserve the tool list in context
* prevented unintended toolcalls from happening when LLM states something irrelevant to toolcall decision
* fixed something that broke koboldlite
* fixed bug added by concedo that broke jinja tools
* experimental further compression of tools array, needs testing
* reverted experimental further compression of tools array
* final cleanup
* add newline after memory insert
* changed tool reasoning to always be in json format to enforce including final decision
* used new json format to skip extra llm call when not necessary
* more catching of possible bad llm output
* further cleanup
* got it down to just one llm call!
* better json format
* even better json format
* further refinement to json format
* further refinement to json format
* fixed broken tool calling
* single-call enforced json method now seems to work well. removed fallbacks as they are no longer required.
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-11-23 22:41:31 +08:00
Concedo
b281d2554a
add a resume after horde worker pauses
2025-11-18 22:48:18 +08:00
Concedo
8631bbcee3
linting
2025-11-18 18:56:31 +08:00
LostRuins Concedo
281542aa0d
add smoothing curve, not tested
2025-11-17 23:07:35 +08:00
LostRuins Concedo
ea22e04320
filename insenstive search for adapters
2025-11-15 09:48:22 +08:00
LostRuins Concedo
357bef3082
add toggle for jinja tools
2025-11-12 17:29:42 +08:00
LostRuins Concedo
95291a93df
rosie fixes: add format normalization for tools and tool call streaming fixes ( #1842 )
2025-11-11 23:06:27 +08:00
LostRuins Concedo
cdc18f0945
linting (+1 squashed commits)
...
Squashed commits:
[994427d3c ] linting
2025-11-10 20:54:44 +08:00
Wagner Bruna
2ae6bff5bd
split memory detection functions and add debug command ( #1832 )
2025-11-10 18:07:15 +08:00
LostRuins Concedo
60a74bdd89
make tool calling work with jinja. but still need to fix qwen omni first (+1 squashed commits)
...
Squashed commits:
[e394da61e] make tool calling work with jinja. but still need to fix qwen omni first
2025-11-09 16:56:14 +08:00
LostRuins Concedo
055fdcef63
update model path
...
jinja tojson
2025-11-08 21:51:50 +08:00
LostRuins Concedo
af94884971
update props
2025-11-08 10:15:13 +08:00
LostRuins Concedo
92b5afc019
flag to show if jinja is enabled
2025-11-08 00:49:50 +08:00
LostRuins Concedo
462a34ed5b
jinja is now working
2025-11-07 23:46:22 +08:00
LostRuins Concedo
cfb22b5c9d
rename a missed BLAS -> batch
2025-11-06 16:11:26 +08:00
LostRuins Concedo
978d755ddc
escape clause for tool calling
2025-11-05 22:02:24 +08:00
LostRuins Concedo
3e4a33499f
updated lite
2025-11-05 20:52:47 +08:00
LostRuins Concedo
6ddacb62a0
serve gzipped versions of files. added a modded lcpp gui with modified path handling and proper stream termination, see https://github.com/ggml-org/llama.cpp/pull/14839#issuecomment-3490987929
2025-11-05 20:40:30 +08:00
Concedo
333e2bb30b
fix for qwen image crashing due to ref images being too big, trial and error shows it happens after 512x512
2025-11-02 01:31:01 +08:00
xzuyn
988baa544e
add JobRate and JobCost to worker log ( #1820 )
...
- adds average jobs per hour
- adds average kudos earned per job
- change EarnRate to show 2 decimal places
2025-11-01 10:01:13 +08:00
Concedo
d229774e11
added compatibility endpoint for VITS api
2025-10-26 17:35:10 +08:00
Concedo
b730c99ecb
fixed a typo
2025-10-26 10:06:59 +08:00
Concedo
57e1d9c822
rename blasbatchsize to batchsize
2025-10-24 18:16:54 +08:00
Concedo
68c9d955d2
support multiple override kv
2025-10-24 17:28:54 +08:00
Concedo
7446e03851
send logprobs in streaming for oai
2025-10-21 18:23:56 +08:00
Concedo
7d20e6bdb3
updated layer count to be more accurate +1 instead of +3
2025-10-18 15:29:07 +08:00
Concedo
f6916ba864
updated sdui
2025-10-17 13:56:45 +08:00
Concedo
45a02ae534
rename blas to just batching
2025-10-16 16:27:51 +08:00
Concedo
4eaf05dfeb
handle oai without v1 prefix
2025-10-16 02:16:49 +08:00
Concedo
dfeccea3a1
added shitty fractional scaling support for GNOME. but really just use KDE
2025-10-15 22:28:04 +08:00
Concedo
8b787866c6
fixed a typo
2025-10-13 11:14:38 +08:00
Concedo
1a360b8458
sdcpp: optimize the handling of the FeedForward precision fix (+1 squashed commits)
...
Squashed commits:
[621ff6392] sdcpp: optimize the handling of the FeedForward precision fix (+1 squashed commits)
Squashed commits:
[05b16906c] sdcpp: optimize the handling of the FeedForward precision fix
2025-10-12 17:49:38 +08:00
Concedo
a0ed446e61
handle numbers outside int32 range with wrapping
2025-10-12 12:46:45 +08:00
Wagner Bruna
9f9494cf3f
sd: add 'default' to the list of supported samplers ( #1788 )
2025-10-12 12:35:56 +08:00