Commit graph

1190 commits

Author SHA1 Message Date
Concedo
fedd529fdc autofit counts overheads 2025-12-21 14:31:08 +08:00
Concedo
9458e08346 fixed https://github.com/LostRuins/koboldcpp/issues/1892 2025-12-19 22:52:39 +08:00
Concedo
30fecac3a3 small tweak 2025-12-18 15:41:22 +08:00
Concedo
1e083d9c8b integrate autofit for upstream, removed forceversion 2025-12-17 18:42:47 +08:00
Concedo
9bc724f86c rearrage some elements in launcher 2025-12-17 17:00:26 +08:00
Concedo
cacfa37611 wip 2025-12-17 16:04:45 +08:00
Concedo
bca0258c2a bump default gen amount by 128 to 896 2025-12-14 22:17:31 +08:00
Concedo
e46a6a2796 better int parser 2025-12-13 09:28:10 +08:00
Concedo
ab9bc6f2ae zimage cfg clamp is opt out with remove_limits 2025-12-13 09:20:00 +08:00
Concedo
b714fe19e2 allow easy clamping of max cfg and steps 2025-12-12 15:22:37 +08:00
Concedo
d07d2c1b39 stub loras endpoint for comfy 2025-12-11 22:48:38 +08:00
Concedo
fd0d0cab03 move pipeline parallelism to a --pipelineparallel launch flag 2025-12-11 21:03:41 +08:00
Concedo
b7428048fc try reduce pipeline parallelism in order to reduce compute buffer sizes 2025-12-11 14:30:38 +08:00
Concedo
8a18e094f5 added smartcaching implementation inspired from Pento95 (+2 squashed commit)
Squashed commit:

[fcc498688] wip basic smart caching test

[b6e8b2577] wip basic smart caching test
2025-12-10 18:00:03 +08:00
Concedo
242ae8b8f3 http get cleanup 2025-12-08 19:51:55 +08:00
Concedo
8c17541cc0 modify llama.cpp branding on lcpp ui (+1 squashed commits)
Squashed commits:

[067343edf] modify llama.cpp branding on lcpp ui
2025-12-07 12:53:33 +08:00
Concedo
12d11cee5c added url to docs 2025-12-06 09:21:56 +08:00
Concedo
3550265249 add indent to kcpps files 2025-12-04 21:26:44 +08:00
Concedo
177e0d7515 strip_oaicontent_of_media placeholder (+2 squashed commit)
Squashed commit:

[7ccd52ef4] placeholder

[71fd2d7bb] strip_oaicontent_of_media
2025-12-01 01:29:57 +08:00
Concedo
bf5efcf86d Merge commit 'd82b7a7c1d' into concedo_experimental
# Conflicts:
#	ci/run.sh
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cuda/common.cuh
#	tests/CMakeLists.txt
2025-11-30 15:43:11 +08:00
Concedo
2985575be4 allow assistant prefills, fixed showgui issue 2025-11-30 12:52:28 +08:00
Concedo
925e7f8f6d added a secondary terminal mirror for linux 2025-11-29 21:53:51 +08:00
Concedo
9999b8950d cleaner resizing 2025-11-29 18:01:49 +08:00
Concedo
eda4a312cb Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/vulkan.Dockerfile
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-sycl/common.hpp
#	tests/test-backend-ops.cpp
#	tools/server/README.md
2025-11-28 13:22:02 +08:00
Concedo
e570478275 limit cuda arches + scale tweaks 2025-11-28 13:05:11 +08:00
Concedo
7527f1eff0 handle media for jinja path (+1 squashed commits)
Squashed commits:

[29d47d6b7] handle media for jinja path
2025-11-27 11:40:08 +08:00
Concedo
d68f4a5ae5 disable clip fa for now 2025-11-27 10:20:38 +08:00
Concedo
2b00292bfe display path on 404 2025-11-27 10:07:08 +08:00
Concedo
c12f9e3b7c bump version 2025-11-27 01:04:09 +08:00
Wagner Bruna
998dfcd1be
sd: add an API endpoint to list the available schedulers (#1856) 2025-11-26 22:49:36 +08:00
Concedo
d9b9c54393 added another alias to jinjatools 2025-11-26 18:52:08 +08:00
Concedo
9b6320cd71 adjust launcher scaling behavior 2025-11-25 21:32:03 +08:00
Concedo
9a7f749f7c minor tweak for sd 2025-11-24 22:31:03 +08:00
Wagner Bruna
3a7dd1a97f sd: sync to master-358-347710f
Also adapt Koboldcpp LoRA loading function, and add
backend support for lora_apply_mode.
2025-11-23 19:28:54 -03:00
Concedo
1cc4403cba updated llama.cpp web ui (+2 squashed commit)
Squashed commit:

[9b22ac6e4] more fixes for lcpp web ui,. will be squashed

[522b59b4c] henky tries using svelte or something
2025-11-24 00:43:27 +08:00
Rose
eeb7363985
improvements to tool calling logic (merged changes from old PR branch) (#1855)
* improvements to tool calling logic (merged changes from old PR branch)

* added some tweaks for improved tool calls to reuse old ctx, but needs testing. refer to PR.

* fixes to some stuff that concedo's modifications broke

* fixed error in reasoning

* extremely hacky way to cache tool list please fix

* oops forgot to add this

* slightly less hacky way to preserve the tool list in context

* prevented unintended toolcalls from happening when LLM states something irrelevant to toolcall decision

* fixed something that broke koboldlite

* fixed bug added by concedo that broke jinja tools

* experimental further compression of tools array, needs testing

* reverted experimental further compression of tools array

* final cleanup

* add newline after memory insert

* changed tool reasoning to always be in json format to enforce including final decision

* used new json format to skip extra llm call when not necessary

* more catching of possible bad llm output

* further cleanup

* got it down to just one llm call!

* better json format

* even better json format

* further refinement to json format

* further refinement to json format

* fixed broken tool calling

* single-call enforced json method now seems to work well. removed fallbacks as they are no longer required.

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-11-23 22:41:31 +08:00
Concedo
b281d2554a add a resume after horde worker pauses 2025-11-18 22:48:18 +08:00
Concedo
8631bbcee3 linting 2025-11-18 18:56:31 +08:00
LostRuins Concedo
281542aa0d add smoothing curve, not tested 2025-11-17 23:07:35 +08:00
LostRuins Concedo
ea22e04320 filename insenstive search for adapters 2025-11-15 09:48:22 +08:00
LostRuins Concedo
357bef3082 add toggle for jinja tools 2025-11-12 17:29:42 +08:00
LostRuins Concedo
95291a93df
rosie fixes: add format normalization for tools and tool call streaming fixes (#1842) 2025-11-11 23:06:27 +08:00
LostRuins Concedo
cdc18f0945 linting (+1 squashed commits)
Squashed commits:

[994427d3c] linting
2025-11-10 20:54:44 +08:00
Wagner Bruna
2ae6bff5bd
split memory detection functions and add debug command (#1832) 2025-11-10 18:07:15 +08:00
LostRuins Concedo
60a74bdd89 make tool calling work with jinja. but still need to fix qwen omni first (+1 squashed commits)
Squashed commits:

[e394da61e] make tool calling work with jinja. but still need to fix qwen omni first
2025-11-09 16:56:14 +08:00
LostRuins Concedo
055fdcef63 update model path
jinja tojson
2025-11-08 21:51:50 +08:00
LostRuins Concedo
af94884971 update props 2025-11-08 10:15:13 +08:00
LostRuins Concedo
92b5afc019 flag to show if jinja is enabled 2025-11-08 00:49:50 +08:00
LostRuins Concedo
462a34ed5b jinja is now working 2025-11-07 23:46:22 +08:00
LostRuins Concedo
cfb22b5c9d rename a missed BLAS -> batch 2025-11-06 16:11:26 +08:00