Concedo
fedd529fdc
autofit counts overheads
2025-12-21 14:31:08 +08:00
Concedo
9458e08346
fixed https://github.com/LostRuins/koboldcpp/issues/1892
2025-12-19 22:52:39 +08:00
Concedo
30fecac3a3
small tweak
2025-12-18 15:41:22 +08:00
Concedo
1e083d9c8b
integrate autofit for upstream, removed forceversion
2025-12-17 18:42:47 +08:00
Concedo
9bc724f86c
rearrage some elements in launcher
2025-12-17 17:00:26 +08:00
Concedo
cacfa37611
wip
2025-12-17 16:04:45 +08:00
Concedo
bca0258c2a
bump default gen amount by 128 to 896
2025-12-14 22:17:31 +08:00
Concedo
e46a6a2796
better int parser
2025-12-13 09:28:10 +08:00
Concedo
ab9bc6f2ae
zimage cfg clamp is opt out with remove_limits
2025-12-13 09:20:00 +08:00
Concedo
b714fe19e2
allow easy clamping of max cfg and steps
2025-12-12 15:22:37 +08:00
Concedo
d07d2c1b39
stub loras endpoint for comfy
2025-12-11 22:48:38 +08:00
Concedo
fd0d0cab03
move pipeline parallelism to a --pipelineparallel launch flag
2025-12-11 21:03:41 +08:00
Concedo
b7428048fc
try reduce pipeline parallelism in order to reduce compute buffer sizes
2025-12-11 14:30:38 +08:00
Concedo
8a18e094f5
added smartcaching implementation inspired from Pento95 (+2 squashed commit)
...
Squashed commit:
[fcc498688] wip basic smart caching test
[b6e8b2577] wip basic smart caching test
2025-12-10 18:00:03 +08:00
Concedo
242ae8b8f3
http get cleanup
2025-12-08 19:51:55 +08:00
Concedo
8c17541cc0
modify llama.cpp branding on lcpp ui (+1 squashed commits)
...
Squashed commits:
[067343edf] modify llama.cpp branding on lcpp ui
2025-12-07 12:53:33 +08:00
Concedo
12d11cee5c
added url to docs
2025-12-06 09:21:56 +08:00
Concedo
3550265249
add indent to kcpps files
2025-12-04 21:26:44 +08:00
Concedo
177e0d7515
strip_oaicontent_of_media placeholder (+2 squashed commit)
...
Squashed commit:
[7ccd52ef4] placeholder
[71fd2d7bb] strip_oaicontent_of_media
2025-12-01 01:29:57 +08:00
Concedo
bf5efcf86d
Merge commit ' d82b7a7c1d' into concedo_experimental
...
# Conflicts:
# ci/run.sh
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cuda/common.cuh
# tests/CMakeLists.txt
2025-11-30 15:43:11 +08:00
Concedo
2985575be4
allow assistant prefills, fixed showgui issue
2025-11-30 12:52:28 +08:00
Concedo
925e7f8f6d
added a secondary terminal mirror for linux
2025-11-29 21:53:51 +08:00
Concedo
9999b8950d
cleaner resizing
2025-11-29 18:01:49 +08:00
Concedo
eda4a312cb
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/vulkan.Dockerfile
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-sycl/common.hpp
# tests/test-backend-ops.cpp
# tools/server/README.md
2025-11-28 13:22:02 +08:00
Concedo
e570478275
limit cuda arches + scale tweaks
2025-11-28 13:05:11 +08:00
Concedo
7527f1eff0
handle media for jinja path (+1 squashed commits)
...
Squashed commits:
[29d47d6b7] handle media for jinja path
2025-11-27 11:40:08 +08:00
Concedo
d68f4a5ae5
disable clip fa for now
2025-11-27 10:20:38 +08:00
Concedo
2b00292bfe
display path on 404
2025-11-27 10:07:08 +08:00
Concedo
c12f9e3b7c
bump version
2025-11-27 01:04:09 +08:00
Wagner Bruna
998dfcd1be
sd: add an API endpoint to list the available schedulers ( #1856 )
2025-11-26 22:49:36 +08:00
Concedo
d9b9c54393
added another alias to jinjatools
2025-11-26 18:52:08 +08:00
Concedo
9b6320cd71
adjust launcher scaling behavior
2025-11-25 21:32:03 +08:00
Concedo
9a7f749f7c
minor tweak for sd
2025-11-24 22:31:03 +08:00
Wagner Bruna
3a7dd1a97f
sd: sync to master-358-347710f
...
Also adapt Koboldcpp LoRA loading function, and add
backend support for lora_apply_mode.
2025-11-23 19:28:54 -03:00
Concedo
1cc4403cba
updated llama.cpp web ui (+2 squashed commit)
...
Squashed commit:
[9b22ac6e4] more fixes for lcpp web ui,. will be squashed
[522b59b4c] henky tries using svelte or something
2025-11-24 00:43:27 +08:00
Rose
eeb7363985
improvements to tool calling logic (merged changes from old PR branch) ( #1855 )
...
* improvements to tool calling logic (merged changes from old PR branch)
* added some tweaks for improved tool calls to reuse old ctx, but needs testing. refer to PR.
* fixes to some stuff that concedo's modifications broke
* fixed error in reasoning
* extremely hacky way to cache tool list please fix
* oops forgot to add this
* slightly less hacky way to preserve the tool list in context
* prevented unintended toolcalls from happening when LLM states something irrelevant to toolcall decision
* fixed something that broke koboldlite
* fixed bug added by concedo that broke jinja tools
* experimental further compression of tools array, needs testing
* reverted experimental further compression of tools array
* final cleanup
* add newline after memory insert
* changed tool reasoning to always be in json format to enforce including final decision
* used new json format to skip extra llm call when not necessary
* more catching of possible bad llm output
* further cleanup
* got it down to just one llm call!
* better json format
* even better json format
* further refinement to json format
* further refinement to json format
* fixed broken tool calling
* single-call enforced json method now seems to work well. removed fallbacks as they are no longer required.
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-11-23 22:41:31 +08:00
Concedo
b281d2554a
add a resume after horde worker pauses
2025-11-18 22:48:18 +08:00
Concedo
8631bbcee3
linting
2025-11-18 18:56:31 +08:00
LostRuins Concedo
281542aa0d
add smoothing curve, not tested
2025-11-17 23:07:35 +08:00
LostRuins Concedo
ea22e04320
filename insenstive search for adapters
2025-11-15 09:48:22 +08:00
LostRuins Concedo
357bef3082
add toggle for jinja tools
2025-11-12 17:29:42 +08:00
LostRuins Concedo
95291a93df
rosie fixes: add format normalization for tools and tool call streaming fixes ( #1842 )
2025-11-11 23:06:27 +08:00
LostRuins Concedo
cdc18f0945
linting (+1 squashed commits)
...
Squashed commits:
[994427d3c ] linting
2025-11-10 20:54:44 +08:00
Wagner Bruna
2ae6bff5bd
split memory detection functions and add debug command ( #1832 )
2025-11-10 18:07:15 +08:00
LostRuins Concedo
60a74bdd89
make tool calling work with jinja. but still need to fix qwen omni first (+1 squashed commits)
...
Squashed commits:
[e394da61e] make tool calling work with jinja. but still need to fix qwen omni first
2025-11-09 16:56:14 +08:00
LostRuins Concedo
055fdcef63
update model path
...
jinja tojson
2025-11-08 21:51:50 +08:00
LostRuins Concedo
af94884971
update props
2025-11-08 10:15:13 +08:00
LostRuins Concedo
92b5afc019
flag to show if jinja is enabled
2025-11-08 00:49:50 +08:00
LostRuins Concedo
462a34ed5b
jinja is now working
2025-11-07 23:46:22 +08:00
LostRuins Concedo
cfb22b5c9d
rename a missed BLAS -> batch
2025-11-06 16:11:26 +08:00