Commit graph

1250 commits

Author SHA1 Message Date
Concedo
7f485e5287 remove CLBlast, part 1 2026-01-23 13:50:12 +08:00
Concedo
28091dec43 pipeline parallel default enable 2026-01-21 17:57:41 +08:00
Concedo
cdd6578a9a esrgan added 2026-01-20 22:10:37 +08:00
Concedo
c9c15749e0 wip on adding esrgan upscaling 2026-01-20 00:35:35 +08:00
Concedo
d827494f17 fix text for vae (+1 squashed commits)
Squashed commits:

[793caed19] fix text
2026-01-19 01:50:07 +08:00
Concedo
7f618454ff Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/labeler.yml
#	CODEOWNERS
#	docs/backend/OPENCL.md
#	docs/ops.md
#	docs/ops/CANN.csv
#	docs/ops/WebGPU.csv
#	ggml/src/ggml-blas/CMakeLists.txt
#	ggml/src/ggml-opencl/kernels/mul_mv_q6_k.cl
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/cpy.tmpl.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/set_rows.wgsl
#	tests/test-backend-ops.cpp
2026-01-18 23:24:29 +08:00
Llama
95ebfdcde8
Add token ids to logprob data returned by the API (#1928)
Previously, logprobs only contained the token string
and byte data, as well as the log probability itself.
For workflows that require the token id, translating
from the token bytes to the token id is potentially
costly and unreliable. It is simple and inexpensive
to expose the numeric token ids directly instead.
2026-01-18 16:30:46 +08:00
Concedo
7b4517c2fe embeddings memory usage regression fix 2026-01-18 16:26:52 +08:00
Concedo
3816391a74 increase logprobs returned to 10 2026-01-18 11:13:42 +08:00
Concedo
22ddad81b9 device override set in gui 2026-01-18 10:54:20 +08:00
Concedo
89a205ecc7 bump version 2026-01-17 19:09:14 +08:00
Concedo
62bea5ef4f allow overriding the devices directly 2026-01-17 19:08:06 +08:00
Concedo
8855a7f52b Merge commit 'c945aaaef2' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	README.md
#	common/CMakeLists.txt
#	common/chat.cpp
#	docs/function-calling.md
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tests/peg-parser/tests.h
#	tests/test-chat-peg-parser.cpp
#	tests/test-chat-template.cpp
#	tests/test-chat.cpp
#	tests/testing.h
#	tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
a5204d2363 fixed mcp command location 2026-01-17 00:09:46 +08:00
Concedo
c332bb614c better mcp error messages 2026-01-16 17:55:34 +08:00
Concedo
612c19afe7 interrogate max length increased 2026-01-13 11:06:05 +08:00
Concedo
3752040165 default to continue assistant turns 2026-01-12 23:12:27 +08:00
Concedo
fc51d8b216 fix prop type for tools 2026-01-12 18:28:00 +08:00
Concedo
4bf6d9eb9a trying with fa on by default 2026-01-07 11:38:45 +08:00
Concedo
3108fe740c Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	examples/model-conversion/logits.cpp
#	examples/retrieval/retrieval.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/unary_op.wgsl
2026-01-06 20:49:01 +08:00
Concedo
bd51d775be Merge branch 'concedo' into concedo_experimental 2026-01-05 21:04:42 +08:00
Concedo
1fc405b8b6 1.105.4 2026-01-05 21:01:42 +08:00
Concedo
c9308570b2 added mcp to list of capabilities, allow it to run standalone 2026-01-05 20:32:25 +08:00
Concedo
301a04adfc Merge branch 'concedo' into concedo_experimental 2026-01-05 15:24:43 +08:00
Concedo
9a4eeafbfc hotfix 1.105.3 2026-01-05 15:24:21 +08:00
Concedo
4d3866a016 mcp proxy is done 2026-01-05 12:24:43 +08:00
Concedo
91089ad1bd wip on mcp 2026-01-04 22:52:47 +08:00
Concedo
01c70a7d3d allow transcribe to be used with the LLM instead if no whisper model exists 2026-01-04 11:06:05 +08:00
Concedo
e4abf643fa Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	ggml/src/ggml-hexagon/htp/act-ops.c
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	src/CMakeLists.txt
#	src/llama-vocab.cpp
2026-01-03 15:37:30 +08:00
Concedo
77082dddfb mcp image handling 2026-01-03 00:03:05 +08:00
Concedo
d8942cde14 smartcache allow custom number of slots 2026-01-02 17:19:40 +08:00
Concedo
0a23388e7d added images in tool call queries 2026-01-02 10:48:34 +08:00
Concedo
442fa7cd7c support for circular textures in sdcpp 2026-01-01 16:34:09 +08:00
Concedo
03df0c40f3 if gendefaults is set, horde has debug flag 2026-01-01 00:54:57 +08:00
Concedo
329c0e7e32 mini qol to prevent fake tool calls 2025-12-29 17:54:27 +08:00
Concedo
58d8635827 fixed autofit 2025-12-28 23:15:06 +08:00
Concedo
07fb18a04b handle case differences 2025-12-28 21:41:56 +08:00
Concedo
21d801f6d5 init total weight for adaptive p 2025-12-28 15:33:06 +08:00
Concedo
ec95655f3c fixed default handling for special keys 2025-12-28 13:56:05 +08:00
Concedo
27261bfc26 adaptive decay as an overridable param (+1 squashed commits)
Squashed commits:

[d94df7843] adaptive decay as an overridable param
2025-12-28 13:34:20 +08:00
Concedo
1051313cb2 added deprecated item sdgendefaults (+1 squashed commits)
Squashed commits:

[efc14a5d9] fixed sd error
2025-12-27 22:47:43 +08:00
Concedo
f5282e114d allow ANY api field to have specified defaults, and to be overwritten by value specified at load time 2025-12-27 18:57:04 +08:00
Concedo
6548645aaa rename power law sampler to adaptive p 2025-12-27 17:50:58 +08:00
Concedo
91d8863f18 power law sampler added 2025-12-27 09:46:06 +08:00
Concedo
399fc9c57e rename tokens tab to context, move fa to hardware 2025-12-26 00:06:07 +08:00
Concedo
cf4201e213 wip power law sampling 2025-12-25 22:01:16 +08:00
Concedo
afe41b6eea Merge branch 'concedo_experimental' of https://github.com/LostRuins/koboldcpp into concedo_experimental 2025-12-24 23:42:52 +08:00
Concedo
d1983959d2 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/release.yml
#	AGENTS.md
#	common/CMakeLists.txt
#	docs/development/parsing.md
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-vulkan/ggml-vulkan.cpp
#	tests/test-arg-parser.cpp
#	tests/test-backend-ops.cpp
#	tests/test-grammar-llguidance.cpp
#	tests/test-tokenizer-0.cpp
#	tests/test-tokenizer-1-bpe.cpp
#	tests/test-tokenizer-1-spm.cpp
#	tools/batched-bench/batched-bench.cpp
#	tools/cli/cli.cpp
#	tools/llama-bench/llama-bench.cpp
#	tools/server/README.md
2025-12-24 23:42:28 +08:00
Wagner Bruna
f30da43b7f
sd: get the available schedulers directly from sd.cpp (#1900)
Avoids a hardcoded list on the Python side.
2025-12-24 21:55:24 +08:00
Concedo
26d89bf589 support for downloading AVI from sdui 2025-12-24 18:40:10 +08:00