Concedo
4090400dff
improved gemma toolcall handling
2026-04-25 09:51:29 +08:00
Concedo
cfb14bd844
fixed more args
2026-04-23 11:11:24 +08:00
Concedo
68e238857f
fixed args
2026-04-23 11:00:42 +08:00
Concedo
c818716f57
router mode fixed for parallel requests
2026-04-21 22:33:46 +08:00
Concedo
96ec87127a
updated colab, handle connection dropping during prompt processing
2026-04-21 21:46:13 +08:00
Concedo
1feba4e4ea
fixed koboldcpp.sh, fixed vision max/min when one param is missing, fixed processing count wrong, updated lite
2026-04-21 18:36:47 +08:00
Concedo
c17ba99812
change time.sleep to asyncio
2026-04-20 23:25:35 +08:00
Concedo
fe4c1b80a1
fix unwanted error print
2026-04-20 13:48:57 +08:00
Concedo
a8290a072f
more robust json field handling
2026-04-19 23:27:19 +08:00
Concedo
707bb67b30
minimal uses 10% of budget
2026-04-19 20:19:45 +08:00
Concedo
71b4107bb6
fixed terminal logs
2026-04-19 11:31:12 +08:00
Concedo
8886e48a4a
cache sd info
2026-04-19 02:19:11 +08:00
Wagner Bruna
1be08b9d15
sd: report all sampler aliases and centralize name mapping ( #2149 )
...
* debug: allow loading backend libraries without normal arg parsing
This is just to be able to test backend functions directly, with e.g.:
>> import koboldcpp
>> koboldcpp.init_libraries()
>> koboldcpp.sd_get_info()
* sd: report all sampler aliases and centralize name mapping
2026-04-19 01:51:42 +08:00
Concedo
e5eab545f3
handle override jinja template
2026-04-19 00:30:28 +08:00
Concedo
17c754a5fc
improved reasoning budget
2026-04-18 17:19:09 +08:00
Concedo
0b37cb9a57
added preliminary support for reasoning budget
2026-04-18 11:56:33 +08:00
Concedo
9a38091207
support q5_1 kv
2026-04-17 17:06:15 +08:00
Concedo
e074939c17
compact context GUI page (+1 squashed commits)
...
Squashed commits:
[136f073ce] compact context GUI page
2026-04-17 14:40:53 +08:00
Concedo
aed18cc901
swa padding default to 0
2026-04-17 10:54:14 +08:00
Concedo
ae292c496e
handle SWA conflicting with rewind, increased default SWA padding.
2026-04-16 17:00:26 +08:00
Concedo
0251c6dbde
added swa padding controls
2026-04-16 16:21:48 +08:00
Concedo
a9e817fb4c
smartcache off when fastforward off
2026-04-16 15:29:23 +08:00
Concedo
535df844dd
touchup for min/max tokens ui
2026-04-16 14:56:22 +08:00
Llama
c592bd01da
Pass img_min_params and img_max_params to ctx_clip_params ( #2133 )
...
* Pass img_min_params and img_max_params to ctx_clip_params
These values determine the minimum and maximum size (in
tokens) of vision embeddings. The default value of -1
uses a model-dependent default size, for example for
Gemma 4 the default is a 280 token embedding. For higher
quality results (at the cost of using more memory and
slower speed) you can increase the size of the embedding
to 1120 tokens.
* Change dict to mydict to match change to method
2026-04-16 12:27:06 +08:00
Concedo
a9f9e9a38b
rename the filepaths for clarity (+1 squashed commits)
...
Squashed commits:
[fa8fc6914] rename the filepaths for clarity
2026-04-16 12:17:23 +08:00
Concedo
45737effd3
refactor for clarity
2026-04-16 10:53:35 +08:00
Rose
2f67e9f096
new baseconfig setting that aworks in router mode ( #2130 )
...
* new baseconfig setting that aworks in router mode
* re-added fix that prevents unneccessary model reload
* fixed the fix
* swapped order of baseconfig <-> override
* fix indent
* simplify baseconfig, if specified AND restart_override_config_target is NOT, it simply replaces the field (+1 squashed commits)
Squashed commits:
[95e816b16] simplify baseconfig, if specified AND restart_override_config_target is NOT, it simply replaces the field
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2026-04-15 22:50:47 +08:00
Concedo
c6b59fc2c7
autoswap some edge conditions
2026-04-14 23:02:29 +08:00
Concedo
3f810dc8c7
fixed preload story import for large stories
2026-04-13 23:27:55 +08:00
Concedo
c984147c84
fix quotes
2026-04-13 22:50:08 +08:00
Concedo
5a3369fd2a
support for gpt oss jinja
2026-04-12 16:13:51 +08:00
Concedo
4084917cab
fixed token counting limit (+1 squashed commits)
...
Squashed commits:
[314528eb2] fixed token counting limit, set to max supported ctx of 256k
2026-04-12 15:36:03 +08:00
Concedo
f07dcbf7af
allow tokencount to handle messages
2026-04-12 11:46:37 +08:00
Concedo
6556161804
jinja tool streaming is now finally working
2026-04-12 02:05:39 +08:00
Concedo
c4abba8868
almost working
2026-04-12 01:44:41 +08:00
Concedo
3175da0873
cleanup - do not use tool calls from kai api, only
2026-04-11 12:19:48 +08:00
Wagner Bruna
f4fbd94129
sdapi: add job_timestamp field to info result ( #2110 )
2026-04-11 09:28:53 +08:00
Concedo
0f278d93b3
better image handling in jinja
2026-04-10 22:19:01 +08:00
Concedo
b962335c99
fix rosie toolcall issue
2026-04-10 21:23:33 +08:00
Concedo
bcf499e5bf
fix gemma tool calling
2026-04-10 20:51:40 +08:00
Concedo
ffdc3ba49e
tool coercion fixes
2026-04-10 18:25:07 +08:00
Concedo
618db91e3d
should pass tc08 now
2026-04-09 23:15:34 +08:00
Concedo
cfcbfd571a
fix think leaking in sync mode
2026-04-09 21:29:56 +08:00
Concedo
c82c0b463a
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/labeler.yml
# .github/workflows/release.yml
# examples/debug/debug.cpp
# ggml/src/ggml-cuda/common.cuh
# ggml/src/ggml-cuda/mmq.cuh
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# src/llama-vocab.cpp
# tests/test-backend-ops.cpp
# tests/test-chat.cpp
# tests/test-json-schema-to-grammar.cpp
# tools/mtmd/CMakeLists.txt
2026-04-09 17:45:04 +08:00
Concedo
f6199d42e1
tool response type coercion
2026-04-09 12:59:57 +08:00
Concedo
77d0ddb486
even better tool calls
2026-04-08 23:40:42 +08:00
Concedo
d9ed4b444b
multiuser default 10
2026-04-07 23:42:29 +08:00
Concedo
5e16453f0c
fixed a bug in chat completions think handling
2026-04-07 00:16:34 +08:00
Concedo
a395af65db
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build-riscv.yml
# .github/workflows/build.yml
# ggml/src/ggml-hexagon/htp/argsort-ops.c
# ggml/src/ggml-sycl/fattn-tile.hpp
# tools/mtmd/CMakeLists.txt
2026-04-06 20:56:02 +08:00
Concedo
a309086735
Revert "increase debug mode truncation limit"
...
This reverts commit 59f863746d .
2026-04-06 18:51:12 +08:00