Commit graph

1272 commits

Author SHA1 Message Date
Concedo
5cf21443bc added autofit padding. autofit is now in the quick menu 2026-02-07 18:29:30 +08:00
Concedo
78bba94f72 autofit hides gpu layer inputs entirely 2026-02-07 17:17:19 +08:00
Concedo
bab1c4ca50 proper handling of common think tags in lcpp ui jinja mode 2026-02-07 17:05:21 +08:00
Concedo
a0a78dacc4 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	pyproject.toml
#	requirements/requirements-convert_legacy_llama.txt
#	src/CMakeLists.txt
#	src/llama-vocab.cpp
#	tests/test-backend-ops.cpp
2026-02-07 15:54:02 +08:00
Concedo
3d99b87506 add downloaddir 2026-02-06 14:34:02 +08:00
Reithan
de3ed7d7d6
add missing resolve_refs call to enable subschema use (#1959) 2026-02-05 22:12:59 +08:00
Concedo
ceb548f407 update text (+1 squashed commits)
Squashed commits:

[2a1532783] update text
2026-02-05 22:11:10 +08:00
Concedo
0e907e23fb Revamped help menu 2026-02-05 17:34:39 +08:00
Concedo
30c74d5cce fixed mcp bug 2026-02-04 20:46:55 +08:00
Concedo
4b073f3aa0 fix sse parsing in mcp 2026-02-04 20:38:33 +08:00
Concedo
349c461453 add stop reason for error 2026-02-04 20:23:18 +08:00
Wagner Bruna
d9ac52a01a
sd: sync to master-492-f957fa3 (#1957)
* sd: sync to master-492-f957fa3

* add Res Multistep and Res 2s samplers

* make sdflashattention control flash_attn too
2026-02-04 16:12:39 +08:00
Concedo
dfa725c58d make the dpi fix more universal. not a perfect solution 2026-02-03 19:49:38 +08:00
Concedo
b13bf44285 kde fractional scaling fix, tooltip fix (+1 squashed commits)
Squashed commits:

[1cf02dcce] kde fractional scaling fix
2026-02-01 21:55:44 +08:00
Concedo
9ef5d34740 fix mcp cert issues 2026-02-01 16:48:37 +08:00
Concedo
ffdc1b0f9f flux2 image editing 2026-01-31 16:36:45 +08:00
Concedo
66e1913da6 fix blocked UA for mcp 2026-01-29 23:53:50 +08:00
Concedo
5c29510330 mcp try handle vscode 2026-01-29 23:39:36 +08:00
Concedo
cd6e087eeb include kde in the fractional scaling fix 2026-01-29 22:06:21 +08:00
Rose
deee1c2cfc
fixed mcp stdio server tool listing (#1950) 2026-01-29 21:35:35 +08:00
Concedo
ef7fe1b5d4 make flash attention default in cli. added --noflashattention 2026-01-28 23:28:48 +08:00
Concedo
fec0d2bb4a pipeline parallel is default in cli now 2026-01-25 18:16:58 +08:00
Concedo
7f485e5287 remove CLBlast, part 1 2026-01-23 13:50:12 +08:00
Concedo
28091dec43 pipeline parallel default enable 2026-01-21 17:57:41 +08:00
Concedo
cdd6578a9a esrgan added 2026-01-20 22:10:37 +08:00
Concedo
c9c15749e0 wip on adding esrgan upscaling 2026-01-20 00:35:35 +08:00
Concedo
d827494f17 fix text for vae (+1 squashed commits)
Squashed commits:

[793caed19] fix text
2026-01-19 01:50:07 +08:00
Concedo
7f618454ff Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/labeler.yml
#	CODEOWNERS
#	docs/backend/OPENCL.md
#	docs/ops.md
#	docs/ops/CANN.csv
#	docs/ops/WebGPU.csv
#	ggml/src/ggml-blas/CMakeLists.txt
#	ggml/src/ggml-opencl/kernels/mul_mv_q6_k.cl
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/cpy.tmpl.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/set_rows.wgsl
#	tests/test-backend-ops.cpp
2026-01-18 23:24:29 +08:00
Llama
95ebfdcde8
Add token ids to logprob data returned by the API (#1928)
Previously, logprobs only contained the token string
and byte data, as well as the log probability itself.
For workflows that require the token id, translating
from the token bytes to the token id is potentially
costly and unreliable. It is simple and inexpensive
to expose the numeric token ids directly instead.
2026-01-18 16:30:46 +08:00
Concedo
7b4517c2fe embeddings memory usage regression fix 2026-01-18 16:26:52 +08:00
Concedo
3816391a74 increase logprobs returned to 10 2026-01-18 11:13:42 +08:00
Concedo
22ddad81b9 device override set in gui 2026-01-18 10:54:20 +08:00
Concedo
89a205ecc7 bump version 2026-01-17 19:09:14 +08:00
Concedo
62bea5ef4f allow overriding the devices directly 2026-01-17 19:08:06 +08:00
Concedo
8855a7f52b Merge commit 'c945aaaef2' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	README.md
#	common/CMakeLists.txt
#	common/chat.cpp
#	docs/function-calling.md
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tests/peg-parser/tests.h
#	tests/test-chat-peg-parser.cpp
#	tests/test-chat-template.cpp
#	tests/test-chat.cpp
#	tests/testing.h
#	tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
a5204d2363 fixed mcp command location 2026-01-17 00:09:46 +08:00
Concedo
c332bb614c better mcp error messages 2026-01-16 17:55:34 +08:00
Concedo
612c19afe7 interrogate max length increased 2026-01-13 11:06:05 +08:00
Concedo
3752040165 default to continue assistant turns 2026-01-12 23:12:27 +08:00
Concedo
fc51d8b216 fix prop type for tools 2026-01-12 18:28:00 +08:00
Concedo
4bf6d9eb9a trying with fa on by default 2026-01-07 11:38:45 +08:00
Concedo
3108fe740c Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	examples/model-conversion/logits.cpp
#	examples/retrieval/retrieval.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/unary_op.wgsl
2026-01-06 20:49:01 +08:00
Concedo
bd51d775be Merge branch 'concedo' into concedo_experimental 2026-01-05 21:04:42 +08:00
Concedo
1fc405b8b6 1.105.4 2026-01-05 21:01:42 +08:00
Concedo
c9308570b2 added mcp to list of capabilities, allow it to run standalone 2026-01-05 20:32:25 +08:00
Concedo
301a04adfc Merge branch 'concedo' into concedo_experimental 2026-01-05 15:24:43 +08:00
Concedo
9a4eeafbfc hotfix 1.105.3 2026-01-05 15:24:21 +08:00
Concedo
4d3866a016 mcp proxy is done 2026-01-05 12:24:43 +08:00
Concedo
91089ad1bd wip on mcp 2026-01-04 22:52:47 +08:00
Concedo
01c70a7d3d allow transcribe to be used with the LLM instead if no whisper model exists 2026-01-04 11:06:05 +08:00