Commit graph

184 commits

Author SHA1 Message Date
Concedo
42ad89cd86 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.devops/cpu.Dockerfile
#	.devops/llama-cli-cann.Dockerfile
#	.devops/nix/package.nix
#	.github/workflows/build-android.yml
#	.github/workflows/build-cann.yml
#	.github/workflows/build-msys.yml
#	.github/workflows/docker.yml
#	.github/workflows/editorconfig.yml
#	.github/workflows/gguf-publish.yml
#	.github/workflows/python-lint.yml
#	.github/workflows/release.yml
#	CMakeLists.txt
#	docs/backend/CANN.md
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c
#	ggml/src/ggml-hexagon/htp/htp-ctx.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	scripts/sync_vendor.py
#	tests/test-chat-auto-parser.cpp
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
#	tests/test-reasoning-budget.cpp
#	tools/cli/cli.cpp
#	tools/server/CMakeLists.txt
#	tools/server/README.md
2026-03-30 20:45:38 +08:00
Aleksander Grygier
51a84efc53
webui: Conversation forking + branching improvements (#21021)
* refactor: Make `DialogConfirmation` extensible with children slot

* feat: Add conversation forking logic

* feat: Conversation forking UI

* feat: Update delete/edit dialogs and logic for forks

* refactor: Improve Chat Sidebar UX and add MCP Servers entry

* refactor: Cleanup

* feat: Update message in place when editing leaf nodes

* chore: Cleanup

* chore: Cleanup

* chore: Cleanup

* chore: Cleanup

* chore: Cleanup

* chore: Cleanup

* refactor: Post-review improvements

* chore: update webui build output

* test: Update Storybook test

* chore: update webui build output

* chore: update webui build output
2026-03-28 13:38:15 +01:00
Concedo
3ec6381123 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build-self-hosted.yml
#	.github/workflows/build.yml
#	.github/workflows/copilot-setup-steps.yml
#	.github/workflows/gguf-publish.yml
#	ci/run.sh
#	docs/backend/OPENVINO.md
#	examples/llama.android/lib/src/main/cpp/ai_chat.cpp
#	ggml/src/ggml-sycl/add-id.cpp
#	requirements/requirements-pydantic.txt
#	tests/test-gguf.cpp
#	tests/test-jinja.cpp
#	tests/test-llama-archs.cpp
#	tools/gguf-split/README.md
#	tools/llama-bench/llama-bench.cpp
2026-03-28 01:18:20 +08:00
Aleksander Grygier
e6f6770515
webui: Improve Chat Messages initial scroll + auto-scroll logic + add lazy loading with transitions to content blocks (#20999)
* refactor: Always use agentic content renderer for Assistant Message

* feat: Improve initial scroll + auto-scroll logic + implement fade in action for content blocks

* chore: update webui build output
2026-03-27 17:01:36 +01:00
Pascal
d0fa2c9fbb
Send reasoning content back to the model across turns via the reasoning_content API field (#21036)
* webui: send reasoning_content back to model in context

Preserve assistant reasoning across turns by extracting it from
internal tags and sending it as a separate reasoning_content field
in the API payload. The server and Jinja templates handle native
formatting (e.g. <think> tags for Qwen, GLM, DeepSeek...).

Adds "Exclude reasoning from context" toggle in Settings > Developer
(off by default, so reasoning is preserved). Includes unit tests.

* webui: add syncable parameter for excludeReasoningFromContext

* chore: update webui build output
2026-03-27 08:17:35 +01:00
Concedo
c00fe0af5a Merge commit '9f102a1407' into concedo_experimental
# Conflicts:
#	.devops/intel.Dockerfile
#	.github/ISSUE_TEMPLATE/010-bug-compilation.yml
#	.github/ISSUE_TEMPLATE/011-bug-results.yml
#	.github/pull_request_template.md
#	CODEOWNERS
#	README.md
#	common/CMakeLists.txt
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/binary-ops.c
#	ggml/src/ggml-hexagon/htp/hex-dma.c
#	ggml/src/ggml-hexagon/htp/hex-dma.h
#	ggml/src/ggml-hexagon/htp/hex-dump.h
#	ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c
#	ggml/src/ggml-hexagon/htp/hvx-utils.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hexagon/htp/ssm-conv.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	scripts/snapdragon/adb/run-bench.sh
#	scripts/sync_vendor.py
#	tests/test-backend-ops.cpp
#	tools/llama-bench/llama-bench.cpp
2026-03-25 23:45:41 +08:00
Concedo
8a6c41dc5c Merge commit '841bc203e2' into concedo_experimental
# Conflicts:
#	.github/workflows/ai-issues.yml
#	embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-musa/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	tests/test-chat-auto-parser.cpp
#	tests/test-jinja.cpp
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/server/README.md
2026-03-25 22:49:53 +08:00
Aleksander Grygier
69e0ecef06
webui: Fix editing assistant message without branching (#20944)
* fix: Editing assistant response without branching

* chore: update webui build output
2026-03-25 12:47:33 +02:00
Pascal
062cca58fc
Add SLEEPING status to the WebUI model selector (#20949)
* webui: handle sleeping model status, fix favourite -> favorite

* Update tools/server/webui/src/lib/components/app/models/ModelsSelectorOption.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/components/app/models/ModelsSelectorOption.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: fix optional event parameter in sleeping model onclick

* typo

* webui: restore orange sleeping indicator dot with hover unload

* chore: update webui build output

* webui: move stopPropagation into ActionIcon onclick, remove svelte-ignore

* chore: update webui build output

* webui: fix favourite -> favorite (UK -> US spelling) everywhere

Address review feedback from WhyNotHugo

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2026-03-25 11:02:32 +01:00
BlueMöhre
a94fdb090a
WebUI: fix edit msg form textarea height (#20830)
* autoresize textarea on mount

* allow textarea to grow to same height as rendered messages

* add UI build file
2026-03-24 13:17:45 +01:00
Aleksander Grygier
11fb11b901
webui: Improve chat form positioning (#20901) 2026-03-23 14:30:55 +01:00
Pascal
c44a932cf4
webui: fix --webui-config-file settings not applied on load (#20823)
* webui: fix --webui-config-file settings not applied on load

* chore: update webui build output
2026-03-23 11:25:35 +01:00
Concedo
ef854f002e Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/python-type-check.yml
#	AGENTS.md
#	CONTRIBUTING.md
#	examples/model-conversion/scripts/embedding/run-original-model.py
#	examples/model-conversion/scripts/utils/compare_tokens.py
#	examples/pydantic_models_to_grammar.py
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	pyrightconfig.json
#	scripts/compare-llama-bench.py
#	scripts/jinja/jinja-tester.py
#	scripts/server-bench.py
#	tests/test-grammar-integration.cpp
#	tests/test-grammar-parser.cpp
#	tests/test-llama-grammar.cpp
#	tests/test-tokenizer-random.py
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/llama-bench/llama-bench.cpp
#	tools/server/README.md
2026-03-22 23:39:13 +08:00
ddh0
3306dbaef7
misc : prefer ggml-org models in docs and examples (#20827)
Some checks failed
Check Pre-Tokenizer Hashes / pre-tokenizer-hashes (push) Has been cancelled
Python check requirements.txt / check-requirements (push) Has been cancelled
Python Type-Check / python type-check (push) Has been cancelled
* misc : prefer ggml-org models in docs and examples

Prefer referring to known-good quantizations under ggml-org rather than
3rd-party uploaders.

* remove accidentally committed file
2026-03-21 22:00:26 +01:00
Concedo
6054bacadd Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/ai-issues.yml
#	CONTRIBUTING.md
#	docs/autoparser.md
#	docs/ops.md
#	docs/ops/Metal.csv
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/hex-dma.h
#	ggml/src/ggml-hexagon/htp/hex-utils.h
#	ggml/src/ggml-hexagon/htp/htp-ctx.h
#	ggml/src/ggml-hexagon/htp/htp-msg.h
#	ggml/src/ggml-hexagon/htp/htp_iface.idl
#	ggml/src/ggml-hexagon/htp/hvx-base.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hip/CMakeLists.txt
#	models/templates/Apriel-1.6-15b-Thinker-fixed.jinja
#	models/templates/deepseek-ai-DeepSeek-R1-Distill-Qwen-32B.jinja
#	models/templates/deepseek-ai-DeepSeek-V3.1.jinja
#	models/templates/llama-cpp-deepseek-r1.jinja
#	models/templates/meetkai-functionary-medium-v3.1.jinja
#	scripts/fetch_server_test_models.py
#	scripts/snapdragon/adb/run-cli.sh
#	scripts/snapdragon/adb/run-completion.sh
#	scripts/snapdragon/adb/run-mtmd.sh
#	scripts/snapdragon/adb/run-tool.sh
#	tests/test-chat-auto-parser.cpp
#	tests/test-chat-peg-parser.cpp
#	tests/test-chat.cpp
#	tools/cli/cli.cpp
#	tools/server/README.md
2026-03-21 12:06:01 +08:00
Concedo
98f099aecc Merge commit 'c1258830b2' into concedo_experimental
# Conflicts:
#	docs/docker.md
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/get_rows.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/row_norm.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/unary.wgsl
2026-03-21 12:00:52 +08:00
Piotr Wilkin (ilintar)
5e54d51b19
common/parser: add proper reasoning tag prefill reading (#20424)
* Implement proper prefill extraction

* Refactor cli parameters, update docs, move reasoning budget sampler part to common/reasoning-budget.cpp

* Update tools/server/server-task.cpp

* refactor: move grammars to variant, remove grammar_external, handle exception internally

* Make code less C++y

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-03-19 16:58:21 +01:00
Pascal
4065c1a3a6
Server becomes the source of truth for sampling parameter defaults (#20558)
* webui: make server the source of truth for sampling defaults

* webui: fix Custom badge for sampling parameters

* webui: log user overrides after server sync

* chore: update webui build output

* fix: Default values for sampling settings config object

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2026-03-19 13:20:39 +01:00
Pascal
cd708db0cc
WebUI: Persist the on/off state of the MCP servers for new conversations (#20750)
* webui: add persistent storage for MCP server on/off state in new chats

* webui: simplify MCP enabled checks, remove dead server.enabled fallback

* chore: update webui build output

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2026-03-19 12:54:06 +01:00
Aleksander Grygier
512bba6ee0
webui: Improve model parsing logic + add unit tests (#20749)
* add tests for model id parser

* add test case having activated params

* add structured tests for model id parser

* add ToDo

* feat: Improve model parsing logic + tests

* chore: update webui build output

---------

Co-authored-by: bluemoehre <bluemoehre@gmx.de>
2026-03-19 12:25:50 +01:00
Concedo
48f914e374 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	ci/run.sh
#	ggml/CMakeLists.txt
#	ggml/src/ggml-cpu/arch/riscv/repack.cpp
#	ggml/src/ggml-cpu/arch/x86/repack.cpp
#	ggml/src/ggml-cpu/repack.cpp
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/htp-msg.h
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/hvx-base.h
#	ggml/src/ggml-hexagon/htp/hvx-exp.h
#	ggml/src/ggml-hexagon/htp/hvx-sigmoid.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hexagon/htp/softmax-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	scripts/sync-ggml.last
#	tests/test-backend-sampler.cpp
#	tests/test-chat.cpp
#	tests/test-jinja.cpp
#	tools/cli/cli.cpp
2026-03-19 02:23:06 +08:00
Julien Chaumond
48e61238e1
webui: improve tooltip wording for attachment requirements (#20688)
* webui: improve tooltip wording for attachment requirements

Co-Authored-By: Claude <Agents+claude@huggingface.co>

* chore: update webui build output

* chore: update webui build output

---------

Co-authored-by: Claude <Agents+claude@huggingface.co>
2026-03-18 14:01:02 +01:00
Aleksander Grygier
7ab321d40d
webui: Fix duplicated messages on q param (#20715)
* fix: Remove duplicate message sending on `?q` param

* chore: update webui build output
2026-03-18 10:32:43 +01:00
Concedo
f31b040941 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/labeler.yml
#	.github/workflows/build-self-hosted.yml
#	benches/nemotron/nemotron-dgx-spark.md
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
#	ggml/src/ggml-sycl/backend.hpp
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/element_wise.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	scripts/sync-ggml.last
#	tests/test-jinja.cpp
#	tests/test-llama-archs.cpp
2026-03-17 14:05:23 +08:00
Concedo
9084527b36 Merge commit '67a2209fab' into concedo_experimental
# Conflicts:
#	.github/workflows/build-cache.yml
#	.github/workflows/build-cross.yml
#	.github/workflows/build-self-hosted.yml
#	.github/workflows/build.yml
#	.github/workflows/python-lint.yml
#	.github/workflows/release.yml
#	.github/workflows/server-self-hosted.yml
#	.github/workflows/server-webui.yml
#	.github/workflows/server.yml
#	CODEOWNERS
#	ggml/src/ggml-sycl/gated_delta_net.cpp
#	scripts/sync_vendor.py
#	tools/cli/cli.cpp
2026-03-17 11:11:25 +08:00
Pascal
dddca026bf
webui: add model information dialog to router mode (#20600)
* webui: add model information dialog to router mode

* webui: add "Available models" section header in model list

* webui: remove nested scrollbar from chat template in model info dialog

* chore: update webui build output

* feat: UI improvements

* refactor: Cleaner rendering + UI docs

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2026-03-16 15:38:11 +01:00
Aleksander Grygier
67a2209fab
webui: Add MCP CORS Proxy detection logic & UI (#20167)
* refactor: MCP store cleanup

* feat: Add MCP proxy availability detection

* fix: Sidebar icon

* chore: update webui build output

* chore: Formatting

* chore: update webui build output

* chore: Update package lock

* chore: update webui build output

* chore: update webui build output

* chore: update webui build output
2026-03-16 13:05:36 +01:00
Pascal
d65c4f2dc9
Fix model selector locked to first loaded model with multiple models (#20580)
* webui: fix model selector being locked to first loaded model

When multiple models are loaded, the auto-select effect would re-fire
on every loadedModelIds change, overriding the user's manual model
selection. Guard with selectedModelId so auto-select only kicks in
when no model is chosen yet.

* chore: update webui build output
2026-03-16 12:04:06 +01:00
Woof Dog
d8c331c0af
webui: use date in more human readable exported filename (#19939)
* webui: use date in exported filename

Move conversation naming and export to utils

update index.html.gz

* webui: move literals to message export constants file

* webui: move export naming and download back to the conversation store

* chore: update webui build output

* webui: add comments to some constants

* chore: update webui build output
2026-03-16 11:18:13 +01:00
Concedo
f3d2f58fa8 note: smartcache is broken for rnn currently 2026-03-15 11:31:47 +08:00
Concedo
b1c500ae2b Merge commit '2948e6049a' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CONTRIBUTING.md
#	docs/backend/VirtGPU/development.md
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	embd_res/templates/GigaChat3-10B-A1.8B.jinja
#	embd_res/templates/GigaChat3.1-10B-A1.8B.jinja
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-chat.cpp
#	tests/test-grammar-integration.cpp
#	tests/test-quantize-fns.cpp
2026-03-15 11:21:24 +08:00
Concedo
67c9798d0b Merge commit '3ca19b0e9f' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	common/CMakeLists.txt
#	common/chat-peg-parser.cpp
#	docs/backend/SYCL.md
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-sycl/common.hpp
#	ggml/src/ggml-sycl/convert.hpp
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-sycl/rope.cpp
#	ggml/src/ggml-sycl/rope.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
#	scripts/compare-llama-bench.py
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tools/cli/cli.cpp
2026-03-15 11:11:31 +08:00
Chedrian07
710878a7dd
webui: restore code preview iframe origin isolation (#20477) 2026-03-14 11:28:28 +01:00
Pascal
de190154c8
New conversations now auto-select the first loaded model (#20403)
* webui: auto-select first loaded model for new conversations in router mode

* chore: update webui build output
2026-03-12 09:07:05 +01:00
Pascal
00de615345
Fix agentic mcp image single model (#20339)
* webui: fix MCP image attachments dropped during the agentic loop in single-model mode

* chore: update webui build output
2026-03-11 05:31:33 +01:00
Concedo
6adcd0b5db Merge commit '34df42f7be' into concedo_experimental
# Conflicts:
#	README.md
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/act-ops.c
#	ggml/src/ggml-hexagon/htp/binary-ops.c
#	ggml/src/ggml-hexagon/htp/cpy-ops.c
#	ggml/src/ggml-hexagon/htp/get-rows-ops.c
#	ggml/src/ggml-hexagon/htp/htp-msg.h
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/hvx-arith.h
#	ggml/src/ggml-hexagon/htp/hvx-base.h
#	ggml/src/ggml-hexagon/htp/hvx-inverse.h
#	ggml/src/ggml-hexagon/htp/hvx-utils.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hexagon/htp/rope-ops.c
#	ggml/src/ggml-hexagon/htp/set-rows-ops.c
#	ggml/src/ggml-hexagon/htp/softmax-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	tests/test-backend-ops.cpp
#	tools/cli/cli.cpp
#	tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte
2026-03-10 22:20:04 +08:00
Concedo
746664fde6 Merge commit '2cd20b72ed' into concedo_experimental
# Conflicts:
#	CONTRIBUTING.md
#	docs/backend/CANN.md
#	docs/backend/SYCL.md
#	docs/backend/snapdragon/README.md
#	docs/backend/snapdragon/windows.md
#	docs/build.md
#	docs/multimodal/MobileVLM.md
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	examples/debug/README.md
#	examples/llama.vim
#	examples/model-conversion/README.md
#	examples/sycl/README.md
#	ggml/src/ggml-cpu/amx/mmq.cpp
#	ggml/src/ggml-cpu/arch/x86/repack.cpp
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp-drv.cpp
#	ggml/src/ggml-hexagon/htp/flash-attn-ops.c
#	ggml/src/ggml-hexagon/htp/hvx-base.h
#	ggml/src/ggml-hexagon/htp/hvx-copy.h
#	ggml/src/ggml-hexagon/htp/hvx-inverse.h
#	ggml/src/ggml-hexagon/htp/hvx-reduce.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-hexagon/htp/rope-ops.c
#	ggml/src/ggml-hexagon/htp/worker-pool.c
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cpy.cl
#	ggml/src/ggml-sycl/common.hpp
#	ggml/src/ggml-sycl/quants.hpp
#	ggml/src/ggml-sycl/softmax.cpp
#	ggml/src/ggml-vulkan/CMakeLists.txt
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	scripts/pr2wt.sh
#	scripts/server-bench.py
#	scripts/snapdragon/windows/run-cli.ps1
#	tests/test-alloc.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat.cpp
#	tools/cli/cli.cpp
#	tools/completion/README.md
#	tools/cvector-generator/cvector-generator.cpp
#	tools/imatrix/README.md
#	tools/perplexity/README.md
#	tools/server/public_simplechat/readme.md
#	tools/server/tests/README.md
2026-03-10 22:11:08 +08:00
Aleksander Grygier
f6235a41ef
webui: Agentic Loop + MCP Client with support for Tools, Resources and Prompts (#18655) 2026-03-06 10:00:39 +01:00
Aleksander Grygier
5e335ba113
webui: Improvements for Models Selector UI (#20066) 2026-03-05 08:52:22 +01:00
Concedo
44182ebefe Merge commit '8c2c0108dd' into concedo_experimental
# Conflicts:
#	examples/model-conversion/Makefile
#	examples/model-conversion/scripts/utils/inspect-org-model.py
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/act-ops.c
#	ggml/src/ggml-hexagon/htp/get-rows-ops.c
#	ggml/src/ggml-hexagon/htp/hex-dma.h
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-hexagon/htp/rope-ops.c
#	ggml/src/ggml-hexagon/htp/set-rows-ops.c
#	ggml/src/ggml-hexagon/htp/softmax-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	scripts/snapdragon/adb/run-cli.sh
#	scripts/snapdragon/adb/run-completion.sh
#	scripts/snapdragon/adb/run-mtmd.sh
#	scripts/snapdragon/windows/run-cli.ps1
#	scripts/sync_vendor.py
#	tests/test-backend-sampler.cpp
2026-02-26 16:30:37 +08:00
Concedo
7e53bfd28d Merge commit '2b6dfe824d' into concedo_experimental
# Conflicts:
#	.github/workflows/release.yml
#	examples/save-load-state/save-load-state.cpp
#	src/llama-context.cpp
#	tools/cli/cli.cpp
2026-02-26 15:07:23 +08:00
Aleksander Grygier
5eb0ea32f0
feat: Add code blocks full height setting to parameter sync service (#19835) 2026-02-23 22:30:13 +01:00
Aleksander Grygier
9051663d5d
webui: Add setting to have full height Code Blocks in Chat Messages (#19829) 2026-02-23 14:16:50 +01:00
Kilian Krampf
cacc371f99
Fix wrong cli-argument in documentation (#19804) 2026-02-22 16:26:33 +01:00
Concedo
d06700687f Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/rocm.Dockerfile
#	.github/workflows/release.yml
#	CMakeLists.txt
#	ggml/src/ggml-cuda/common.cuh
#	scripts/sync_vendor.py
#	tests/test-chat.cpp
2026-02-22 09:33:13 +08:00
crsawyer
07968d53e4
fix: UI single model selection in router mode (#19767) 2026-02-21 09:28:39 +01:00
Concedo
e626de2430 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	embd_res/templates/stepfun-ai-Step-3.5-Flash.jinja
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/unary.wgsl
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-chat.cpp
#	tools/mtmd/CMakeLists.txt
2026-02-20 15:16:26 +08:00
Concedo
07c45ced56 Merge commit 'c78e682245' into concedo_experimental
# Conflicts:
#	src/models/qwen35.cpp
#	src/models/qwen35moe.cpp
2026-02-20 14:41:32 +08:00
Concedo
9eb9e4eb83 Merge commit '8a70973557' into concedo_experimental
# Conflicts:
#	docs/backend/CANN.md
#	docs/backend/SYCL.md
#	examples/model-conversion/scripts/utils/tensor-info.py
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/expm1.cl
#	ggml/src/ggml-opencl/kernels/mean.cl
#	ggml/src/ggml-opencl/kernels/softplus.cl
#	ggml/src/ggml-opencl/kernels/sum_rows.cl
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/common_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/embed_wgsl.py
#	ggml/src/ggml-webgpu/wgsl-shaders/get_rows.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_subgroup_matrix.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/scale.wgsl
#	tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte
2026-02-20 14:36:49 +08:00
crsawyer
10b26ee23a
WebUI hide models in router mode (#19374) 2026-02-19 22:53:42 +01:00