koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-17 12:39:09 +00:00

Author	SHA1	Message	Date
Concedo	7c70187e26	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/ISSUE_TEMPLATE/010-bug-compilation.yml # .github/ISSUE_TEMPLATE/011-bug-results.yml # .github/ISSUE_TEMPLATE/019-bug-misc.yml # .github/ISSUE_TEMPLATE/020-enhancement.yml # .github/ISSUE_TEMPLATE/030-research.yml # .github/ISSUE_TEMPLATE/040-refactor.yml # ggml/CMakeLists.txt # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-hexagon/CMakeLists.txt # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/CMakeLists.txt # ggml/src/ggml-hexagon/htp/cmake-toolchain.cmake # ggml/src/ggml-hexagon/htp/flash-attn-ops.c # ggml/src/ggml-hexagon/htp/hex-utils.h # ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c # ggml/src/ggml-hexagon/htp/hmx-ops.h # ggml/src/ggml-hexagon/htp/hmx-utils.h # ggml/src/ggml-hexagon/htp/hvx-base.h # ggml/src/ggml-hexagon/htp/hvx-copy.h # ggml/src/ggml-hexagon/htp/hvx-exp.h # ggml/src/ggml-hexagon/htp/unary-ops.c # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # ggml/src/ggml-rpc/ggml-rpc.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-virtgpu/ggml-backend.cpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl # ggml/src/ggml-zdnn/ggml-zdnn.cpp # ggml/src/ggml-zendnn/ggml-zendnn.cpp # scripts/sync-ggml.last # tests/test-backend-ops.cpp	2026-05-02 18:07:50 +08:00
Aleksander Grygier	ab6120cde5	webui: Spring Cleaning Refactor v1 (#22505 ) * wip: server_tools * feat: Integrate with `/tools` endpoint * feat: Builtin + MCP + JSON Schema Tools WIP * refactor * displayName -> display_name * snake_case everywhere * rm redundant field * feat: Improvements * chore: update webui build output * refactor: Updates after server updates * chore: update webui build output * change arg to --tools all * feat: UI improvements * chore: update webui build output * add readme mention * llama-gen-docs * chore: update webui build output * chore: update webui build output * chore: update webui build output * feat: Reorganize settings sections * feat: Separate dialogs for MCP Servers Settings and Import/Export * feat: WIP * feat: WIP * feat: WIP * feat: WIP * feat: WIP * feat: WIP * WIP on allozaur/20677-webui-server-tools * feat: UI improvements * chore: Update package lock * chore: Run `npm audit fix` * feat: UI WIP * feat: UI * refactor: Desktop Icon Strip DRY * feat: Cleaner rendering and transition for ChatScreen * feat: UI improvements * feat: UI improvement * feat: Remove MCP Server "enable" switch from Tools submenu * chore: Run `npm audit fix` * feat: WIP * feat: Logic improvements * refactor: Cleanup * refactor: DRY * test: Fix Chat Sidebar UI Tests * chore: Update package lock * refactor: Cleanup * feat: Chat Message Action Card with Continue and Permission flow implementations * feat: Add agentic steering messages, draft messages and improve chat UX * fix: Search results UI * test: Fix unit test * feat: UI/UX improvements * refactor: Simplify `useToolsPanel` access in components * feat: Implement Processing Info Context API * feat: Implement 'Go back to chat' functionality for settings * feat: Enhance MCP Server management in Chat Form Attachments * style: Minor UI and branding adjustments * chore: Update webui static build output * chore: Formatting, linting & type checks * feat: Draft messages logic * feat: UI improvements * feat: Steering Messages improvements * refactor: Cleanup * refactor: Cleanup * feat: Improve UI * refactor: Settings navigation hook * refactor: DRY code * refactor: DRY ChatMessageUser UI components * refactor: Desktop Icon Strip DRY * refactor: Tools & permissions * fix: Navigation condition * refactor: Cleanup * refactor: Cleanup * refactor: Cleanup * fix: preserve reasoning_content in agentic flow * refactor: Storybook cleanup * refactor: isInViewport util function * refactor: Rename globally `onClick` to `onclick` * chore: `npm audit fix` * refactor: Action Icon usage * refactor: Naming * refactor: JS in `class` directive * refactor: Chat components cleanup WIP * refactor: Components structure * refactor: Cleanup WIP * feat: New ChatAttachmentsPreview component * feat: UI improvements * feat: UI improvements * refactor: Cleanup * refactor: ChatAttachmentsPreview UI/UX * refactor: Remove dead code * refactor: Cleanup * fix: Model Name aliases displaying * feat: Shortcut improvements * refactor: Chat Message * feat: Move Import/Export to settings * refactor: Cleanup * refactor: Cleanup * refactor: Cleanup * refactor: Cleanup --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2026-05-01 18:36:29 +02:00
Concedo	37073bc13d	Merge branch 'upstream' into concedo_experimental # Conflicts: # ggml/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-cuda/mmq.cuh # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # scripts/sync-ggml.last # tests/test-backend-ops.cpp # tests/test-log.cpp	2026-04-30 17:37:52 +08:00
Concedo	45f8ff49bb	Merge commit '`52e5f0a5c1`' into concedo_experimental # Conflicts: # examples/gen-docs/gen-docs.cpp # examples/lookup/lookup-create.cpp # examples/lookup/lookup-stats.cpp # examples/lookup/lookup.cpp # examples/speculative-simple/speculative-simple.cpp # examples/speculative/speculative.cpp # ggml/src/CMakeLists.txt # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-rpc/ggml-rpc.cpp # ggml/src/ggml-vulkan/ggml-vulkan.cpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/binary.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/get_rows.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/rms_norm_mul.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/ssm_scan.wgsl # tests/test-arg-parser.cpp # tests/test-backend-ops.cpp # tests/test-chat.cpp # tests/test-reasoning-budget.cpp # tools/llama-bench/llama-bench.cpp # tools/rpc/rpc-server.cpp # tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte # tools/server/webui/src/lib/components/app/chat/ChatSidebar/ChatSidebar.svelte # tools/server/webui/src/routes/(chat)/+page.svelte	2026-04-29 22:27:36 +08:00
Pascal	59237bfbbc	webui: fix slow mic stop and WAV encode (#22480 ) * webui: instant mic stop, race-free recorder restart * webui: faster WAV PCM encode via hoisted channels and Int16Array * chore: update webui build output * webui: drop setTimeout(0) hack and harden cancelRecording * chore: update webui build output	2026-04-29 12:58:35 +02:00
Aleksander Grygier	f42e29fdf1	webui: Server tools (#21237 ) * wip: server_tools * feat: Integrate with `/tools` endpoint * feat: Builtin + MCP + JSON Schema Tools WIP * refactor * displayName -> display_name * snake_case everywhere * rm redundant field * feat: Improvements * chore: update webui build output * refactor: Updates after server updates * chore: update webui build output * change arg to --tools all * feat: UI improvements * chore: update webui build output * add readme mention * llama-gen-docs * chore: update webui build output * chore: update webui build output * chore: update webui build output * feat: Reorganize settings sections * feat: Separate dialogs for MCP Servers Settings and Import/Export * feat: WIP * feat: WIP * feat: WIP * feat: WIP * feat: WIP * feat: WIP * WIP on allozaur/20677-webui-server-tools * feat: UI improvements * chore: Update package lock * chore: Run `npm audit fix` * feat: UI WIP * feat: UI * refactor: Desktop Icon Strip DRY * feat: Cleaner rendering and transition for ChatScreen * feat: UI improvements * feat: UI improvement * feat: Remove MCP Server "enable" switch from Tools submenu * chore: Run `npm audit fix` * feat: WIP * feat: Logic improvements * refactor: Cleanup * refactor: DRY * test: Fix Chat Sidebar UI Tests * chore: Update package lock * refactor: Cleanup * feat: Chat Message Action Card with Continue and Permission flow implementations * feat: Add agentic steering messages, draft messages and improve chat UX * fix: Search results UI * test: Fix unit test * feat: UI/UX improvements * refactor: Simplify `useToolsPanel` access in components * feat: Implement Processing Info Context API * feat: Implement 'Go back to chat' functionality for settings * feat: Enhance MCP Server management in Chat Form Attachments * style: Minor UI and branding adjustments * chore: Update webui static build output * chore: Formatting, linting & type checks * feat: Draft messages logic * feat: UI improvements * feat: Steering Messages improvements * refactor: Cleanup * refactor: Cleanup * feat: Improve UI * refactor: Settings navigation hook * refactor: DRY code * refactor: DRY ChatMessageUser UI components * refactor: Desktop Icon Strip DRY * refactor: Tools & permissions * fix: Navigation condition * refactor: Cleanup * refactor: Cleanup * refactor: Cleanup * fix: preserve reasoning_content in agentic flow --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2026-04-28 14:35:49 +03:00
Concedo	9c0b9b0bb1	Merge branch 'upstream' into concedo_experimental # Conflicts: # docs/development/HOWTO-add-model.md # docs/multimodal.md # ggml/src/ggml-sycl/convert.cpp # ggml/src/ggml-sycl/dequantize.hpp # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/gated_delta_net.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/upscale.cpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # tests/test-backend-ops.cpp # tests/test-llama-archs.cpp # tools/mtmd/CMakeLists.txt	2026-04-14 20:06:04 +08:00
Rohan Jain	974c8c94cc	webui: add setting for first-line chat titles (#21797 ) * webui: add setting for first-line chat titles Add an opt-in setting (`titleGenerationUseFirstLine`) to use the first non-empty line of a prompt as the generated conversation title. Previously, the complete multi-line prompt was being used, which created long titles for complex queries. Coupled with "Ask for confirmation before changing conversation title", the dialog would overflow. * Update tools/server/webui/src/lib/utils/text.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/utils/text.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: Run build to update the bundle As requested in: https://github.com/ggml-org/llama.cpp/pull/21797#pullrequestreview-4094935065 * webui: Fix missing import for NEWLINE_SEPARATOR --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-04-13 09:30:46 +02:00
Aleksander Grygier	227ed28e12	webui: MCP Diagnostics improvements (#21803 ) * Add MCP Connection diagnostics and CORS hint to web-ui * tidy up test * webui: Refactor and improve MCP diagnostic logging --------- Co-authored-by: evalstate <1936278+evalstate@users.noreply.github.com>	2026-04-13 07:58:38 +02:00
Aleksander Grygier	9e209c5aee	fix: Proper messages rendering for "Show raw output" (#21672 )	2026-04-12 13:08:11 +02:00
Concedo	4c860ae4ae	Merge branch 'upstream' into concedo_experimental # Conflicts: # common/download.cpp # docs/backend/OPENVINO.md # docs/backend/snapdragon/CMakeUserPresets.json # docs/backend/snapdragon/README.md # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/act-ops.c # ggml/src/ggml-hexagon/htp/argsort-ops.c # ggml/src/ggml-hexagon/htp/binary-ops.c # ggml/src/ggml-hexagon/htp/cpy-ops.c # ggml/src/ggml-hexagon/htp/cumsum-ops.c # ggml/src/ggml-hexagon/htp/flash-attn-ops.c # ggml/src/ggml-hexagon/htp/get-rows-ops.c # ggml/src/ggml-hexagon/htp/hex-utils.h # ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c # ggml/src/ggml-hexagon/htp/hmx-ops.h # ggml/src/ggml-hexagon/htp/htp-ctx.h # ggml/src/ggml-hexagon/htp/htp-ops.h # ggml/src/ggml-hexagon/htp/htp_iface.idl # ggml/src/ggml-hexagon/htp/main.c # ggml/src/ggml-hexagon/htp/matmul-ops.c # ggml/src/ggml-hexagon/htp/repeat-ops.c # ggml/src/ggml-hexagon/htp/rope-ops.c # ggml/src/ggml-hexagon/htp/set-rows-ops.c # ggml/src/ggml-hexagon/htp/softmax-ops.c # ggml/src/ggml-hexagon/htp/ssm-conv.c # ggml/src/ggml-hexagon/htp/sum-rows-ops.c # ggml/src/ggml-hexagon/htp/unary-ops.c # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/common_decls.tmpl # ggml/src/ggml-webgpu/wgsl-shaders/flash_attn.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/get_rows.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/unary.wgsl # models/templates/google-gemma-4-31B-it-interleaved.jinja # models/templates/google-gemma-4-31B-it.jinja # scripts/snapdragon/adb/run-bench.sh # scripts/snapdragon/adb/run-cli.sh # scripts/snapdragon/adb/run-completion.sh # scripts/snapdragon/adb/run-tool.sh # scripts/snapdragon/windows/run-bench.ps1 # scripts/snapdragon/windows/run-cli.ps1 # scripts/snapdragon/windows/run-mtmd.ps1 # scripts/snapdragon/windows/run-tool.ps1 # tests/test-backend-ops.cpp # tests/test-chat.cpp # tools/llama-bench/llama-bench.cpp	2026-04-11 11:19:32 +08:00
Concedo	8b90bfe094	Merge commit '`4ef9301e4d`' into concedo_experimental # Conflicts: # .github/labeler.yml # docs/multimodal.md # embd_res/ggml-vocab-gemma-4.gguf # embd_res/ggml-vocab-gemma-4.gguf.inp # embd_res/ggml-vocab-gemma-4.gguf.out # ggml/src/ggml-sycl/fattn-tile.cpp # ggml/src/ggml-sycl/fattn-tile.hpp # ggml/src/ggml-sycl/fattn-vec.hpp # ggml/src/ggml-sycl/fattn.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-f16-f16.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-f16-q4_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-f16-q4_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-f16-q5_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-f16-q5_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-f16-q8_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_0-f16.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_0-q4_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_0-q4_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_0-q5_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_0-q5_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_0-q8_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_1-f16.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_1-q4_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_1-q4_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_1-q5_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_1-q5_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q4_1-q8_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_0-f16.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_0-q4_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_0-q4_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_0-q5_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_0-q5_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_0-q8_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_1-f16.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_1-q4_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_1-q4_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_1-q5_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_1-q5_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q5_1-q8_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q8_0-f16.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q8_0-q4_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q8_0-q4_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q8_0-q5_0.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q8_0-q5_1.cpp # ggml/src/ggml-sycl/template-instances/fattn-vec-instance-q8_0-q8_0.cpp # tests/CMakeLists.txt # tests/test-jinja.cpp # tools/mtmd/CMakeLists.txt	2026-04-11 09:38:50 +08:00
Aleksander Grygier	f989a6e39e	webui: Static build output improvements (#21667 ) * refactor: Build improvements * chore: Formatting + package lock update	2026-04-10 11:49:47 +02:00
JvM	4ef9301e4d	webui: add "Send message on Enter" setting (#21577 ) * webui: make Enter to send chat a setting * Shorten description * Use isMobile hook from $lib/hooks * Rebuild static output	2026-04-09 12:26:27 +02:00
Concedo	c82c0b463a	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/labeler.yml # .github/workflows/release.yml # examples/debug/debug.cpp # ggml/src/ggml-cuda/common.cuh # ggml/src/ggml-cuda/mmq.cuh # ggml/src/ggml-webgpu/ggml-webgpu.cpp # src/llama-vocab.cpp # tests/test-backend-ops.cpp # tests/test-chat.cpp # tests/test-json-schema-to-grammar.cpp # tools/mtmd/CMakeLists.txt	2026-04-09 17:45:04 +08:00
Concedo	5529748a01	Merge commit '`de1aa6fa73`' into concedo_experimental # Conflicts: # docs/build.md # docs/ops.md # docs/ops/WebGPU.csv # ggml/src/ggml-sycl/dequantize.hpp # ggml/src/ggml-sycl/dmmv.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/mmvq.cpp # ggml/src/ggml-sycl/quants.hpp # ggml/src/ggml-sycl/vecdotq.hpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl # tests/test-backend-ops.cpp # tests/test-quantize-fns.cpp	2026-04-09 17:16:33 +08:00
Aleksander Grygier	9949ad08f6	fix: Model Selector choice sync (#21628 )	2026-04-09 09:46:27 +02:00
Aleksander Grygier	75511a8d7e	webui: Add option to pre-encode conversation for faster next turns (#21034 )	2026-04-09 09:10:18 +02:00
Georgi Gerganov	4a05e0c566	webui : send both backend_sampling == false/true (#18781 ) * webui : send both backend_sampling == false/true * feat: Parameter sync --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-04-08 16:35:52 +02:00
Hamish M. Blair	97508acb17	webui: fix syntax highlighting lost after streaming for non-common languages (#21206 ) * webui: fix syntax highlighting lost for non-common languages after streaming rehype-highlight uses lowlight internally, which only bundles 37 "common" languages. The streaming code path uses highlight.js directly (192 languages), so languages like Haskell highlight correctly while streaming but lose all color once the code block closes. Pass the full lowlight language set to rehype-highlight so both paths support the same languages. * webui: rebuild static files after rebase	2026-04-08 08:58:08 +02:00
Aldehir Rojas	482192f12d	webui : store reasoning_content so it is sent back in subsequent requests (#21249 )	2026-04-07 13:32:44 +02:00
Aleksander Grygier	ecce0087da	fix: Detect streaming state in reasoning content blocks (#21549 )	2026-04-07 12:04:41 +02:00
Kabir08	d1f82e382d	Fix rtl text rendering (#21382 ) * Fix Arabic RTL text rendering in web UI - Add dir='auto' attributes to markdown containers and blocks - Implement post-processing to add dir='auto' to all text elements - Replace directional CSS properties with logical properties for proper RTL list alignment - Ensure bidirectional text support for mixed Arabic/English content * Clean up commented duplicate function Remove the commented-out duplicate transformMdastNode function that was left over from refactoring. * Fix Arabic RTL text rendering in web UI - Add dir='auto' attributes to markdown containers and blocks - Implement post-processing to add dir='auto' to all text elements - Replace directional CSS properties with logical properties for proper RTL list alignment - Minor code formatting improvements This ensures bidirectional text support for mixed Arabic/English content in the llama.cpp web UI. * Implement rehype plugin for comprehensive RTL text support - Add rehypeRtlSupport plugin that applies dir='auto' to all elements with children - Replace DOMParser-based approach with efficient HAST tree processing - Remove hardcoded element lists for better maintainability - Ensure proper bidirectional text rendering for mixed RTL/LTR content * Fix RTL text rendering with rehype plugin and cleanup * fix: prettier formatting	2026-04-07 11:37:20 +02:00
Concedo	31aa072da1	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/release.yml # .gitignore # examples/batched/batched.cpp # examples/debug/debug.cpp # examples/eval-callback/eval-callback.cpp # examples/idle/idle.cpp # examples/lookahead/lookahead.cpp # examples/lookup/lookup-create.cpp # examples/lookup/lookup-stats.cpp # examples/lookup/lookup.cpp # examples/parallel/parallel.cpp # examples/passkey/passkey.cpp # examples/retrieval/retrieval.cpp # examples/save-load-state/save-load-state.cpp # examples/speculative-simple/speculative-simple.cpp # examples/speculative/speculative.cpp # examples/training/finetune.cpp # ggml/CMakeLists.txt # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-sycl/fattn-tile.hpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/cpy.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/embed_wgsl.py # ggml/src/ggml-webgpu/wgsl-shaders/rope.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/soft_max.wgsl # scripts/sync-ggml.last # tests/export-graph-ops.cpp # tests/test-chat.cpp # tests/test-state-restore-fragmented.cpp # tests/test-thread-safety.cpp # tools/batched-bench/batched-bench.cpp # tools/cli/cli.cpp # tools/cvector-generator/cvector-generator.cpp # tools/export-lora/export-lora.cpp # tools/imatrix/imatrix.cpp # tools/perplexity/perplexity.cpp # tools/results/results.cpp # tools/server/CMakeLists.txt	2026-04-01 10:54:13 +08:00
Aleksander Grygier	0fcb3760b2	fix: Use lower-case proxy headers naming (#21235 )	2026-03-31 17:47:46 +02:00
Xuan-Son Nguyen	4a00bbfed6	server: (webui) no more gzip compression (#21073 ) * webui: no more gzip * try changing a small line * Revert "try changing a small line" This reverts commit 0d7a3531593d87b724d404c8727a96becab3ab07. * fix lint * fix test * rebuild * split into html/css/js * lint * chore: update webui build output * chore: Update git hooks script * server: update webui build output * chore: Update pre-commit hook * refactor: Cleanup --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-03-31 15:44:26 +02:00
SATISH K C	fcc2d598c8	fix: include API key in CORS proxy requests for MCP connections (#21193 ) * fix: include API key in CORS proxy requests for MCP connections When llama-server is started with --api-key-file and --webui-mcp-proxy, the /cors-proxy endpoint requires authentication. The WebUI was not including the Authorization header in proxy requests, causing MCP connections to fail with 401. Inject getAuthHeaders() into requestInit when useProxy is true so the proxy request carries the Bearer token alongside the forwarded target headers. Fixes #21167 * fix: simplify headers assignment based on reviewer suggestion Apply buildProxiedHeaders only when useProxy is true, pass headers directly to the transport otherwise.	2026-03-31 10:52:34 +02:00
Piotr Wilkin (ilintar)	4453e77561	server/webui: cleanup dual representation approach, simplify to openai-compat (#21090 ) * server/webui: cleanup dual representation approach, simplify to openai-compat * feat: Fix regression for Agentic Loop UI * chore: update webui build output * refactor: Post-review code improvements * chore: update webui build output * refactor: Cleanup * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-03-31 10:42:06 +02:00
Concedo	a3a5897d93	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/intel.Dockerfile # .github/workflows/python-type-check.yml # embd_res/templates/Qwen3.5-4B.jinja # examples/model-conversion/scripts/causal/compare-logits.py # examples/model-conversion/scripts/utils/check-nmse.py # examples/model-conversion/scripts/utils/compare_tokens.py # examples/model-conversion/scripts/utils/semantic_check.py # examples/sycl/build.sh # examples/sycl/run-llama2.sh # ggml/src/ggml-hexagon/htp/flash-attn-ops.c # ggml/src/ggml-hexagon/htp/hex-dma.h # ggml/src/ggml-hexagon/htp/rope-ops.c # scripts/gen-unicode-data.py # tests/test-chat.cpp	2026-03-30 21:41:19 +08:00
Concedo	42ad89cd86	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/cann.Dockerfile # .devops/cpu.Dockerfile # .devops/llama-cli-cann.Dockerfile # .devops/nix/package.nix # .github/workflows/build-android.yml # .github/workflows/build-cann.yml # .github/workflows/build-msys.yml # .github/workflows/docker.yml # .github/workflows/editorconfig.yml # .github/workflows/gguf-publish.yml # .github/workflows/python-lint.yml # .github/workflows/release.yml # CMakeLists.txt # docs/backend/CANN.md # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c # ggml/src/ggml-hexagon/htp/htp-ctx.h # ggml/src/ggml-hexagon/htp/main.c # ggml/src/ggml-hexagon/htp/matmul-ops.c # ggml/src/ggml-rpc/ggml-rpc.cpp # scripts/sync_vendor.py # tests/test-chat-auto-parser.cpp # tests/test-chat.cpp # tests/test-json-schema-to-grammar.cpp # tests/test-reasoning-budget.cpp # tools/cli/cli.cpp # tools/server/CMakeLists.txt # tools/server/README.md	2026-03-30 20:45:38 +08:00
Aleksander Grygier	389c7d4955	webui: Fix branching logic on edit message (#21175 ) Some checks failed Check Pre-Tokenizer Hashes / pre-tokenizer-hashes (push) Has been cancelled Details Python check requirements.txt / check-requirements (push) Has been cancelled Details Python Type-Check / python type-check (push) Has been cancelled Details * fix: Branching logic + small refactor * chore: update webui build output	2026-03-30 14:40:50 +02:00
Xuan-Son Nguyen	abf9a62161	server: wrap headers for mcp proxy (#21072 ) * server: wrap headers for mcp proxy * Update tools/server/server-cors-proxy.h Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * fix build * chore: update webui build output * chore: update webui build output --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-03-30 08:59:16 +02:00
BlueMöhre	968189729f	WebUI: Replace illegal nested button elements (#21026 ) * remove/replace nested button elements * map rest props to outer element * solve TODO * chore: update webui build output	2026-03-28 17:57:59 +01:00
Aleksander Grygier	51a84efc53	webui: Conversation forking + branching improvements (#21021 ) * refactor: Make `DialogConfirmation` extensible with children slot * feat: Add conversation forking logic * feat: Conversation forking UI * feat: Update delete/edit dialogs and logic for forks * refactor: Improve Chat Sidebar UX and add MCP Servers entry * refactor: Cleanup * feat: Update message in place when editing leaf nodes * chore: Cleanup * chore: Cleanup * chore: Cleanup * chore: Cleanup * chore: Cleanup * chore: Cleanup * refactor: Post-review improvements * chore: update webui build output * test: Update Storybook test * chore: update webui build output * chore: update webui build output	2026-03-28 13:38:15 +01:00
Concedo	3ec6381123	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build-self-hosted.yml # .github/workflows/build.yml # .github/workflows/copilot-setup-steps.yml # .github/workflows/gguf-publish.yml # ci/run.sh # docs/backend/OPENVINO.md # examples/llama.android/lib/src/main/cpp/ai_chat.cpp # ggml/src/ggml-sycl/add-id.cpp # requirements/requirements-pydantic.txt # tests/test-gguf.cpp # tests/test-jinja.cpp # tests/test-llama-archs.cpp # tools/gguf-split/README.md # tools/llama-bench/llama-bench.cpp	2026-03-28 01:18:20 +08:00
Aleksander Grygier	e6f6770515	webui: Improve Chat Messages initial scroll + auto-scroll logic + add lazy loading with transitions to content blocks (#20999 ) * refactor: Always use agentic content renderer for Assistant Message * feat: Improve initial scroll + auto-scroll logic + implement fade in action for content blocks * chore: update webui build output	2026-03-27 17:01:36 +01:00
Pascal	d0fa2c9fbb	Send reasoning content back to the model across turns via the reasoning_content API field (#21036 ) * webui: send reasoning_content back to model in context Preserve assistant reasoning across turns by extracting it from internal tags and sending it as a separate reasoning_content field in the API payload. The server and Jinja templates handle native formatting (e.g. <think> tags for Qwen, GLM, DeepSeek...). Adds "Exclude reasoning from context" toggle in Settings > Developer (off by default, so reasoning is preserved). Includes unit tests. * webui: add syncable parameter for excludeReasoningFromContext * chore: update webui build output	2026-03-27 08:17:35 +01:00
Concedo	c00fe0af5a	Merge commit '`9f102a1407`' into concedo_experimental # Conflicts: # .devops/intel.Dockerfile # .github/ISSUE_TEMPLATE/010-bug-compilation.yml # .github/ISSUE_TEMPLATE/011-bug-results.yml # .github/pull_request_template.md # CODEOWNERS # README.md # common/CMakeLists.txt # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/binary-ops.c # ggml/src/ggml-hexagon/htp/hex-dma.c # ggml/src/ggml-hexagon/htp/hex-dma.h # ggml/src/ggml-hexagon/htp/hex-dump.h # ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c # ggml/src/ggml-hexagon/htp/hvx-utils.h # ggml/src/ggml-hexagon/htp/main.c # ggml/src/ggml-hexagon/htp/ssm-conv.c # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # ggml/src/ggml-rpc/ggml-rpc.cpp # scripts/snapdragon/adb/run-bench.sh # scripts/sync_vendor.py # tests/test-backend-ops.cpp # tools/llama-bench/llama-bench.cpp	2026-03-25 23:45:41 +08:00
Concedo	8a6c41dc5c	Merge commit '`841bc203e2`' into concedo_experimental # Conflicts: # .github/workflows/ai-issues.yml # embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-musa/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # ggml/src/ggml-openvino/ggml-openvino.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # tests/test-chat-auto-parser.cpp # tests/test-jinja.cpp # tools/cli/README.md # tools/completion/README.md # tools/server/README.md	2026-03-25 22:49:53 +08:00
Aleksander Grygier	69e0ecef06	webui: Fix editing assistant message without branching (#20944 ) * fix: Editing assistant response without branching * chore: update webui build output	2026-03-25 12:47:33 +02:00
Pascal	062cca58fc	Add SLEEPING status to the WebUI model selector (#20949 ) * webui: handle sleeping model status, fix favourite -> favorite * Update tools/server/webui/src/lib/components/app/models/ModelsSelectorOption.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/models/ModelsSelectorOption.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: fix optional event parameter in sleeping model onclick * typo * webui: restore orange sleeping indicator dot with hover unload * chore: update webui build output * webui: move stopPropagation into ActionIcon onclick, remove svelte-ignore * chore: update webui build output * webui: fix favourite -> favorite (UK -> US spelling) everywhere Address review feedback from WhyNotHugo * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-03-25 11:02:32 +01:00
BlueMöhre	a94fdb090a	WebUI: fix edit msg form textarea height (#20830 ) * autoresize textarea on mount * allow textarea to grow to same height as rendered messages * add UI build file	2026-03-24 13:17:45 +01:00
Aleksander Grygier	11fb11b901	webui: Improve chat form positioning (#20901 )	2026-03-23 14:30:55 +01:00
Pascal	c44a932cf4	webui: fix --webui-config-file settings not applied on load (#20823 ) * webui: fix --webui-config-file settings not applied on load * chore: update webui build output	2026-03-23 11:25:35 +01:00
Concedo	ef854f002e	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/python-type-check.yml # AGENTS.md # CONTRIBUTING.md # examples/model-conversion/scripts/embedding/run-original-model.py # examples/model-conversion/scripts/utils/compare_tokens.py # examples/pydantic_models_to_grammar.py # ggml/src/ggml-rpc/ggml-rpc.cpp # pyrightconfig.json # scripts/compare-llama-bench.py # scripts/jinja/jinja-tester.py # scripts/server-bench.py # tests/test-grammar-integration.cpp # tests/test-grammar-parser.cpp # tests/test-llama-grammar.cpp # tests/test-tokenizer-random.py # tools/cli/README.md # tools/completion/README.md # tools/llama-bench/llama-bench.cpp # tools/server/README.md	2026-03-22 23:39:13 +08:00
ddh0	3306dbaef7	misc : prefer ggml-org models in docs and examples (#20827 ) Some checks failed Check Pre-Tokenizer Hashes / pre-tokenizer-hashes (push) Has been cancelled Details Python check requirements.txt / check-requirements (push) Has been cancelled Details Python Type-Check / python type-check (push) Has been cancelled Details * misc : prefer ggml-org models in docs and examples Prefer referring to known-good quantizations under ggml-org rather than 3rd-party uploaders. * remove accidentally committed file	2026-03-21 22:00:26 +01:00
Concedo	6054bacadd	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/ai-issues.yml # CONTRIBUTING.md # docs/autoparser.md # docs/ops.md # docs/ops/Metal.csv # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/CMakeLists.txt # ggml/src/ggml-hexagon/htp/hex-dma.h # ggml/src/ggml-hexagon/htp/hex-utils.h # ggml/src/ggml-hexagon/htp/htp-ctx.h # ggml/src/ggml-hexagon/htp/htp-msg.h # ggml/src/ggml-hexagon/htp/htp_iface.idl # ggml/src/ggml-hexagon/htp/hvx-base.h # ggml/src/ggml-hexagon/htp/main.c # ggml/src/ggml-hip/CMakeLists.txt # models/templates/Apriel-1.6-15b-Thinker-fixed.jinja # models/templates/deepseek-ai-DeepSeek-R1-Distill-Qwen-32B.jinja # models/templates/deepseek-ai-DeepSeek-V3.1.jinja # models/templates/llama-cpp-deepseek-r1.jinja # models/templates/meetkai-functionary-medium-v3.1.jinja # scripts/fetch_server_test_models.py # scripts/snapdragon/adb/run-cli.sh # scripts/snapdragon/adb/run-completion.sh # scripts/snapdragon/adb/run-mtmd.sh # scripts/snapdragon/adb/run-tool.sh # tests/test-chat-auto-parser.cpp # tests/test-chat-peg-parser.cpp # tests/test-chat.cpp # tools/cli/cli.cpp # tools/server/README.md	2026-03-21 12:06:01 +08:00
Concedo	98f099aecc	Merge commit '`c1258830b2`' into concedo_experimental # Conflicts: # docs/docker.md # docs/ops.md # docs/ops/WebGPU.csv # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/get_rows.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/row_norm.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/unary.wgsl	2026-03-21 12:00:52 +08:00
Piotr Wilkin (ilintar)	5e54d51b19	common/parser: add proper reasoning tag prefill reading (#20424 ) * Implement proper prefill extraction * Refactor cli parameters, update docs, move reasoning budget sampler part to common/reasoning-budget.cpp * Update tools/server/server-task.cpp * refactor: move grammars to variant, remove grammar_external, handle exception internally * Make code less C++y Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2026-03-19 16:58:21 +01:00
Pascal	4065c1a3a6	Server becomes the source of truth for sampling parameter defaults (#20558 ) * webui: make server the source of truth for sampling defaults * webui: fix Custom badge for sampling parameters * webui: log user overrides after server sync * chore: update webui build output * fix: Default values for sampling settings config object * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2026-03-19 13:20:39 +01:00

1 2 3 4 5

216 commits