koboldcpp/tools/server
Concedo cc82c3164e Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/intel.Dockerfile
#	.github/workflows/build-cross.yml
#	.github/workflows/build-sycl.yml
#	.github/workflows/build.yml
#	.github/workflows/editorconfig.yml
#	.github/workflows/release.yml
#	cmake/riscv64-spacemit-linux-gnu-gcc.cmake
#	docs/backend/OPENVINO.md
#	docs/backend/SYCL.md
#	docs/build-riscv64-spacemit.md
#	docs/ops.md
#	docs/ops/WebGPU.csv
#	embd_res/ggml-vocab-qwen35.gguf
#	embd_res/ggml-vocab-qwen35.gguf.inp
#	embd_res/ggml-vocab-qwen35.gguf.out
#	examples/model-conversion/Makefile
#	ggml/CMakeLists.txt
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/hmx-flash-attn-ops.c
#	ggml/src/ggml-hexagon/htp/hmx-matmul-ops.c
#	ggml/src/ggml-hexagon/htp/hmx-utils.h
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/hvx-utils.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-sycl/CMakeLists.txt
#	ggml/src/ggml-sycl/common.cpp
#	ggml/src/ggml-sycl/common.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/common_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/flash_attn_tile.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/flash_attn_vec_reduce.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/flash_attn_vec_split.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/get_rows.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec_acc.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/unary.wgsl
#	ggml/src/ggml-zendnn/CMakeLists.txt
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	scripts/snapdragon/adb/run-completion.sh
#	tests/CMakeLists.txt
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/mtmd/clip-impl.h
#	tools/mtmd/clip.cpp
#	tools/mtmd/clip.h
#	tools/server/README.md
2026-05-14 19:04:04 +08:00
..
bench Merge branch 'upstream' into concedo_experimental 2026-03-22 23:39:13 +08:00
public fix: Autoscroll detection (#23026) 2026-05-14 08:09:29 +02:00
tests Merge branch 'upstream' into concedo_experimental 2026-05-14 19:04:04 +08:00
webui Merge branch 'upstream' into concedo_experimental 2026-05-14 19:04:04 +08:00
chat-llama2.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
chat.mjs llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
chat.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
README-dev.md server: (webui) no more gzip compression (#21073) 2026-03-31 15:44:26 +02:00
server-chat.cpp server: (router) Forward form-data to model server (Fixes #22044) (#22118) 2026-04-27 23:55:00 +02:00
server-chat.h server: (router) Forward form-data to model server (Fixes #22044) (#22118) 2026-04-27 23:55:00 +02:00
server-common.cpp server, webui: accept continue_final_message flag for vLLM API compat (#23012) 2026-05-13 20:47:58 +02:00
server-common.h logs : reduce (#23021) 2026-05-14 13:05:52 +03:00
server-context.cpp logs : reduce (#23021) 2026-05-14 13:05:52 +03:00
server-context.h server: (router) expose child model info from router's /v1/models (#22683) 2026-05-08 14:42:15 +02:00
server-cors-proxy.h server: (router) Forward form-data to model server (Fixes #22044) (#22118) 2026-04-27 23:55:00 +02:00
server-http.cpp logs : reduce (#23021) 2026-05-14 13:05:52 +03:00
server-http.h server: support Vertex AI compatible API (#22545) 2026-05-08 15:23:04 +02:00
server-models.cpp mtmd, server, common: expose modalities to /v1/models (#22952) 2026-05-12 19:08:07 +02:00
server-models.h mtmd, server, common: expose modalities to /v1/models (#22952) 2026-05-12 19:08:07 +02:00
server-queue.cpp server : print warning when HTTP timeout exceeded (#22907) 2026-05-10 22:00:18 +03:00
server-queue.h server: allow router to report child instances sleep status (#20849) 2026-03-22 18:33:52 +01:00
server-task.cpp logs : reduce (#23021) 2026-05-14 13:05:52 +03:00
server-task.h spec : parallel drafting support (#22838) 2026-05-11 19:09:43 +03:00
server-tools.cpp server : validate --tools CLI argument against known tool names (#22538) 2026-05-05 06:35:27 +03:00
server-tools.h server: add built-in tools backend support (#20898) 2026-03-27 10:07:11 +01:00
server.cpp logs : reduce (#23021) 2026-05-14 13:05:52 +03:00