koboldcpp/tools/server
Concedo 7d987af23a Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.devops/cpu.Dockerfile
#	.devops/cuda.Dockerfile
#	.devops/intel.Dockerfile
#	.devops/llama-cli-cann.Dockerfile
#	.devops/musa.Dockerfile
#	.devops/openvino.Dockerfile
#	.devops/rocm.Dockerfile
#	.devops/s390x.Dockerfile
#	.devops/vulkan.Dockerfile
#	.github/ISSUE_TEMPLATE/011-bug-results.yml
#	.github/ISSUE_TEMPLATE/019-bug-misc.yml
#	.github/workflows/build-and-test-snapdragon.yml
#	.github/workflows/docker.yml
#	.github/workflows/server-self-hosted.yml
#	.github/workflows/ui-ci.yml
#	.pi/gg/SYSTEM.md
#	README.md
#	common/arg.cpp
#	docs/backend/SYCL.md
#	docs/backend/snapdragon/CMakeUserPresets.json
#	docs/backend/snapdragon/README.md
#	docs/speculative.md
#	examples/save-load-state/save-load-state.cpp
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/htp-ctx.h
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/main.c
#	ggml/src/ggml-hexagon/htp/rope-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/gated_delta_net.wgsl
#	tools/cli/README.md
#	tools/server/README.md
2026-05-20 18:48:34 +08:00
..
bench Merge branch 'upstream' into concedo_experimental 2026-03-22 23:39:13 +08:00
tests Merge branch 'upstream' into concedo_experimental 2026-05-18 21:27:23 +08:00
chat-llama2.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
chat.mjs llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00
chat.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
README-dev.md ui: Restructure repo to use tools/ui folder and ui / UI / llama-ui / LLAMA_UI naming (#23064) 2026-05-16 02:02:40 +02:00
server-chat.cpp Support for Codex CLI by skipping unsupported Responses tools (#23041) 2026-05-15 09:03:24 +02:00
server-chat.h server: (router) Forward form-data to model server (Fixes #22044) (#22118) 2026-04-27 23:55:00 +02:00
server-common.cpp common : delegate assistant continuation to underlying template handlers (#23089) 2026-05-17 13:36:05 +02:00
server-common.h logs : reduce (#23021) 2026-05-14 13:05:52 +03:00
server-context.cpp server-context: guarantee there is at least 1 token to decode (#23280) 2026-05-19 09:49:01 +03:00
server-context.h ui: Restructure repo to use tools/ui folder and ui / UI / llama-ui / LLAMA_UI naming (#23064) 2026-05-16 02:02:40 +02:00
server-cors-proxy.h server: (router) Forward form-data to model server (Fixes #22044) (#22118) 2026-04-27 23:55:00 +02:00
server-http.cpp cmake : fix LLAMA_BUILD_UI logic (#23190) 2026-05-17 14:42:26 -04:00
server-http.h server: support Vertex AI compatible API (#22545) 2026-05-08 15:23:04 +02:00
server-models.cpp ui: Refactor models store, MCP service, and gate logs behind VITE_DEBUG (#23236) 2026-05-18 16:09:40 +02:00
server-models.h ui: Restructure repo to use tools/ui folder and ui / UI / llama-ui / LLAMA_UI naming (#23064) 2026-05-16 02:02:40 +02:00
server-queue.cpp server : print warning when HTTP timeout exceeded (#22907) 2026-05-10 22:00:18 +03:00
server-queue.h server: allow router to report child instances sleep status (#20849) 2026-03-22 18:33:52 +01:00
server-task.cpp common : delegate assistant continuation to underlying template handlers (#23089) 2026-05-17 13:36:05 +02:00
server-task.h common : delegate assistant continuation to underlying template handlers (#23089) 2026-05-17 13:36:05 +02:00
server-tools.cpp server : validate --tools CLI argument against known tool names (#22538) 2026-05-05 06:35:27 +03:00
server-tools.h server: add built-in tools backend support (#20898) 2026-03-27 10:07:11 +01:00
server.cpp server: skip device enumeration in router mode to avoid creating CUDA primary context (#23137) 2026-05-16 21:21:06 +02:00