koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-22 19:47:49 +00:00

Author	SHA1	Message	Date
Concedo	7e08e8d8b4	add some rpc dependencies (+1 squashed commits) Squashed commits: [b092a94e5] add some rpc dependencies	2026-05-18 22:17:30 +08:00
Wagner Bruna	90326f8585	sd: sync to master-612-d7ecbe1 (#2213 )	2026-05-18 21:19:12 +08:00
Concedo	1e828ccabf	Merge branch 'upstream' into concedo_experimental # Conflicts: # common/common.cpp # ggml/CMakeLists.txt # scripts/sync-ggml.last # scripts/sync_vendor.py # src/llama-context.cpp # tests/CMakeLists.txt # tests/test-backend-ops.cpp # tools/cli/README.md # tools/completion/README.md # tools/server/README.md	2026-05-17 11:26:18 +08:00
Concedo	79666e5764	revert sdcpp build steps to use makefile and cmake without external txt files	2026-05-16 00:53:56 +08:00
Concedo	f8391d527a	fix broken makefile	2026-05-15 23:02:38 +08:00
Wagner Bruna	bfe9548fd5	sd: sync to master-596-90e87bc (#2204 ) * sd: reuse source lists between make and cmake * sd: sync to master-596-90e87bc * Update source file path for sdtype_adapter.cpp --------- Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>	2026-05-14 23:14:33 +08:00
Concedo	4cfa1ad1c4	rpc server test build	2026-05-12 23:32:42 +08:00
Wagner Bruna	243b03586b	sd: build each source file separately (#2188 ) * sd: build source files separately * sd: decouple stable-diffusion.cpp and sdtype_adapter.cpp * sd: remove include util.h from sdtype_adapter.cpp * sd: update source file lists and review dependencies	2026-05-07 22:50:10 +08:00
henk717	bcf9c81e0d	Linux CUDA13 Action (#2186 ) * Linux CU13 CI * Bump max CUDA arch * CUDA13 Linux * Upload the correct build to rolling (CUDA13) * Downgrade cuda to get better compatibility Runpod can't handle 13.1, and if they can't handle it neither can the people with a secondary GPU of an older generation. * Add support for compute capability 89 in NVCCFLAGS	2026-05-06 18:06:39 +08:00
Concedo	950676fdb7	split utils.cpp into 2 files to support sd.cpp	2026-05-04 15:04:12 +08:00
Wagner Bruna	276c651a12	sd: sync to master-593-3d6064b (#2175 ) * sd: sync to master-593-3d6064b * sd: use the same sdtype_adapter object for all builds Since master-592-b8079e2, no sd.cpp source depends on the ggml backend build anymore. * sd: fix main_gpu selection * sd: report backend devices to the Python layer	2026-05-04 14:05:34 +08:00
Wagner Bruna	e2bdd6d7aa	sd: sync to master-591-331cfa5 (#2155 ) * sd: sync to master-585-44cca3d * sd: sync to master-587-b8bdffc * sd: sync to master-591-331cfa5	2026-05-01 16:33:28 +08:00
Wagner Bruna	bad9b61064	sd: sync to master-582-7023fc4 (#2150 ) * sd: remove sampler alias handling from the C++ layer It's already handled at the Python layer. * sd: sync to master-580-7d33d4b * sd: sync to master-582-7023fc4	2026-04-21 23:01:33 +08:00
Concedo	9a38091207	support q5_1 kv	2026-04-17 17:06:15 +08:00
Concedo	a165a73120	Merge commit '`d6f3030047`' into concedo_experimental # Conflicts: # examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py # examples/model-conversion/scripts/utils/semantic_check.py # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cpu/amx/amx.cpp # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-openvino/ggml-openvino.cpp # ggml/src/ggml-rpc/ggml-rpc.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp # ggml/src/ggml-virtgpu/ggml-backend.cpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-zdnn/ggml-zdnn.cpp # ggml/src/ggml-zendnn/ggml-zendnn.cpp # pyproject.toml # requirements/requirements-convert_legacy_llama.txt # requirements/requirements-tool_bench.txt # src/llama-model.cpp # src/llama.cpp # tests/test-llama-archs.cpp # tests/test-tokenizer-0.py # tests/test-tokenizer-random.py # tools/llama-bench/llama-bench.cpp # tools/perplexity/perplexity.cpp	2026-04-11 11:10:55 +08:00
Wagner Bruna	f371bb14d4	sd: sync to master-560-e8323ca (#2082 ) * sd: sync to master-540-f16a110 * tae post-merge fixes * build fixes * restore image mask for non-inpainting models * sd: sync to master-551-99c1de3 * avoid nlohmann/json.hpp include diffs * Euler A now works on Flux * sd: sync to master-555-7397dda avi_writer.h got removed upstream, but I've simply kept the local copy for now. * sd: sync to master-558-8afbeb6 * sd: sync to master-560-e8323ca	2026-04-09 14:44:59 +08:00
Concedo	8a6c41dc5c	Merge commit '`841bc203e2`' into concedo_experimental # Conflicts: # .github/workflows/ai-issues.yml # embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-musa/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # ggml/src/ggml-openvino/ggml-openvino.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # tests/test-chat-auto-parser.cpp # tests/test-jinja.cpp # tools/cli/README.md # tools/completion/README.md # tools/server/README.md	2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias	8e045b33a1	fix - w64devkit vulkan build (#2048 )	2026-03-20 16:37:22 +08:00
Concedo	67c9798d0b	Merge commit '`3ca19b0e9f`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # common/CMakeLists.txt # common/chat-peg-parser.cpp # docs/backend/SYCL.md # docs/ops.md # docs/ops/SYCL.csv # ggml/src/ggml-sycl/common.hpp # ggml/src/ggml-sycl/convert.hpp # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/norm.cpp # ggml/src/ggml-sycl/rope.cpp # ggml/src/ggml-sycl/rope.hpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl # scripts/compare-llama-bench.py # scripts/sync_vendor.py # tests/CMakeLists.txt # tools/cli/cli.cpp	2026-03-15 11:11:31 +08:00
JustCommitRandomness	9ddd74111f	OpenBSD changes for vulkan backend (#2026 ) * OpenBSD also needs alloca.h * Changes to compile vulkan backend with OpenBSD * Update README.md tweak details for OpenBSD vulkan backend * Update README.md	2026-03-08 20:41:36 +08:00
Concedo	adebf63877	ace converter	2026-02-26 19:53:02 +08:00
Concedo	0fd7d2c0e5	ace step diffusion loading	2026-02-24 15:24:15 +08:00
Concedo	13db5aee9e	stub files for loading ace step	2026-02-22 23:15:08 +08:00
Concedo	5cd6e50eab	initial files for ace step	2026-02-22 13:22:24 +08:00
Concedo	72219fdbf5	basic qwen3 tts working	2026-02-21 12:03:53 +08:00
Concedo	1af7095cb5	add qwen3 tts repo files	2026-02-21 10:54:55 +08:00
Wagner Bruna	ae5183be10	sd: sync to master-504-636d3cb (#1969 ) * sd: sync to master-504-636d3cb * sd: fix and simplify limit calculation - restore the "arbitrarily high" 8192 limit, since it's used to turn off the img_hard_limit (and if each side was always limited by 2048, we wouldn't need hard_megapixel_res_limit) - avoid changing the config cfg_square_limit during a generation - apply the hard_megapixel_res_limit only in the configuration-changed path, since the default path uses constants - clean up comments The calculation itself remains the same: - default area limit: 832² for SD1.5/SD2, 1024² otherwise - configured limit always between 64 and 2048	2026-02-14 08:12:08 +08:00
Concedo	bff3fd3e34	Merge branch 'upstream' into concedo_experimental # Conflicts: # common/common.cpp # docs/backend/snapdragon/README.md # ggml/src/ggml-hexagon/htp/htp-ops.h # ggml/src/ggml-hexagon/htp/matmul-ops.c # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # scripts/pr2wt.sh # tests/test-backend-ops.cpp # tools/server/README.md	2026-02-13 14:00:45 +08:00
Concedo	423a4bd3c0	Merge branch 'upstream' into concedo_experimental # Conflicts: # src/CMakeLists.txt # tests/test-backend-ops.cpp	2026-02-06 14:43:02 +08:00
Concedo	ddce19db72	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/nix/package-gguf-py.nix # .devops/nix/scope.nix # common/CMakeLists.txt # docs/backend/SYCL.md # examples/lookahead/lookahead.cpp # examples/lookup/lookup.cpp # examples/sycl/run-llama2.sh # examples/sycl/win-run-llama2.bat # examples/sycl/win-test.bat # ggml/src/ggml-hexagon/CMakeLists.txt # ggml/src/ggml-hexagon/htp/flash-attn-ops.c # ggml/src/ggml-hexagon/htp/hvx-dump.h # ggml/src/ggml-hexagon/htp/hvx-reduce.h # ggml/src/ggml-hexagon/htp/matmul-ops.c # ggml/src/ggml-hexagon/htp/softmax-ops.c # ggml/src/ggml-hexagon/htp/unary-ops.c # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # scripts/sync-ggml.last	2026-02-01 22:35:25 +08:00
Concedo	7e755014b2	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/winget.yml # CODEOWNERS # common/CMakeLists.txt # common/arg.cpp # docs/ops/SYCL.csv # examples/lookup/lookup-create.cpp # examples/lookup/lookup-stats.cpp # examples/lookup/lookup.cpp # examples/speculative-simple/speculative-simple.cpp # examples/speculative/speculative.cpp # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-sycl/dpct/helper.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/norm.cpp # ggml/src/ggml-zendnn/ggml-zendnn.cpp # tests/test-chat-template.cpp	2026-01-29 23:05:05 +08:00
Concedo	e8e7c357c9	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build-cache.yml # .github/workflows/build-cmake-pkg.yml # .github/workflows/build-linux-cross.yml # .github/workflows/build.yml # .github/workflows/check-vendor.yml # .github/workflows/close-issue.yml # .github/workflows/copilot-setup-steps.yml # .github/workflows/docker.yml # .github/workflows/editorconfig.yml # .github/workflows/gguf-publish.yml # .github/workflows/labeler.yml # .github/workflows/pre-tokenizer-hashes.yml # .github/workflows/python-check-requirements.yml # .github/workflows/python-lint.yml # .github/workflows/python-type-check.yml # .github/workflows/release.yml # .github/workflows/server-webui.yml # .github/workflows/server.yml # .github/workflows/update-ops-docs.yml # .github/workflows/winget.yml # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-zdnn/ggml-zdnn.cpp # requirements/requirements-tool_bench.txt # src/CMakeLists.txt # src/llama-quant.cpp # tests/test-backend-ops.cpp # tests/test-chat.cpp # tools/cli/cli.cpp # tools/server/README.md	2026-01-23 14:27:04 +08:00
Concedo	5c6cc02985	remove clblast, part 2	2026-01-23 14:09:46 +08:00
Concedo	7f485e5287	remove CLBlast, part 1	2026-01-23 13:50:12 +08:00
Concedo	8855a7f52b	Merge commit '`c945aaaef2`' into concedo_experimental # Conflicts: # .devops/cann.Dockerfile # .github/workflows/build.yml # .github/workflows/release.yml # README.md # common/CMakeLists.txt # common/chat.cpp # docs/function-calling.md # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja # scripts/sync_vendor.py # tests/CMakeLists.txt # tests/peg-parser/tests.h # tests/test-chat-peg-parser.cpp # tests/test-chat-template.cpp # tests/test-chat.cpp # tests/testing.h # tools/llama-bench/llama-bench.cpp	2026-01-17 10:24:03 +08:00
Concedo	d15bd212c5	cleanup	2026-01-17 00:57:33 +08:00
Concedo	cde4791e36	fix tools building	2025-12-19 12:08:29 +08:00
Concedo	a01b49098c	fix tool builds	2025-12-18 23:26:31 +08:00
Concedo	1daeed5d4d	Merge commit '`9963b81f63`' into concedo_experimental # Conflicts: # .github/workflows/server.yml # SECURITY.md # docs/backend/SYCL.md # examples/model-conversion/README.md # examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/matmul-ops.c # tests/CMakeLists.txt # tests/test-chat.cpp # tests/test-json-schema-to-grammar.cpp	2025-12-17 20:30:34 +08:00
Concedo	cacfa37611	wip	2025-12-17 16:04:45 +08:00
Wagner Bruna	78bbe89956	sd: sync to master-417-43a70e8 (#1889 ) * sd: sync to master-417-43a70e8 * fix sdmain build * switch to upstream apply_loras() * refactor u8 path conversions and add it to the gguf reader	2025-12-16 16:16:48 +08:00
Concedo	010995c967	Merge commit '`4df6e859e9`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # README.md # ci/run.sh # examples/gen-docs/gen-docs.cpp # scripts/snapdragon/adb/run-cli.sh # tests/test-lora-conversion-inference.sh # tools/CMakeLists.txt # tools/completion/CMakeLists.txt # tools/completion/README.md # tools/server/CMakeLists.txt	2025-12-12 17:23:25 +08:00
Concedo	cd73613136	moved volta onto tile kernels, so building for cc7.0 can be avoided this shouldn't do anything (+2 squashed commit) Squashed commit: [1cdcb302a] another attempt to tip the scales, part 2 [8f647b709] another attempt to tip the scales (volta)	2025-12-08 19:51:54 +08:00
Concedo	d27949f22a	Revert "try remove volta as a dedicated target b (+1 squashed commits)" This reverts commit `ddba580f00`.	2025-12-06 21:31:44 +08:00
Concedo	ddba580f00	try remove volta as a dedicated target b (+1 squashed commits) Squashed commits: [2df689a03] try remove volta as a dedicated target	2025-12-06 21:31:06 +08:00
Concedo	e570478275	limit cuda arches + scale tweaks	2025-11-28 13:05:11 +08:00
Wagner Bruna	3318b73c94	sd: sync to master-355-694f0d9	2025-11-23 19:28:34 -03:00
LostRuins Concedo	5751c30790	add vulkan for whisper	2025-11-13 15:37:58 +08:00
LostRuins Concedo	d6a2ad8455	still not really working right	2025-11-09 01:57:48 +08:00
LostRuins Concedo	cfb22b5c9d	rename a missed BLAS -> batch	2025-11-06 16:11:26 +08:00

1 2 3 4 5 ...

712 commits