koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-17 04:09:19 +00:00

Author	SHA1	Message	Date
Concedo	79666e5764	revert sdcpp build steps to use makefile and cmake without external txt files	2026-05-16 00:53:56 +08:00
Concedo	f8391d527a	fix broken makefile	2026-05-15 23:02:38 +08:00
Wagner Bruna	bfe9548fd5	sd: sync to master-596-90e87bc (#2204 ) * sd: reuse source lists between make and cmake * sd: sync to master-596-90e87bc * Update source file path for sdtype_adapter.cpp --------- Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>	2026-05-14 23:14:33 +08:00
Concedo	4cfa1ad1c4	rpc server test build	2026-05-12 23:32:42 +08:00
Wagner Bruna	243b03586b	sd: build each source file separately (#2188 ) * sd: build source files separately * sd: decouple stable-diffusion.cpp and sdtype_adapter.cpp * sd: remove include util.h from sdtype_adapter.cpp * sd: update source file lists and review dependencies	2026-05-07 22:50:10 +08:00
henk717	bcf9c81e0d	Linux CUDA13 Action (#2186 ) * Linux CU13 CI * Bump max CUDA arch * CUDA13 Linux * Upload the correct build to rolling (CUDA13) * Downgrade cuda to get better compatibility Runpod can't handle 13.1, and if they can't handle it neither can the people with a secondary GPU of an older generation. * Add support for compute capability 89 in NVCCFLAGS	2026-05-06 18:06:39 +08:00
Concedo	950676fdb7	split utils.cpp into 2 files to support sd.cpp	2026-05-04 15:04:12 +08:00
Wagner Bruna	276c651a12	sd: sync to master-593-3d6064b (#2175 ) * sd: sync to master-593-3d6064b * sd: use the same sdtype_adapter object for all builds Since master-592-b8079e2, no sd.cpp source depends on the ggml backend build anymore. * sd: fix main_gpu selection * sd: report backend devices to the Python layer	2026-05-04 14:05:34 +08:00
Wagner Bruna	e2bdd6d7aa	sd: sync to master-591-331cfa5 (#2155 ) * sd: sync to master-585-44cca3d * sd: sync to master-587-b8bdffc * sd: sync to master-591-331cfa5	2026-05-01 16:33:28 +08:00
Wagner Bruna	bad9b61064	sd: sync to master-582-7023fc4 (#2150 ) * sd: remove sampler alias handling from the C++ layer It's already handled at the Python layer. * sd: sync to master-580-7d33d4b * sd: sync to master-582-7023fc4	2026-04-21 23:01:33 +08:00
Concedo	9a38091207	support q5_1 kv	2026-04-17 17:06:15 +08:00
Concedo	a165a73120	Merge commit '`d6f3030047`' into concedo_experimental # Conflicts: # examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py # examples/model-conversion/scripts/utils/semantic_check.py # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cpu/amx/amx.cpp # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-openvino/ggml-openvino.cpp # ggml/src/ggml-rpc/ggml-rpc.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp # ggml/src/ggml-virtgpu/ggml-backend.cpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-zdnn/ggml-zdnn.cpp # ggml/src/ggml-zendnn/ggml-zendnn.cpp # pyproject.toml # requirements/requirements-convert_legacy_llama.txt # requirements/requirements-tool_bench.txt # src/llama-model.cpp # src/llama.cpp # tests/test-llama-archs.cpp # tests/test-tokenizer-0.py # tests/test-tokenizer-random.py # tools/llama-bench/llama-bench.cpp # tools/perplexity/perplexity.cpp	2026-04-11 11:10:55 +08:00
Wagner Bruna	f371bb14d4	sd: sync to master-560-e8323ca (#2082 ) * sd: sync to master-540-f16a110 * tae post-merge fixes * build fixes * restore image mask for non-inpainting models * sd: sync to master-551-99c1de3 * avoid nlohmann/json.hpp include diffs * Euler A now works on Flux * sd: sync to master-555-7397dda avi_writer.h got removed upstream, but I've simply kept the local copy for now. * sd: sync to master-558-8afbeb6 * sd: sync to master-560-e8323ca	2026-04-09 14:44:59 +08:00
Concedo	8a6c41dc5c	Merge commit '`841bc203e2`' into concedo_experimental # Conflicts: # .github/workflows/ai-issues.yml # embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-musa/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # ggml/src/ggml-openvino/ggml-openvino.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # tests/test-chat-auto-parser.cpp # tests/test-jinja.cpp # tools/cli/README.md # tools/completion/README.md # tools/server/README.md	2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias	8e045b33a1	fix - w64devkit vulkan build (#2048 )	2026-03-20 16:37:22 +08:00
Concedo	67c9798d0b	Merge commit '`3ca19b0e9f`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # common/CMakeLists.txt # common/chat-peg-parser.cpp # docs/backend/SYCL.md # docs/ops.md # docs/ops/SYCL.csv # ggml/src/ggml-sycl/common.hpp # ggml/src/ggml-sycl/convert.hpp # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/norm.cpp # ggml/src/ggml-sycl/rope.cpp # ggml/src/ggml-sycl/rope.hpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl # ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl # scripts/compare-llama-bench.py # scripts/sync_vendor.py # tests/CMakeLists.txt # tools/cli/cli.cpp	2026-03-15 11:11:31 +08:00
JustCommitRandomness	9ddd74111f	OpenBSD changes for vulkan backend (#2026 ) * OpenBSD also needs alloca.h * Changes to compile vulkan backend with OpenBSD * Update README.md tweak details for OpenBSD vulkan backend * Update README.md	2026-03-08 20:41:36 +08:00
Concedo	adebf63877	ace converter	2026-02-26 19:53:02 +08:00
Concedo	0fd7d2c0e5	ace step diffusion loading	2026-02-24 15:24:15 +08:00
Concedo	13db5aee9e	stub files for loading ace step	2026-02-22 23:15:08 +08:00
Concedo	5cd6e50eab	initial files for ace step	2026-02-22 13:22:24 +08:00
Concedo	72219fdbf5	basic qwen3 tts working	2026-02-21 12:03:53 +08:00
Concedo	1af7095cb5	add qwen3 tts repo files	2026-02-21 10:54:55 +08:00
Wagner Bruna	ae5183be10	sd: sync to master-504-636d3cb (#1969 ) * sd: sync to master-504-636d3cb * sd: fix and simplify limit calculation - restore the "arbitrarily high" 8192 limit, since it's used to turn off the img_hard_limit (and if each side was always limited by 2048, we wouldn't need hard_megapixel_res_limit) - avoid changing the config cfg_square_limit during a generation - apply the hard_megapixel_res_limit only in the configuration-changed path, since the default path uses constants - clean up comments The calculation itself remains the same: - default area limit: 832² for SD1.5/SD2, 1024² otherwise - configured limit always between 64 and 2048	2026-02-14 08:12:08 +08:00
Concedo	bff3fd3e34	Merge branch 'upstream' into concedo_experimental # Conflicts: # common/common.cpp # docs/backend/snapdragon/README.md # ggml/src/ggml-hexagon/htp/htp-ops.h # ggml/src/ggml-hexagon/htp/matmul-ops.c # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # scripts/pr2wt.sh # tests/test-backend-ops.cpp # tools/server/README.md	2026-02-13 14:00:45 +08:00
Concedo	423a4bd3c0	Merge branch 'upstream' into concedo_experimental # Conflicts: # src/CMakeLists.txt # tests/test-backend-ops.cpp	2026-02-06 14:43:02 +08:00
Concedo	ddce19db72	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/nix/package-gguf-py.nix # .devops/nix/scope.nix # common/CMakeLists.txt # docs/backend/SYCL.md # examples/lookahead/lookahead.cpp # examples/lookup/lookup.cpp # examples/sycl/run-llama2.sh # examples/sycl/win-run-llama2.bat # examples/sycl/win-test.bat # ggml/src/ggml-hexagon/CMakeLists.txt # ggml/src/ggml-hexagon/htp/flash-attn-ops.c # ggml/src/ggml-hexagon/htp/hvx-dump.h # ggml/src/ggml-hexagon/htp/hvx-reduce.h # ggml/src/ggml-hexagon/htp/matmul-ops.c # ggml/src/ggml-hexagon/htp/softmax-ops.c # ggml/src/ggml-hexagon/htp/unary-ops.c # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # scripts/sync-ggml.last	2026-02-01 22:35:25 +08:00
Concedo	7e755014b2	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/winget.yml # CODEOWNERS # common/CMakeLists.txt # common/arg.cpp # docs/ops/SYCL.csv # examples/lookup/lookup-create.cpp # examples/lookup/lookup-stats.cpp # examples/lookup/lookup.cpp # examples/speculative-simple/speculative-simple.cpp # examples/speculative/speculative.cpp # ggml/src/ggml-hip/CMakeLists.txt # ggml/src/ggml-sycl/dpct/helper.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/norm.cpp # ggml/src/ggml-zendnn/ggml-zendnn.cpp # tests/test-chat-template.cpp	2026-01-29 23:05:05 +08:00
Concedo	e8e7c357c9	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build-cache.yml # .github/workflows/build-cmake-pkg.yml # .github/workflows/build-linux-cross.yml # .github/workflows/build.yml # .github/workflows/check-vendor.yml # .github/workflows/close-issue.yml # .github/workflows/copilot-setup-steps.yml # .github/workflows/docker.yml # .github/workflows/editorconfig.yml # .github/workflows/gguf-publish.yml # .github/workflows/labeler.yml # .github/workflows/pre-tokenizer-hashes.yml # .github/workflows/python-check-requirements.yml # .github/workflows/python-lint.yml # .github/workflows/python-type-check.yml # .github/workflows/release.yml # .github/workflows/server-webui.yml # .github/workflows/server.yml # .github/workflows/update-ops-docs.yml # .github/workflows/winget.yml # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-zdnn/ggml-zdnn.cpp # requirements/requirements-tool_bench.txt # src/CMakeLists.txt # src/llama-quant.cpp # tests/test-backend-ops.cpp # tests/test-chat.cpp # tools/cli/cli.cpp # tools/server/README.md	2026-01-23 14:27:04 +08:00
Concedo	5c6cc02985	remove clblast, part 2	2026-01-23 14:09:46 +08:00
Concedo	7f485e5287	remove CLBlast, part 1	2026-01-23 13:50:12 +08:00
Concedo	8855a7f52b	Merge commit '`c945aaaef2`' into concedo_experimental # Conflicts: # .devops/cann.Dockerfile # .github/workflows/build.yml # .github/workflows/release.yml # README.md # common/CMakeLists.txt # common/chat.cpp # docs/function-calling.md # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja # scripts/sync_vendor.py # tests/CMakeLists.txt # tests/peg-parser/tests.h # tests/test-chat-peg-parser.cpp # tests/test-chat-template.cpp # tests/test-chat.cpp # tests/testing.h # tools/llama-bench/llama-bench.cpp	2026-01-17 10:24:03 +08:00
Concedo	d15bd212c5	cleanup	2026-01-17 00:57:33 +08:00
Concedo	cde4791e36	fix tools building	2025-12-19 12:08:29 +08:00
Concedo	a01b49098c	fix tool builds	2025-12-18 23:26:31 +08:00
Concedo	1daeed5d4d	Merge commit '`9963b81f63`' into concedo_experimental # Conflicts: # .github/workflows/server.yml # SECURITY.md # docs/backend/SYCL.md # examples/model-conversion/README.md # examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh # ggml/src/ggml-hexagon/ggml-hexagon.cpp # ggml/src/ggml-hexagon/htp/matmul-ops.c # tests/CMakeLists.txt # tests/test-chat.cpp # tests/test-json-schema-to-grammar.cpp	2025-12-17 20:30:34 +08:00
Concedo	cacfa37611	wip	2025-12-17 16:04:45 +08:00
Wagner Bruna	78bbe89956	sd: sync to master-417-43a70e8 (#1889 ) * sd: sync to master-417-43a70e8 * fix sdmain build * switch to upstream apply_loras() * refactor u8 path conversions and add it to the gguf reader	2025-12-16 16:16:48 +08:00
Concedo	010995c967	Merge commit '`4df6e859e9`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # README.md # ci/run.sh # examples/gen-docs/gen-docs.cpp # scripts/snapdragon/adb/run-cli.sh # tests/test-lora-conversion-inference.sh # tools/CMakeLists.txt # tools/completion/CMakeLists.txt # tools/completion/README.md # tools/server/CMakeLists.txt	2025-12-12 17:23:25 +08:00
Concedo	cd73613136	moved volta onto tile kernels, so building for cc7.0 can be avoided this shouldn't do anything (+2 squashed commit) Squashed commit: [1cdcb302a] another attempt to tip the scales, part 2 [8f647b709] another attempt to tip the scales (volta)	2025-12-08 19:51:54 +08:00
Concedo	d27949f22a	Revert "try remove volta as a dedicated target b (+1 squashed commits)" This reverts commit `ddba580f00`.	2025-12-06 21:31:44 +08:00
Concedo	ddba580f00	try remove volta as a dedicated target b (+1 squashed commits) Squashed commits: [2df689a03] try remove volta as a dedicated target	2025-12-06 21:31:06 +08:00
Concedo	e570478275	limit cuda arches + scale tweaks	2025-11-28 13:05:11 +08:00
Wagner Bruna	3318b73c94	sd: sync to master-355-694f0d9	2025-11-23 19:28:34 -03:00
LostRuins Concedo	5751c30790	add vulkan for whisper	2025-11-13 15:37:58 +08:00
LostRuins Concedo	d6a2ad8455	still not really working right	2025-11-09 01:57:48 +08:00
LostRuins Concedo	cfb22b5c9d	rename a missed BLAS -> batch	2025-11-06 16:11:26 +08:00
Concedo	b5d3dcb6c0	add workflow for older pc	2025-10-29 17:35:04 +08:00
Wagner Bruna	d7da1eb35c	invert KCPP_BAKE_SD_VOCAB logic, move define to sdtype_adapter.cpp (#1803 ) Using KCPP_BAKE_SD_VOCAB to turn off the change to not embed the vocabulary files makes testing new upstream merges harder, because we then need to set that macro on the sd.cpp original build. So, revert the tests, making the define turn the change on. Also, since model.cpp is always built by Koboldcpp as part of the sdtype_adapter.cpp, it's enough to set the macro on that file.	2025-10-20 10:07:37 +08:00
Concedo	59aa1529dc	add embeddings vulkan to makefile	2025-10-13 11:05:45 +08:00

1 2 3 4 5 ...

709 commits