Commit graph

709 commits

Author SHA1 Message Date
Concedo
79666e5764 revert sdcpp build steps to use makefile and cmake without external txt files 2026-05-16 00:53:56 +08:00
Concedo
f8391d527a fix broken makefile 2026-05-15 23:02:38 +08:00
Wagner Bruna
bfe9548fd5
sd: sync to master-596-90e87bc (#2204)
* sd: reuse source lists between make and cmake

* sd: sync to master-596-90e87bc

* Update source file path for sdtype_adapter.cpp

---------

Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>
2026-05-14 23:14:33 +08:00
Concedo
4cfa1ad1c4 rpc server test build 2026-05-12 23:32:42 +08:00
Wagner Bruna
243b03586b
sd: build each source file separately (#2188)
* sd: build source files separately

* sd: decouple stable-diffusion.cpp and sdtype_adapter.cpp

* sd: remove include util.h from sdtype_adapter.cpp

* sd: update source file lists and review dependencies
2026-05-07 22:50:10 +08:00
henk717
bcf9c81e0d
Linux CUDA13 Action (#2186)
* Linux CU13 CI

* Bump max CUDA arch

* CUDA13 Linux

* Upload the correct build to rolling (CUDA13)

* Downgrade cuda to get better compatibility

Runpod can't handle 13.1, and if they can't handle it neither can the people with a secondary GPU of an older generation.

* Add support for compute capability 89 in NVCCFLAGS
2026-05-06 18:06:39 +08:00
Concedo
950676fdb7 split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00
Wagner Bruna
276c651a12
sd: sync to master-593-3d6064b (#2175)
* sd: sync to master-593-3d6064b

* sd: use the same sdtype_adapter object for all builds

Since master-592-b8079e2, no sd.cpp source depends on the ggml
backend build anymore.

* sd: fix main_gpu selection

* sd: report backend devices to the Python layer
2026-05-04 14:05:34 +08:00
Wagner Bruna
e2bdd6d7aa
sd: sync to master-591-331cfa5 (#2155)
* sd: sync to master-585-44cca3d

* sd: sync to master-587-b8bdffc

* sd: sync to master-591-331cfa5
2026-05-01 16:33:28 +08:00
Wagner Bruna
bad9b61064
sd: sync to master-582-7023fc4 (#2150)
* sd: remove sampler alias handling from the C++ layer

It's already handled at the Python layer.

* sd: sync to master-580-7d33d4b

* sd: sync to master-582-7023fc4
2026-04-21 23:01:33 +08:00
Concedo
9a38091207 support q5_1 kv 2026-04-17 17:06:15 +08:00
Concedo
a165a73120 Merge commit 'd6f3030047' into concedo_experimental
# Conflicts:
#	examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py
#	examples/model-conversion/scripts/utils/semantic_check.py
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cpu/amx/amx.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp
#	ggml/src/ggml-virtgpu/ggml-backend.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-zdnn/ggml-zdnn.cpp
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	pyproject.toml
#	requirements/requirements-convert_legacy_llama.txt
#	requirements/requirements-tool_bench.txt
#	src/llama-model.cpp
#	src/llama.cpp
#	tests/test-llama-archs.cpp
#	tests/test-tokenizer-0.py
#	tests/test-tokenizer-random.py
#	tools/llama-bench/llama-bench.cpp
#	tools/perplexity/perplexity.cpp
2026-04-11 11:10:55 +08:00
Wagner Bruna
f371bb14d4
sd: sync to master-560-e8323ca (#2082)
* sd: sync to master-540-f16a110

* tae post-merge fixes

* build fixes

* restore image mask for non-inpainting models

* sd: sync to master-551-99c1de3

* avoid nlohmann/json.hpp include diffs

* Euler A now works on Flux

* sd: sync to master-555-7397dda

avi_writer.h got removed upstream, but I've simply kept the local
copy for now.

* sd: sync to master-558-8afbeb6

* sd: sync to master-560-e8323ca
2026-04-09 14:44:59 +08:00
Concedo
8a6c41dc5c Merge commit '841bc203e2' into concedo_experimental
# Conflicts:
#	.github/workflows/ai-issues.yml
#	embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-musa/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	tests/test-chat-auto-parser.cpp
#	tests/test-jinja.cpp
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/server/README.md
2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias
8e045b33a1
fix - w64devkit vulkan build (#2048) 2026-03-20 16:37:22 +08:00
Concedo
67c9798d0b Merge commit '3ca19b0e9f' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	common/CMakeLists.txt
#	common/chat-peg-parser.cpp
#	docs/backend/SYCL.md
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-sycl/common.hpp
#	ggml/src/ggml-sycl/convert.hpp
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-sycl/rope.cpp
#	ggml/src/ggml-sycl/rope.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
#	scripts/compare-llama-bench.py
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tools/cli/cli.cpp
2026-03-15 11:11:31 +08:00
JustCommitRandomness
9ddd74111f
OpenBSD changes for vulkan backend (#2026)
* OpenBSD also needs alloca.h

* Changes to compile vulkan backend with OpenBSD

* Update README.md

tweak details for OpenBSD vulkan backend

* Update README.md
2026-03-08 20:41:36 +08:00
Concedo
adebf63877 ace converter 2026-02-26 19:53:02 +08:00
Concedo
0fd7d2c0e5 ace step diffusion loading 2026-02-24 15:24:15 +08:00
Concedo
13db5aee9e stub files for loading ace step 2026-02-22 23:15:08 +08:00
Concedo
5cd6e50eab initial files for ace step 2026-02-22 13:22:24 +08:00
Concedo
72219fdbf5 basic qwen3 tts working 2026-02-21 12:03:53 +08:00
Concedo
1af7095cb5 add qwen3 tts repo files 2026-02-21 10:54:55 +08:00
Wagner Bruna
ae5183be10
sd: sync to master-504-636d3cb (#1969)
* sd: sync to master-504-636d3cb

* sd: fix and simplify limit calculation

- restore the "arbitrarily high" 8192 limit, since it's used to turn
off the img_hard_limit (and if each side was always limited by 2048,
we wouldn't need hard_megapixel_res_limit)
- avoid changing the config cfg_square_limit during a generation
- apply the hard_megapixel_res_limit only in the configuration-changed
path, since the default path uses constants
- clean up comments

The calculation itself remains the same:
- default area limit: 832² for SD1.5/SD2, 1024² otherwise
- configured limit always between 64 and 2048
2026-02-14 08:12:08 +08:00
Concedo
bff3fd3e34 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	common/common.cpp
#	docs/backend/snapdragon/README.md
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	scripts/pr2wt.sh
#	tests/test-backend-ops.cpp
#	tools/server/README.md
2026-02-13 14:00:45 +08:00
Concedo
423a4bd3c0 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
2026-02-06 14:43:02 +08:00
Concedo
ddce19db72 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/package-gguf-py.nix
#	.devops/nix/scope.nix
#	common/CMakeLists.txt
#	docs/backend/SYCL.md
#	examples/lookahead/lookahead.cpp
#	examples/lookup/lookup.cpp
#	examples/sycl/run-llama2.sh
#	examples/sycl/win-run-llama2.bat
#	examples/sycl/win-test.bat
#	ggml/src/ggml-hexagon/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/flash-attn-ops.c
#	ggml/src/ggml-hexagon/htp/hvx-dump.h
#	ggml/src/ggml-hexagon/htp/hvx-reduce.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-hexagon/htp/softmax-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	scripts/sync-ggml.last
2026-02-01 22:35:25 +08:00
Concedo
7e755014b2 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/winget.yml
#	CODEOWNERS
#	common/CMakeLists.txt
#	common/arg.cpp
#	docs/ops/SYCL.csv
#	examples/lookup/lookup-create.cpp
#	examples/lookup/lookup-stats.cpp
#	examples/lookup/lookup.cpp
#	examples/speculative-simple/speculative-simple.cpp
#	examples/speculative/speculative.cpp
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-sycl/dpct/helper.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	tests/test-chat-template.cpp
2026-01-29 23:05:05 +08:00
Concedo
e8e7c357c9 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build-cache.yml
#	.github/workflows/build-cmake-pkg.yml
#	.github/workflows/build-linux-cross.yml
#	.github/workflows/build.yml
#	.github/workflows/check-vendor.yml
#	.github/workflows/close-issue.yml
#	.github/workflows/copilot-setup-steps.yml
#	.github/workflows/docker.yml
#	.github/workflows/editorconfig.yml
#	.github/workflows/gguf-publish.yml
#	.github/workflows/labeler.yml
#	.github/workflows/pre-tokenizer-hashes.yml
#	.github/workflows/python-check-requirements.yml
#	.github/workflows/python-lint.yml
#	.github/workflows/python-type-check.yml
#	.github/workflows/release.yml
#	.github/workflows/server-webui.yml
#	.github/workflows/server.yml
#	.github/workflows/update-ops-docs.yml
#	.github/workflows/winget.yml
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-zdnn/ggml-zdnn.cpp
#	requirements/requirements-tool_bench.txt
#	src/CMakeLists.txt
#	src/llama-quant.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat.cpp
#	tools/cli/cli.cpp
#	tools/server/README.md
2026-01-23 14:27:04 +08:00
Concedo
5c6cc02985 remove clblast, part 2 2026-01-23 14:09:46 +08:00
Concedo
7f485e5287 remove CLBlast, part 1 2026-01-23 13:50:12 +08:00
Concedo
8855a7f52b Merge commit 'c945aaaef2' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	README.md
#	common/CMakeLists.txt
#	common/chat.cpp
#	docs/function-calling.md
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tests/peg-parser/tests.h
#	tests/test-chat-peg-parser.cpp
#	tests/test-chat-template.cpp
#	tests/test-chat.cpp
#	tests/testing.h
#	tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
d15bd212c5 cleanup 2026-01-17 00:57:33 +08:00
Concedo
cde4791e36 fix tools building 2025-12-19 12:08:29 +08:00
Concedo
a01b49098c fix tool builds 2025-12-18 23:26:31 +08:00
Concedo
1daeed5d4d Merge commit '9963b81f63' into concedo_experimental
# Conflicts:
#	.github/workflows/server.yml
#	SECURITY.md
#	docs/backend/SYCL.md
#	examples/model-conversion/README.md
#	examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	tests/CMakeLists.txt
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
2025-12-17 20:30:34 +08:00
Concedo
cacfa37611 wip 2025-12-17 16:04:45 +08:00
Wagner Bruna
78bbe89956
sd: sync to master-417-43a70e8 (#1889)
* sd: sync to master-417-43a70e8

* fix sdmain build

* switch to upstream apply_loras()

* refactor u8 path conversions and add it to the gguf reader
2025-12-16 16:16:48 +08:00
Concedo
010995c967 Merge commit '4df6e859e9' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
#	ci/run.sh
#	examples/gen-docs/gen-docs.cpp
#	scripts/snapdragon/adb/run-cli.sh
#	tests/test-lora-conversion-inference.sh
#	tools/CMakeLists.txt
#	tools/completion/CMakeLists.txt
#	tools/completion/README.md
#	tools/server/CMakeLists.txt
2025-12-12 17:23:25 +08:00
Concedo
cd73613136 moved volta onto tile kernels, so building for cc7.0 can be avoided
this shouldn't do anything (+2 squashed commit)

Squashed commit:

[1cdcb302a] another attempt to tip the scales, part 2

[8f647b709] another attempt to tip the scales (volta)
2025-12-08 19:51:54 +08:00
Concedo
d27949f22a Revert "try remove volta as a dedicated target b (+1 squashed commits)"
This reverts commit ddba580f00.
2025-12-06 21:31:44 +08:00
Concedo
ddba580f00 try remove volta as a dedicated target b (+1 squashed commits)
Squashed commits:

[2df689a03] try remove volta as a dedicated target
2025-12-06 21:31:06 +08:00
Concedo
e570478275 limit cuda arches + scale tweaks 2025-11-28 13:05:11 +08:00
Wagner Bruna
3318b73c94 sd: sync to master-355-694f0d9 2025-11-23 19:28:34 -03:00
LostRuins Concedo
5751c30790 add vulkan for whisper 2025-11-13 15:37:58 +08:00
LostRuins Concedo
d6a2ad8455 still not really working right 2025-11-09 01:57:48 +08:00
LostRuins Concedo
cfb22b5c9d rename a missed BLAS -> batch 2025-11-06 16:11:26 +08:00
Concedo
b5d3dcb6c0 add workflow for older pc 2025-10-29 17:35:04 +08:00
Wagner Bruna
d7da1eb35c
invert KCPP_BAKE_SD_VOCAB logic, move define to sdtype_adapter.cpp (#1803)
Using KCPP_BAKE_SD_VOCAB to turn off the change to not embed the
vocabulary files makes testing new upstream merges harder, because
we then need to set that macro on the sd.cpp original build.

So, revert the tests, making the define turn the change on. Also,
since model.cpp is always built by Koboldcpp as part of the
sdtype_adapter.cpp, it's enough to set the macro on that file.
2025-10-20 10:07:37 +08:00
Concedo
59aa1529dc add embeddings vulkan to makefile 2025-10-13 11:05:45 +08:00