Commit graph

712 commits

Author SHA1 Message Date
Concedo
7e08e8d8b4 add some rpc dependencies (+1 squashed commits)
Squashed commits:

[b092a94e5] add some rpc dependencies
2026-05-18 22:17:30 +08:00
Wagner Bruna
90326f8585
sd: sync to master-612-d7ecbe1 (#2213) 2026-05-18 21:19:12 +08:00
Concedo
1e828ccabf Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	common/common.cpp
#	ggml/CMakeLists.txt
#	scripts/sync-ggml.last
#	scripts/sync_vendor.py
#	src/llama-context.cpp
#	tests/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/server/README.md
2026-05-17 11:26:18 +08:00
Concedo
79666e5764 revert sdcpp build steps to use makefile and cmake without external txt files 2026-05-16 00:53:56 +08:00
Concedo
f8391d527a fix broken makefile 2026-05-15 23:02:38 +08:00
Wagner Bruna
bfe9548fd5
sd: sync to master-596-90e87bc (#2204)
* sd: reuse source lists between make and cmake

* sd: sync to master-596-90e87bc

* Update source file path for sdtype_adapter.cpp

---------

Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>
2026-05-14 23:14:33 +08:00
Concedo
4cfa1ad1c4 rpc server test build 2026-05-12 23:32:42 +08:00
Wagner Bruna
243b03586b
sd: build each source file separately (#2188)
* sd: build source files separately

* sd: decouple stable-diffusion.cpp and sdtype_adapter.cpp

* sd: remove include util.h from sdtype_adapter.cpp

* sd: update source file lists and review dependencies
2026-05-07 22:50:10 +08:00
henk717
bcf9c81e0d
Linux CUDA13 Action (#2186)
* Linux CU13 CI

* Bump max CUDA arch

* CUDA13 Linux

* Upload the correct build to rolling (CUDA13)

* Downgrade cuda to get better compatibility

Runpod can't handle 13.1, and if they can't handle it neither can the people with a secondary GPU of an older generation.

* Add support for compute capability 89 in NVCCFLAGS
2026-05-06 18:06:39 +08:00
Concedo
950676fdb7 split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00
Wagner Bruna
276c651a12
sd: sync to master-593-3d6064b (#2175)
* sd: sync to master-593-3d6064b

* sd: use the same sdtype_adapter object for all builds

Since master-592-b8079e2, no sd.cpp source depends on the ggml
backend build anymore.

* sd: fix main_gpu selection

* sd: report backend devices to the Python layer
2026-05-04 14:05:34 +08:00
Wagner Bruna
e2bdd6d7aa
sd: sync to master-591-331cfa5 (#2155)
* sd: sync to master-585-44cca3d

* sd: sync to master-587-b8bdffc

* sd: sync to master-591-331cfa5
2026-05-01 16:33:28 +08:00
Wagner Bruna
bad9b61064
sd: sync to master-582-7023fc4 (#2150)
* sd: remove sampler alias handling from the C++ layer

It's already handled at the Python layer.

* sd: sync to master-580-7d33d4b

* sd: sync to master-582-7023fc4
2026-04-21 23:01:33 +08:00
Concedo
9a38091207 support q5_1 kv 2026-04-17 17:06:15 +08:00
Concedo
a165a73120 Merge commit 'd6f3030047' into concedo_experimental
# Conflicts:
#	examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py
#	examples/model-conversion/scripts/utils/semantic_check.py
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cpu/amx/amx.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp
#	ggml/src/ggml-virtgpu/ggml-backend.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-zdnn/ggml-zdnn.cpp
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	pyproject.toml
#	requirements/requirements-convert_legacy_llama.txt
#	requirements/requirements-tool_bench.txt
#	src/llama-model.cpp
#	src/llama.cpp
#	tests/test-llama-archs.cpp
#	tests/test-tokenizer-0.py
#	tests/test-tokenizer-random.py
#	tools/llama-bench/llama-bench.cpp
#	tools/perplexity/perplexity.cpp
2026-04-11 11:10:55 +08:00
Wagner Bruna
f371bb14d4
sd: sync to master-560-e8323ca (#2082)
* sd: sync to master-540-f16a110

* tae post-merge fixes

* build fixes

* restore image mask for non-inpainting models

* sd: sync to master-551-99c1de3

* avoid nlohmann/json.hpp include diffs

* Euler A now works on Flux

* sd: sync to master-555-7397dda

avi_writer.h got removed upstream, but I've simply kept the local
copy for now.

* sd: sync to master-558-8afbeb6

* sd: sync to master-560-e8323ca
2026-04-09 14:44:59 +08:00
Concedo
8a6c41dc5c Merge commit '841bc203e2' into concedo_experimental
# Conflicts:
#	.github/workflows/ai-issues.yml
#	embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-musa/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	tests/test-chat-auto-parser.cpp
#	tests/test-jinja.cpp
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/server/README.md
2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias
8e045b33a1
fix - w64devkit vulkan build (#2048) 2026-03-20 16:37:22 +08:00
Concedo
67c9798d0b Merge commit '3ca19b0e9f' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	common/CMakeLists.txt
#	common/chat-peg-parser.cpp
#	docs/backend/SYCL.md
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-sycl/common.hpp
#	ggml/src/ggml-sycl/convert.hpp
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-sycl/rope.cpp
#	ggml/src/ggml-sycl/rope.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
#	scripts/compare-llama-bench.py
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tools/cli/cli.cpp
2026-03-15 11:11:31 +08:00
JustCommitRandomness
9ddd74111f
OpenBSD changes for vulkan backend (#2026)
* OpenBSD also needs alloca.h

* Changes to compile vulkan backend with OpenBSD

* Update README.md

tweak details for OpenBSD vulkan backend

* Update README.md
2026-03-08 20:41:36 +08:00
Concedo
adebf63877 ace converter 2026-02-26 19:53:02 +08:00
Concedo
0fd7d2c0e5 ace step diffusion loading 2026-02-24 15:24:15 +08:00
Concedo
13db5aee9e stub files for loading ace step 2026-02-22 23:15:08 +08:00
Concedo
5cd6e50eab initial files for ace step 2026-02-22 13:22:24 +08:00
Concedo
72219fdbf5 basic qwen3 tts working 2026-02-21 12:03:53 +08:00
Concedo
1af7095cb5 add qwen3 tts repo files 2026-02-21 10:54:55 +08:00
Wagner Bruna
ae5183be10
sd: sync to master-504-636d3cb (#1969)
* sd: sync to master-504-636d3cb

* sd: fix and simplify limit calculation

- restore the "arbitrarily high" 8192 limit, since it's used to turn
off the img_hard_limit (and if each side was always limited by 2048,
we wouldn't need hard_megapixel_res_limit)
- avoid changing the config cfg_square_limit during a generation
- apply the hard_megapixel_res_limit only in the configuration-changed
path, since the default path uses constants
- clean up comments

The calculation itself remains the same:
- default area limit: 832² for SD1.5/SD2, 1024² otherwise
- configured limit always between 64 and 2048
2026-02-14 08:12:08 +08:00
Concedo
bff3fd3e34 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	common/common.cpp
#	docs/backend/snapdragon/README.md
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	scripts/pr2wt.sh
#	tests/test-backend-ops.cpp
#	tools/server/README.md
2026-02-13 14:00:45 +08:00
Concedo
423a4bd3c0 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
2026-02-06 14:43:02 +08:00
Concedo
ddce19db72 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/package-gguf-py.nix
#	.devops/nix/scope.nix
#	common/CMakeLists.txt
#	docs/backend/SYCL.md
#	examples/lookahead/lookahead.cpp
#	examples/lookup/lookup.cpp
#	examples/sycl/run-llama2.sh
#	examples/sycl/win-run-llama2.bat
#	examples/sycl/win-test.bat
#	ggml/src/ggml-hexagon/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/flash-attn-ops.c
#	ggml/src/ggml-hexagon/htp/hvx-dump.h
#	ggml/src/ggml-hexagon/htp/hvx-reduce.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-hexagon/htp/softmax-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	scripts/sync-ggml.last
2026-02-01 22:35:25 +08:00
Concedo
7e755014b2 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/winget.yml
#	CODEOWNERS
#	common/CMakeLists.txt
#	common/arg.cpp
#	docs/ops/SYCL.csv
#	examples/lookup/lookup-create.cpp
#	examples/lookup/lookup-stats.cpp
#	examples/lookup/lookup.cpp
#	examples/speculative-simple/speculative-simple.cpp
#	examples/speculative/speculative.cpp
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-sycl/dpct/helper.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	tests/test-chat-template.cpp
2026-01-29 23:05:05 +08:00
Concedo
e8e7c357c9 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build-cache.yml
#	.github/workflows/build-cmake-pkg.yml
#	.github/workflows/build-linux-cross.yml
#	.github/workflows/build.yml
#	.github/workflows/check-vendor.yml
#	.github/workflows/close-issue.yml
#	.github/workflows/copilot-setup-steps.yml
#	.github/workflows/docker.yml
#	.github/workflows/editorconfig.yml
#	.github/workflows/gguf-publish.yml
#	.github/workflows/labeler.yml
#	.github/workflows/pre-tokenizer-hashes.yml
#	.github/workflows/python-check-requirements.yml
#	.github/workflows/python-lint.yml
#	.github/workflows/python-type-check.yml
#	.github/workflows/release.yml
#	.github/workflows/server-webui.yml
#	.github/workflows/server.yml
#	.github/workflows/update-ops-docs.yml
#	.github/workflows/winget.yml
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-zdnn/ggml-zdnn.cpp
#	requirements/requirements-tool_bench.txt
#	src/CMakeLists.txt
#	src/llama-quant.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat.cpp
#	tools/cli/cli.cpp
#	tools/server/README.md
2026-01-23 14:27:04 +08:00
Concedo
5c6cc02985 remove clblast, part 2 2026-01-23 14:09:46 +08:00
Concedo
7f485e5287 remove CLBlast, part 1 2026-01-23 13:50:12 +08:00
Concedo
8855a7f52b Merge commit 'c945aaaef2' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	README.md
#	common/CMakeLists.txt
#	common/chat.cpp
#	docs/function-calling.md
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tests/peg-parser/tests.h
#	tests/test-chat-peg-parser.cpp
#	tests/test-chat-template.cpp
#	tests/test-chat.cpp
#	tests/testing.h
#	tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
d15bd212c5 cleanup 2026-01-17 00:57:33 +08:00
Concedo
cde4791e36 fix tools building 2025-12-19 12:08:29 +08:00
Concedo
a01b49098c fix tool builds 2025-12-18 23:26:31 +08:00
Concedo
1daeed5d4d Merge commit '9963b81f63' into concedo_experimental
# Conflicts:
#	.github/workflows/server.yml
#	SECURITY.md
#	docs/backend/SYCL.md
#	examples/model-conversion/README.md
#	examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	tests/CMakeLists.txt
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
2025-12-17 20:30:34 +08:00
Concedo
cacfa37611 wip 2025-12-17 16:04:45 +08:00
Wagner Bruna
78bbe89956
sd: sync to master-417-43a70e8 (#1889)
* sd: sync to master-417-43a70e8

* fix sdmain build

* switch to upstream apply_loras()

* refactor u8 path conversions and add it to the gguf reader
2025-12-16 16:16:48 +08:00
Concedo
010995c967 Merge commit '4df6e859e9' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
#	ci/run.sh
#	examples/gen-docs/gen-docs.cpp
#	scripts/snapdragon/adb/run-cli.sh
#	tests/test-lora-conversion-inference.sh
#	tools/CMakeLists.txt
#	tools/completion/CMakeLists.txt
#	tools/completion/README.md
#	tools/server/CMakeLists.txt
2025-12-12 17:23:25 +08:00
Concedo
cd73613136 moved volta onto tile kernels, so building for cc7.0 can be avoided
this shouldn't do anything (+2 squashed commit)

Squashed commit:

[1cdcb302a] another attempt to tip the scales, part 2

[8f647b709] another attempt to tip the scales (volta)
2025-12-08 19:51:54 +08:00
Concedo
d27949f22a Revert "try remove volta as a dedicated target b (+1 squashed commits)"
This reverts commit ddba580f00.
2025-12-06 21:31:44 +08:00
Concedo
ddba580f00 try remove volta as a dedicated target b (+1 squashed commits)
Squashed commits:

[2df689a03] try remove volta as a dedicated target
2025-12-06 21:31:06 +08:00
Concedo
e570478275 limit cuda arches + scale tweaks 2025-11-28 13:05:11 +08:00
Wagner Bruna
3318b73c94 sd: sync to master-355-694f0d9 2025-11-23 19:28:34 -03:00
LostRuins Concedo
5751c30790 add vulkan for whisper 2025-11-13 15:37:58 +08:00
LostRuins Concedo
d6a2ad8455 still not really working right 2025-11-09 01:57:48 +08:00
LostRuins Concedo
cfb22b5c9d rename a missed BLAS -> batch 2025-11-06 16:11:26 +08:00