Commit graph

700 commits

Author SHA1 Message Date
Wagner Bruna
bad9b61064
sd: sync to master-582-7023fc4 (#2150)
* sd: remove sampler alias handling from the C++ layer

It's already handled at the Python layer.

* sd: sync to master-580-7d33d4b

* sd: sync to master-582-7023fc4
2026-04-21 23:01:33 +08:00
Concedo
9a38091207 support q5_1 kv 2026-04-17 17:06:15 +08:00
Concedo
a165a73120 Merge commit 'd6f3030047' into concedo_experimental
# Conflicts:
#	examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py
#	examples/model-conversion/scripts/utils/semantic_check.py
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cpu/amx/amx.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp
#	ggml/src/ggml-virtgpu/ggml-backend.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-zdnn/ggml-zdnn.cpp
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	pyproject.toml
#	requirements/requirements-convert_legacy_llama.txt
#	requirements/requirements-tool_bench.txt
#	src/llama-model.cpp
#	src/llama.cpp
#	tests/test-llama-archs.cpp
#	tests/test-tokenizer-0.py
#	tests/test-tokenizer-random.py
#	tools/llama-bench/llama-bench.cpp
#	tools/perplexity/perplexity.cpp
2026-04-11 11:10:55 +08:00
Wagner Bruna
f371bb14d4
sd: sync to master-560-e8323ca (#2082)
* sd: sync to master-540-f16a110

* tae post-merge fixes

* build fixes

* restore image mask for non-inpainting models

* sd: sync to master-551-99c1de3

* avoid nlohmann/json.hpp include diffs

* Euler A now works on Flux

* sd: sync to master-555-7397dda

avi_writer.h got removed upstream, but I've simply kept the local
copy for now.

* sd: sync to master-558-8afbeb6

* sd: sync to master-560-e8323ca
2026-04-09 14:44:59 +08:00
Concedo
8a6c41dc5c Merge commit '841bc203e2' into concedo_experimental
# Conflicts:
#	.github/workflows/ai-issues.yml
#	embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-musa/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	ggml/src/ggml-openvino/ggml-openvino.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	tests/test-chat-auto-parser.cpp
#	tests/test-jinja.cpp
#	tools/cli/README.md
#	tools/completion/README.md
#	tools/server/README.md
2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias
8e045b33a1
fix - w64devkit vulkan build (#2048) 2026-03-20 16:37:22 +08:00
Concedo
67c9798d0b Merge commit '3ca19b0e9f' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	common/CMakeLists.txt
#	common/chat-peg-parser.cpp
#	docs/backend/SYCL.md
#	docs/ops.md
#	docs/ops/SYCL.csv
#	ggml/src/ggml-sycl/common.hpp
#	ggml/src/ggml-sycl/convert.hpp
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-sycl/rope.cpp
#	ggml/src/ggml-sycl/rope.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
#	ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
#	scripts/compare-llama-bench.py
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tools/cli/cli.cpp
2026-03-15 11:11:31 +08:00
JustCommitRandomness
9ddd74111f
OpenBSD changes for vulkan backend (#2026)
* OpenBSD also needs alloca.h

* Changes to compile vulkan backend with OpenBSD

* Update README.md

tweak details for OpenBSD vulkan backend

* Update README.md
2026-03-08 20:41:36 +08:00
Concedo
adebf63877 ace converter 2026-02-26 19:53:02 +08:00
Concedo
0fd7d2c0e5 ace step diffusion loading 2026-02-24 15:24:15 +08:00
Concedo
13db5aee9e stub files for loading ace step 2026-02-22 23:15:08 +08:00
Concedo
5cd6e50eab initial files for ace step 2026-02-22 13:22:24 +08:00
Concedo
72219fdbf5 basic qwen3 tts working 2026-02-21 12:03:53 +08:00
Concedo
1af7095cb5 add qwen3 tts repo files 2026-02-21 10:54:55 +08:00
Wagner Bruna
ae5183be10
sd: sync to master-504-636d3cb (#1969)
* sd: sync to master-504-636d3cb

* sd: fix and simplify limit calculation

- restore the "arbitrarily high" 8192 limit, since it's used to turn
off the img_hard_limit (and if each side was always limited by 2048,
we wouldn't need hard_megapixel_res_limit)
- avoid changing the config cfg_square_limit during a generation
- apply the hard_megapixel_res_limit only in the configuration-changed
path, since the default path uses constants
- clean up comments

The calculation itself remains the same:
- default area limit: 832² for SD1.5/SD2, 1024² otherwise
- configured limit always between 64 and 2048
2026-02-14 08:12:08 +08:00
Concedo
bff3fd3e34 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	common/common.cpp
#	docs/backend/snapdragon/README.md
#	ggml/src/ggml-hexagon/htp/htp-ops.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	scripts/pr2wt.sh
#	tests/test-backend-ops.cpp
#	tools/server/README.md
2026-02-13 14:00:45 +08:00
Concedo
423a4bd3c0 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
2026-02-06 14:43:02 +08:00
Concedo
ddce19db72 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/package-gguf-py.nix
#	.devops/nix/scope.nix
#	common/CMakeLists.txt
#	docs/backend/SYCL.md
#	examples/lookahead/lookahead.cpp
#	examples/lookup/lookup.cpp
#	examples/sycl/run-llama2.sh
#	examples/sycl/win-run-llama2.bat
#	examples/sycl/win-test.bat
#	ggml/src/ggml-hexagon/CMakeLists.txt
#	ggml/src/ggml-hexagon/htp/flash-attn-ops.c
#	ggml/src/ggml-hexagon/htp/hvx-dump.h
#	ggml/src/ggml-hexagon/htp/hvx-reduce.h
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	ggml/src/ggml-hexagon/htp/softmax-ops.c
#	ggml/src/ggml-hexagon/htp/unary-ops.c
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/cvt.cl
#	scripts/sync-ggml.last
2026-02-01 22:35:25 +08:00
Concedo
7e755014b2 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/winget.yml
#	CODEOWNERS
#	common/CMakeLists.txt
#	common/arg.cpp
#	docs/ops/SYCL.csv
#	examples/lookup/lookup-create.cpp
#	examples/lookup/lookup-stats.cpp
#	examples/lookup/lookup.cpp
#	examples/speculative-simple/speculative-simple.cpp
#	examples/speculative/speculative.cpp
#	ggml/src/ggml-hip/CMakeLists.txt
#	ggml/src/ggml-sycl/dpct/helper.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/norm.cpp
#	ggml/src/ggml-zendnn/ggml-zendnn.cpp
#	tests/test-chat-template.cpp
2026-01-29 23:05:05 +08:00
Concedo
e8e7c357c9 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build-cache.yml
#	.github/workflows/build-cmake-pkg.yml
#	.github/workflows/build-linux-cross.yml
#	.github/workflows/build.yml
#	.github/workflows/check-vendor.yml
#	.github/workflows/close-issue.yml
#	.github/workflows/copilot-setup-steps.yml
#	.github/workflows/docker.yml
#	.github/workflows/editorconfig.yml
#	.github/workflows/gguf-publish.yml
#	.github/workflows/labeler.yml
#	.github/workflows/pre-tokenizer-hashes.yml
#	.github/workflows/python-check-requirements.yml
#	.github/workflows/python-lint.yml
#	.github/workflows/python-type-check.yml
#	.github/workflows/release.yml
#	.github/workflows/server-webui.yml
#	.github/workflows/server.yml
#	.github/workflows/update-ops-docs.yml
#	.github/workflows/winget.yml
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-zdnn/ggml-zdnn.cpp
#	requirements/requirements-tool_bench.txt
#	src/CMakeLists.txt
#	src/llama-quant.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat.cpp
#	tools/cli/cli.cpp
#	tools/server/README.md
2026-01-23 14:27:04 +08:00
Concedo
5c6cc02985 remove clblast, part 2 2026-01-23 14:09:46 +08:00
Concedo
7f485e5287 remove CLBlast, part 1 2026-01-23 13:50:12 +08:00
Concedo
8855a7f52b Merge commit 'c945aaaef2' into concedo_experimental
# Conflicts:
#	.devops/cann.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	README.md
#	common/CMakeLists.txt
#	common/chat.cpp
#	docs/function-calling.md
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/aclnn_ops.h
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
#	scripts/sync_vendor.py
#	tests/CMakeLists.txt
#	tests/peg-parser/tests.h
#	tests/test-chat-peg-parser.cpp
#	tests/test-chat-template.cpp
#	tests/test-chat.cpp
#	tests/testing.h
#	tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
d15bd212c5 cleanup 2026-01-17 00:57:33 +08:00
Concedo
cde4791e36 fix tools building 2025-12-19 12:08:29 +08:00
Concedo
a01b49098c fix tool builds 2025-12-18 23:26:31 +08:00
Concedo
1daeed5d4d Merge commit '9963b81f63' into concedo_experimental
# Conflicts:
#	.github/workflows/server.yml
#	SECURITY.md
#	docs/backend/SYCL.md
#	examples/model-conversion/README.md
#	examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	tests/CMakeLists.txt
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
2025-12-17 20:30:34 +08:00
Concedo
cacfa37611 wip 2025-12-17 16:04:45 +08:00
Wagner Bruna
78bbe89956
sd: sync to master-417-43a70e8 (#1889)
* sd: sync to master-417-43a70e8

* fix sdmain build

* switch to upstream apply_loras()

* refactor u8 path conversions and add it to the gguf reader
2025-12-16 16:16:48 +08:00
Concedo
010995c967 Merge commit '4df6e859e9' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
#	ci/run.sh
#	examples/gen-docs/gen-docs.cpp
#	scripts/snapdragon/adb/run-cli.sh
#	tests/test-lora-conversion-inference.sh
#	tools/CMakeLists.txt
#	tools/completion/CMakeLists.txt
#	tools/completion/README.md
#	tools/server/CMakeLists.txt
2025-12-12 17:23:25 +08:00
Concedo
cd73613136 moved volta onto tile kernels, so building for cc7.0 can be avoided
this shouldn't do anything (+2 squashed commit)

Squashed commit:

[1cdcb302a] another attempt to tip the scales, part 2

[8f647b709] another attempt to tip the scales (volta)
2025-12-08 19:51:54 +08:00
Concedo
d27949f22a Revert "try remove volta as a dedicated target b (+1 squashed commits)"
This reverts commit ddba580f00.
2025-12-06 21:31:44 +08:00
Concedo
ddba580f00 try remove volta as a dedicated target b (+1 squashed commits)
Squashed commits:

[2df689a03] try remove volta as a dedicated target
2025-12-06 21:31:06 +08:00
Concedo
e570478275 limit cuda arches + scale tweaks 2025-11-28 13:05:11 +08:00
Wagner Bruna
3318b73c94 sd: sync to master-355-694f0d9 2025-11-23 19:28:34 -03:00
LostRuins Concedo
5751c30790 add vulkan for whisper 2025-11-13 15:37:58 +08:00
LostRuins Concedo
d6a2ad8455 still not really working right 2025-11-09 01:57:48 +08:00
LostRuins Concedo
cfb22b5c9d rename a missed BLAS -> batch 2025-11-06 16:11:26 +08:00
Concedo
b5d3dcb6c0 add workflow for older pc 2025-10-29 17:35:04 +08:00
Wagner Bruna
d7da1eb35c
invert KCPP_BAKE_SD_VOCAB logic, move define to sdtype_adapter.cpp (#1803)
Using KCPP_BAKE_SD_VOCAB to turn off the change to not embed the
vocabulary files makes testing new upstream merges harder, because
we then need to set that macro on the sd.cpp original build.

So, revert the tests, making the define turn the change on. Also,
since model.cpp is always built by Koboldcpp as part of the
sdtype_adapter.cpp, it's enough to set the macro on that file.
2025-10-20 10:07:37 +08:00
Concedo
59aa1529dc add embeddings vulkan to makefile 2025-10-13 11:05:45 +08:00
Concedo
e0ba01c65e fix cuda builds 2025-10-12 20:09:16 +08:00
Concedo
f282362414 added qwen image support (+1 squashed commits)
Squashed commits:

[92df28061] added qwen image support (+1 squashed commits)

Squashed commits:

[1485c71ed] wip adding qwen image
2025-10-03 18:58:48 +08:00
Concedo
4f8f0e5949 move embeds into their own dir, detach sd vocab into separate files 2025-10-03 14:21:09 +08:00
Concedo
c00ae93421 makefile fix vulkan noext compile (+1 squashed commits)
Squashed commits:

[eae88fd49] makefile fix vulkan noext compile
2025-10-02 23:19:45 +08:00
Concedo
1a4f54dd11 update for cu13 builds (no ci will be provided) 2025-09-26 16:01:43 +08:00
Concedo
326f6f3fad not sure if working on metal 2025-09-21 11:35:02 +08:00
tsite
04498a345a
update makefile to clone llguidance if the directory does not exist (#1743)
also remove llguidance when running 'make clean'
2025-09-21 08:40:55 +08:00
Concedo
fddd046f9d metal common 2025-09-15 01:58:32 +08:00
Concedo
a5580a32fb fix cuda and macos compile issues 2025-09-12 20:53:42 +08:00