Wagner Bruna
bad9b61064
sd: sync to master-582-7023fc4 ( #2150 )
...
* sd: remove sampler alias handling from the C++ layer
It's already handled at the Python layer.
* sd: sync to master-580-7d33d4b
* sd: sync to master-582-7023fc4
2026-04-21 23:01:33 +08:00
Concedo
9a38091207
support q5_1 kv
2026-04-17 17:06:15 +08:00
Concedo
a165a73120
Merge commit ' d6f3030047' into concedo_experimental
...
# Conflicts:
# examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py
# examples/model-conversion/scripts/utils/semantic_check.py
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/amx/amx.cpp
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-hexagon/ggml-hexagon.cpp
# ggml/src/ggml-hip/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-openvino/ggml-openvino.cpp
# ggml/src/ggml-rpc/ggml-rpc.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp
# ggml/src/ggml-virtgpu/ggml-backend.cpp
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# ggml/src/ggml-zdnn/ggml-zdnn.cpp
# ggml/src/ggml-zendnn/ggml-zendnn.cpp
# pyproject.toml
# requirements/requirements-convert_legacy_llama.txt
# requirements/requirements-tool_bench.txt
# src/llama-model.cpp
# src/llama.cpp
# tests/test-llama-archs.cpp
# tests/test-tokenizer-0.py
# tests/test-tokenizer-random.py
# tools/llama-bench/llama-bench.cpp
# tools/perplexity/perplexity.cpp
2026-04-11 11:10:55 +08:00
Wagner Bruna
f371bb14d4
sd: sync to master-560-e8323ca ( #2082 )
...
* sd: sync to master-540-f16a110
* tae post-merge fixes
* build fixes
* restore image mask for non-inpainting models
* sd: sync to master-551-99c1de3
* avoid nlohmann/json.hpp include diffs
* Euler A now works on Flux
* sd: sync to master-555-7397dda
avi_writer.h got removed upstream, but I've simply kept the local
copy for now.
* sd: sync to master-558-8afbeb6
* sd: sync to master-560-e8323ca
2026-04-09 14:44:59 +08:00
Concedo
8a6c41dc5c
Merge commit ' 841bc203e2' into concedo_experimental
...
# Conflicts:
# .github/workflows/ai-issues.yml
# embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/aclnn_ops.h
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-hip/CMakeLists.txt
# ggml/src/ggml-musa/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/cvt.cl
# ggml/src/ggml-openvino/ggml-openvino.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# tests/test-chat-auto-parser.cpp
# tests/test-jinja.cpp
# tools/cli/README.md
# tools/completion/README.md
# tools/server/README.md
2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias
8e045b33a1
fix - w64devkit vulkan build ( #2048 )
2026-03-20 16:37:22 +08:00
Concedo
67c9798d0b
Merge commit ' 3ca19b0e9f' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# common/CMakeLists.txt
# common/chat-peg-parser.cpp
# docs/backend/SYCL.md
# docs/ops.md
# docs/ops/SYCL.csv
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/convert.hpp
# ggml/src/ggml-sycl/element_wise.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/norm.cpp
# ggml/src/ggml-sycl/rope.cpp
# ggml/src/ggml-sycl/rope.hpp
# ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
# ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
# ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
# scripts/compare-llama-bench.py
# scripts/sync_vendor.py
# tests/CMakeLists.txt
# tools/cli/cli.cpp
2026-03-15 11:11:31 +08:00
JustCommitRandomness
9ddd74111f
OpenBSD changes for vulkan backend ( #2026 )
...
* OpenBSD also needs alloca.h
* Changes to compile vulkan backend with OpenBSD
* Update README.md
tweak details for OpenBSD vulkan backend
* Update README.md
2026-03-08 20:41:36 +08:00
Concedo
adebf63877
ace converter
2026-02-26 19:53:02 +08:00
Concedo
0fd7d2c0e5
ace step diffusion loading
2026-02-24 15:24:15 +08:00
Concedo
13db5aee9e
stub files for loading ace step
2026-02-22 23:15:08 +08:00
Concedo
5cd6e50eab
initial files for ace step
2026-02-22 13:22:24 +08:00
Concedo
72219fdbf5
basic qwen3 tts working
2026-02-21 12:03:53 +08:00
Concedo
1af7095cb5
add qwen3 tts repo files
2026-02-21 10:54:55 +08:00
Wagner Bruna
ae5183be10
sd: sync to master-504-636d3cb ( #1969 )
...
* sd: sync to master-504-636d3cb
* sd: fix and simplify limit calculation
- restore the "arbitrarily high" 8192 limit, since it's used to turn
off the img_hard_limit (and if each side was always limited by 2048,
we wouldn't need hard_megapixel_res_limit)
- avoid changing the config cfg_square_limit during a generation
- apply the hard_megapixel_res_limit only in the configuration-changed
path, since the default path uses constants
- clean up comments
The calculation itself remains the same:
- default area limit: 832² for SD1.5/SD2, 1024² otherwise
- configured limit always between 64 and 2048
2026-02-14 08:12:08 +08:00
Concedo
bff3fd3e34
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/common.cpp
# docs/backend/snapdragon/README.md
# ggml/src/ggml-hexagon/htp/htp-ops.h
# ggml/src/ggml-hexagon/htp/matmul-ops.c
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# scripts/pr2wt.sh
# tests/test-backend-ops.cpp
# tools/server/README.md
2026-02-13 14:00:45 +08:00
Concedo
423a4bd3c0
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# src/CMakeLists.txt
# tests/test-backend-ops.cpp
2026-02-06 14:43:02 +08:00
Concedo
ddce19db72
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/nix/package-gguf-py.nix
# .devops/nix/scope.nix
# common/CMakeLists.txt
# docs/backend/SYCL.md
# examples/lookahead/lookahead.cpp
# examples/lookup/lookup.cpp
# examples/sycl/run-llama2.sh
# examples/sycl/win-run-llama2.bat
# examples/sycl/win-test.bat
# ggml/src/ggml-hexagon/CMakeLists.txt
# ggml/src/ggml-hexagon/htp/flash-attn-ops.c
# ggml/src/ggml-hexagon/htp/hvx-dump.h
# ggml/src/ggml-hexagon/htp/hvx-reduce.h
# ggml/src/ggml-hexagon/htp/matmul-ops.c
# ggml/src/ggml-hexagon/htp/softmax-ops.c
# ggml/src/ggml-hexagon/htp/unary-ops.c
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/cvt.cl
# scripts/sync-ggml.last
2026-02-01 22:35:25 +08:00
Concedo
7e755014b2
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/winget.yml
# CODEOWNERS
# common/CMakeLists.txt
# common/arg.cpp
# docs/ops/SYCL.csv
# examples/lookup/lookup-create.cpp
# examples/lookup/lookup-stats.cpp
# examples/lookup/lookup.cpp
# examples/speculative-simple/speculative-simple.cpp
# examples/speculative/speculative.cpp
# ggml/src/ggml-hip/CMakeLists.txt
# ggml/src/ggml-sycl/dpct/helper.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/norm.cpp
# ggml/src/ggml-zendnn/ggml-zendnn.cpp
# tests/test-chat-template.cpp
2026-01-29 23:05:05 +08:00
Concedo
e8e7c357c9
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build-cache.yml
# .github/workflows/build-cmake-pkg.yml
# .github/workflows/build-linux-cross.yml
# .github/workflows/build.yml
# .github/workflows/check-vendor.yml
# .github/workflows/close-issue.yml
# .github/workflows/copilot-setup-steps.yml
# .github/workflows/docker.yml
# .github/workflows/editorconfig.yml
# .github/workflows/gguf-publish.yml
# .github/workflows/labeler.yml
# .github/workflows/pre-tokenizer-hashes.yml
# .github/workflows/python-check-requirements.yml
# .github/workflows/python-lint.yml
# .github/workflows/python-type-check.yml
# .github/workflows/release.yml
# .github/workflows/server-webui.yml
# .github/workflows/server.yml
# .github/workflows/update-ops-docs.yml
# .github/workflows/winget.yml
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-zdnn/ggml-zdnn.cpp
# requirements/requirements-tool_bench.txt
# src/CMakeLists.txt
# src/llama-quant.cpp
# tests/test-backend-ops.cpp
# tests/test-chat.cpp
# tools/cli/cli.cpp
# tools/server/README.md
2026-01-23 14:27:04 +08:00
Concedo
5c6cc02985
remove clblast, part 2
2026-01-23 14:09:46 +08:00
Concedo
7f485e5287
remove CLBlast, part 1
2026-01-23 13:50:12 +08:00
Concedo
8855a7f52b
Merge commit ' c945aaaef2' into concedo_experimental
...
# Conflicts:
# .devops/cann.Dockerfile
# .github/workflows/build.yml
# .github/workflows/release.yml
# README.md
# common/CMakeLists.txt
# common/chat.cpp
# docs/function-calling.md
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/aclnn_ops.h
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
# scripts/sync_vendor.py
# tests/CMakeLists.txt
# tests/peg-parser/tests.h
# tests/test-chat-peg-parser.cpp
# tests/test-chat-template.cpp
# tests/test-chat.cpp
# tests/testing.h
# tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
d15bd212c5
cleanup
2026-01-17 00:57:33 +08:00
Concedo
cde4791e36
fix tools building
2025-12-19 12:08:29 +08:00
Concedo
a01b49098c
fix tool builds
2025-12-18 23:26:31 +08:00
Concedo
1daeed5d4d
Merge commit ' 9963b81f63' into concedo_experimental
...
# Conflicts:
# .github/workflows/server.yml
# SECURITY.md
# docs/backend/SYCL.md
# examples/model-conversion/README.md
# examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh
# ggml/src/ggml-hexagon/ggml-hexagon.cpp
# ggml/src/ggml-hexagon/htp/matmul-ops.c
# tests/CMakeLists.txt
# tests/test-chat.cpp
# tests/test-json-schema-to-grammar.cpp
2025-12-17 20:30:34 +08:00
Concedo
cacfa37611
wip
2025-12-17 16:04:45 +08:00
Wagner Bruna
78bbe89956
sd: sync to master-417-43a70e8 ( #1889 )
...
* sd: sync to master-417-43a70e8
* fix sdmain build
* switch to upstream apply_loras()
* refactor u8 path conversions and add it to the gguf reader
2025-12-16 16:16:48 +08:00
Concedo
010995c967
Merge commit ' 4df6e859e9' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# README.md
# ci/run.sh
# examples/gen-docs/gen-docs.cpp
# scripts/snapdragon/adb/run-cli.sh
# tests/test-lora-conversion-inference.sh
# tools/CMakeLists.txt
# tools/completion/CMakeLists.txt
# tools/completion/README.md
# tools/server/CMakeLists.txt
2025-12-12 17:23:25 +08:00
Concedo
cd73613136
moved volta onto tile kernels, so building for cc7.0 can be avoided
...
this shouldn't do anything (+2 squashed commit)
Squashed commit:
[1cdcb302a] another attempt to tip the scales, part 2
[8f647b709] another attempt to tip the scales (volta)
2025-12-08 19:51:54 +08:00
Concedo
d27949f22a
Revert "try remove volta as a dedicated target b (+1 squashed commits)"
...
This reverts commit ddba580f00 .
2025-12-06 21:31:44 +08:00
Concedo
ddba580f00
try remove volta as a dedicated target b (+1 squashed commits)
...
Squashed commits:
[2df689a03] try remove volta as a dedicated target
2025-12-06 21:31:06 +08:00
Concedo
e570478275
limit cuda arches + scale tweaks
2025-11-28 13:05:11 +08:00
Wagner Bruna
3318b73c94
sd: sync to master-355-694f0d9
2025-11-23 19:28:34 -03:00
LostRuins Concedo
5751c30790
add vulkan for whisper
2025-11-13 15:37:58 +08:00
LostRuins Concedo
d6a2ad8455
still not really working right
2025-11-09 01:57:48 +08:00
LostRuins Concedo
cfb22b5c9d
rename a missed BLAS -> batch
2025-11-06 16:11:26 +08:00
Concedo
b5d3dcb6c0
add workflow for older pc
2025-10-29 17:35:04 +08:00
Wagner Bruna
d7da1eb35c
invert KCPP_BAKE_SD_VOCAB logic, move define to sdtype_adapter.cpp ( #1803 )
...
Using KCPP_BAKE_SD_VOCAB to turn off the change to not embed the
vocabulary files makes testing new upstream merges harder, because
we then need to set that macro on the sd.cpp original build.
So, revert the tests, making the define turn the change on. Also,
since model.cpp is always built by Koboldcpp as part of the
sdtype_adapter.cpp, it's enough to set the macro on that file.
2025-10-20 10:07:37 +08:00
Concedo
59aa1529dc
add embeddings vulkan to makefile
2025-10-13 11:05:45 +08:00
Concedo
e0ba01c65e
fix cuda builds
2025-10-12 20:09:16 +08:00
Concedo
f282362414
added qwen image support (+1 squashed commits)
...
Squashed commits:
[92df28061] added qwen image support (+1 squashed commits)
Squashed commits:
[1485c71ed] wip adding qwen image
2025-10-03 18:58:48 +08:00
Concedo
4f8f0e5949
move embeds into their own dir, detach sd vocab into separate files
2025-10-03 14:21:09 +08:00
Concedo
c00ae93421
makefile fix vulkan noext compile (+1 squashed commits)
...
Squashed commits:
[eae88fd49] makefile fix vulkan noext compile
2025-10-02 23:19:45 +08:00
Concedo
1a4f54dd11
update for cu13 builds (no ci will be provided)
2025-09-26 16:01:43 +08:00
Concedo
326f6f3fad
not sure if working on metal
2025-09-21 11:35:02 +08:00
tsite
04498a345a
update makefile to clone llguidance if the directory does not exist ( #1743 )
...
also remove llguidance when running 'make clean'
2025-09-21 08:40:55 +08:00
Concedo
fddd046f9d
metal common
2025-09-15 01:58:32 +08:00
Concedo
a5580a32fb
fix cuda and macos compile issues
2025-09-12 20:53:42 +08:00