Concedo
7e08e8d8b4
add some rpc dependencies (+1 squashed commits)
...
Squashed commits:
[b092a94e5] add some rpc dependencies
2026-05-18 22:17:30 +08:00
Wagner Bruna
90326f8585
sd: sync to master-612-d7ecbe1 ( #2213 )
2026-05-18 21:19:12 +08:00
Concedo
1e828ccabf
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/common.cpp
# ggml/CMakeLists.txt
# scripts/sync-ggml.last
# scripts/sync_vendor.py
# src/llama-context.cpp
# tests/CMakeLists.txt
# tests/test-backend-ops.cpp
# tools/cli/README.md
# tools/completion/README.md
# tools/server/README.md
2026-05-17 11:26:18 +08:00
Concedo
79666e5764
revert sdcpp build steps to use makefile and cmake without external txt files
2026-05-16 00:53:56 +08:00
Concedo
f8391d527a
fix broken makefile
2026-05-15 23:02:38 +08:00
Wagner Bruna
bfe9548fd5
sd: sync to master-596-90e87bc ( #2204 )
...
* sd: reuse source lists between make and cmake
* sd: sync to master-596-90e87bc
* Update source file path for sdtype_adapter.cpp
---------
Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>
2026-05-14 23:14:33 +08:00
Concedo
4cfa1ad1c4
rpc server test build
2026-05-12 23:32:42 +08:00
Wagner Bruna
243b03586b
sd: build each source file separately ( #2188 )
...
* sd: build source files separately
* sd: decouple stable-diffusion.cpp and sdtype_adapter.cpp
* sd: remove include util.h from sdtype_adapter.cpp
* sd: update source file lists and review dependencies
2026-05-07 22:50:10 +08:00
henk717
bcf9c81e0d
Linux CUDA13 Action ( #2186 )
...
* Linux CU13 CI
* Bump max CUDA arch
* CUDA13 Linux
* Upload the correct build to rolling (CUDA13)
* Downgrade cuda to get better compatibility
Runpod can't handle 13.1, and if they can't handle it neither can the people with a secondary GPU of an older generation.
* Add support for compute capability 89 in NVCCFLAGS
2026-05-06 18:06:39 +08:00
Concedo
950676fdb7
split utils.cpp into 2 files to support sd.cpp
2026-05-04 15:04:12 +08:00
Wagner Bruna
276c651a12
sd: sync to master-593-3d6064b ( #2175 )
...
* sd: sync to master-593-3d6064b
* sd: use the same sdtype_adapter object for all builds
Since master-592-b8079e2, no sd.cpp source depends on the ggml
backend build anymore.
* sd: fix main_gpu selection
* sd: report backend devices to the Python layer
2026-05-04 14:05:34 +08:00
Wagner Bruna
e2bdd6d7aa
sd: sync to master-591-331cfa5 ( #2155 )
...
* sd: sync to master-585-44cca3d
* sd: sync to master-587-b8bdffc
* sd: sync to master-591-331cfa5
2026-05-01 16:33:28 +08:00
Wagner Bruna
bad9b61064
sd: sync to master-582-7023fc4 ( #2150 )
...
* sd: remove sampler alias handling from the C++ layer
It's already handled at the Python layer.
* sd: sync to master-580-7d33d4b
* sd: sync to master-582-7023fc4
2026-04-21 23:01:33 +08:00
Concedo
9a38091207
support q5_1 kv
2026-04-17 17:06:15 +08:00
Concedo
a165a73120
Merge commit ' d6f3030047' into concedo_experimental
...
# Conflicts:
# examples/model-conversion/scripts/causal/run-casual-gen-embeddings-org.py
# examples/model-conversion/scripts/utils/semantic_check.py
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/amx/amx.cpp
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-hexagon/ggml-hexagon.cpp
# ggml/src/ggml-hip/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-openvino/ggml-openvino.cpp
# ggml/src/ggml-rpc/ggml-rpc.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-virtgpu/ggml-backend-buffer.cpp
# ggml/src/ggml-virtgpu/ggml-backend.cpp
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# ggml/src/ggml-zdnn/ggml-zdnn.cpp
# ggml/src/ggml-zendnn/ggml-zendnn.cpp
# pyproject.toml
# requirements/requirements-convert_legacy_llama.txt
# requirements/requirements-tool_bench.txt
# src/llama-model.cpp
# src/llama.cpp
# tests/test-llama-archs.cpp
# tests/test-tokenizer-0.py
# tests/test-tokenizer-random.py
# tools/llama-bench/llama-bench.cpp
# tools/perplexity/perplexity.cpp
2026-04-11 11:10:55 +08:00
Wagner Bruna
f371bb14d4
sd: sync to master-560-e8323ca ( #2082 )
...
* sd: sync to master-540-f16a110
* tae post-merge fixes
* build fixes
* restore image mask for non-inpainting models
* sd: sync to master-551-99c1de3
* avoid nlohmann/json.hpp include diffs
* Euler A now works on Flux
* sd: sync to master-555-7397dda
avi_writer.h got removed upstream, but I've simply kept the local
copy for now.
* sd: sync to master-558-8afbeb6
* sd: sync to master-560-e8323ca
2026-04-09 14:44:59 +08:00
Concedo
8a6c41dc5c
Merge commit ' 841bc203e2' into concedo_experimental
...
# Conflicts:
# .github/workflows/ai-issues.yml
# embd_res/templates/HuggingFaceTB-SmolLM3-3B.jinja
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/aclnn_ops.h
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-hip/CMakeLists.txt
# ggml/src/ggml-musa/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/cvt.cl
# ggml/src/ggml-openvino/ggml-openvino.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# tests/test-chat-auto-parser.cpp
# tests/test-jinja.cpp
# tools/cli/README.md
# tools/completion/README.md
# tools/server/README.md
2026-03-25 22:49:53 +08:00
Gustavo Rocha Dias
8e045b33a1
fix - w64devkit vulkan build ( #2048 )
2026-03-20 16:37:22 +08:00
Concedo
67c9798d0b
Merge commit ' 3ca19b0e9f' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# common/CMakeLists.txt
# common/chat-peg-parser.cpp
# docs/backend/SYCL.md
# docs/ops.md
# docs/ops/SYCL.csv
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/convert.hpp
# ggml/src/ggml-sycl/element_wise.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/norm.cpp
# ggml/src/ggml-sycl/rope.cpp
# ggml/src/ggml-sycl/rope.hpp
# ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_decls.tmpl
# ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_reg_tile.wgsl
# ggml/src/ggml-webgpu/wgsl-shaders/mul_mat_vec.wgsl
# scripts/compare-llama-bench.py
# scripts/sync_vendor.py
# tests/CMakeLists.txt
# tools/cli/cli.cpp
2026-03-15 11:11:31 +08:00
JustCommitRandomness
9ddd74111f
OpenBSD changes for vulkan backend ( #2026 )
...
* OpenBSD also needs alloca.h
* Changes to compile vulkan backend with OpenBSD
* Update README.md
tweak details for OpenBSD vulkan backend
* Update README.md
2026-03-08 20:41:36 +08:00
Concedo
adebf63877
ace converter
2026-02-26 19:53:02 +08:00
Concedo
0fd7d2c0e5
ace step diffusion loading
2026-02-24 15:24:15 +08:00
Concedo
13db5aee9e
stub files for loading ace step
2026-02-22 23:15:08 +08:00
Concedo
5cd6e50eab
initial files for ace step
2026-02-22 13:22:24 +08:00
Concedo
72219fdbf5
basic qwen3 tts working
2026-02-21 12:03:53 +08:00
Concedo
1af7095cb5
add qwen3 tts repo files
2026-02-21 10:54:55 +08:00
Wagner Bruna
ae5183be10
sd: sync to master-504-636d3cb ( #1969 )
...
* sd: sync to master-504-636d3cb
* sd: fix and simplify limit calculation
- restore the "arbitrarily high" 8192 limit, since it's used to turn
off the img_hard_limit (and if each side was always limited by 2048,
we wouldn't need hard_megapixel_res_limit)
- avoid changing the config cfg_square_limit during a generation
- apply the hard_megapixel_res_limit only in the configuration-changed
path, since the default path uses constants
- clean up comments
The calculation itself remains the same:
- default area limit: 832² for SD1.5/SD2, 1024² otherwise
- configured limit always between 64 and 2048
2026-02-14 08:12:08 +08:00
Concedo
bff3fd3e34
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/common.cpp
# docs/backend/snapdragon/README.md
# ggml/src/ggml-hexagon/htp/htp-ops.h
# ggml/src/ggml-hexagon/htp/matmul-ops.c
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# scripts/pr2wt.sh
# tests/test-backend-ops.cpp
# tools/server/README.md
2026-02-13 14:00:45 +08:00
Concedo
423a4bd3c0
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# src/CMakeLists.txt
# tests/test-backend-ops.cpp
2026-02-06 14:43:02 +08:00
Concedo
ddce19db72
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/nix/package-gguf-py.nix
# .devops/nix/scope.nix
# common/CMakeLists.txt
# docs/backend/SYCL.md
# examples/lookahead/lookahead.cpp
# examples/lookup/lookup.cpp
# examples/sycl/run-llama2.sh
# examples/sycl/win-run-llama2.bat
# examples/sycl/win-test.bat
# ggml/src/ggml-hexagon/CMakeLists.txt
# ggml/src/ggml-hexagon/htp/flash-attn-ops.c
# ggml/src/ggml-hexagon/htp/hvx-dump.h
# ggml/src/ggml-hexagon/htp/hvx-reduce.h
# ggml/src/ggml-hexagon/htp/matmul-ops.c
# ggml/src/ggml-hexagon/htp/softmax-ops.c
# ggml/src/ggml-hexagon/htp/unary-ops.c
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/cvt.cl
# scripts/sync-ggml.last
2026-02-01 22:35:25 +08:00
Concedo
7e755014b2
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/winget.yml
# CODEOWNERS
# common/CMakeLists.txt
# common/arg.cpp
# docs/ops/SYCL.csv
# examples/lookup/lookup-create.cpp
# examples/lookup/lookup-stats.cpp
# examples/lookup/lookup.cpp
# examples/speculative-simple/speculative-simple.cpp
# examples/speculative/speculative.cpp
# ggml/src/ggml-hip/CMakeLists.txt
# ggml/src/ggml-sycl/dpct/helper.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/norm.cpp
# ggml/src/ggml-zendnn/ggml-zendnn.cpp
# tests/test-chat-template.cpp
2026-01-29 23:05:05 +08:00
Concedo
e8e7c357c9
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build-cache.yml
# .github/workflows/build-cmake-pkg.yml
# .github/workflows/build-linux-cross.yml
# .github/workflows/build.yml
# .github/workflows/check-vendor.yml
# .github/workflows/close-issue.yml
# .github/workflows/copilot-setup-steps.yml
# .github/workflows/docker.yml
# .github/workflows/editorconfig.yml
# .github/workflows/gguf-publish.yml
# .github/workflows/labeler.yml
# .github/workflows/pre-tokenizer-hashes.yml
# .github/workflows/python-check-requirements.yml
# .github/workflows/python-lint.yml
# .github/workflows/python-type-check.yml
# .github/workflows/release.yml
# .github/workflows/server-webui.yml
# .github/workflows/server.yml
# .github/workflows/update-ops-docs.yml
# .github/workflows/winget.yml
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-zdnn/ggml-zdnn.cpp
# requirements/requirements-tool_bench.txt
# src/CMakeLists.txt
# src/llama-quant.cpp
# tests/test-backend-ops.cpp
# tests/test-chat.cpp
# tools/cli/cli.cpp
# tools/server/README.md
2026-01-23 14:27:04 +08:00
Concedo
5c6cc02985
remove clblast, part 2
2026-01-23 14:09:46 +08:00
Concedo
7f485e5287
remove CLBlast, part 1
2026-01-23 13:50:12 +08:00
Concedo
8855a7f52b
Merge commit ' c945aaaef2' into concedo_experimental
...
# Conflicts:
# .devops/cann.Dockerfile
# .github/workflows/build.yml
# .github/workflows/release.yml
# README.md
# common/CMakeLists.txt
# common/chat.cpp
# docs/function-calling.md
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/aclnn_ops.h
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# models/templates/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja
# scripts/sync_vendor.py
# tests/CMakeLists.txt
# tests/peg-parser/tests.h
# tests/test-chat-peg-parser.cpp
# tests/test-chat-template.cpp
# tests/test-chat.cpp
# tests/testing.h
# tools/llama-bench/llama-bench.cpp
2026-01-17 10:24:03 +08:00
Concedo
d15bd212c5
cleanup
2026-01-17 00:57:33 +08:00
Concedo
cde4791e36
fix tools building
2025-12-19 12:08:29 +08:00
Concedo
a01b49098c
fix tool builds
2025-12-18 23:26:31 +08:00
Concedo
1daeed5d4d
Merge commit ' 9963b81f63' into concedo_experimental
...
# Conflicts:
# .github/workflows/server.yml
# SECURITY.md
# docs/backend/SYCL.md
# examples/model-conversion/README.md
# examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh
# ggml/src/ggml-hexagon/ggml-hexagon.cpp
# ggml/src/ggml-hexagon/htp/matmul-ops.c
# tests/CMakeLists.txt
# tests/test-chat.cpp
# tests/test-json-schema-to-grammar.cpp
2025-12-17 20:30:34 +08:00
Concedo
cacfa37611
wip
2025-12-17 16:04:45 +08:00
Wagner Bruna
78bbe89956
sd: sync to master-417-43a70e8 ( #1889 )
...
* sd: sync to master-417-43a70e8
* fix sdmain build
* switch to upstream apply_loras()
* refactor u8 path conversions and add it to the gguf reader
2025-12-16 16:16:48 +08:00
Concedo
010995c967
Merge commit ' 4df6e859e9' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# README.md
# ci/run.sh
# examples/gen-docs/gen-docs.cpp
# scripts/snapdragon/adb/run-cli.sh
# tests/test-lora-conversion-inference.sh
# tools/CMakeLists.txt
# tools/completion/CMakeLists.txt
# tools/completion/README.md
# tools/server/CMakeLists.txt
2025-12-12 17:23:25 +08:00
Concedo
cd73613136
moved volta onto tile kernels, so building for cc7.0 can be avoided
...
this shouldn't do anything (+2 squashed commit)
Squashed commit:
[1cdcb302a] another attempt to tip the scales, part 2
[8f647b709] another attempt to tip the scales (volta)
2025-12-08 19:51:54 +08:00
Concedo
d27949f22a
Revert "try remove volta as a dedicated target b (+1 squashed commits)"
...
This reverts commit ddba580f00 .
2025-12-06 21:31:44 +08:00
Concedo
ddba580f00
try remove volta as a dedicated target b (+1 squashed commits)
...
Squashed commits:
[2df689a03] try remove volta as a dedicated target
2025-12-06 21:31:06 +08:00
Concedo
e570478275
limit cuda arches + scale tweaks
2025-11-28 13:05:11 +08:00
Wagner Bruna
3318b73c94
sd: sync to master-355-694f0d9
2025-11-23 19:28:34 -03:00
LostRuins Concedo
5751c30790
add vulkan for whisper
2025-11-13 15:37:58 +08:00
LostRuins Concedo
d6a2ad8455
still not really working right
2025-11-09 01:57:48 +08:00
LostRuins Concedo
cfb22b5c9d
rename a missed BLAS -> batch
2025-11-06 16:11:26 +08:00