koboldcpp/otherarch
Wagner Bruna 592d12d0a3
sd: support for CLIP and VAE on different devices (#2184)
* sd: generalize internal interfaces to place generation on CPU

* sd: backend support for multi-device selection

* sd: frontend support for multi-device selection

* add deprecated flags to avoid breaking old cli args

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2026-05-19 21:51:23 +08:00
..
acestep try fix recent segfault on SIGINT https://github.com/LostRuins/koboldcpp/issues/2215 2026-05-18 22:37:14 +08:00
qwen3tts use original precision for q3tts 2026-04-30 17:28:11 +08:00
sdcpp sd: support for CLIP and VAE on different devices (#2184) 2026-05-19 21:51:23 +08:00
tools
ttscpp hack to allow kokoro to remain functional even with much higher GGML_SCHED_MAX_SPLIT_INPUTS 2026-04-19 20:40:07 +08:00
whispercpp refactor: handle GGML_VK_VISIBLE_DEVICES at the Python level (#2179) 2026-05-02 23:10:29 +08:00
embeddings_adapter.cpp refactor: handle GGML_VK_VISIBLE_DEVICES at the Python level (#2179) 2026-05-02 23:10:29 +08:00
ggml_v1.c
ggml_v1.h
ggml_v2-cuda-legacy.cu
ggml_v2-cuda-legacy.h
ggml_v2-cuda.cu
ggml_v2-cuda.h
ggml_v2.c
ggml_v2.h
ggml_v3-cuda.cu
ggml_v3-cuda.h
ggml_v3.c
ggml_v3.h
gpt2_v1.cpp
gpt2_v2.cpp
gpt2_v3.cpp
gptj_v1.cpp
gptj_v2.cpp
gptj_v3.cpp
llama-util.h
llama_v2-util.h
llama_v2.cpp
llama_v2.h
llama_v3.cpp
llama_v3.h
llmutils.cpp split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00
llmutils.h split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00
mpt_v3.cpp
neox_v2.cpp
neox_v3.cpp
otherarch.h added preliminary support for reasoning budget 2026-04-18 11:56:33 +08:00
rwkv_v2.cpp
rwkv_v2.h
rwkv_v3.cpp
rwkv_v3.h
rwkv_vocab.cpp
tts_adapter.cpp split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00
utils.cpp split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00
utils.h split utils.cpp into 2 files to support sd.cpp 2026-05-04 15:04:12 +08:00