Concedo
724763fdec
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/vulkan.Dockerfile
# .github/workflows/build.yml
# .github/workflows/server.yml
# common/common.cpp
# examples/batched/README.md
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/arch-fallback.h
# ggml/src/ggml-opencl/ggml-opencl.cpp
# scripts/sync-ggml.last
# src/CMakeLists.txt
# tests/test-backend-ops.cpp
# tools/server/CMakeLists.txt
2025-11-25 16:38:07 +08:00
william pan
4902eebe33
models : Added support for RND1 Diffusion Language Model ( #17433 )
...
* Converted RND1 model to GGUF weights
* RND1 llama.cpp support v1
* RND1 llama.cpp support v2 non causal bug
* RND1 llama.cpp support v3 doccumentation
* RND1 llama.cpp support v4 clean code
* linting issues
* RND1 pr fixes v1
* RND1 pr fixes v2
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Diffusion documentation edits
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-24 14:16:56 +08:00
Concedo
3e72aaff5b
Merge commit ' 8f8f2274ee' into concedo_experimental
...
# Conflicts:
# .devops/rocm.Dockerfile
# .github/workflows/build.yml
# .github/workflows/release.yml
# CMakeLists.txt
# examples/simple/simple.cpp
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-opencl/kernels/tsembd.cl
# ggml/src/ggml-sycl/binbcast.cpp
# ggml/src/ggml-sycl/binbcast.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/tsembd.cpp
# ggml/src/ggml-zdnn/ggml-zdnn.cpp
# src/llama-model.cpp
# tools/batched-bench/CMakeLists.txt
# tools/cvector-generator/CMakeLists.txt
# tools/export-lora/CMakeLists.txt
# tools/gguf-split/CMakeLists.txt
# tools/imatrix/CMakeLists.txt
# tools/llama-bench/CMakeLists.txt
# tools/llama-bench/llama-bench.cpp
# tools/main/CMakeLists.txt
# tools/main/README.md
# tools/mtmd/CMakeLists.txt
# tools/perplexity/CMakeLists.txt
# tools/perplexity/perplexity.cpp
# tools/quantize/CMakeLists.txt
# tools/rpc/rpc-server.cpp
# tools/run/CMakeLists.txt
# tools/run/run.cpp
# tools/tokenize/CMakeLists.txt
# tools/tts/CMakeLists.txt
2025-09-21 08:58:23 +08:00
Aman Gupta
6d758839ff
Add LLaDA-7b-MoE diffusion model ( #16003 )
2025-09-16 10:38:28 +08:00
Concedo
7e35954695
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# docs/build.md
# docs/function-calling.md
# examples/eval-callback/eval-callback.cpp
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/kleidiai/kernels.cpp
# ggml/src/ggml-cpu/kleidiai/kernels.h
# ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
# scripts/compare-llama-bench.py
# scripts/server-bench.py
# scripts/tool_bench.py
# tests/test-chat.cpp
# tools/batched-bench/batched-bench.cpp
# tools/llama-bench/llama-bench.cpp
# tools/server/README.md
2025-08-31 23:33:36 +08:00
Johannes Gäßler
e81b8e4b7f
llama: use FA + max. GPU layers by default ( #15434 )
...
* llama: use max. GPU layers by default, auto -fa
* ggml-backend: abort instead of segfault
2025-08-30 16:32:10 +02:00
Concedo
52606e9b1d
tts cpp model is now loadable in kcpp
2025-08-17 15:47:22 +08:00
Aman Gupta
8a4a856277
Add LLaDA 8b Diffusion model ( #14771 )
...
* Add support for Llada-8b: diffusion model
* Add README
* Fix README and convert_hf_to_gguf
* convert_hf_to_gguf.py: address review comments
* Make everything in a single example
* Remove model-specific sampling
* Remove unused argmax
* Remove braced initializers, improve README.md a bit
* Add diffusion specific gguf params in set_vocab, remove setting rope_theta and rms_norm_eps
* Remove adding the mask token
* Move add_add_bos_token to set_vocab
* use add_bool in gguf_writer.py
2025-07-31 19:49:09 +08:00
Aman Gupta
ab14019821
Support diffusion models: Add Dream 7B ( #14644 )
...
* Support diffusion models: Add Dream 7B
* Move diffusion to examples
* Move stuff to examples. Add patch to not use kv-cache
* Address review comments
* Make sampling fast
* llama: remove diffusion functions
* Add basic timings + cleanup
* More cleanup
* Review comments: better formating, use LOG instead std::cerr, re-use batch, use ubatch instead of max_length
* fixup!
* Review: move everything to diffusion-cli for now
2025-07-16 20:03:51 +08:00