mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-31 21:39:42 +00:00

History

Concedo 724763fdec Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/vulkan.Dockerfile # .github/workflows/build.yml # .github/workflows/server.yml # common/common.cpp # examples/batched/README.md # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-cpu/arch-fallback.h # ggml/src/ggml-opencl/ggml-opencl.cpp # scripts/sync-ggml.last # src/CMakeLists.txt # tests/test-backend-ops.cpp # tools/server/CMakeLists.txt	2025-11-25 16:38:07 +08:00
..
diffusion-cli.cpp	Add LLaDA-7b-MoE diffusion model (#16003 )	2025-09-16 10:38:28 +08:00
README.md	models : Added support for RND1 Diffusion Language Model (#17433 )	2025-11-24 14:16:56 +08:00

Concedo 724763fdec Merge branch 'upstream' into concedo_experimental

# Conflicts:
#	.devops/vulkan.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/server.yml
#	common/common.cpp
#	examples/batched/README.md
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-cpu/arch-fallback.h
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	scripts/sync-ggml.last
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tools/server/CMakeLists.txt

2025-11-25 16:38:07 +08:00

diffusion-cli.cpp

Add LLaDA-7b-MoE diffusion model (#16003 )

2025-09-16 10:38:28 +08:00

README.md

models : Added support for RND1 Diffusion Language Model (#17433 )

2025-11-24 14:16:56 +08:00

README.md

Diffusion Text Generation

This directory contains implementations for Diffusion LLMs (DLLMs)

More Info:

Parameters

The diffusion CLI supports various parameters to control the generation process:

Core Diffusion Parameters

--diffusion-steps: Number of diffusion steps (default: 256)
--diffusion-algorithm: Algorithm for token selection
- 0: ORIGIN - Token will be generated in a purely random order from https://arxiv.org/abs/2107.03006.
- 1: ENTROPY_BASED - Entropy-based selection
- 2: MARGIN_BASED - Margin-based selection
- 3: RANDOM - Random selection
- 4: CONFIDENCE_BASED - Confidence-based selection (default)
- More documentation here https://github.com/DreamLM/Dream
--diffusion-visual: Enable live visualization during generation

Scheduling Parameters

Choose one of the following scheduling methods:

Timestep-based scheduling:

--diffusion-eps: Epsilon value for timestep scheduling (e.g., 0.001)

Block-based scheduling:

--diffusion-block-length: Block size for block-based scheduling (e.g., 32)

Sampling Parameters

--temp: Temperature for sampling (0.0 = greedy/deterministic, higher = more random)
--top-k: Top-k filtering for sampling
--top-p: Top-p (nucleus) filtering for sampling
--seed: Random seed for reproducibility

Model Parameters

-m: Path to the GGUF model file
-p: Input prompt text
-ub: Maximum sequence length (ubatch size)
-c: Context size
-b: Batch size

Examples

Dream architechture:

llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual

LLaDA architechture:

llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual

RND1 architecture:

llama-diffusion-cli -m RND1-Base-0910.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-algorithm 1 --diffusion-steps 256 --diffusion-visual --temp 0.5 --diffusion-eps 0.001