mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-11 17:44:38 +00:00

History

Concedo 2cad736260 Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/nix/package.nix # .github/labeler.yml # .gitignore # CMakeLists.txt # Makefile # Package.swift # README.md # ci/run.sh # docs/build.md # examples/CMakeLists.txt # flake.lock # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # grammars/README.md # requirements/requirements-convert_hf_to_gguf.txt # requirements/requirements-convert_hf_to_gguf_update.txt # scripts/check-requirements.sh # scripts/compare-llama-bench.py # scripts/gen-unicode-data.py # scripts/sync-ggml-am.sh # scripts/sync-ggml.last # scripts/sync-ggml.sh # tests/test-backend-ops.cpp # tests/test-chat-template.cpp # tests/test-tokenizer-random.py		2024-07-11 16:36:16 +08:00
..
convert_train_checkpoint_to_gguf.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
README.md	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
train-text-from-scratch.cpp	ggml : refactor rope norm/neox (#7634 )	2024-06-05 11:29:20 +03:00

README.md

train-text-from-scratch

Basic usage instructions:

# get training data
wget https://raw.githubusercontent.com/brunoklein99/deep-learning-notes/master/shakespeare.txt

# train
./bin/llama-train-text-from-scratch \
        --vocab-model ../models/ggml-vocab-llama.gguf \
        --ctx 64 --embd 256 --head 8 --layer 16 \
        --checkpoint-in  chk-shakespeare-256x16-LATEST.gguf \
        --checkpoint-out chk-shakespeare-256x16-ITERATION.gguf \
        --model-out ggml-shakespeare-256x16-f32-ITERATION.gguf \
        --train-data "shakespeare.txt" \
        -t 6 -b 16 --seed 1 --adam-iter 256 \
        --no-checkpointing

# predict
./bin/llama-cli -m ggml-shakespeare-256x16-f32.gguf

Output files will be saved every N iterations (config with --save-every N). The pattern "ITERATION" in the output filenames will be replaced with the iteration number and "LATEST" for the latest output.

To train GGUF models just pass them to --checkpoint-in FN.