mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-11 01:24:36 +00:00

History

Concedo 01d5175654 Merge branch 'upstream' into concedo_experimental # Conflicts: # Makefile # ggml/src/CMakeLists.txt		2024-07-24 16:41:33 +08:00
..
export-lora.cpp	examples : Fix `llama-export-lora` example (#8607 )	2024-07-23 23:48:37 +02:00
README.md	examples : Fix `llama-export-lora` example (#8607 )	2024-07-23 23:48:37 +02:00

README.md

export-lora

Apply LORA adapters to base model and export the resulting model.

usage: llama-export-lora [options]

options:
  -m,    --model                  model path from which to load base model (default '')
         --lora FNAME             path to LoRA adapter  (can be repeated to use multiple adapters)
         --lora-scaled FNAME S    path to LoRA adapter with user defined scaling S  (can be repeated to use multiple adapters)
  -t,    --threads N              number of threads to use during computation (default: 4)
  -o,    --output FNAME           output file (default: 'ggml-lora-merged-f16.gguf')

For example:

./bin/llama-export-lora \
    -m open-llama-3b-v2-q8_0.gguf \
    -o open-llama-3b-v2-q8_0-english2tokipona-chat.gguf \
    --lora lora-open-llama-3b-v2-q8_0-english2tokipona-chat-LATEST.bin

Multiple LORA adapters can be applied by passing multiple --lora FNAME or --lora-scaled FNAME S command line parameters.