mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-28 03:39:48 +00:00
[docs]: refine dpo tutorial (#1739)
Some checks failed
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Book-CI / test (push) Has been cancelled
Some checks failed
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Book-CI / test (push) Has been cancelled
This commit is contained in:
parent
0bce173e3b
commit
dee1e211d5
1 changed files with 4 additions and 4 deletions
|
|
@ -61,7 +61,7 @@ pip install custom_flashinfer/
|
|||
|
||||
## Prepare Models
|
||||
|
||||
We uses `DeepSeek-V2-Lite-Chat` as an example here. You can replace it with other models such as Kimi K2.
|
||||
We uses `deepseek-ai/DeepSeek-V2-Lite` as an example here. You can replace it with other models such as Kimi K2.
|
||||
|
||||
## How to start
|
||||
|
||||
|
|
@ -80,7 +80,7 @@ For example, we provide the YAML file as follows:
|
|||
|
||||
```YAML
|
||||
### model
|
||||
model_name_or_path: DeepSeek-V2-Lite-Chat
|
||||
model_name_or_path: deepseek-ai/DeepSeek-V2-Lite
|
||||
trust_remote_code: true
|
||||
|
||||
### method
|
||||
|
|
@ -114,7 +114,7 @@ report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
|
|||
per_device_train_batch_size: 1
|
||||
gradient_accumulation_steps: 8
|
||||
learning_rate: 5.0e-6
|
||||
num_train_epochs: 0.1
|
||||
num_train_epochs: 3
|
||||
lr_scheduler_type: cosine
|
||||
warmup_ratio: 0.1
|
||||
bf16: true
|
||||
|
|
@ -130,7 +130,7 @@ chunk_size: 8192
|
|||
|
||||
For more details about --kt_optimize_rule, please refer to https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/KTransformers-Fine-Tuning_User-Guide.md
|
||||
|
||||
(2)examples/inference/deepseek2_lora_dpo_kt.yaml
|
||||
Then, you can use the lora adapter saved in `saves/Kllama_deepseekV2_DPO` for inference the same as the sft training. For example,
|
||||
|
||||
```YAML
|
||||
model_name_or_path: DeepSeek-V2-Lite-Chat
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue