mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-28 03:39:48 +00:00
Update doc
This commit is contained in:
parent
4dc5518e4d
commit
36fbeee341
11 changed files with 101 additions and 59 deletions
|
|
@ -45,7 +45,7 @@ from-https://github.com/kvcache-ai/ktransformers/issues/129#issue-2842799552
|
|||
|
||||
### Q: If I don't have enough VRAM, but I have multiple GPUs, how can I utilize them?
|
||||
|
||||
Use the `--optimize_rule_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml` to load the two optimized rule yaml file. You may also use it as an example to write your own 4/8 gpu optimized rule yaml file.
|
||||
Use the `--optimize_config_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml` to load the two optimized rule yaml file. You may also use it as an example to write your own 4/8 gpu optimized rule yaml file.
|
||||
|
||||
> Note: The ktransformers' multi-gpu stratigy is pipline, which is not able to speed up the model's inference. It's only for the model's weight distribution.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue