mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-28 03:39:48 +00:00
Revise GPU/CPU memory footprint information
Updated memory footprint details for DeepSeek models.
This commit is contained in:
parent
501b114863
commit
6721f8765d
1 changed files with 3 additions and 3 deletions
|
|
@ -284,11 +284,11 @@ chunk_size: 8192
|
|||
|
||||
### GPU/CPU Memory Footprint
|
||||
|
||||
- DeepSeek-V3 (671B; 61 layers with 58 MoE): ~**70 GB** total GPU memory (multi-GPU), ~**1.2–1.3 TB** host memory.
|
||||
- DeepSeek-V2-Lite (14B; 27 layers with 26 MoE): ~**5.5 GB** GPU memory, ~**150 GB** host memory.
|
||||
- DeepSeek-V3 (671B; 61 layers with 58 MoE): ~**70 GB** total GPU VRAM (multi-GPU), ~**1.2–1.3 TB** CPU RAM.
|
||||
- DeepSeek-V2-Lite (14B; 27 layers with 26 MoE): ~**5.5 GB** GPU VRAM, ~**30 GB** CPU RAM.
|
||||
|
||||
## Conclusion
|
||||
|
||||
By integrating **KTransformers LoRA fine-tuning** into **LLaMA-Factory**, we provide a practical guide for efficient training and deployment of MoE LLMs. KT brings cutting-edge optimizations (DeepSeek/Qwen/Kimi support with AMX-accelerated kernels), and LoRA enables customization under very low GPU memory. LLaMA-Factory offers a friendly, unified interface.
|
||||
|
||||
This integration (akin to Unsloth-style speedups) means even models with tens to hundreds of billions of parameters can be fine-tuned and deployed with low latency on commodity hardware. You get **memory savings, speed-ups, and usability** together. We encourage you to try LLaMA-Factory + KT for your next MoE project and follow this guide. Feedback is welcome!
|
||||
This integration (akin to Unsloth-style speedups) means even models with tens to hundreds of billions of parameters can be fine-tuned and deployed with low latency on commodity hardware. You get **memory savings, speed-ups, and usability** together. We encourage you to try LLaMA-Factory + KT for your next MoE project and follow this guide. Feedback is welcome!
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue