From b5024f62a48f84b9f8fc0b208c53c3565f70ebb4 Mon Sep 17 00:00:00 2001 From: chenxl Date: Sat, 12 Jul 2025 12:51:00 +0800 Subject: [PATCH] Update Kimi-K2 Readme --- doc/en/Kimi-K2.md | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/doc/en/Kimi-K2.md b/doc/en/Kimi-K2.md index 647fed1..298cb64 100644 --- a/doc/en/Kimi-K2.md +++ b/doc/en/Kimi-K2.md @@ -19,20 +19,13 @@ With a dual-socket CPU and sufficient system memory, enabling NUMA optimizations ### 1. Resource Requirements -The model running with 384 Experts requires approximately 2 TB of memory and 14 GB of GPU memory. +The model running with 384 Experts requires approximately 600 GB of memory and 14 GB of GPU memory. ### 2. Prepare Models -You can convert the fp8 to bf16. - ```bash -# download fp8 -huggingface-cli download --resume-download xxx - -# convert fp8 to bf16 -git clone https://github.com/deepseek-ai/DeepSeek-V3.git -cd inference -python fp8_cast_bf16.py --input-fp8-hf-path --output-bf16-hf-path +# download gguf +huggingface-cli download --resume-download KVCache-ai/Kimi-K2-Instruct-GGUF ``` @@ -46,7 +39,7 @@ To install KTransformers, follow the official [Installation Guide](https://kvcac python ktransformers/server/main.py \ --port 10002 \ --model_path \ - --gguf_path \ + --gguf_path \ --optimize_config_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml \ --max_new_tokens 1024 \ --cache_lens 32768 \