Update prefix_cache.md

2025-09-07 04:59:55 +00:00 · 2025-06-30 15:04:37 +08:00 · 2025-06-30 15:04:37 +08:00 · 5a73aaf652
commit 5a73aaf652
parent a9a72e52c3
1 changed files with 6 additions and 2 deletions
--- a/doc/en/prefix_cache.md
+++ b/doc/en/prefix_cache.md
@ -1,6 +1,6 @@
 ## Enabling Prefix Cache Mode in KTransformers

-To enable **Prefix Cache Mode** in KTransformers, you need to modify the configuration file and recompile the project.
+Balance serve now supports prefix cache reuse! To enable **Prefix Cache Mode** in KTransformers, you need to modify the configuration file and recompile the project. 

 ### Step 1: Modify the Configuration File

@ -31,4 +31,8 @@ Then recompile the project:
 USE_BALANCE_SERVE=1  bash ./install.sh
 # For those who have two cpu and 1T RAM（Dual NUMA）:
 USE_BALANCE_SERVE=1 USE_NUMA=1 bash ./install.sh
-```
+```
+
+## Note
+Balance serve utilizes a 3-layer (GPU-CPU-Disk) scheme to store and reuse KVCache. Deleting KVCache is not supported now. If you have too much KVCache, you can simply delete them by remove kvcache files. 
+