mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2025-09-09 05:54:06 +00:00
Update prefix_cache.md
This commit is contained in:
parent
a9a72e52c3
commit
5a73aaf652
1 changed files with 6 additions and 2 deletions
|
@ -1,6 +1,6 @@
|
||||||
## Enabling Prefix Cache Mode in KTransformers
|
## Enabling Prefix Cache Mode in KTransformers
|
||||||
|
|
||||||
To enable **Prefix Cache Mode** in KTransformers, you need to modify the configuration file and recompile the project.
|
Balance serve now supports prefix cache reuse! To enable **Prefix Cache Mode** in KTransformers, you need to modify the configuration file and recompile the project.
|
||||||
|
|
||||||
### Step 1: Modify the Configuration File
|
### Step 1: Modify the Configuration File
|
||||||
|
|
||||||
|
@ -32,3 +32,7 @@ USE_BALANCE_SERVE=1 bash ./install.sh
|
||||||
# For those who have two cpu and 1T RAM(Dual NUMA):
|
# For those who have two cpu and 1T RAM(Dual NUMA):
|
||||||
USE_BALANCE_SERVE=1 USE_NUMA=1 bash ./install.sh
|
USE_BALANCE_SERVE=1 USE_NUMA=1 bash ./install.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Note
|
||||||
|
Balance serve utilizes a 3-layer (GPU-CPU-Disk) scheme to store and reuse KVCache. Deleting KVCache is not supported now. If you have too much KVCache, you can simply delete them by remove kvcache files.
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue