mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2025-09-15 11:29:43 +00:00
CANN: Improve loading efficiency after converting weights to NZ format. (#14985)
* CANN: Improve loading efficiency after converting weights to NZ format. * CANN: fix typo
This commit is contained in:
parent
66625a59a5
commit
11490b3672
3 changed files with 70 additions and 58 deletions
|
@ -310,5 +310,7 @@ Specifies the memory pool management strategy:
|
|||
|
||||
Controls automatic cleanup of the memory pool. This option is only effective when using the prio or leg memory pool strategies.
|
||||
|
||||
## TODO
|
||||
- Support more models and data types.
|
||||
### GGML_CANN_WEIGHT_NZ
|
||||
|
||||
Converting the matmul weight format from ND to NZ can significantly improve performance on the 310I DUO NPU.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue