CANN: Improve loading efficiency after converting weights to NZ format. (#14985)

* CANN: Improve loading efficiency after converting weights to NZ format.

* CANN: fix typo
This commit is contained in:
hipudding 2025-07-31 19:47:20 +08:00 committed by GitHub
parent 66625a59a5
commit 11490b3672
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 70 additions and 58 deletions

View file

@ -310,5 +310,7 @@ Specifies the memory pool management strategy:
Controls automatic cleanup of the memory pool. This option is only effective when using the prio or leg memory pool strategies.
## TODO
- Support more models and data types.
### GGML_CANN_WEIGHT_NZ
Converting the matmul weight format from ND to NZ can significantly improve performance on the 310I DUO NPU.