mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-05-18 23:49:46 +00:00
hexagon: eliminate scalar VTCM loads via HVX splat helpers (#22993)
* hexagon: add hvx_vec_repl helpers and use those for splat-from-vtcm usecase * hmx-mm: optimize per-group scale handling * hmx-fa: optimize slope load from vtcm * hmx-fa: use aligned access where possible in hmx-utils * hexagon: add hvx_vec_repl_2x_f16 helper and consolidate repl helpers --------- Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com>
This commit is contained in:
parent
a9883db8ee
commit
856c3adac1
6 changed files with 107 additions and 38 deletions
|
|
@ -70,5 +70,5 @@ adb $adbserial $adbhost shell " \
|
|||
./$branch/bin/llama-completion --no-mmap -m $basedir/../gguf/$model \
|
||||
--poll 1000 -t 6 --cpu-mask 0xfc --cpu-strict 1 \
|
||||
--ctx-size 8192 --ubatch-size 256 -fa on \
|
||||
-ngl 99 -no-cnv --device $device $cli_opts $@ \
|
||||
-ngl 99 --device $device $cli_opts $@ \
|
||||
"
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue