koboldcpp/scripts
Max Krasnyansky 5d2b52d80d
hexagon: add support for basic and extended Op profiling (#22269)
* hexagon: restore HTP_OPMASK_QUEUE

* hexagon: honor OPMASK_SKIP_COMPUTE in hmx-matmul

* hex-prof: restore op profiling

* hex-prof: enable PMU

* hexagon: simplify and improve op-queuing with full profiling support

Add separate profile descriptors.

* hexagon: remove opsync and rename opmask into opstage

opsync is no longer needed since the profiler is fully async now.
opmask name was confusing and opstage is more accurate.

* hexagon: refactor opbatch queue handling

* hexagon: add iface hooks for enabling profiler from the host

Also move all the PMU setup stuff out of the hex-utils since it's not inteded for normal use.

* hexagon: make profiler mode configurable

On older devices getting PMU counters is expensive so it's now optional.

* hexagon: add support for setting profiler pmu events from env

* hexagon: simplify profiler output (no need to print buffs, etc)

* hexagon: simplify pmu counter formating

* hexagon: add a simple profile post-proc tool

* hex-prof: add support for reading logs from stdin

* hexagon: document GGML_HEXAGON_PROFILE

* hex-prof: update default width for dims field

* hex-prof: fix linter warnings and errors

* Update ggml/src/ggml-hexagon/htp/htp-ops.h

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/snapdragon/ggml-hexagon-profile.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Trivikram Reddy <tamarnat@qti.qualcomm.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-04-23 14:17:21 -07:00
..
apple scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
hip ggml-cuda: Add generic NVFP4 MMQ kernel (#21074) 2026-04-01 12:04:58 +02:00
jinja ci : switch from pyright to ty (#20826) 2026-03-21 08:54:34 +01:00
snapdragon hexagon: add support for basic and extended Op profiling (#22269) 2026-04-23 14:17:21 -07:00
bench-models.sh benches : update models + numbers (#19359) 2026-02-05 14:34:07 +02:00
build-info.sh llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
check-requirements.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
compare-commits.sh scripts: add sqlite3 check for compare-commits.sh (#15633) 2025-08-28 19:23:22 +08:00
compare-llama-bench.py llama-bench: add -fitc and -fitt to arguments (#21304) 2026-04-06 22:26:02 +08:00
compare-logprobs.py scripts: update corpus of compare-logprobs (#19326) 2026-02-25 12:57:34 +01:00
create_ops_docs.py Docs: add instructions for adding backends (#14889) 2025-07-27 09:36:43 +08:00
debug-test.sh refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
fetch_server_test_models.py server: Add cached_tokens info to oaicompat responses (#19361) 2026-03-19 19:09:33 +01:00
gen-authors.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
gen-unicode-data.py ci : bump ty to 0.0.26 (#21156) 2026-03-30 09:29:15 +02:00
get-flags.mk build : pass all warning flags to nvcc via -Xcompiler (#5570) 2024-02-18 16:21:52 -05:00
get-hellaswag.sh scripts : update get-hellaswag.sh and get-winogrande.sh (#20542) 2026-03-14 11:21:50 +01:00
get-pg.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
get-wikitext-2.sh scripts : improve get-wikitext-2.sh (#19952) 2026-03-02 15:40:49 +01:00
get-winogrande.sh scripts : update get-hellaswag.sh and get-winogrande.sh (#20542) 2026-03-14 11:21:50 +01:00
get_chat_template.py scripts: corrected encoding when getting chat template (#11866) (#11907) 2025-02-18 10:30:16 +01:00
git-bisect-run.sh llama: end-to-end tests (#19802) 2026-03-08 12:30:21 +01:00
git-bisect.sh llama: end-to-end tests (#19802) 2026-03-08 12:30:21 +01:00
hf.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
install-oneapi.bat support SYCL backend windows build (#5208) 2024-01-31 08:08:07 +05:30
pr2wt.sh chore : correct typos [no ci] (#20041) 2026-03-05 08:50:21 +01:00
serve-static.js refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
server-bench.py ci : switch from pyright to ty (#20826) 2026-03-21 08:54:34 +01:00
server-test-function-call.py scripts: add function call test script (#21234) 2026-04-01 15:31:58 +02:00
server-test-model.py Autoparser - complete refactoring of parser architecture (#18675) 2026-03-06 21:01:00 +01:00
server-test-parallel-tc.py chat: fix parallel_tool_calls default setting based on model capabilities, add tests for parallel tool calls and structured outputs (#22217) 2026-04-22 18:10:56 +02:00
server-test-structured.py chat: fix parallel_tool_calls default setting based on model capabilities, add tests for parallel tool calls and structured outputs (#22217) 2026-04-22 18:10:56 +02:00
sync-ggml-am.sh scripts : update sync scripts 2025-08-18 22:06:44 +03:00
sync-ggml.last sync : ggml 2026-04-21 11:04:21 +03:00
sync-ggml.sh scripts : update sync scripts 2025-08-18 22:06:44 +03:00
sync_vendor.py vendor : update cpp-httplib to 0.43.1 (#22143) 2026-04-21 22:45:48 +08:00
tool_bench.py refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
tool_bench.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
verify-checksum-models.py convert.py : add python logging instead of print() (#6511) 2024-05-03 22:36:41 +03:00
xxd.cmake llama : move end-user examples to tools directory (#13249) 2025-05-02 20:27:13 +02:00