mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-05-06 08:01:27 +00:00
* feat: Add a batched version of ssm_conv This was done using Claude Code. It found a number of optimizations around how the threads were organized, resulting in a huge performance boost! Branch: Mamba2SSD Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Optimized SSM_SCAN kernel for metal This used Claude Code and resulted in a modest performance improvement while maintaining correctness. Branch: Mamba2SSD Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * test: Add test-backend-ops perf tests for SSM_CONV Branch: SSMKernelImprovements Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * test: Real representitive tests for SSM_CONV Branch: SSMKernelImprovements Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * refactor: Use function constant for ssm_conv batch size Branch: SSMKernelImprovements Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * test: backend op tests for ssm_scan from granite4 1b-h Branch: SSMKernelImprovements Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * style: remove commented out templates Branch: SSMKernelImprovements Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: float4 version of ssm_conv_batched Branch: SSMKernelImprovements Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix: Add missing ggml_metal_cv_free Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> |
||
|---|---|---|
| .. | ||
| peg-parser | ||
| .gitignore | ||
| CMakeLists.txt | ||
| get-model.cpp | ||
| get-model.h | ||
| run-json-schema-to-grammar.mjs | ||
| test-alloc.cpp | ||
| test-arg-parser.cpp | ||
| test-autorelease.cpp | ||
| test-backend-ops.cpp | ||
| test-barrier.cpp | ||
| test-c.c | ||
| test-chat-parser.cpp | ||
| test-chat-peg-parser.cpp | ||
| test-chat-template.cpp | ||
| test-chat.cpp | ||
| test-double-float.cpp | ||
| test-gbnf-validator.cpp | ||
| test-gguf.cpp | ||
| test-grammar-integration.cpp | ||
| test-grammar-llguidance.cpp | ||
| test-grammar-parser.cpp | ||
| test-json-partial.cpp | ||
| test-json-schema-to-grammar.cpp | ||
| test-llama-grammar.cpp | ||
| test-log.cpp | ||
| test-lora-conversion-inference.sh | ||
| test-model-load-cancel.cpp | ||
| test-mtmd-c-api.c | ||
| test-opt.cpp | ||
| test-peg-parser.cpp | ||
| test-quantize-fns.cpp | ||
| test-quantize-perf.cpp | ||
| test-quantize-stats.cpp | ||
| test-regex-partial.cpp | ||
| test-rope.cpp | ||
| test-sampling.cpp | ||
| test-thread-safety.cpp | ||
| test-tokenizer-0.cpp | ||
| test-tokenizer-0.py | ||
| test-tokenizer-0.sh | ||
| test-tokenizer-1-bpe.cpp | ||
| test-tokenizer-1-spm.cpp | ||
| test-tokenizer-random.py | ||
| test-tokenizers-repo.sh | ||