koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-09 02:50:39 +00:00

History

Georgi Gerganov 852ce5180a ggml : fix conv2d_dw SVE path (ggml/1380) * Fix test-conv2d-dw failure on ARM SVE by using runtime vector length The ggml_compute_forward_conv_2d_dw_cwhn function was using a hardcoded GGML_F32_EPR (8) for SIMD vectorization, but on ARM SVE the actual vector length varies by hardware. This caused incorrect computation when processing CWHN layout tensors on ARM machines. Fix by using svcntw() to get the runtime SVE vector length instead of the compile-time constant. Co-authored-by: ggerganov <1991296+ggerganov@users.noreply.github.com> * ci : reduce sam score threshold * ci : update bbox checks for sam test --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ggerganov <1991296+ggerganov@users.noreply.github.com>		2025-11-05 10:41:51 +02:00
..
ggml-blas	sync : whisper.cpp (ggml/1359)	2025-09-29 17:43:58 +03:00
ggml-cann	CANN: Improve device ID handling and aclnnArange checks (#16752 )	2025-10-28 10:54:53 +08:00
ggml-cpu	ggml : fix conv2d_dw SVE path (ggml/1380)	2025-11-05 10:41:51 +02:00
ggml-cuda	CUDA: avoid mul + bias fusion when doing fusion (#16935 )	2025-11-04 10:53:48 +08:00
ggml-hexagon	refactor: replace sprintf with snprintf for safer string handling in dump functions (#16913 )	2025-11-04 12:25:39 -08:00
ggml-hip	HIP: fix AMDGPU_TARGETS, update documentation (#16803 )	2025-10-27 21:39:49 +01:00
ggml-metal	clip : use FA (#16837 )	2025-11-02 21:21:48 +01:00
ggml-musa	CUDA: faster tile FA, add oob checks, more HSs (#16492 )	2025-10-11 20:54:32 +02:00
ggml-opencl	opencl: support imrope (#16914 )	2025-11-03 11:47:57 -08:00
ggml-rpc	rpc : report actual free memory (#16616 )	2025-10-17 18:02:52 +03:00
ggml-sycl	SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (#16869 )	2025-11-03 09:35:33 +08:00
ggml-vulkan	vulkan: remove the need for the dryrun (#16826 )	2025-11-04 13:28:17 -06:00
ggml-webgpu	model: add support for qwen3vl series (#16780 )	2025-10-30 16:19:14 +01:00
ggml-zdnn	zdnn: refactor codebase + add docs (#16178 )	2025-09-23 14:53:05 +08:00
CMakeLists.txt	ggml: add s390x cpu-feats (#16774 )	2025-11-02 08:48:23 +08:00
ggml-alloc.c	ggml-alloc : make gallocr prefer chunks that allow memory reuse (#16788 )	2025-10-26 23:19:03 +01:00
ggml-backend-impl.h	rpc : add support for multiple devices (#16276 )	2025-10-04 12:49:16 +03:00
ggml-backend-reg.cpp	Add experimental ggml-hexagon backend for the Hexagon NPU (#16547 )	2025-10-22 13:47:09 -07:00
ggml-backend.cpp	llama: print memory breakdown on exit (#15860 )	2025-09-24 16:53:48 +02:00
ggml-common.h	llama : add gpt-oss (#15091 )	2025-08-05 22:10:36 +03:00
ggml-impl.h	vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656 )	2025-10-29 14:44:29 +01:00
ggml-opt.cpp	finetune: SGD optimizer, more CLI args (#13873 )	2025-08-14 12:03:57 +02:00
ggml-quants.c	ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (#15928 )	2025-09-23 10:25:20 +02:00
ggml-quants.h	llama : add gpt-oss (#15091 )	2025-08-05 22:10:36 +03:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
ggml.c	ggml: add ggml_can_fuse_subgraph (#16662 )	2025-10-21 16:43:14 +08:00
ggml.cpp	ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)	2025-06-01 13:43:57 +03:00
gguf.cpp	gguf: gguf_writer refactor (#15691 )	2025-09-05 11:34:28 +02:00