Concedo
05834eecb3
Merge commit ' 1ca3d1de15' into concedo_experimental
...
# Conflicts:
# tools/server/README.md
2026-02-26 19:55:06 +08:00
Georgi Gerganov
1ca3d1de15
gguf : avoid too many file size calls ( #19919 )
2026-02-26 12:46:32 +02:00
Concedo
749a606374
whisper broke
2026-02-26 16:45:04 +08:00
Aldehir Rojas
a96a1120b4
gguf : fix ftell/fseek for Windows ( #19870 )
2026-02-25 06:58:11 +02:00
Georgi Gerganov
418dea39ce
ggml/gguf : prevent integer overflows ( #19856 )
...
* gguf : prevent integer overflow for ggml_context mem size
* ggml : fix int overflows in ggml_new_object()
* gguf : prevent string exhaustion
* gguf : prevent array elements exhaustion
* ggml : fix negative tensor type oob
* py : assert that alignment is non-zero power of 2
* ggml : check int overflow in ggml_new_tensor_impl and ggml_new_object
* gguf-py : error on duplicate keys when reading
* py : restore tensor_fields
* enforce proper alignment in add_custom_alignment
* gguf : better name
* gguf : fix ctx size for no_alloc == true
* gguf : minor print fix
* ggml : print values when overflow
* ggml : remove deprecated ggml_type_sizef()
* ggml : relax ggml_type asserts to debug-only
* gguf : add mem_size overflow test
* gguf : add file size check for arrays
* ggml : relax asseerts for ggml_get_type_traits()
* flake8 fix
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-02-24 20:17:11 +02:00
Concedo
f6ece6fd37
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/check-vendor.yml
# .github/workflows/close-issue.yml
# .github/workflows/editorconfig.yml
# .github/workflows/gguf-publish.yml
# .github/workflows/labeler.yml
# .github/workflows/pre-tokenizer-hashes.yml
# .github/workflows/python-check-requirements.yml
# .github/workflows/python-lint.yml
# .github/workflows/python-type-check.yml
# .github/workflows/server.yml
# .github/workflows/update-ops-docs.yml
# README.md
# docs/build.md
# examples/model-conversion/scripts/utils/perplexity-gen.sh
# examples/model-conversion/scripts/utils/perplexity-run-simple.sh
# examples/model-conversion/scripts/utils/perplexity-run.sh
# examples/model-conversion/scripts/utils/quantize.sh
# examples/model-conversion/scripts/utils/run-embedding-server.sh
# ggml/src/ggml-cpu/ggml-cpu.c
# ggml/src/ggml-hexagon/htp/flash-attn-ops.c
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/cvt.cl
# ggml/src/ggml-opencl/kernels/mul_mv_q6_k_f32.cl
# ggml/src/ggml-sycl/ggml-sycl.cpp
# scripts/compare-llama-bench.py
# tests/test-backend-ops.cpp
# tests/test-gguf.cpp
# tools/cli/README.md
# tools/completion/README.md
# tools/server/README.md
2026-01-27 23:06:13 +08:00
Johannes Gäßler
4e5b83b226
GGUF: check that tensor size is representable ( #19072 )
2026-01-24 21:57:51 +01:00
Concedo
4984c9bc16
Merge commit ' 12a4a47e6a' into concedo_experimental
...
# Conflicts:
# ci/run.sh
# examples/model-conversion/scripts/causal/run-converted-model-embeddings-logits.sh
# examples/model-conversion/scripts/causal/run-converted-model.sh
# examples/model-conversion/scripts/embedding/run-converted-model.sh
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-hexagon/ggml-hexagon.cpp
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# ggml/src/ggml-zdnn/ggml-zdnn.cpp
# ggml/src/ggml-zendnn/ggml-zendnn.cpp
# tests/CMakeLists.txt
# tests/test-chat-parser.cpp
# tests/test-chat-peg-parser.cpp
# tests/test-chat.cpp
# tools/cli/cli.cpp
2026-01-21 21:00:44 +08:00
Matthieu Coudron
37c35f0e1c
gguf: display strerrno when cant load a model ( #18884 )
...
I've had issues loading models with llama-server:
[44039] E gguf_init_from_file: failed to open GGUF file 'mistral-7b-v0.1.Q8_0.gguf'
and I was sure it could access the file. Seems like --models-dir and
--models-presets dont interact like I thought they would but I salvaged
this snippet that helps troubleshooting
[44039] E gguf_init_from_file: failed to open GGUF file 'mistral-7b-v0.1.Q8_0.gguf' (errno No such file or directory)
2026-01-21 08:52:46 +02:00
Concedo
03cec02a3d
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/release.yml
# .github/workflows/winget.yml
# CODEOWNERS
# README.md
# ci/run.sh
# docs/build.md
# docs/ops.md
# docs/ops/Vulkan.csv
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# scripts/sync_vendor.py
# src/CMakeLists.txt
# tests/test-json-schema-to-grammar.cpp
# tests/test-quantize-stats.cpp
# tools/server/CMakeLists.txt
# tools/server/README.md
2025-12-03 18:56:31 +08:00
Herman Semenoff
37adc9c6ba
ggml, llama : use defaulted constructors/destructors ( #17649 )
2025-12-03 07:12:18 +01:00
Concedo
5de51b77c1
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/close-issue.yml
# docs/build-s390x.md
# examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
# ggml/src/ggml-cuda/fattn-tile-f16.cu
# ggml/src/ggml-cuda/fattn.cu
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# scripts/tool_bench.py
# tests/test-backend-ops.cpp
# tools/batched-bench/batched-bench.cpp
# tools/server/README.md
2025-09-11 22:28:19 +08:00
Erik Scholz
a81283820a
gguf: gguf_writer refactor ( #15691 )
...
* gguf: split gguf writer into base and buf impl
* gguf: templated gguf write out
* gguf: file based writer (avoid writing everything to memory first!)
* examples(llama2c): fix log not being the same level and compiler nits
2025-09-05 11:34:28 +02:00
Concedo
2562129271
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# ci/run.sh
# docs/backend/CANN.md
# examples/speculative/speculative.cpp
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/flash_attn_f16.cl
# ggml/src/ggml-opencl/kernels/flash_attn_f32.cl
# ggml/src/ggml-opencl/kernels/flash_attn_f32_f16.cl
# ggml/src/ggml-webgpu/ggml-webgpu.cpp
# ggml/src/gguf.cpp
# src/llama-context.cpp
# tests/test-sampling.cpp
# tools/server/README.md
2025-09-03 17:16:42 +08:00
SnA1lGo
3de008208b
fix: resolve unsigned int initialization warning for n_dims/size in gguf.cpp ( #15754 )
2025-09-02 21:27:30 +02:00
Concedo
c52cbdce52
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/scale.cl
# ggml/src/ggml-sycl/backend.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# tests/test-backend-ops.cpp
2025-07-10 17:43:08 +08:00
Miaoqian Lin
26a48ad699
ggml : prevent integer overflow in gguf tensor size calculation ( #14595 )
2025-07-09 14:33:53 +02:00
Concedo
0ac20e30b5
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# docs/backend/SYCL.md
# docs/build.md
# ggml/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/amx/mmq.cpp
# ggml/src/ggml-cpu/ggml-cpu.c
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/sycl_hw.cpp
# ggml/src/ggml-sycl/sycl_hw.hpp
# ggml/src/ggml-vulkan/CMakeLists.txt
# tests/test-backend-ops.cpp
2025-06-28 08:56:29 +08:00
Sigbjørn Skjæret
b193d53069
ggml : do not output unprintable characters on GGUF load failure ( #14381 )
2025-06-25 23:26:51 +02:00
Concedo
b42b618897
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# examples/parallel/parallel.cpp
# ggml/src/CMakeLists.txt
# ggml/src/ggml-blas/CMakeLists.txt
# ggml/src/ggml-sycl/CMakeLists.txt
# ggml/src/gguf.cpp
# scripts/sync-ggml.last
# tests/test-gguf.cpp
2025-06-02 23:26:43 +08:00
Johannes Gäßler
7675c555a1
gguf: fix failure on version == 0 ( #13956 )
2025-06-01 18:08:05 +02:00
Aaron Teo
e57bb87ced
ggml: check if non-native endian model is being loaded ( #13943 )
...
* gguf: prevent non-native endian models from being loaded
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* gguf: update error message
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* gguf: make the non-native endian check more verbose
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: move ggml_assert location
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: reword the endianness check error message
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-01 16:53:57 +02:00
Concedo
e5d26a2356
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/CMakeLists.txt
# docs/backend/SYCL.md
# ggml/CMakeLists.txt
# ggml/src/ggml-sycl/CMakeLists.txt
# ggml/src/ggml-sycl/binbcast.cpp
# ggml/src/ggml-sycl/convert.cpp
# ggml/src/ggml-sycl/dequantize.hpp
# ggml/src/ggml-sycl/dmmv.cpp
# ggml/src/ggml-sycl/gemm.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/mmvq.cpp
# ggml/src/ggml-sycl/vecdotq.hpp
# ggml/src/ggml-vulkan/CMakeLists.txt
# ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
# ggml/src/gguf.cpp
# scripts/compare-llama-bench.py
# tests/CMakeLists.txt
# tests/test-chat.cpp
# tools/llama-bench/llama-bench.cpp
# tools/server/README.md
2025-05-16 15:30:31 +08:00
Diego Devesa
c6a2c9e741
gguf : use ggml log system ( #13571 )
...
* gguf : use ggml log system
* llama : remove unnecessary new lines in exception messages
2025-05-15 19:13:11 +02:00
Concedo
8a4a9b8c19
Merge branch 'upstream' into concedo_experimental
2025-04-01 20:16:16 +08:00
R0CKSTAR
a6f32f0b34
Fix clang warning in gguf_check_reserved_keys ( #12686 )
...
* Fix clang warning in gguf_check_reserved_keys
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Fix typo
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-04-01 13:12:53 +02:00
Concedo
5329df2bdf
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/server.yml
# CMakeLists.txt
# cmake/build-info.cmake
# examples/run/CMakeLists.txt
# examples/run/run.cpp
# examples/simple-chat/simple-chat.cpp
# tests/CMakeLists.txt
# tests/test-backend-ops.cpp
# tests/test-sampling.cpp
2025-01-21 00:25:07 +08:00
Georgi Gerganov
4dd34ff831
cmake : add sanitizer flags for llama.cpp ( #11279 )
...
* cmake : add sanitizer flags for llama.cpp
ggml-ci
* tests : fix compile warnings
ggml-ci
* cmake : move sanitizer flags to llama_add_compile_flags
ggml-ci
* cmake : move llama.cpp compile flags to top level lists
ggml-ci
* cmake : apply only sanitizer flags at top level
ggml-ci
* tests : fix gguf context use in same_tensor_data
* gguf-test: tensor data comparison
* dummy : trigger ggml-ci
* unicode : silence gcc warnings
ggml-ci
* ci : use sanitizer builds only in Debug mode
ggml-ci
* cmake : add status messages [no ci]
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-01-18 16:18:15 +02:00
Concedo
b3de1598e7
Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS
...
tts is functional (+6 squashed commit)
Squashed commit:
[22396311] wip tts
[3a883027] tts not yet working
[0dcfab0e] fix silly bug
[a378d9ef] some long overdue cleanup
[fc5a6fb5] Wip tts
[39f50497] wip TTS integration
2025-01-13 14:23:25 +08:00
Concedo
e788b8289a
You'll never take us alive
...
We swore that death will do us part
They'll call our crimes a work of art
2025-01-09 11:27:06 +08:00
Johannes Gäßler
53ff6b9b9f
GGUF: C++ refactor, backend support, misc fixes ( #11030 )
...
* GGUF: C++ refactor, backend support, misc fixes
remove ggml_tensor.backend
update CODEOWNERS [no ci]
remove gguf_get_data from API
revise GGUF API data types
2025-01-07 18:01:58 +01:00