Concedo
b649f69e4f
gemma3n chat template
2025-06-29 15:24:55 +08:00
Concedo
d383c03554
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# tests/test-backend-ops.cpp
2025-06-29 15:10:26 +08:00
Concedo
2594be7d4e
fixed scaling behavior again
2025-06-29 11:36:38 +08:00
Concedo
2635e4b932
try fix segfault in sdcpp
2025-06-29 02:33:03 +08:00
Aman Gupta
27208bf657
CUDA: add bf16 and f32 support to cublas_mul_mat_batched ( #14361 )
...
* CUDA: add bf16 and f32 support to cublas_mul_mat_batched
* Review: add type traits and make function more generic
* Review: make check more explicit, add back comments, and fix formatting
* Review: fix formatting, remove useless type conversion, fix naming for bools
2025-06-29 01:30:53 +08:00
Jeff Bolz
63a7bb3c7e
vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline ( #14378 )
2025-06-28 17:36:40 +02:00
Concedo
485148b293
fixed sdmain compiling
2025-06-28 23:19:13 +08:00
Jeff Bolz
00d5282c7f
vulkan: lock accesses of pinned_memory vector ( #14333 )
2025-06-28 17:17:09 +02:00
Concedo
2975ccdb6f
Merge branch 'upstream' into concedo_experimental
2025-06-28 23:11:29 +08:00
Concedo
6c92a9f0e1
fixed resizing
2025-06-28 23:10:04 +08:00
Concedo
a1175cf34f
merged leejet changes
2025-06-28 22:57:07 +08:00
Weizhao Ouyang
566c16fcce
model : add support for ERNIE 4.5 0.3B model ( #14408 )
...
Add Day-0 support for Baidu ERNIE 4.5 0.3B model.
Signed-off-by: Weizhao Ouyang <weizhao.ouyang@arm.com>
2025-06-28 16:08:21 +02:00
Concedo
b6edb79648
filter out empty entries
2025-06-28 20:22:34 +08:00
Concedo
794563b52c
update docs
2025-06-28 17:53:32 +08:00
Concedo
a88c56e70c
Merge remote-tracking branch 'origin/upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/release.yml
# examples/eval-callback/eval-callback.cpp
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-vulkan/CMakeLists.txt
# ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
# tests/test-backend-ops.cpp
2025-06-28 17:47:53 +08:00
Xinpeng Dou
b25e92774e
fix async_mode bug ( #14432 )
2025-06-28 17:35:41 +08:00
Concedo
4ec0e0fd21
now accept multiple images for reference images
2025-06-28 17:30:28 +08:00
Sigbjørn Skjæret
6609507a91
ci : fix windows build and release ( #14431 )
2025-06-28 09:57:07 +02:00
Concedo
2e14338455
additional padding for the swa kv cache itself
2025-06-28 15:52:48 +08:00
Concedo
ff2cabc28f
fixed kontext and photomaker (+1 squashed commits)
...
Squashed commits:
[de0ac91dd] photomaker use 1 channel
2025-06-28 12:14:05 +08:00
Concedo
5a6cc38f35
fixed a typo
2025-06-28 11:47:07 +08:00
Concedo
ed289227e5
added support for flux kontext
2025-06-28 11:37:19 +08:00
Jeff Bolz
ceb1bf5a34
vulkan: Fix GGML_VULKAN_SHADER_DEBUG_INFO ( #14427 )
...
This setting needs to be passed through to vulkan-shaders-gen
2025-06-27 22:35:30 -05:00
Concedo
0bd648ffa4
photomaker renamed to extra image to handle future extension
2025-06-28 10:26:06 +08:00
Concedo
815d2056d9
gentoken reservations
2025-06-28 09:16:20 +08:00
Concedo
0ac20e30b5
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# docs/backend/SYCL.md
# docs/build.md
# ggml/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/amx/mmq.cpp
# ggml/src/ggml-cpu/ggml-cpu.c
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/sycl_hw.cpp
# ggml/src/ggml-sycl/sycl_hw.hpp
# ggml/src/ggml-vulkan/CMakeLists.txt
# tests/test-backend-ops.cpp
2025-06-28 08:56:29 +08:00
Georgi Gerganov
72babea5de
graph : make llm_graph_context destructor virtual ( #14410 )
...
ggml-ci
2025-06-27 21:42:02 +03:00
Georgi Gerganov
43678060c1
recurrent : call balloc split_reset() in init_batch() ( #14414 )
...
ggml-ci
2025-06-27 17:55:45 +03:00
Radoslav Gerganov
8d94219a4a
ggml : add ggml_set_rows ( #14274 )
...
* ggml : add ggml_set_rows
Add ggml_set_rows(a, b, c) which copies rows from 'b' into 'a' using
indices from 'c'.
ref: #8366
* use I64 for indices
* ggml : add repeat impl for i64
* ggml : add ggml_is_contiguous_rows
* ggml : ggml_set_rows support broadcast
* ggml : ggml_set_rows support quantized dst
ggml-ci
* ggml : support GGML_TYPE_F32 ".from_float" trait
* ggml : ggml_set_rows update comment + better index name
* tests : add ggml_set_rows
* metal : add ggml_set_rows implementation
ggml-ci
* ggml : simplify forward_dup_f32
* ggml : fix supports_op
* tests : add comment to set_rows
* ggml : leave the repeat_i64 for a separate PR
ggml-ci
* ggml : set_rows use std::min instead of MIN
* ggml : better error message for set_rows unsupported type
* metal : perform op->type check only once
* tests : more consistent implementation + more tests
ggml-ci
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-27 16:41:40 +03:00
Concedo
39b0699c71
fixed savestates with drafting
2025-06-27 20:35:38 +08:00
Sigbjørn Skjæret
f667f1e624
convert : fix broken sentencepiece vocab ( #14416 )
2025-06-27 10:42:19 +02:00
Xuan-Son Nguyen
8846aace49
model : gemma3n text-only ( #14400 )
...
* gemma3n
* add llm_graph_input_one
2025-06-26 20:34:02 +03:00
bandoti
a01047b041
cmake: regen vulkan shaders when shaders-gen sources change ( #14398 )
...
* Add shaders-gen sources as target deps
2025-06-26 13:46:53 -03:00
tsite
df47b51bd1
support python 3.13 ( #1621 )
2025-06-27 00:18:30 +08:00
Sigbjørn Skjæret
b25346221d
llama : return mistral-v7-tekken as default template only ( #14390 )
2025-06-26 15:01:14 +02:00
Georgi Gerganov
e8215dbb96
metal : add special-case mat-vec mul for ne00 == 4 ( #14385 )
...
ggml-ci
2025-06-26 15:51:19 +03:00
Georgi Gerganov
5783ae4359
metal : batch rows copy in a single threadgroup ( #14384 )
...
* metal : batch rows copy in a single threadgroup
ggml-ci
* metal : handle some edge cases when threadgroup size is not a power of 2
ggml-ci
2025-06-26 15:50:15 +03:00
Concedo
60e9f285c3
extend log
2025-06-26 18:52:44 +08:00
Aaron Teo
bf5bcd0b85
docs: update s390x documentation + add faq ( #14389 )
...
* docs: update s390x documentation + add faq
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* docs: add s390x z17 build q&a
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-26 12:41:41 +02:00
R0CKSTAR
716301d1b0
musa: enable fp16 mma (all) and cublas on qy2 ( #13842 )
...
* musa: enable fp16 mma (all) and cublas on qy2
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Update ggml/src/ggml-cuda/ggml-cuda.cu
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Address review comments
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Address review comments
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: disable MUL_MAT_ID (q2_k × f32) due to precision issues
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-06-26 12:11:59 +08:00
Aaron Teo
60ef23d6c1
ggml-cpu: enable IBM NNPA Vector Intrinsics ( #14317 )
...
* ggml-cpu: add nnpa compile flag
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 4a9f60c201573128f73a65999b3e5cc497fae5c1)
* ggml-cpu: add fp16->fp32 nnpa first
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 8d4a7987f9c1887f716be96250f2caeee0253929)
* ggml-cpu: add fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 0ff0d6516247a41d2ade42b42cf0d676a4dd1627)
* ggml-cpu: better variable names
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 2f58bbcbb89c183340e252362b2a40651f573f1f)
* docs: update s390x docs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 01b929491b50071a5d0572235dcf5a449da70aa7)
* ggml-cpu: add debugging prints to see if dlf16 is correct
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix print vs printf
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix float placeholder
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: ensure fp16 and fp32 load and stores are called
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fp16 load ensured to hit
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: remove sigint from fp16 store
for some reason, the function is not getting a hit when debugged with
gdb. we will need to investigate further
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: activate nnpa for ggml_cpu_fp16_to_fp32
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: nnpa activate ggml_cpu_fp16_to_fp32 for 8 elements
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: nnpa switch to vec_xst test
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: switch to vec_xst for 4 element loops also
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: rework noop
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: remove noop, general code cleanup
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: clarify variable naming
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: activate nnpa for ggml_cpu_fp32_to_fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add breakpoint for debugging
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: test fix for conversion failure
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: disable fp32->fp16 nnpa conversions for now
there are some conversion failures in nnpa that requires the eyes of an
ibm stsm. will create a separate pr to introduce the fp32->fp16 change.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: switch to elif macro
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: reattempt fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix typo
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: reattempt fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix compiler types
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: change to typedef vector types
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add 4 element loops for fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: clarified vector naming
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: bring back fp32->fp16 store nnpa
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: activate nnpa fp32->fp16 or fp16->fp32 compute
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add nnpa macro check in ggml-impl
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add missing __func__
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: diagnose why __NNPA__ macro is not being defined
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: import vecintrin.h to fix compiler errors
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: update macro tests
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: move s390x typedef to own header file
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml-cpu: move s390x typedef to own header file"
This reverts commit 157f856c34589566151630e294563a420702db39.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: switch to importing ggml-cpu-impl instead
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix macro declaration
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: test more macros
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add debug prints
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: bruteforce macro definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: move macro definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add ggml-impl.h to cmakelists
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: switch to private macros
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: move s390x typedef to own header file
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 157f856c34589566151630e294563a420702db39)
* ggml-cpu: move things around
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: bring back compile macros
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: switch to quotes for import
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add compiler error macro
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add s390x detection in ggml-src
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: bring back compile definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: undo cmakelists work
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml-cpu: move s390x typedef to own header file"
This reverts commit 18d79e1a30b39d9aaa0bd58400c5cf2c32135c9a.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: remove typedefs.h
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: remove typedef from cmakelists
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add ggml-impl.h future notes
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: add todo comment for future reference
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: clarify naming of dlf16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: remove unnecessary target compile definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: move nnpa fp16->fp32 and fp32->fp16 to simd-mappings
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* docs: update broken huggingface link for s390x
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix duplicate func names during compile
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml-cpu: fix duplicate func names during compile"
This reverts commit fbb733451f27677063b914d4f6c9a9841d45b38d.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu"
This reverts commit bd288e8fa52b5244f65cee21cb61062f1a9e0ca5.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: refactor fp16<->fp32 simd to ggml-cpu
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix missing simd-mappings.h import in quants.c
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix missing simd-mappings.h within repack
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix amx mmq missing simd-mappings.h
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: attempt at fixing loongarch failing build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: move nnpa together with other fp16<->fp32 simd
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: fix wrong refactor of ggml-base
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164176555
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: remove dependency on ggml-cpu from ggml-base
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: rename all fp16<->fp32 macros to prefix with ggml_cpu
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164449406
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: remove mistaken fallback macro
fallback logic was already implemented but i was too sleepy to realise
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: move ggml_table_f32_f16 to ggml-cpu
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164775006
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures"
This reverts commit 32a3533564bdb7902cefb9c89b1c9e956a81ce29.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml: move ggml_table_f32_f16 to ggml-cpu"
This reverts commit 9e40d984ad27d7b60392fb2b7548885201864fe4.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml: move ggml_table_f32_f16 to ggml-cpu
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164775006
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
(cherry picked from commit 9e40d984ad27d7b60392fb2b7548885201864fe4)
* ggml: move ggml_table_f32_f16 to ggml-cpu.c
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: extern c ggml_table_f32_f16 + chore docs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h
we rely on the variable declaration in ggml-cpu.c instead
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h"
This reverts commit f71b21d2f74f5e03ec0c2b4fefd3cbf395aecf16.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-cpu: bring back ggml_table_f32_f16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "ggml-cpu: bring back ggml_table_f32_f16"
This reverts commit 2dce119178bed5ef5c8398c4230ddd14fef80e49.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* fix ggml time initialization
* fix f32_f16 table init
* remove extra line
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Co-authored-by: slaren <slarengh@gmail.com>
2025-06-25 23:49:04 +02:00
Sigbjørn Skjæret
b193d53069
ggml : do not output unprintable characters on GGUF load failure ( #14381 )
2025-06-25 23:26:51 +02:00
Anton Mitkov
2bf9d539dd
sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices ( #13973 )
2025-06-25 18:09:55 +02:00
Concedo
969ef81701
updated lite
2025-06-25 20:51:27 +08:00
Concedo
45f0cc7310
hack to fix compilation on avx2 intel macbook
2025-06-25 20:15:20 +08:00
Reithan
54dde5e565
Add memoized cache to llama_grammar_reject_candidates_for_stack ( #1615 )
...
* Add memoized cache to llama_grammar_reject_candidates_for_stack
* make size cutoff more aggressive and move to outer branch
* update comment
* add cache reset whenever grammar is reloaded
* remove explicit reference types for compiler transportability
2025-06-25 19:22:19 +08:00
lhez
73e53dc834
opencl: ref count ggml_backend_opencl_context and refactor profiling ( #14254 )
...
* Move profiling info into `ggml_backend_opencl_context`
* Add `enqueue_ndrange_kernel` to launch kernel
2025-06-24 11:46:25 -07:00
Georgi Gerganov
62af464227
batch : fix check for empty sequences in memory ( #14364 )
...
* batch : fix check for empty sequences in memory
ggml-ci
* cont : reuse the var
ggml-ci
2025-06-24 18:26:30 +03:00
Concedo
b884a7f058
try switch back to size max for vulkan
2025-06-24 23:14:24 +08:00
Concedo
ace537d44e
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/release.yml
# CMakeLists.txt
# examples/simple-chat/simple-chat.cpp
# src/llama-quant.cpp
# tools/run/run.cpp
# tools/server/README.md
2025-06-24 23:06:16 +08:00