Commit graph

655 commits

Author SHA1 Message Date
Concedo
1a4f54dd11 update for cu13 builds (no ci will be provided) 2025-09-26 16:01:43 +08:00
Concedo
326f6f3fad not sure if working on metal 2025-09-21 11:35:02 +08:00
tsite
04498a345a
update makefile to clone llguidance if the directory does not exist (#1743)
also remove llguidance when running 'make clean'
2025-09-21 08:40:55 +08:00
Concedo
fddd046f9d metal common 2025-09-15 01:58:32 +08:00
Concedo
a5580a32fb fix cuda and macos compile issues 2025-09-12 20:53:42 +08:00
tsite
27c443f01e
add support for llguidance (#1728)
* add llguidance

remove tab indentation for makefile if statements - these are dangerous
fix broken tool compilation commands
add USE_LLGUIDANCE env var to enable llguidance for faster structured
output generation
add llguidance as an optional submodule

* rm submodule
2025-09-11 16:46:03 +08:00
Concedo
f7fa283bb6 indentation fix for makefile 2025-09-11 16:41:51 +08:00
Concedo
52ff99805c fixed windows 7 compat builds 2025-08-25 10:36:13 +08:00
Concedo
ed5e7a3062 fix for some old android devices 2025-08-24 01:34:54 +08:00
Concedo
8b8396c30c Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	README.md
#	docs/build-s390x.md
#	examples/llama.vim
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/common.h
#	scripts/compare-llama-bench.py
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tools/llama-bench/README.md
#	tools/llama-bench/llama-bench.cpp
#	tools/server/README.md
2025-08-23 11:35:28 +08:00
Concedo
b50f94ae27 this commit removes ggml_cuda_f16 targets. Merge commit '7a6e91ad26' into concedo_experimental
# Conflicts:
#	docs/build.md
#	docs/multimodal/MobileVLM.md
#	ggml/CMakeLists.txt
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-musa/CMakeLists.txt
2025-08-21 19:25:29 +08:00
Daniel Bevenius
37f10f955f
make : remove make in favor of CMake (#15449)
This commit removes the content from the Makefile and updates the
current deprecation message to information that `make` has been
replaced by CMake instead.

The message when `make` is invoked will now be the following:
```console
$ make
Makefile:6: *** Build system changed:
 The Makefile build has been replaced by CMake.

 For build instructions see:
 https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md

.  Stop.
```

The motivation for this is that many, if not all targets fail to build
now, after changes to the system, and `make` has also been deprected for
some time now.
2025-08-20 13:31:16 +03:00
Concedo
35707f4e97 split vulkan into two compilation units for faster build 2025-08-20 12:12:47 +08:00
Concedo
67ef5e6c02 phonemizer fixes, now kokoro works very well 2025-08-18 16:13:16 +08:00
Concedo
52606e9b1d tts cpp model is now loadable in kcpp 2025-08-17 15:47:22 +08:00
Concedo
9935ac093f standardize tts linting and formatting 2025-08-17 14:11:30 +08:00
Concedo
cfc1a0d4ef tts cpp cli builds and runs fine. 2025-08-17 13:53:27 +08:00
Concedo
bc04366a65 builds but crashes 2025-08-17 00:09:03 +08:00
Concedo
67e0072245 fixed clblast repacking 2025-08-09 01:08:02 +08:00
Concedo
d37529c0cd add sanitize flag 2025-08-04 22:19:23 +08:00
Concedo
4db8ba6228 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	ggml/src/ggml-sycl/gemm.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/set_rows.cpp
2025-07-14 23:16:44 +08:00
Concedo
dca49de059 fixed qwen2 audio issues, works fine now (+3 squashed commit)
Squashed commit:

[b3053a1ba] updated lite

[5071630d6] fixed mtmd issues, audio works

[06efa5af4] fix mtmd compile
2025-07-12 18:54:41 +08:00
Concedo
e9473305d0 wip2 (+1 squashed commits)
Squashed commits:

[4628777b6] wip
2025-07-12 18:54:40 +08:00
Concedo
f8a49aa8e6 fixed a typo 2025-07-08 11:41:09 +08:00
Concedo
18cd46a6db allow people to manually override gfx12 fa 2025-07-05 11:33:30 +08:00
Concedo
abc1d8ac25 better way of checking for avx2 support 2025-06-22 22:56:50 +08:00
Concedo
45f589b78d test gfx1200 again 2025-06-21 17:56:04 +08:00
Concedo
b59b5dbbd1 Merge commit '456af35eb7' into concedo_experimental
# Conflicts:
#	ggml/src/ggml-sycl/getrows.cpp
#	src/CMakeLists.txt
#	tools/llama-bench/llama-bench.cpp
2025-06-20 23:41:27 +08:00
Concedo
33809c9e82 doing what i must because i can, after the mess that is https://github.com/ggml-org/llama.cpp/pull/13892
there is so much duplicate code in each cpu arch, i expect upstream will prune it eventually
arch detection has no fallback if all the arches are not found, by right we should set GGML_CPU_GENERIC
i should be relaxing its the weekend
2025-06-14 01:41:16 +08:00
Concedo
f50c793140 not working - refactoring 2025-06-14 00:03:21 +08:00
Concedo
7a688e07cd remove gfx12 until amd wakes up 2025-06-12 16:52:55 +08:00
Concedo
1970d8c9e8 uvos said it might work 2025-06-12 16:44:46 +08:00
Concedo
8386546e08 Switched VS2019 for revert cu12.1 build, hopefully solves dll issues
try change order (+3 squashed commit)

Squashed commit:

[457f02507] try newer jimver

[64af28862] windows pyinstaller shim. the final loader will be moved into the packed directory later.

[0272ecf2d] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm
try again (+3 squashed commit)

Squashed commit:

[133e81633] try without pwsh

[4d99cefba] try without pwsh

[bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm
2025-06-10 23:08:02 +08:00
Concedo
28b35ca879 allow wmma flag for rocm 2025-06-10 01:23:48 +08:00
Concedo
7d8aa31f1f fixed embeddings, added new parameter to limit max embeddings context 2025-06-10 01:11:55 +08:00
xctan
f470bc36be
ggml-cpu : split arch-specific implementations (#13892)
* move ggml-cpu-aarch64 to repack

* split quantize_row_q8_0/1

* split helper functions

* split ggml_vec_dot_q4_0_q8_0

* split ggml_vec_dot_q4_1_q8_1

* split ggml_vec_dot_q5_0_q8_0

* split ggml_vec_dot_q5_1_q8_1

* split ggml_vec_dot_q8_0_q8_0

* split ggml_vec_dot_tq1_0_q8_K

* split ggml_vec_dot_tq2_0_q8_K

* split ggml_vec_dot_q2_K_q8_K

* split ggml_vec_dot_q3_K_q8_K

* split ggml_vec_dot_q4_K_q8_K

* split ggml_vec_dot_q5_K_q8_K

* split ggml_vec_dot_q6_K_q8_K

* split ggml_vec_dot_iq2_xxs_q8_K

* split ggml_vec_dot_iq2_xs_q8_K

* split ggml_vec_dot_iq2_s_q8_K

* split ggml_vec_dot_iq3_xxs_q8_K

* split ggml_vec_dot_iq3_s_q8_K

* split ggml_vec_dot_iq1_s_q8_K

* split ggml_vec_dot_iq1_m_q8_K

* split ggml_vec_dot_iq4_nl_q8_0

* split ggml_vec_dot_iq4_xs_q8_K

* fix typos

* fix missing prototypes

* rename ggml-cpu-quants.c

* rename ggml-cpu-traits

* rename arm folder

* move cpu-feats-x86.cpp

* rename ggml-cpu-hbm

* update arm detection macro in quants.c

* move iq quant tables

* split ggml_quantize_mat_q8_0/K

* split ggml_gemv_*

* split ggml_gemm_*

* rename namespace aarch64 to repack

* use weak aliases to replace test macros

* rename GGML_CPU_AARCH64 to GGML_CPU_REPACK

* rename more aarch64 to repack

* clean up rebase leftover

* fix compilation errors

* remove trailing spaces

* try to fix clang compilation errors

* try to fix clang compilation errors again

* try to fix clang compilation errors, 3rd attempt

* try to fix clang compilation errors, 4th attempt

* try to fix clang compilation errors, 5th attempt

* try to fix clang compilation errors, 6th attempt

* try to fix clang compilation errors, 7th attempt

* try to fix clang compilation errors, 8th attempt

* try to fix clang compilation errors, 9th attempt

* more cleanup

* fix compilation errors

* fix apple targets

* fix a typo in arm version of ggml_vec_dot_q4_K_q8_K

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-09 16:47:13 +02:00
Concedo
6c5c8be48d try to make rocm work for the github ci, requires disabling rocwmma 2025-06-08 21:52:29 +08:00
Concedo
a80dfa5c10 various minor fixes 2025-06-08 01:11:42 +08:00
Concedo
301450b1eb attempt to use system glslc first before using bundled glslc 2025-06-07 16:54:25 +08:00
Concedo
d18938fc70 fixed build 2025-06-06 18:05:44 +08:00
Concedo
eec5a8ad16 breaking change: due to cuda12 upgrade, release filenames will change. standardize them to windows naming for the future. (+1 squashed commits)
Squashed commits:

[75842919a] cuda12.4 test
2025-06-06 14:02:34 +08:00
Concedo
b08dca65ed Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	common/CMakeLists.txt
#	common/arg.cpp
#	common/chat.cpp
#	examples/parallel/README.md
#	examples/parallel/parallel.cpp
#	ggml/cmake/common.cmake
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-sycl/rope.cpp
#	models/ggml-vocab-bert-bge.gguf.inp
#	models/ggml-vocab-bert-bge.gguf.out
#	models/ggml-vocab-command-r.gguf.inp
#	models/ggml-vocab-command-r.gguf.out
#	models/ggml-vocab-deepseek-coder.gguf.inp
#	models/ggml-vocab-deepseek-coder.gguf.out
#	models/ggml-vocab-deepseek-llm.gguf.inp
#	models/ggml-vocab-deepseek-llm.gguf.out
#	models/ggml-vocab-falcon.gguf.inp
#	models/ggml-vocab-falcon.gguf.out
#	models/ggml-vocab-gpt-2.gguf.inp
#	models/ggml-vocab-gpt-2.gguf.out
#	models/ggml-vocab-llama-bpe.gguf.inp
#	models/ggml-vocab-llama-bpe.gguf.out
#	models/ggml-vocab-llama-spm.gguf.inp
#	models/ggml-vocab-llama-spm.gguf.out
#	models/ggml-vocab-mpt.gguf.inp
#	models/ggml-vocab-mpt.gguf.out
#	models/ggml-vocab-phi-3.gguf.inp
#	models/ggml-vocab-phi-3.gguf.out
#	models/ggml-vocab-qwen2.gguf.inp
#	models/ggml-vocab-qwen2.gguf.out
#	models/ggml-vocab-refact.gguf.inp
#	models/ggml-vocab-refact.gguf.out
#	models/ggml-vocab-starcoder.gguf.inp
#	models/ggml-vocab-starcoder.gguf.out
#	requirements/requirements-gguf_editor_gui.txt
#	tests/CMakeLists.txt
#	tests/test-chat.cpp
#	tests/test-grammar-integration.cpp
#	tests/test-json-schema-to-grammar.cpp
#	tools/mtmd/CMakeLists.txt
#	tools/run/run.cpp
#	tools/server/CMakeLists.txt
2025-05-31 13:04:21 +08:00
Concedo
c987abf9f5 Merge commit '763d06edb7' into concedo_experimental
# Conflicts:
#	.github/workflows/build-linux-cross.yml
#	ggml/CMakeLists.txt
#	ggml/src/ggml-cann/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-vulkan/CMakeLists.txt
#	tools/mtmd/CMakeLists.txt
#	tools/mtmd/clip.cpp
#	tools/mtmd/mtmd.cpp
#	tools/server/CMakeLists.txt
2025-05-31 12:44:18 +08:00
henk717
b8883e254a
KoboldCpp.sh updates (#1562)
* YR makefile upstream

* Create make_portable_rocm_libs.sh

* update makefile, support llama portable, ditch all unnecessary changes

* Delete make_portable_rocm_libs.sh should not be needed

* koboldcpp.sh updates

* Small rocm fixes

* ROCm is now a cuda version not a command

* Don't commit temp file

* Don't commit temp file

* 1200 has errors, removing it for now

* Only rebuild rocm with rebuild

* Update kcpp-build-release-linux.yaml

* Fix rocm filename

* ROCm Linux CI

* We need more diskspace

* Workaround for lockfile getting stuck

Why do I have to do hacks like this....

* Update kcpp-build-release-linux-rocm.yaml

* Dont apt update rocm

You don't allow us to apt update? Better not break things github!

* Container maybe?

* Turns out we aren't root, so we use sudo

* Cleanup ROCm CI PR

* Build for Runpods GPU

* We also need rocblas

* More cleanup just in case

* Update kcpp-build-release-linux-rocm.yaml

---------

Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>
2025-05-26 15:24:49 +08:00
Concedo
60268de62c update targets for rocm 2025-05-25 18:41:15 +08:00
Concedo
499283c63a rename define to match upstream 2025-05-23 17:10:12 +08:00
Concedo
dec3cd92b0 fix cuda compile 2025-05-13 02:15:33 +08:00
Concedo
40eb3a54c4 rename some toolip texts 2025-05-11 22:50:40 +08:00
Concedo
5cf5f35540 added vulkan build target for main.exe 2025-05-11 21:53:08 +08:00
Georgi Gerganov
4773d7a02f
examples : remove infill (#13283)
ggml-ci
2025-05-07 10:28:02 +03:00