Concedo
1a4f54dd11
update for cu13 builds (no ci will be provided)
2025-09-26 16:01:43 +08:00
Concedo
326f6f3fad
not sure if working on metal
2025-09-21 11:35:02 +08:00
tsite
04498a345a
update makefile to clone llguidance if the directory does not exist ( #1743 )
...
also remove llguidance when running 'make clean'
2025-09-21 08:40:55 +08:00
Concedo
fddd046f9d
metal common
2025-09-15 01:58:32 +08:00
Concedo
a5580a32fb
fix cuda and macos compile issues
2025-09-12 20:53:42 +08:00
tsite
27c443f01e
add support for llguidance ( #1728 )
...
* add llguidance
remove tab indentation for makefile if statements - these are dangerous
fix broken tool compilation commands
add USE_LLGUIDANCE env var to enable llguidance for faster structured
output generation
add llguidance as an optional submodule
* rm submodule
2025-09-11 16:46:03 +08:00
Concedo
f7fa283bb6
indentation fix for makefile
2025-09-11 16:41:51 +08:00
Concedo
52ff99805c
fixed windows 7 compat builds
2025-08-25 10:36:13 +08:00
Concedo
ed5e7a3062
fix for some old android devices
2025-08-24 01:34:54 +08:00
Concedo
8b8396c30c
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# docs/build-s390x.md
# examples/llama.vim
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/common.h
# scripts/compare-llama-bench.py
# src/CMakeLists.txt
# tests/test-backend-ops.cpp
# tools/llama-bench/README.md
# tools/llama-bench/llama-bench.cpp
# tools/server/README.md
2025-08-23 11:35:28 +08:00
Concedo
b50f94ae27
this commit removes ggml_cuda_f16 targets. Merge commit ' 7a6e91ad26' into concedo_experimental
...
# Conflicts:
# docs/build.md
# docs/multimodal/MobileVLM.md
# ggml/CMakeLists.txt
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-musa/CMakeLists.txt
2025-08-21 19:25:29 +08:00
Daniel Bevenius
37f10f955f
make : remove make in favor of CMake ( #15449 )
...
This commit removes the content from the Makefile and updates the
current deprecation message to information that `make` has been
replaced by CMake instead.
The message when `make` is invoked will now be the following:
```console
$ make
Makefile:6: *** Build system changed:
The Makefile build has been replaced by CMake.
For build instructions see:
https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md
. Stop.
```
The motivation for this is that many, if not all targets fail to build
now, after changes to the system, and `make` has also been deprected for
some time now.
2025-08-20 13:31:16 +03:00
Concedo
35707f4e97
split vulkan into two compilation units for faster build
2025-08-20 12:12:47 +08:00
Concedo
67ef5e6c02
phonemizer fixes, now kokoro works very well
2025-08-18 16:13:16 +08:00
Concedo
52606e9b1d
tts cpp model is now loadable in kcpp
2025-08-17 15:47:22 +08:00
Concedo
9935ac093f
standardize tts linting and formatting
2025-08-17 14:11:30 +08:00
Concedo
cfc1a0d4ef
tts cpp cli builds and runs fine.
2025-08-17 13:53:27 +08:00
Concedo
bc04366a65
builds but crashes
2025-08-17 00:09:03 +08:00
Concedo
67e0072245
fixed clblast repacking
2025-08-09 01:08:02 +08:00
Concedo
d37529c0cd
add sanitize flag
2025-08-04 22:19:23 +08:00
Concedo
4db8ba6228
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# ggml/src/ggml-sycl/gemm.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/set_rows.cpp
2025-07-14 23:16:44 +08:00
Concedo
dca49de059
fixed qwen2 audio issues, works fine now (+3 squashed commit)
...
Squashed commit:
[b3053a1ba] updated lite
[5071630d6] fixed mtmd issues, audio works
[06efa5af4] fix mtmd compile
2025-07-12 18:54:41 +08:00
Concedo
e9473305d0
wip2 (+1 squashed commits)
...
Squashed commits:
[4628777b6] wip
2025-07-12 18:54:40 +08:00
Concedo
f8a49aa8e6
fixed a typo
2025-07-08 11:41:09 +08:00
Concedo
18cd46a6db
allow people to manually override gfx12 fa
2025-07-05 11:33:30 +08:00
Concedo
abc1d8ac25
better way of checking for avx2 support
2025-06-22 22:56:50 +08:00
Concedo
45f589b78d
test gfx1200 again
2025-06-21 17:56:04 +08:00
Concedo
b59b5dbbd1
Merge commit ' 456af35eb7' into concedo_experimental
...
# Conflicts:
# ggml/src/ggml-sycl/getrows.cpp
# src/CMakeLists.txt
# tools/llama-bench/llama-bench.cpp
2025-06-20 23:41:27 +08:00
Concedo
33809c9e82
doing what i must because i can, after the mess that is https://github.com/ggml-org/llama.cpp/pull/13892
...
there is so much duplicate code in each cpu arch, i expect upstream will prune it eventually
arch detection has no fallback if all the arches are not found, by right we should set GGML_CPU_GENERIC
i should be relaxing its the weekend
2025-06-14 01:41:16 +08:00
Concedo
f50c793140
not working - refactoring
2025-06-14 00:03:21 +08:00
Concedo
7a688e07cd
remove gfx12 until amd wakes up
2025-06-12 16:52:55 +08:00
Concedo
1970d8c9e8
uvos said it might work
2025-06-12 16:44:46 +08:00
Concedo
8386546e08
Switched VS2019 for revert cu12.1 build, hopefully solves dll issues
...
try change order (+3 squashed commit)
Squashed commit:
[457f02507] try newer jimver
[64af28862 ] windows pyinstaller shim. the final loader will be moved into the packed directory later.
[0272ecf2d ] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm
try again (+3 squashed commit)
Squashed commit:
[133e81633] try without pwsh
[4d99cefba] try without pwsh
[bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm
2025-06-10 23:08:02 +08:00
Concedo
28b35ca879
allow wmma flag for rocm
2025-06-10 01:23:48 +08:00
Concedo
7d8aa31f1f
fixed embeddings, added new parameter to limit max embeddings context
2025-06-10 01:11:55 +08:00
xctan
f470bc36be
ggml-cpu : split arch-specific implementations ( #13892 )
...
* move ggml-cpu-aarch64 to repack
* split quantize_row_q8_0/1
* split helper functions
* split ggml_vec_dot_q4_0_q8_0
* split ggml_vec_dot_q4_1_q8_1
* split ggml_vec_dot_q5_0_q8_0
* split ggml_vec_dot_q5_1_q8_1
* split ggml_vec_dot_q8_0_q8_0
* split ggml_vec_dot_tq1_0_q8_K
* split ggml_vec_dot_tq2_0_q8_K
* split ggml_vec_dot_q2_K_q8_K
* split ggml_vec_dot_q3_K_q8_K
* split ggml_vec_dot_q4_K_q8_K
* split ggml_vec_dot_q5_K_q8_K
* split ggml_vec_dot_q6_K_q8_K
* split ggml_vec_dot_iq2_xxs_q8_K
* split ggml_vec_dot_iq2_xs_q8_K
* split ggml_vec_dot_iq2_s_q8_K
* split ggml_vec_dot_iq3_xxs_q8_K
* split ggml_vec_dot_iq3_s_q8_K
* split ggml_vec_dot_iq1_s_q8_K
* split ggml_vec_dot_iq1_m_q8_K
* split ggml_vec_dot_iq4_nl_q8_0
* split ggml_vec_dot_iq4_xs_q8_K
* fix typos
* fix missing prototypes
* rename ggml-cpu-quants.c
* rename ggml-cpu-traits
* rename arm folder
* move cpu-feats-x86.cpp
* rename ggml-cpu-hbm
* update arm detection macro in quants.c
* move iq quant tables
* split ggml_quantize_mat_q8_0/K
* split ggml_gemv_*
* split ggml_gemm_*
* rename namespace aarch64 to repack
* use weak aliases to replace test macros
* rename GGML_CPU_AARCH64 to GGML_CPU_REPACK
* rename more aarch64 to repack
* clean up rebase leftover
* fix compilation errors
* remove trailing spaces
* try to fix clang compilation errors
* try to fix clang compilation errors again
* try to fix clang compilation errors, 3rd attempt
* try to fix clang compilation errors, 4th attempt
* try to fix clang compilation errors, 5th attempt
* try to fix clang compilation errors, 6th attempt
* try to fix clang compilation errors, 7th attempt
* try to fix clang compilation errors, 8th attempt
* try to fix clang compilation errors, 9th attempt
* more cleanup
* fix compilation errors
* fix apple targets
* fix a typo in arm version of ggml_vec_dot_q4_K_q8_K
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-09 16:47:13 +02:00
Concedo
6c5c8be48d
try to make rocm work for the github ci, requires disabling rocwmma
2025-06-08 21:52:29 +08:00
Concedo
a80dfa5c10
various minor fixes
2025-06-08 01:11:42 +08:00
Concedo
301450b1eb
attempt to use system glslc first before using bundled glslc
2025-06-07 16:54:25 +08:00
Concedo
d18938fc70
fixed build
2025-06-06 18:05:44 +08:00
Concedo
eec5a8ad16
breaking change: due to cuda12 upgrade, release filenames will change. standardize them to windows naming for the future. (+1 squashed commits)
...
Squashed commits:
[75842919a] cuda12.4 test
2025-06-06 14:02:34 +08:00
Concedo
b08dca65ed
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/CMakeLists.txt
# common/arg.cpp
# common/chat.cpp
# examples/parallel/README.md
# examples/parallel/parallel.cpp
# ggml/cmake/common.cmake
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/rope.cpp
# models/ggml-vocab-bert-bge.gguf.inp
# models/ggml-vocab-bert-bge.gguf.out
# models/ggml-vocab-command-r.gguf.inp
# models/ggml-vocab-command-r.gguf.out
# models/ggml-vocab-deepseek-coder.gguf.inp
# models/ggml-vocab-deepseek-coder.gguf.out
# models/ggml-vocab-deepseek-llm.gguf.inp
# models/ggml-vocab-deepseek-llm.gguf.out
# models/ggml-vocab-falcon.gguf.inp
# models/ggml-vocab-falcon.gguf.out
# models/ggml-vocab-gpt-2.gguf.inp
# models/ggml-vocab-gpt-2.gguf.out
# models/ggml-vocab-llama-bpe.gguf.inp
# models/ggml-vocab-llama-bpe.gguf.out
# models/ggml-vocab-llama-spm.gguf.inp
# models/ggml-vocab-llama-spm.gguf.out
# models/ggml-vocab-mpt.gguf.inp
# models/ggml-vocab-mpt.gguf.out
# models/ggml-vocab-phi-3.gguf.inp
# models/ggml-vocab-phi-3.gguf.out
# models/ggml-vocab-qwen2.gguf.inp
# models/ggml-vocab-qwen2.gguf.out
# models/ggml-vocab-refact.gguf.inp
# models/ggml-vocab-refact.gguf.out
# models/ggml-vocab-starcoder.gguf.inp
# models/ggml-vocab-starcoder.gguf.out
# requirements/requirements-gguf_editor_gui.txt
# tests/CMakeLists.txt
# tests/test-chat.cpp
# tests/test-grammar-integration.cpp
# tests/test-json-schema-to-grammar.cpp
# tools/mtmd/CMakeLists.txt
# tools/run/run.cpp
# tools/server/CMakeLists.txt
2025-05-31 13:04:21 +08:00
Concedo
c987abf9f5
Merge commit ' 763d06edb7' into concedo_experimental
...
# Conflicts:
# .github/workflows/build-linux-cross.yml
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-vulkan/CMakeLists.txt
# tools/mtmd/CMakeLists.txt
# tools/mtmd/clip.cpp
# tools/mtmd/mtmd.cpp
# tools/server/CMakeLists.txt
2025-05-31 12:44:18 +08:00
henk717
b8883e254a
KoboldCpp.sh updates ( #1562 )
...
* YR makefile upstream
* Create make_portable_rocm_libs.sh
* update makefile, support llama portable, ditch all unnecessary changes
* Delete make_portable_rocm_libs.sh should not be needed
* koboldcpp.sh updates
* Small rocm fixes
* ROCm is now a cuda version not a command
* Don't commit temp file
* Don't commit temp file
* 1200 has errors, removing it for now
* Only rebuild rocm with rebuild
* Update kcpp-build-release-linux.yaml
* Fix rocm filename
* ROCm Linux CI
* We need more diskspace
* Workaround for lockfile getting stuck
Why do I have to do hacks like this....
* Update kcpp-build-release-linux-rocm.yaml
* Dont apt update rocm
You don't allow us to apt update? Better not break things github!
* Container maybe?
* Turns out we aren't root, so we use sudo
* Cleanup ROCm CI PR
* Build for Runpods GPU
* We also need rocblas
* More cleanup just in case
* Update kcpp-build-release-linux-rocm.yaml
---------
Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>
2025-05-26 15:24:49 +08:00
Concedo
60268de62c
update targets for rocm
2025-05-25 18:41:15 +08:00
Concedo
499283c63a
rename define to match upstream
2025-05-23 17:10:12 +08:00
Concedo
dec3cd92b0
fix cuda compile
2025-05-13 02:15:33 +08:00
Concedo
40eb3a54c4
rename some toolip texts
2025-05-11 22:50:40 +08:00
Concedo
5cf5f35540
added vulkan build target for main.exe
2025-05-11 21:53:08 +08:00
Georgi Gerganov
4773d7a02f
examples : remove infill ( #13283 )
...
ggml-ci
2025-05-07 10:28:02 +03:00