Commit graph

447 commits

Author SHA1 Message Date
Concedo
ce58d1253f fixed build and workflow 2025-06-21 00:56:27 +08:00
Concedo
5cdb2d3fc6 cleanup 2025-06-11 01:35:40 +08:00
Concedo
8386546e08 Switched VS2019 for revert cu12.1 build, hopefully solves dll issues
try change order (+3 squashed commit)

Squashed commit:

[457f02507] try newer jimver

[64af28862] windows pyinstaller shim. the final loader will be moved into the packed directory later.

[0272ecf2d] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm
try again (+3 squashed commit)

Squashed commit:

[133e81633] try without pwsh

[4d99cefba] try without pwsh

[bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm
2025-06-10 23:08:02 +08:00
Concedo
28b35ca879 allow wmma flag for rocm 2025-06-10 01:23:48 +08:00
Concedo
deece4be69 missed a build target 2025-06-09 17:05:56 +08:00
Concedo
6c5c8be48d try to make rocm work for the github ci, requires disabling rocwmma 2025-06-08 21:52:29 +08:00
Concedo
7132d6b15c test rocm rolling (+1 squashed commits)
Squashed commits:

[43c8f7fc6] test rocm rolling (+4 squashed commit)

Squashed commit:

[16a60aa77] test clobber 4

[a6c866450] test clobber 3

[9322f17f6] test clobber 2

[b7a420cbe] testing clobber
2025-06-08 15:33:05 +08:00
Concedo
abc272d89f breaking change: standardize ci binary names 2025-06-07 00:40:46 +08:00
Concedo
6effb65cfe change singleinstance order 2025-06-06 21:20:30 +08:00
Concedo
8b141d8647 stick to cu12.1 for linux for now 2025-06-06 17:38:28 +08:00
Concedo
eec5a8ad16 breaking change: due to cuda12 upgrade, release filenames will change. standardize them to windows naming for the future. (+1 squashed commits)
Squashed commits:

[75842919a] cuda12.4 test
2025-06-06 14:02:34 +08:00
Concedo
50a27793d3 upgrade windows runners to windows 2022, cu11 still uses vs2019
this should finally work (+21 squashed commit)

Squashed commit:

[5edac5b59] Revert "quick dbg"

This reverts commit fd62a997cc6684bb89242d5e7b0ae2aed83fd27f.

[fd62a997c] quick dbg

[bcccae7e6] sanity check 2

[568e2eb08] sanity check

[2f30d573a] please work 2

[cf8765221] please work

[c535e60d9] try a small trick

[d4ba79b80] 2022 test

[3f146b000] t2

[4a3b9a9b4] revert and test

[4bdc9a149] reverted test2

[5081cb4a3] reverted test

[ea9a826f3] broken test

[3c11ae389] compare 2019

[8ecec4fec] not for cu12

[0be964f3a] added vs2019 for the other runners

[5d24641cb] debugging 4

[1dee79207] debugging 3

[ab172f133] more debugging 2

[b1a895e84] more debugging

[5d21d8bd0] vs2019 setup
2025-06-06 14:02:34 +08:00
Concedo
a341188f84 add install for vs2019 2025-06-05 10:32:57 +08:00
Concedo
a74d8669b3 try hardcoded path (+1 squashed commits)
Squashed commits:

[711b43d9d] let's see if VS2019 can work
2025-06-05 10:26:02 +08:00
Concedo
f3bb947a13 cuda use wmma flash attention for turing (+1 squashed commits)
Squashed commits:

[3c5112398] 117 (+10 squashed commit)

Squashed commit:

[4f01bb2d4] 117 graphs 80v

[7549034ea] 117 graphs

[dabf9cb99] checking if cuda 11.5.2 works

[ba7ccdb7a] another try cu11.7 only

[752cf2ae5] increase aria2c download log rate

[dc4f198fd] test send turing to wmma flash attention

[496a22e83] temp build test cu11.7.0

[ca759c424] temp build test cu11.7

[c46ada17c] test build: enable virtual80 for oldcpu

[3ccfd939a] test build: with cuda graphs for all
2025-06-01 11:41:45 +08:00
henk717
b8883e254a
KoboldCpp.sh updates (#1562)
* YR makefile upstream

* Create make_portable_rocm_libs.sh

* update makefile, support llama portable, ditch all unnecessary changes

* Delete make_portable_rocm_libs.sh should not be needed

* koboldcpp.sh updates

* Small rocm fixes

* ROCm is now a cuda version not a command

* Don't commit temp file

* Don't commit temp file

* 1200 has errors, removing it for now

* Only rebuild rocm with rebuild

* Update kcpp-build-release-linux.yaml

* Fix rocm filename

* ROCm Linux CI

* We need more diskspace

* Workaround for lockfile getting stuck

Why do I have to do hacks like this....

* Update kcpp-build-release-linux-rocm.yaml

* Dont apt update rocm

You don't allow us to apt update? Better not break things github!

* Container maybe?

* Turns out we aren't root, so we use sudo

* Cleanup ROCm CI PR

* Build for Runpods GPU

* We also need rocblas

* More cleanup just in case

* Update kcpp-build-release-linux-rocm.yaml

---------

Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>
2025-05-26 15:24:49 +08:00
Concedo
0dca953d78 removed winget workflow 2025-05-24 16:40:39 +08:00
Concedo
55cc9acec5 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/release.yml
#	README.md
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	tools/mtmd/CMakeLists.txt
#	tools/mtmd/clip.cpp
#	tools/mtmd/clip.h
2025-05-24 12:10:36 +08:00
Diego Devesa
b775345d78
ci : enable winget package updates (#13734) 2025-05-23 23:14:00 +03:00
Diego Devesa
a70a8a69c2
ci : add winget package updater (#13732) 2025-05-23 22:09:38 +02:00
Diego Devesa
3079e9ac8e
release : fix windows hip release (#13707)
* release : fix windows hip release

* make single hip release with multiple targets
2025-05-23 00:21:37 +02:00
Concedo
fdca5ba71e declutter 2025-05-22 22:58:47 +08:00
Concedo
8bd6f9f9ae added a simple cross platform launch script for unpacked dirs 2025-05-22 22:09:46 +08:00
Diego Devesa
d643bb2c79
releases : build CPU backend separately (windows) (#13642) 2025-05-21 22:09:57 +02:00
Concedo
d04b4eeb04 merge not working 2025-05-21 18:06:41 +08:00
R0CKSTAR
33983057d0
musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647)
* musa: fix build warning (unused parameter)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: upgrade MUSA SDK version to rc4.0.1

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: use mudnn::Unary::IDENTITY op to accelerate D2D memory copy

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Update ggml/src/ggml-cuda/cpy.cu

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* musa: remove MUDNN_CHECK_GEN and use CUDA_CHECK_GEN instead in MUDNN_CHECK

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-05-21 09:58:49 +08:00
Alberto Cabrera Pérez
f71f40a284
ci : upgraded oneAPI version in SYCL workflows and dockerfile (#13532) 2025-05-19 11:46:09 +01:00
Concedo
59300dbdf5 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/actions/windows-setup-curl/action.yml
#	.github/workflows/build-linux-cross.yml
#	README.md
#	common/CMakeLists.txt
#	examples/parallel/README.md
#	examples/parallel/parallel.cpp
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-vulkan/CMakeLists.txt
#	tools/server/README.md
2025-05-18 23:27:53 +08:00
Concedo
be3e93c76a bundle AGPL license and llama.cpp's MIT license into binaries. clarified some licensing terms, updated readme (+1 squashed commits)
Squashed commits:

[61c152daf] bundle AGPL license and llama.cpp's MIT license into binaries. clarified some licensing terms, updated readme
2025-05-18 02:21:27 +08:00
Diego Devesa
415e40a357
releases : use arm version of curl for arm releases (#13592) 2025-05-16 19:36:51 +02:00
Sigbjørn Skjæret
7c07ac244d
ci : add ppc64el to build-linux-cross (#13575) 2025-05-16 14:54:23 +02:00
Thammachart Chinvarapon
b064a51a4e
ci: free_disk_space flag enabled for intel variant (#13426)
before cleanup: 20G
after cleanup: 44G
after all built and pushed: 24G

https://github.com/Thammachart/llama.cpp/actions/runs/14945093573/job/41987371245
2025-05-10 16:34:48 +02:00
Jeff Bolz
dc1d2adfc0
vulkan: scalar flash attention implementation (#13324)
* vulkan: scalar flash attention implementation

* vulkan: always use fp32 for scalar flash attention

* vulkan: use vector loads in scalar flash attention shader

* vulkan: remove PV matrix, helps with register usage

* vulkan: reduce register usage in scalar FA, but perf may be slightly worse

* vulkan: load each Q value once. optimize O reduction. more tuning

* vulkan: support q4_0/q8_0 KV in scalar FA

* CI: increase timeout to accommodate newly-supported tests

* vulkan: for scalar FA, select between 1 and 8 rows

* vulkan: avoid using Float16 capability in scalar FA
2025-05-10 08:07:07 +02:00
Concedo
2f5f4ee65a Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	common/CMakeLists.txt
2025-05-09 14:18:20 +08:00
Diego Devesa
15e03282bb
ci : limit write permission to only the release step + fixes (#13392)
* ci : limit write permission to only the release step

* fix win cuda file name

* fix license file copy on multi-config generators
2025-05-08 23:45:22 +02:00
Concedo
2439014a03 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	examples/embedding/embedding.cpp
#	tools/imatrix/imatrix.cpp
#	tools/perplexity/perplexity.cpp
2025-05-08 23:41:02 +08:00
Diego Devesa
70a6991edf
ci : move release workflow to a separate file (#13362) 2025-05-08 13:15:28 +02:00
Diego Devesa
814f795e06
docker : disable arm64 and intel images (#13356) 2025-05-07 16:36:33 +02:00
Concedo
b951310ca5 tryout smaller binaries 2025-05-07 14:56:34 +08:00
Diego Devesa
9f2da5871f
llama : build windows releases with dl backends (#13220) 2025-05-04 14:20:49 +02:00
Diego Devesa
1d36b3670b
llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00
Concedo
bc452da452 improved comfyui compatibility, tweaked hf search 2025-05-02 16:18:31 +08:00
bandoti
d24d592808
ci: fix cross-compile sync issues (#12804) 2025-05-01 19:06:39 -03:00
bandoti
00137157fc
Disable CI cross-compile builds (#13022) 2025-04-19 18:05:03 +02:00
Concedo
4b0f63ed62 cleanup 2025-04-18 22:57:10 +08:00
hipudding
54a7272043
CANN: Add x86 build ci (#12950)
* CANN: Add x86 build ci

* CANN: fix code format
2025-04-15 12:08:55 +01:00
Concedo
c94aec1930 update workflows, update gemma default adapter sysprompt 2025-04-12 18:38:23 +08:00
Concedo
b42fa821d8 try allow build from commit hash 2025-04-12 13:37:10 +08:00
Concedo
7a7bdeab6d json to gbnf endpoint added 2025-04-12 11:41:11 +08:00
R0CKSTAR
8ac9f5d765
ci : Replace freediskspace to free_disk_space in docker.yml (#12861)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-04-11 09:26:17 +02:00