Neo Zhang Jianyu
0bbd2262a3
restore the condistion to build & update pacakge when merge ( #10507 )
...
Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2024-11-26 21:43:47 +08:00
Diego Devesa
7db3846a94
ci : publish the docker images created during scheduled runs ( #10515 )
2024-11-26 13:05:20 +01:00
Diego Devesa
c6807b3f28
ci : add ubuntu cuda build, build with one arch on windows ( #10456 )
2024-11-26 13:05:07 +01:00
Diego Devesa
50d5cecbda
ci : build docker images only once daily ( #10503 )
2024-11-25 22:05:39 +01:00
Johannes Gäßler
1f922254f0
Github: update issue templates [no ci] ( #10489 )
2024-11-25 19:18:37 +01:00
Neo Zhang Jianyu
5a8987793f
[SYCL] Fix building Win package for oneAPI 2025.0 update ( #10483 )
...
* fix build package for 2025.0
* debug
* debug
* fix
* rm debug
---------
Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2024-11-25 17:31:10 +08:00
Concedo
afc575fbd8
cleanup, try to add version tagging
2024-11-23 12:59:06 +08:00
蕭澧邦
6dfcfef078
ci: Update oneAPI runtime dll packaging ( #10428 )
...
This is the minimum runtime dll dependencies for oneAPI 2025.0
2024-11-22 10:44:08 +01:00
Johannes Gäßler
599b3e0cd4
GitHub: ask for more info in issue templates ( #10426 )
...
* GitHub: ask for more info in issues [no ci]
* refactor issue templates to be component-specific
* more understandable issue description
* add dropdown for llama.cpp module
2024-11-22 08:32:40 +01:00
Concedo
dbbdb2eedc
try fix macos build again (+3 squashed commit)
...
Squashed commit:
[7d2a67132] fix ci builds
[f0a5f0a97] fixed a typo
[8736d9034] try fix ci builds (+1 squashed commits)
Squashed commits:
[c2ae5a542] Revert "updated ci"
This reverts commit d8ebdde6ee .
2024-11-21 23:15:51 +08:00
Concedo
d8ebdde6ee
updated ci
2024-11-21 18:23:31 +08:00
Concedo
f6e9d11636
try with 2 parallel jobs
2024-11-17 01:46:41 +08:00
R0CKSTAR
f0204a0ec7
ci: build test musa with cmake ( #10298 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-11-15 12:47:25 +01:00
Concedo
fedc3874bd
try fix build inconsistency
2024-11-15 14:12:53 +08:00
Concedo
d595a80abc
update prints
2024-11-15 14:10:02 +08:00
Romain Biessy
5a54af4d4f
sycl: Use syclcompat::dp4a ( #10267 )
...
* sycl: Use syclcompat::dp4a
* Using the syclcompat version allow the compiler to optimize the
operation with native function
* Update news section
* Update CI Windows oneAPI version to 2025.0
* Reword doc
* Call syclcompat::dp4a inside dpct::dp4a
This reverts commit 90cb61d692d61360b46954a1c7f780bd2e569b73.
2024-11-15 11:09:12 +08:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries ( #10256 )
...
* ggml : build backends as libraries
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00
Georgi Gerganov
ec450d3bbf
metal : opt-in compile flag for BF16 ( #10218 )
...
* metal : opt-in compile flag for BF16
ggml-ci
* ci : use BF16
ggml-ci
* swift : switch back to v12
* metal : has_float -> use_float
ggml-ci
* metal : fix BF16 check in MSL
ggml-ci
2024-11-08 21:59:46 +02:00
Eve
3407364776
Q6_K AVX improvements ( #10118 )
...
* q6_k instruction reordering attempt
* better subtract method
* should be theoretically faster
small improvement with shuffle lut, likely because all loads are already done at that stage
* optimize bit fiddling
* handle -32 offset separately. bsums exists for a reason!
* use shift
* Update ggml-quants.c
* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86
2024-11-04 23:06:31 +01:00
Concedo
4ae06b4a64
print some env vars for win ci
2024-11-01 23:58:41 +08:00
R0CKSTAR
cf8e0a3bb9
musa: add docker image support ( #9685 )
...
* mtgpu: add docker image support
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* mtgpu: enable docker workflow
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-10-10 20:10:37 +02:00
Xuan Son Nguyen
f3fdcfaa79
ci : fine-grant permission ( #9710 )
2024-10-04 11:47:19 +02:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces ( #9707 )
...
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
serhii-nakon
6f1d9d71f4
Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS ( #9641 )
...
* Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS
* Set ROCM_DOCKER_ARCH as string due it incorrectly build and cause OOM exit code
2024-09-30 20:57:12 +02:00
compilade
511636df0c
ci : reduce severity of unused Pyright ignore comments ( #9697 )
2024-09-30 14:13:16 -04:00
Neo Zhang Jianyu
95bc82fbc0
[SYCL] add missed dll file in package ( #9577 )
...
* update oneapi to 2024.2
* use 2024.1
---------
Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2024-09-26 17:38:31 +08:00
Xuan Son Nguyen
ea9c32be71
ci : fix docker build number and tag name ( #9638 )
...
* ci : fix docker build number and tag name
* fine-grant permissions
2024-09-25 17:26:01 +02:00
Huang Qi
e948a7da7a
CI: Provide prebuilt windows binary for hip ( #9467 )
2024-09-21 02:39:41 +02:00
Georgi Gerganov
6262d13e0b
common : reimplement logging ( #9418 )
...
https://github.com/ggerganov/llama.cpp/pull/9418
2024-09-15 20:46:12 +03:00
Mathijs Henquet
78203641fe
server : Add option to return token pieces in /tokenize endpoint ( #9108 )
...
* server : added with_pieces functionality to /tokenize endpoint
* server : Add tokenize with pieces tests to server.feature
* Handle case if tokenizer splits along utf8 continuation bytes
* Add example of token splitting
* Remove trailing ws
* Fix trailing ws
* Maybe fix ci
* maybe this fix windows ci?
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2024-09-12 22:30:11 +02:00
Huang Qi
4dc4f5f14a
ci : update HIP SDK to 24.Q3 (ROCm 6.1) ( #9329 )
2024-09-12 14:28:43 +03:00
Trivikram Kamat
3c26a1644d
ci : bump actions/checkout to v4 ( #9377 )
2024-09-12 14:27:45 +03:00
slaren
6c89eb0b47
ci : disable rocm image creation ( #9340 )
2024-09-07 10:48:54 +03:00
awatuna
32b2ec88bc
Update build.yml ( #9184 )
...
build rpc-server for windows cuda
2024-09-06 00:34:36 +02:00
slaren
9fe94ccac9
docker : build images only once ( #9225 )
2024-08-28 17:28:00 +02:00
Georgi Gerganov
d5492f0525
ci : disable bench workflow ( #9010 )
2024-08-15 10:11:11 +03:00
Diogo Teles Sant'Anna
fc4ca27b25
ci : fix github workflow vulnerable to script injection ( #9008 )
...
Signed-off-by: Diogo Teles Sant'Anna <diogoteles@google.com>
2024-08-12 19:28:23 +03:00
Radoslav Gerganov
1f67436c5e
ci : enable RPC in all of the released builds ( #9006 )
...
ref: #8912
2024-08-12 19:17:03 +03:00
Georgi Gerganov
d3ae0ee8d7
py : fix requirements check '==' -> '~=' ( #8982 )
...
* py : fix requirements check '==' -> '~='
* cont : fix the fix
* ci : run on all requirements.txt
2024-08-12 11:02:01 +03:00
Concedo
03adb90dc6
prompt command done
2024-08-07 20:52:28 +08:00
Concedo
c7108742f4
fix typo
2024-08-06 17:24:58 +08:00
henk717
0d534d810f
Mac builds ( #1037 )
...
* OSX attempt 1
* OSX Pyinstaller
* Update kcpp-build-release-osx.yaml
* Update kcpp-build-release-osx.yaml
* Update kcpp-build-release-osx.yaml
* Add .metal file
* Update kcpp-build-release-osx.yaml
* Polish Mac
(cherry picked from commit 52cc0daa1b )
2024-08-06 17:11:19 +08:00
Johannes Gäßler
6eeaeba126
cmake: use 1 more thread for non-ggml in CI ( #8740 )
2024-07-28 22:32:44 +02:00
Concedo
a84f7c5d81
revert num old cpu for ci
2024-07-25 13:24:34 +08:00
Concedo
e28c42d7f7
adjusted layer estimation
2024-07-24 21:54:49 +08:00
Concedo
44ef87f14c
update lite, try fix ci
2024-07-24 16:31:34 +08:00
Johannes Gäßler
69c487f4ed
CUDA: MMQ code deduplication + iquant support ( #8495 )
...
* CUDA: MMQ code deduplication + iquant support
* 1 less parallel job for CI build
2024-07-20 22:25:26 +02:00
Concedo
8412946b9f
fix oldcpu build avx1
2024-07-15 23:42:22 +08:00
Concedo
21179d675b
try ci for avx1, up ver (+2 squashed commit)
...
Squashed commit:
[74150175] up version
[97b6163c] try ci for avx1 linux
2024-07-15 23:07:07 +08:00
Concedo
1a6855f597
Merge branch 'concedo_experimental' into concedo
2024-07-15 00:02:50 +08:00