Concedo
67ef5e6c02
phonemizer fixes, now kokoro works very well
2025-08-18 16:13:16 +08:00
Concedo
d876898476
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/cpu.Dockerfile
# .devops/cuda.Dockerfile
# .github/ISSUE_TEMPLATE/010-bug-compilation.yml
# .github/ISSUE_TEMPLATE/011-bug-results.yml
# .github/labeler.yml
# .github/workflows/build.yml
# .github/workflows/release.yml
# CODEOWNERS
# README.md
# docs/build-s390x.md
# docs/ops.md
# examples/eval-callback/eval-callback.cpp
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/transpose.cl
# tests/test-backend-ops.cpp
# tests/test-chat.cpp
# tests/test-opt.cpp
2025-08-16 12:39:25 +08:00
Aaron Teo
ff27f80a74
ggml: initial IBM zDNN backend ( #14975 )
...
* ggml-zdnn: inital backend impl
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: temp change z17 to arch15
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: fix build bugs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: tensor->extra logging check
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: add layout name mapping, ztensor information
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: separate logging into its own line
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: add shape comparison
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: add ggml_tensor shape log
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ggml-zdnn: fix incorrect shape logging
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add output buffer check
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: run compute and store into tensor->extra
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add set_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add more loggers
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: update set_tensor logging to check only for matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: last working matmul version
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add comments to prevent accidentally deleting lines
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: support op out_prod
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: update op out_prod to use tensor->extra
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: rewrite the backend implementation
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: bugfix new impl
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix compiler warnings and bugfixes
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: test ztensor finding in init_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: implement at least 1 op to test
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: assign tensor->extra to buffer
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add check for view tensors to prevent init_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: rework init_tensor to create new buffers
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: switch to std vector instead of array
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: switch buffers back and set to arbitrary number
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: impl init_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: update supports_op matmul matrix
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix incorrect ztensor shape, reduce memory padding
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: code clean up
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: impl matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix compiler error missing type
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix missing data transform call
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add bias init_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: tighten memory usage, change string allocation
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add bias ztensor and data free
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add bias data transform
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add more debug info for extra buffer transform
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add logger to check if mat mul ops go through set_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: activate bias transform in matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: move weights transform into mulmat
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add more safeguards in matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix sequencing of transforms
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: bugfix transform ztensor vs origtensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: figure out why sigtrap is happening
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix sigsegv
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: move everything back to local declaration
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: move bias data to local also
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: bring back working matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: rewrite into mre
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix missing vector import
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix missing vector import in header
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: attempt to fix sigsegv
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix missing load tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix invalid ztensor buffer release
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add logging to debug free buffer
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: remove free_buffer debug info
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add parmblkformat detections
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add nnpa installed detection
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add zdnn_init call for static libs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add init_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: attempt at fixing invalid buffer
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: switch to using deque to fix pointer deref problem
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add weights logging to check
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: attempt to use unique ptr
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add tensor to pre_tfm_desc logging
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add inputs logging
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: disable op_none initialisation for testing
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix missing return from init_tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: load ztensors in cgraph exec
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: work on moving output ztensor as well
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: disable logging and breakpoints for full test
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: attempt at manually changing the layout
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: attempt at using default nwhc format instead
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: disable global load ztensor for now
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix errorenous output load tensor
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: add guards to prevent loading ztensor if transformed
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: code cleanup
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: bring load ztensor back to init routine
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: code clean up
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix ztensor deallocation abort
stabilise ggml <-> zdnn api
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: clean up matmul selection
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: clean up project structure
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: update documentation, prepare for upstream
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* chore: add codeowners
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: disable batched matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: attempt at fixing tensor views during matmul
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: deny all view tensors directly
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix pr comments
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* docs: update ops docs for zdnn
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: redo test-backend-ops for ops.md
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ggml-zdnn: fix typo in build-s390x.md
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* codeowners: remove taronaeo for now
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Revert "codeowners: remove taronaeo for now"
This reverts commit 411ea4ed78d08778967bd0bd33a6538cfcbe082f.
* ggml-zdnn: remove unused ggml_zdnn macro
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-08-15 21:11:22 +08:00
Sigbjørn Skjæret
d3248d9b65
ci : fix ios-xcode-build ( #15324 )
...
* fix ios-xcode-build
* use xcode-select with fixed version
* switch to macos-15 to get xcode 16.4
2025-08-15 14:02:39 +02:00
Diego Devesa
7aeee88cfe
ci : move ccache action to ggml-org fork ( #15328 )
2025-08-15 12:27:02 +02:00
Concedo
7ac0102ed3
hope i didnt break anything
2025-08-14 21:42:24 +08:00
uvos
29c8fbe4e0
HIP: bump requirement to rocm 6.1 ( #15296 )
2025-08-13 20:44:30 +02:00
Ali Tariq
648ebcdb73
ci : Added CI with RISC-V RVV1.0 Hardware ( #14439 )
...
* Changed the CI file to hw
* Changed the CI file to hw
* Added to sudoers for apt
* Removed the clone command and used checkout
* Added libcurl
* Added gcc-14
* Checking gcc --version
* added gcc-14 symlink
* added CC and C++ variables
* Added the gguf weight
* Changed the weights path
* Added system specification
* Removed white spaces
* ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow
Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions.
* removed trailing whitespaces
---------
Co-authored-by: Akif Ejaz <akifejaz40@gmail.com>
2025-08-13 13:14:44 +03:00
Sigbjørn Skjæret
07aa869a91
ci : add more python requirements to copilot-setup-steps ( #15289 )
...
* ci : add flake8 and pyright to copilot-setup-steps.yml
* add tools/server/tests/requirements.txt
2025-08-13 11:30:45 +02:00
Sigbjørn Skjæret
bc5182272c
ci : add copilot-setup-steps.yml ( #15214 )
2025-08-13 09:07:13 +02:00
Concedo
57db0ce9cd
allow uploading tagged pinned versions for rocm
2025-08-10 11:04:49 +08:00
Reese Levine
5fd160bbd9
ggml: Add basic SET_ROWS support in WebGPU ( #15137 )
...
* Begin work on set_rows
* Work on set rows
* Add error buffers for reporting unsupported SET_ROWS indices
* Remove extra comments
2025-08-06 15:14:40 -07:00
Reese Levine
9515c6131a
ggml: WebGPU disable SET_ROWS for now ( #15078 )
...
* Add paramater buffer pool, batching of submissions, refactor command building/submission
* Add header for linux builds
* Free staged parameter buffers at once
* Format with clang-format
* Fix thread-safe implementation
* Use device implicit synchronization
* Update workflow to use custom release
* Remove testing branch workflow
* Disable set_rows until it's implemented
* Fix potential issue around empty queue submission
* Try synchronous submission
* Try waiting on all futures explicitly
* Add debug
* Add more debug messages
* Work on getting ssh access for debugging
* Debug on failure
* Disable other tests
* Remove extra if
* Try more locking
* maybe passes?
* test
* Some cleanups
* Restore build file
* Remove extra testing branch ci
2025-08-05 16:26:38 -07:00
Reese Levine
587d0118f5
ggml: WebGPU backend host improvements and style fixing ( #14978 )
...
* Add parameter buffer pool, batching of submissions, refactor command building/submission
* Add header for linux builds
* Free staged parameter buffers at once
* Format with clang-format
* Fix thread-safe implementation
* Use device implicit synchronization
* Update workflow to use custom release
* Remove testing branch workflow
2025-08-04 08:52:43 -07:00
Sigbjørn Skjæret
2bf3fbf0b5
ci : check that pre-tokenizer hashes are up-to-date ( #15032 )
...
* torch is not required for convert_hf_to_gguf_update
* add --check-missing parameter
* check that pre-tokenizer hashes are up-to-date
2025-08-02 14:39:01 +02:00
kallewoof
b7b3e0d2a7
add adapter tests for autoguess ( #1654 )
2025-07-25 22:14:18 +08:00
kallewoof
ff8f156fa0
AutoGuess tests ( #1650 )
...
* whitespace
* AutoGuess remove dot suffix in names
* .gitignore update
* test: added autoguess test suite
* github workflow to run autoguess test when appropriate
* git clone unavailable tokenizer configs rather than committing to repo
* fix link to included tokenizer configs
* skip storing downloaded tokenizer configs
* typo
* minor fixes
* clean-up
* limit workflow to trigger from experimental branch
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2025-07-25 19:21:00 +08:00
R0CKSTAR
3f4fc97f1d
musa: upgrade musa sdk to rc4.2.0 ( #14498 )
...
* musa: apply mublas API changes
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: update musa version to 4.2.0
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: restore MUSA graph settings in CMakeLists.txt
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: disable mudnnMemcpyAsync by default
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: switch back to non-mudnn images
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* minor changes
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* musa: restore rc in docker image tag
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-07-24 20:05:37 +01:00
Sigbjørn Skjæret
221c0e0c58
ci : correct label refactor->refactoring ( #14832 )
2025-07-23 14:27:54 +02:00
Sigbjørn Skjæret
1ba45d4982
ci : disable failing vulkan crossbuilds ( #14723 )
2025-07-16 20:52:08 -03:00
Reese Levine
21c021745d
ggml: Add initial WebGPU backend ( #14521 )
...
* Minimal setup of webgpu backend with dawn. Just prints out the adapter and segfaults
* Initialize webgpu device
* Making progress on setting up the backend
* Finish more boilerplate/utility functions
* Organize file and work on alloc buffer
* Add webgpu_context to prepare for actually running some shaders
* Work on memset and add shader loading
* Work on memset polyfill
* Implement set_tensor as webgpu WriteBuffer, remove host_buffer stubs since webgpu doesn't support it
* Implement get_tensor and buffer_clear
* Finish rest of setup
* Start work on compute graph
* Basic mat mul working
* Work on emscripten build
* Basic WebGPU backend instructions
* Use EMSCRIPTEN flag
* Work on passing ci, implement 4d tensor multiplication
* Pass thread safety test
* Implement permuting for mul_mat and cpy
* minor cleanups
* Address feedback
* Remove division by type size in cpy op
* Fix formatting and add github action workflows for vulkan and metal (m-series) webgpu backends
* Fix name
* Fix macos dawn prefix path
2025-07-16 18:18:51 +03:00
Concedo
2a59adce0f
stay on macos 14
2025-07-16 15:47:33 +08:00
Concedo
aa3623dcce
remove unwanted workflow
2025-07-13 23:43:56 +08:00
Concedo
8cebec5128
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# CMakePresets.json
# README.md
# common/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# scripts/sync-ggml.last
# tests/test-backend-ops.cpp
# tools/run/CMakeLists.txt
2025-07-13 23:39:41 +08:00
Aman Gupta
11ee0fea2a
Docs: script to auto-generate ggml operations docs ( #14598 )
...
* Docs: script to auto-generate ggml operations docs
* Review: formatting changes + change github action
* Use built-in types instead of typing
* docs : add BLAS and Metal ops
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-07-10 23:29:01 +08:00
Jeff Bolz
53903ae6fa
vulkan: increase timeout for CI ( #14574 )
2025-07-08 09:38:31 +02:00
Georgi Gerganov
d4cdd9c1c3
ggml : remove kompute backend ( #14501 )
...
ggml-ci
2025-07-03 07:48:32 +03:00
Rotem Dan
f3ed38d793
Set RPATH to "@loader_path" / "$ORIGIN" to ensure executables and dynamic libraries search for dependencies in their origin directory. ( #14309 )
2025-07-02 18:37:16 +02:00
Sigbjørn Skjæret
611ba4b264
ci : add OpenCL to labeler workflow ( #14496 )
2025-07-02 09:02:51 +02:00
Eric Zhang
85841e121d
github : add OpenCL backend to issue templates ( #14492 )
2025-07-02 08:41:35 +03:00
Georgi Gerganov
de56944147
ci : disable fast-math for Metal GHA CI ( #14478 )
...
* ci : disable fast-math for Metal GHA CI
ggml-ci
* cont : remove -g flag
ggml-ci
2025-07-01 18:04:08 +03:00
Sigbjørn Skjæret
6609507a91
ci : fix windows build and release ( #14431 )
2025-06-28 09:57:07 +02:00
bandoti
ce82bd0117
ci: add workflow for relocatable cmake package ( #14346 )
2025-06-23 15:30:51 -03:00
Jeff Bolz
bf2a99e3cb
vulkan: update windows SDK in release.yml ( #14344 )
2025-06-23 15:44:48 +02:00
Jeff Bolz
3a9457df96
vulkan: update windows SDK in CI ( #14334 )
2025-06-23 10:19:24 +02:00
Concedo
abc1d8ac25
better way of checking for avx2 support
2025-06-22 22:56:50 +08:00
Concedo
52dcfe42d6
try auto selecting correct backend while checking intrinsics
2025-06-22 18:16:02 +08:00
Concedo
ce58d1253f
fixed build and workflow
2025-06-21 00:56:27 +08:00
Diego Devesa
6adc3c3ebc
llama : add thread safety test ( #14035 )
...
* llama : add thread safety test
* llamafile : remove global state
* llama : better LLAMA_SPLIT_MODE_NONE logic
when main_gpu < 0 GPU devices are not used
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-16 08:11:43 -07:00
bandoti
0dbcabde8c
cmake: clean up external project logic for vulkan-shaders-gen ( #14179 )
...
* Remove install step for vulkan-shaders-gen
* Add install step to normalize msvc with make
* Regenerate modified shaders at build-time
2025-06-16 10:32:13 -03:00
Concedo
5cdb2d3fc6
cleanup
2025-06-11 01:35:40 +08:00
Jeff Bolz
652b70e667
vulkan: force device 0 in CI ( #14106 )
2025-06-10 10:53:47 -05:00
Concedo
8386546e08
Switched VS2019 for revert cu12.1 build, hopefully solves dll issues
...
try change order (+3 squashed commit)
Squashed commit:
[457f02507] try newer jimver
[64af28862 ] windows pyinstaller shim. the final loader will be moved into the packed directory later.
[0272ecf2d ] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm
try again (+3 squashed commit)
Squashed commit:
[133e81633] try without pwsh
[4d99cefba] try without pwsh
[bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm
2025-06-10 23:08:02 +08:00
Diego Devesa
7f4fbe5183
llama : allow building all tests on windows when not using shared libs ( #13980 )
...
* llama : allow building all tests on windows when not using shared libraries
* add static windows build to ci
* tests : enable debug logs for test-chat
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-06-09 20:03:09 +02:00
Concedo
28b35ca879
allow wmma flag for rocm
2025-06-10 01:23:48 +08:00
Concedo
deece4be69
missed a build target
2025-06-09 17:05:56 +08:00
Yuanhao Ji
056eb74534
CANN: Enable labeler for Ascend NPU ( #13914 )
2025-06-09 11:20:06 +08:00
Concedo
6c5c8be48d
try to make rocm work for the github ci, requires disabling rocwmma
2025-06-08 21:52:29 +08:00
Concedo
7132d6b15c
test rocm rolling (+1 squashed commits)
...
Squashed commits:
[43c8f7fc6] test rocm rolling (+4 squashed commit)
Squashed commit:
[16a60aa77] test clobber 4
[a6c866450] test clobber 3
[9322f17f6] test clobber 2
[b7a420cbe] testing clobber
2025-06-08 15:33:05 +08:00
吴小白
5787b5da57
ci: add LoongArch cross-compile build ( #13944 )
2025-06-07 10:39:11 -03:00