koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-04-28 03:30:20 +00:00

Author	SHA1	Message	Date
Concedo	7ac0102ed3	hope i didnt break anything	2025-08-14 21:42:24 +08:00
uvos	29c8fbe4e0	HIP: bump requirement to rocm 6.1 (#15296 )	2025-08-13 20:44:30 +02:00
Ali Tariq	648ebcdb73	ci : Added CI with RISC-V RVV1.0 Hardware (#14439 ) * Changed the CI file to hw * Changed the CI file to hw * Added to sudoers for apt * Removed the clone command and used checkout * Added libcurl * Added gcc-14 * Checking gcc --version * added gcc-14 symlink * added CC and C++ variables * Added the gguf weight * Changed the weights path * Added system specification * Removed white spaces * ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions. * removed trailing whitespaces --------- Co-authored-by: Akif Ejaz <akifejaz40@gmail.com>	2025-08-13 13:14:44 +03:00
Sigbjørn Skjæret	07aa869a91	ci : add more python requirements to copilot-setup-steps (#15289 ) * ci : add flake8 and pyright to copilot-setup-steps.yml * add tools/server/tests/requirements.txt	2025-08-13 11:30:45 +02:00
Sigbjørn Skjæret	bc5182272c	ci : add copilot-setup-steps.yml (#15214 )	2025-08-13 09:07:13 +02:00
Concedo	57db0ce9cd	allow uploading tagged pinned versions for rocm	2025-08-10 11:04:49 +08:00
Reese Levine	5fd160bbd9	ggml: Add basic SET_ROWS support in WebGPU (#15137 ) * Begin work on set_rows * Work on set rows * Add error buffers for reporting unsupported SET_ROWS indices * Remove extra comments	2025-08-06 15:14:40 -07:00
Reese Levine	9515c6131a	ggml: WebGPU disable SET_ROWS for now (#15078 ) * Add paramater buffer pool, batching of submissions, refactor command building/submission * Add header for linux builds * Free staged parameter buffers at once * Format with clang-format * Fix thread-safe implementation * Use device implicit synchronization * Update workflow to use custom release * Remove testing branch workflow * Disable set_rows until it's implemented * Fix potential issue around empty queue submission * Try synchronous submission * Try waiting on all futures explicitly * Add debug * Add more debug messages * Work on getting ssh access for debugging * Debug on failure * Disable other tests * Remove extra if * Try more locking * maybe passes? * test * Some cleanups * Restore build file * Remove extra testing branch ci	2025-08-05 16:26:38 -07:00
Reese Levine	587d0118f5	ggml: WebGPU backend host improvements and style fixing (#14978 ) * Add parameter buffer pool, batching of submissions, refactor command building/submission * Add header for linux builds * Free staged parameter buffers at once * Format with clang-format * Fix thread-safe implementation * Use device implicit synchronization * Update workflow to use custom release * Remove testing branch workflow	2025-08-04 08:52:43 -07:00
Sigbjørn Skjæret	2bf3fbf0b5	ci : check that pre-tokenizer hashes are up-to-date (#15032 ) * torch is not required for convert_hf_to_gguf_update * add --check-missing parameter * check that pre-tokenizer hashes are up-to-date	2025-08-02 14:39:01 +02:00
kallewoof	b7b3e0d2a7	add adapter tests for autoguess (#1654 )	2025-07-25 22:14:18 +08:00
kallewoof	ff8f156fa0	AutoGuess tests (#1650 ) * whitespace * AutoGuess remove dot suffix in names * .gitignore update * test: added autoguess test suite * github workflow to run autoguess test when appropriate * git clone unavailable tokenizer configs rather than committing to repo * fix link to included tokenizer configs * skip storing downloaded tokenizer configs * typo * minor fixes * clean-up * limit workflow to trigger from experimental branch --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-07-25 19:21:00 +08:00
R0CKSTAR	3f4fc97f1d	musa: upgrade musa sdk to rc4.2.0 (#14498 ) * musa: apply mublas API changes Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: update musa version to 4.2.0 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: restore MUSA graph settings in CMakeLists.txt Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: disable mudnnMemcpyAsync by default Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: switch back to non-mudnn images Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * minor changes Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: restore rc in docker image tag Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-07-24 20:05:37 +01:00
Sigbjørn Skjæret	221c0e0c58	ci : correct label refactor->refactoring (#14832 )	2025-07-23 14:27:54 +02:00
Sigbjørn Skjæret	1ba45d4982	ci : disable failing vulkan crossbuilds (#14723 )	2025-07-16 20:52:08 -03:00
Reese Levine	21c021745d	ggml: Add initial WebGPU backend (#14521 ) * Minimal setup of webgpu backend with dawn. Just prints out the adapter and segfaults * Initialize webgpu device * Making progress on setting up the backend * Finish more boilerplate/utility functions * Organize file and work on alloc buffer * Add webgpu_context to prepare for actually running some shaders * Work on memset and add shader loading * Work on memset polyfill * Implement set_tensor as webgpu WriteBuffer, remove host_buffer stubs since webgpu doesn't support it * Implement get_tensor and buffer_clear * Finish rest of setup * Start work on compute graph * Basic mat mul working * Work on emscripten build * Basic WebGPU backend instructions * Use EMSCRIPTEN flag * Work on passing ci, implement 4d tensor multiplication * Pass thread safety test * Implement permuting for mul_mat and cpy * minor cleanups * Address feedback * Remove division by type size in cpy op * Fix formatting and add github action workflows for vulkan and metal (m-series) webgpu backends * Fix name * Fix macos dawn prefix path	2025-07-16 18:18:51 +03:00
Concedo	2a59adce0f	stay on macos 14	2025-07-16 15:47:33 +08:00
Concedo	aa3623dcce	remove unwanted workflow	2025-07-13 23:43:56 +08:00
Concedo	8cebec5128	Merge branch 'upstream' into concedo_experimental # Conflicts: # CMakePresets.json # README.md # common/CMakeLists.txt # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp # scripts/sync-ggml.last # tests/test-backend-ops.cpp # tools/run/CMakeLists.txt	2025-07-13 23:39:41 +08:00
Aman Gupta	11ee0fea2a	Docs: script to auto-generate ggml operations docs (#14598 ) * Docs: script to auto-generate ggml operations docs * Review: formatting changes + change github action * Use built-in types instead of typing * docs : add BLAS and Metal ops --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-07-10 23:29:01 +08:00
Jeff Bolz	53903ae6fa	vulkan: increase timeout for CI (#14574 )	2025-07-08 09:38:31 +02:00
Georgi Gerganov	d4cdd9c1c3	ggml : remove kompute backend (#14501 ) ggml-ci	2025-07-03 07:48:32 +03:00
Rotem Dan	f3ed38d793	Set RPATH to "@loader_path" / "$ORIGIN" to ensure executables and dynamic libraries search for dependencies in their origin directory. (#14309 )	2025-07-02 18:37:16 +02:00
Sigbjørn Skjæret	611ba4b264	ci : add OpenCL to labeler workflow (#14496 )	2025-07-02 09:02:51 +02:00
Eric Zhang	85841e121d	github : add OpenCL backend to issue templates (#14492 )	2025-07-02 08:41:35 +03:00
Georgi Gerganov	de56944147	ci : disable fast-math for Metal GHA CI (#14478 ) * ci : disable fast-math for Metal GHA CI ggml-ci * cont : remove -g flag ggml-ci	2025-07-01 18:04:08 +03:00
Sigbjørn Skjæret	6609507a91	ci : fix windows build and release (#14431 )	2025-06-28 09:57:07 +02:00
bandoti	ce82bd0117	ci: add workflow for relocatable cmake package (#14346 )	2025-06-23 15:30:51 -03:00
Jeff Bolz	bf2a99e3cb	vulkan: update windows SDK in release.yml (#14344 )	2025-06-23 15:44:48 +02:00
Jeff Bolz	3a9457df96	vulkan: update windows SDK in CI (#14334 )	2025-06-23 10:19:24 +02:00
Concedo	abc1d8ac25	better way of checking for avx2 support	2025-06-22 22:56:50 +08:00
Concedo	52dcfe42d6	try auto selecting correct backend while checking intrinsics	2025-06-22 18:16:02 +08:00
Concedo	ce58d1253f	fixed build and workflow	2025-06-21 00:56:27 +08:00
Diego Devesa	6adc3c3ebc	llama : add thread safety test (#14035 ) * llama : add thread safety test * llamafile : remove global state * llama : better LLAMA_SPLIT_MODE_NONE logic when main_gpu < 0 GPU devices are not used --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-06-16 08:11:43 -07:00
bandoti	0dbcabde8c	cmake: clean up external project logic for vulkan-shaders-gen (#14179 ) * Remove install step for vulkan-shaders-gen * Add install step to normalize msvc with make * Regenerate modified shaders at build-time	2025-06-16 10:32:13 -03:00
Concedo	5cdb2d3fc6	cleanup	2025-06-11 01:35:40 +08:00
Jeff Bolz	652b70e667	vulkan: force device 0 in CI (#14106 )	2025-06-10 10:53:47 -05:00
Concedo	8386546e08	Switched VS2019 for revert cu12.1 build, hopefully solves dll issues try change order (+3 squashed commit) Squashed commit: [457f02507] try newer jimver [`64af28862`] windows pyinstaller shim. the final loader will be moved into the packed directory later. [`0272ecf2d`] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm try again (+3 squashed commit) Squashed commit: [133e81633] try without pwsh [4d99cefba] try without pwsh [bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm	2025-06-10 23:08:02 +08:00
Diego Devesa	7f4fbe5183	llama : allow building all tests on windows when not using shared libs (#13980 ) * llama : allow building all tests on windows when not using shared libraries * add static windows build to ci * tests : enable debug logs for test-chat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-06-09 20:03:09 +02:00
Concedo	28b35ca879	allow wmma flag for rocm	2025-06-10 01:23:48 +08:00
Concedo	deece4be69	missed a build target	2025-06-09 17:05:56 +08:00
Yuanhao Ji	056eb74534	CANN: Enable labeler for Ascend NPU (#13914 )	2025-06-09 11:20:06 +08:00
Concedo	6c5c8be48d	try to make rocm work for the github ci, requires disabling rocwmma	2025-06-08 21:52:29 +08:00
Concedo	7132d6b15c	test rocm rolling (+1 squashed commits) Squashed commits: [43c8f7fc6] test rocm rolling (+4 squashed commit) Squashed commit: [16a60aa77] test clobber 4 [a6c866450] test clobber 3 [9322f17f6] test clobber 2 [b7a420cbe] testing clobber	2025-06-08 15:33:05 +08:00
吴小白	5787b5da57	ci: add LoongArch cross-compile build (#13944 )	2025-06-07 10:39:11 -03:00
Concedo	abc272d89f	breaking change: standardize ci binary names	2025-06-07 00:40:46 +08:00
Concedo	6effb65cfe	change singleinstance order	2025-06-06 21:20:30 +08:00
Concedo	8b141d8647	stick to cu12.1 for linux for now	2025-06-06 17:38:28 +08:00
Concedo	eec5a8ad16	breaking change: due to cuda12 upgrade, release filenames will change. standardize them to windows naming for the future. (+1 squashed commits) Squashed commits: [75842919a] cuda12.4 test	2025-06-06 14:02:34 +08:00
Concedo	50a27793d3	upgrade windows runners to windows 2022, cu11 still uses vs2019 this should finally work (+21 squashed commit) Squashed commit: [5edac5b59] Revert "quick dbg" This reverts commit fd62a997cc6684bb89242d5e7b0ae2aed83fd27f. [fd62a997c] quick dbg [bcccae7e6] sanity check 2 [568e2eb08] sanity check [2f30d573a] please work 2 [cf8765221] please work [c535e60d9] try a small trick [d4ba79b80] 2022 test [3f146b000] t2 [4a3b9a9b4] revert and test [4bdc9a149] reverted test2 [5081cb4a3] reverted test [ea9a826f3] broken test [3c11ae389] compare 2019 [8ecec4fec] not for cu12 [0be964f3a] added vs2019 for the other runners [5d24641cb] debugging 4 [1dee79207] debugging 3 [ab172f133] more debugging 2 [b1a895e84] more debugging [5d21d8bd0] vs2019 setup	2025-06-06 14:02:34 +08:00

1 2 3 4 5 ...

490 commits