koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-10 04:00:53 +00:00

Author	SHA1	Message	Date
Concedo	1a4f54dd11	update for cu13 builds (no ci will be provided)	2025-09-26 16:01:43 +08:00
Concedo	326f6f3fad	not sure if working on metal	2025-09-21 11:35:02 +08:00
tsite	04498a345a	update makefile to clone llguidance if the directory does not exist (#1743 ) also remove llguidance when running 'make clean'	2025-09-21 08:40:55 +08:00
Concedo	fddd046f9d	metal common	2025-09-15 01:58:32 +08:00
Concedo	a5580a32fb	fix cuda and macos compile issues	2025-09-12 20:53:42 +08:00
tsite	27c443f01e	add support for llguidance (#1728 ) * add llguidance remove tab indentation for makefile if statements - these are dangerous fix broken tool compilation commands add USE_LLGUIDANCE env var to enable llguidance for faster structured output generation add llguidance as an optional submodule * rm submodule	2025-09-11 16:46:03 +08:00
Concedo	f7fa283bb6	indentation fix for makefile	2025-09-11 16:41:51 +08:00
Concedo	52ff99805c	fixed windows 7 compat builds	2025-08-25 10:36:13 +08:00
Concedo	ed5e7a3062	fix for some old android devices	2025-08-24 01:34:54 +08:00
Concedo	8b8396c30c	Merge branch 'upstream' into concedo_experimental # Conflicts: # README.md # docs/build-s390x.md # examples/llama.vim # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/common.h # scripts/compare-llama-bench.py # src/CMakeLists.txt # tests/test-backend-ops.cpp # tools/llama-bench/README.md # tools/llama-bench/llama-bench.cpp # tools/server/README.md	2025-08-23 11:35:28 +08:00
Concedo	b50f94ae27	this commit removes ggml_cuda_f16 targets. Merge commit '`7a6e91ad26`' into concedo_experimental # Conflicts: # docs/build.md # docs/multimodal/MobileVLM.md # ggml/CMakeLists.txt # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-musa/CMakeLists.txt	2025-08-21 19:25:29 +08:00
Daniel Bevenius	37f10f955f	make : remove make in favor of CMake (#15449 ) This commit removes the content from the Makefile and updates the current deprecation message to information that `make` has been replaced by CMake instead. The message when `make` is invoked will now be the following: ```console $ make Makefile:6: *** Build system changed: The Makefile build has been replaced by CMake. For build instructions see: https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md . Stop. ``` The motivation for this is that many, if not all targets fail to build now, after changes to the system, and `make` has also been deprected for some time now.	2025-08-20 13:31:16 +03:00
Concedo	35707f4e97	split vulkan into two compilation units for faster build	2025-08-20 12:12:47 +08:00
Concedo	67ef5e6c02	phonemizer fixes, now kokoro works very well	2025-08-18 16:13:16 +08:00
Concedo	52606e9b1d	tts cpp model is now loadable in kcpp	2025-08-17 15:47:22 +08:00
Concedo	9935ac093f	standardize tts linting and formatting	2025-08-17 14:11:30 +08:00
Concedo	cfc1a0d4ef	tts cpp cli builds and runs fine.	2025-08-17 13:53:27 +08:00
Concedo	bc04366a65	builds but crashes	2025-08-17 00:09:03 +08:00
Concedo	67e0072245	fixed clblast repacking	2025-08-09 01:08:02 +08:00
Concedo	d37529c0cd	add sanitize flag	2025-08-04 22:19:23 +08:00
Concedo	4db8ba6228	Merge branch 'upstream' into concedo_experimental # Conflicts: # ggml/src/ggml-sycl/gemm.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/set_rows.cpp	2025-07-14 23:16:44 +08:00
Concedo	dca49de059	fixed qwen2 audio issues, works fine now (+3 squashed commit) Squashed commit: [b3053a1ba] updated lite [5071630d6] fixed mtmd issues, audio works [06efa5af4] fix mtmd compile	2025-07-12 18:54:41 +08:00
Concedo	e9473305d0	wip2 (+1 squashed commits) Squashed commits: [4628777b6] wip	2025-07-12 18:54:40 +08:00
Concedo	f8a49aa8e6	fixed a typo	2025-07-08 11:41:09 +08:00
Concedo	18cd46a6db	allow people to manually override gfx12 fa	2025-07-05 11:33:30 +08:00
Concedo	abc1d8ac25	better way of checking for avx2 support	2025-06-22 22:56:50 +08:00
Concedo	45f589b78d	test gfx1200 again	2025-06-21 17:56:04 +08:00
Concedo	b59b5dbbd1	Merge commit '`456af35eb7`' into concedo_experimental # Conflicts: # ggml/src/ggml-sycl/getrows.cpp # src/CMakeLists.txt # tools/llama-bench/llama-bench.cpp	2025-06-20 23:41:27 +08:00
Concedo	33809c9e82	doing what i must because i can, after the mess that is https://github.com/ggml-org/llama.cpp/pull/13892 there is so much duplicate code in each cpu arch, i expect upstream will prune it eventually arch detection has no fallback if all the arches are not found, by right we should set GGML_CPU_GENERIC i should be relaxing its the weekend	2025-06-14 01:41:16 +08:00
Concedo	f50c793140	not working - refactoring	2025-06-14 00:03:21 +08:00
Concedo	7a688e07cd	remove gfx12 until amd wakes up	2025-06-12 16:52:55 +08:00
Concedo	1970d8c9e8	uvos said it might work	2025-06-12 16:44:46 +08:00
Concedo	8386546e08	Switched VS2019 for revert cu12.1 build, hopefully solves dll issues try change order (+3 squashed commit) Squashed commit: [457f02507] try newer jimver [`64af28862`] windows pyinstaller shim. the final loader will be moved into the packed directory later. [`0272ecf2d`] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm try again (+3 squashed commit) Squashed commit: [133e81633] try without pwsh [4d99cefba] try without pwsh [bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm	2025-06-10 23:08:02 +08:00
Concedo	28b35ca879	allow wmma flag for rocm	2025-06-10 01:23:48 +08:00
Concedo	7d8aa31f1f	fixed embeddings, added new parameter to limit max embeddings context	2025-06-10 01:11:55 +08:00
xctan	f470bc36be	ggml-cpu : split arch-specific implementations (#13892 ) * move ggml-cpu-aarch64 to repack * split quantize_row_q8_0/1 * split helper functions * split ggml_vec_dot_q4_0_q8_0 * split ggml_vec_dot_q4_1_q8_1 * split ggml_vec_dot_q5_0_q8_0 * split ggml_vec_dot_q5_1_q8_1 * split ggml_vec_dot_q8_0_q8_0 * split ggml_vec_dot_tq1_0_q8_K * split ggml_vec_dot_tq2_0_q8_K * split ggml_vec_dot_q2_K_q8_K * split ggml_vec_dot_q3_K_q8_K * split ggml_vec_dot_q4_K_q8_K * split ggml_vec_dot_q5_K_q8_K * split ggml_vec_dot_q6_K_q8_K * split ggml_vec_dot_iq2_xxs_q8_K * split ggml_vec_dot_iq2_xs_q8_K * split ggml_vec_dot_iq2_s_q8_K * split ggml_vec_dot_iq3_xxs_q8_K * split ggml_vec_dot_iq3_s_q8_K * split ggml_vec_dot_iq1_s_q8_K * split ggml_vec_dot_iq1_m_q8_K * split ggml_vec_dot_iq4_nl_q8_0 * split ggml_vec_dot_iq4_xs_q8_K * fix typos * fix missing prototypes * rename ggml-cpu-quants.c * rename ggml-cpu-traits * rename arm folder * move cpu-feats-x86.cpp * rename ggml-cpu-hbm * update arm detection macro in quants.c * move iq quant tables * split ggml_quantize_mat_q8_0/K * split ggml_gemv_* * split ggml_gemm_* * rename namespace aarch64 to repack * use weak aliases to replace test macros * rename GGML_CPU_AARCH64 to GGML_CPU_REPACK * rename more aarch64 to repack * clean up rebase leftover * fix compilation errors * remove trailing spaces * try to fix clang compilation errors * try to fix clang compilation errors again * try to fix clang compilation errors, 3rd attempt * try to fix clang compilation errors, 4th attempt * try to fix clang compilation errors, 5th attempt * try to fix clang compilation errors, 6th attempt * try to fix clang compilation errors, 7th attempt * try to fix clang compilation errors, 8th attempt * try to fix clang compilation errors, 9th attempt * more cleanup * fix compilation errors * fix apple targets * fix a typo in arm version of ggml_vec_dot_q4_K_q8_K Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-06-09 16:47:13 +02:00
Concedo	6c5c8be48d	try to make rocm work for the github ci, requires disabling rocwmma	2025-06-08 21:52:29 +08:00
Concedo	a80dfa5c10	various minor fixes	2025-06-08 01:11:42 +08:00
Concedo	301450b1eb	attempt to use system glslc first before using bundled glslc	2025-06-07 16:54:25 +08:00
Concedo	d18938fc70	fixed build	2025-06-06 18:05:44 +08:00
Concedo	eec5a8ad16	breaking change: due to cuda12 upgrade, release filenames will change. standardize them to windows naming for the future. (+1 squashed commits) Squashed commits: [75842919a] cuda12.4 test	2025-06-06 14:02:34 +08:00
Concedo	b08dca65ed	Merge branch 'upstream' into concedo_experimental # Conflicts: # common/CMakeLists.txt # common/arg.cpp # common/chat.cpp # examples/parallel/README.md # examples/parallel/parallel.cpp # ggml/cmake/common.cmake # ggml/src/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/rope.cpp # models/ggml-vocab-bert-bge.gguf.inp # models/ggml-vocab-bert-bge.gguf.out # models/ggml-vocab-command-r.gguf.inp # models/ggml-vocab-command-r.gguf.out # models/ggml-vocab-deepseek-coder.gguf.inp # models/ggml-vocab-deepseek-coder.gguf.out # models/ggml-vocab-deepseek-llm.gguf.inp # models/ggml-vocab-deepseek-llm.gguf.out # models/ggml-vocab-falcon.gguf.inp # models/ggml-vocab-falcon.gguf.out # models/ggml-vocab-gpt-2.gguf.inp # models/ggml-vocab-gpt-2.gguf.out # models/ggml-vocab-llama-bpe.gguf.inp # models/ggml-vocab-llama-bpe.gguf.out # models/ggml-vocab-llama-spm.gguf.inp # models/ggml-vocab-llama-spm.gguf.out # models/ggml-vocab-mpt.gguf.inp # models/ggml-vocab-mpt.gguf.out # models/ggml-vocab-phi-3.gguf.inp # models/ggml-vocab-phi-3.gguf.out # models/ggml-vocab-qwen2.gguf.inp # models/ggml-vocab-qwen2.gguf.out # models/ggml-vocab-refact.gguf.inp # models/ggml-vocab-refact.gguf.out # models/ggml-vocab-starcoder.gguf.inp # models/ggml-vocab-starcoder.gguf.out # requirements/requirements-gguf_editor_gui.txt # tests/CMakeLists.txt # tests/test-chat.cpp # tests/test-grammar-integration.cpp # tests/test-json-schema-to-grammar.cpp # tools/mtmd/CMakeLists.txt # tools/run/run.cpp # tools/server/CMakeLists.txt	2025-05-31 13:04:21 +08:00
Concedo	c987abf9f5	Merge commit '`763d06edb7`' into concedo_experimental # Conflicts: # .github/workflows/build-linux-cross.yml # ggml/CMakeLists.txt # ggml/src/ggml-cann/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-vulkan/CMakeLists.txt # tools/mtmd/CMakeLists.txt # tools/mtmd/clip.cpp # tools/mtmd/mtmd.cpp # tools/server/CMakeLists.txt	2025-05-31 12:44:18 +08:00
henk717	b8883e254a	KoboldCpp.sh updates (#1562 ) * YR makefile upstream * Create make_portable_rocm_libs.sh * update makefile, support llama portable, ditch all unnecessary changes * Delete make_portable_rocm_libs.sh should not be needed * koboldcpp.sh updates * Small rocm fixes * ROCm is now a cuda version not a command * Don't commit temp file * Don't commit temp file * 1200 has errors, removing it for now * Only rebuild rocm with rebuild * Update kcpp-build-release-linux.yaml * Fix rocm filename * ROCm Linux CI * We need more diskspace * Workaround for lockfile getting stuck Why do I have to do hacks like this.... * Update kcpp-build-release-linux-rocm.yaml * Dont apt update rocm You don't allow us to apt update? Better not break things github! * Container maybe? * Turns out we aren't root, so we use sudo * Cleanup ROCm CI PR * Build for Runpods GPU * We also need rocblas * More cleanup just in case * Update kcpp-build-release-linux-rocm.yaml --------- Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>	2025-05-26 15:24:49 +08:00
Concedo	60268de62c	update targets for rocm	2025-05-25 18:41:15 +08:00
Concedo	499283c63a	rename define to match upstream	2025-05-23 17:10:12 +08:00
Concedo	dec3cd92b0	fix cuda compile	2025-05-13 02:15:33 +08:00
Concedo	40eb3a54c4	rename some toolip texts	2025-05-11 22:50:40 +08:00
Concedo	5cf5f35540	added vulkan build target for main.exe	2025-05-11 21:53:08 +08:00
Georgi Gerganov	4773d7a02f	examples : remove infill (#13283 ) ggml-ci	2025-05-07 10:28:02 +03:00

1 2 3 4 5 ...

655 commits