koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-01 21:20:29 +00:00

Author	SHA1	Message	Date
Concedo	cd4012c3ed	minor fixes to debug logging, fixed a typo, added a new failsafe mode	2023-05-23 21:31:42 +08:00
Concedo	b9f06a7670	mavx only for windows by default, let them eat march native.	2023-05-22 16:48:55 +08:00
Concedo	169a26d15f	removed unused build targets	2023-05-22 13:53:10 +08:00
Concedo	587308a202	fixed some build errors on linux, changed icon resolution, added more error printing	2023-05-22 12:18:42 +08:00
Concedo	c048bcfec4	remove old filever checks (+7 squashed commit) Squashed commit: [b72627a] new format not working [e568870] old ver works [7053b77] compile errors fixed, fixing linkers [4ae8889] add new ver [ff82dfd] file format checks [25b8aa8] refactoring type names [931063b] still merging	2023-05-21 00:15:39 +08:00
Concedo	f561fe5a4a	switch back to ofast for c	2023-05-17 10:04:54 +08:00
Concedo	504a2aa874	Merge remote-tracking branch 'fixmake/concedo' into concedo_experimental	2023-05-17 10:01:57 +08:00
horenbergerb	f29c25e7a1	hacky fix for linux cublas build	2023-05-16 12:29:04 -04:00
Concedo	196fbba527	Merge branch 'opencl-dev2' into concedo_experimental # Conflicts: # CMakeLists.txt	2023-05-16 17:04:33 +08:00
Concedo	e4e6994353	Not working, don't use. testing a merge	2023-05-16 12:33:24 +08:00
0cc4m	c77966524a	Refactor OpenCL code to work more like the CUDA code, add missing functions	2023-05-14 17:01:46 +02:00
Concedo	e01e373e63	Merge branch 'master' into concedo_experimental # Conflicts: # Makefile # ggml.c # llama.cpp	2023-05-14 11:34:41 +08:00
Georgi Gerganov	bda4d7c215	make : fix PERF build with cuBLAS	2023-05-13 17:25:09 +03:00
Concedo	cee8042793	integrated new version of clblast kernels as a separate file	2023-05-13 12:53:28 +08:00
Concedo	08810d5fee	interim merge. do not use	2023-05-13 00:33:55 +08:00
Concedo	e9caff1cda	Interim merge. Do not use. Merge branch 'master' into concedo_experimental # Conflicts: # README.md # SHA256SUMS # examples/quantize/quantize.cpp # ggml-opencl.c # ggml.c # ggml.h # llama.cpp # llama.h	2023-05-12 23:20:27 +08:00
Concedo	62beded0e7	Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # Makefile # README.md	2023-05-07 19:10:01 +08:00
DaniAndTheWeb	173d0e6419	makefile: automatic Arch Linux detection (#1332 ) This commit is a port of a detection method used in koboldcpp's Makefile in order to automatically set the -lcblas option on Arch Linux	2023-05-05 23:57:14 +02:00
Ionoclast Laboratories	2d13786e91	Fix for OpenCL / clbast builds on macOS. (#1329 )	2023-05-05 14:18:21 +02:00
Concedo	7c129305f5	derp (+1 squashed commits) Squashed commits: [8fa8af7] suppress the rwkv Wwrite-strings warnings	2023-05-04 12:16:25 +08:00
Concedo	ede8e4edbb	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # README.md	2023-05-03 23:34:50 +08:00
Concedo	105f818d45	integrated new version of rwkv from upstream	2023-05-03 23:26:39 +08:00
DannyDaemonic	55bc5f0900	Call sh on build-info.sh (#1294 )	2023-05-02 17:52:35 -07:00
Concedo	94827172e0	Merge branch 'master' into concedo # Conflicts: # CMakeLists.txt # Makefile # ggml-cuda.cu # ggml-cuda.h	2023-05-02 14:38:31 +08:00
DannyDaemonic	f4cef87edf	Add git-based build information for better issue tracking (#1232 ) * Add git-based build information for better issue tracking * macOS fix * "build (hash)" and "CMAKE_SOURCE_DIR" changes * Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages * Fix conditional dependency on missing target * Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile * 4 space indenting for cmake, attempt to clean up my mess in Makefile * Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it	2023-05-01 18:23:47 +02:00
Concedo	3de34ee492	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # ggml-opencl.c	2023-05-01 12:03:46 +08:00
Pavol Rusnak	6f79699286	build: add armv{6,7,8} support to cmake (#1251 ) - flags copied from Makefile - updated comments in both CMakeLists.txt and Makefile to match reality	2023-04-30 20:48:38 +02:00
Stephan Walter	f0d70f147d	Various fixes to mat_mul benchmark (#1253 )	2023-04-30 12:32:37 +00:00
Concedo	b3315459c7	pilled the new dequants for clblast, fixed some ooms	2023-04-30 14:15:44 +08:00
Georgi Gerganov	214b6a3570	ggml : adjust mul_mat_f16 work memory (#1226 ) * llama : minor - remove explicity int64_t cast * ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS * ggml : add asserts to guard for incorrect wsize	2023-04-29 18:43:28 +03:00
Georgi Gerganov	305eb5afd5	build : fix reference to old llama_util.h	2023-04-29 13:53:12 +03:00
Concedo	bb282a4ecf	reinstated the q4_3 format, for backwards compatibility.	2023-04-29 11:42:04 +08:00
Concedo	0fc1772a8f	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # README.md # ggml.c	2023-04-29 11:14:05 +08:00
slaren	7fc50c051a	cuBLAS: use host pinned memory and dequantize while copying (#1207 ) * cuBLAS: dequantize simultaneously while copying memory * cuBLAS: use host pinned memory * cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory * cuBLAS: also pin kv cache * fix rebase	2023-04-29 02:04:18 +02:00
0cc4m	7296c961d9	ggml : add CLBlast support (#1164 ) * Allow use of OpenCL GPU-based BLAS using ClBlast instead of OpenBLAS for context processing * Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers * Finish merge of ClBlast support * Move CLBlast implementation to separate file Add buffer reuse code (adapted from slaren's cuda implementation) * Add q4_2 and q4_3 CLBlast support, improve code * Double CLBlast speed by disabling OpenBLAS thread workaround Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com> Co-authored-by: slaren <2141330+slaren@users.noreply.github.com> * Fix device selection env variable names * Fix cast in opencl kernels * Add CLBlast to CMakeLists.txt * Replace buffer pool with static buffers a, b, qb, c Fix compile warnings * Fix typos, use GGML_TYPE defines, improve code * Improve btype dequant kernel selection code, add error if type is unsupported * Improve code quality * Move internal stuff out of header * Use internal enums instead of CLBlast enums * Remove leftover C++ includes and defines * Make event use easier to read Co-authored-by: Henri Vasserman <henv@hot.ee> * Use c compiler for opencl files * Simplify code, fix include * First check error, then release event * Make globals static, fix indentation * Rename dequant kernels file to conform with other file names * Fix import cl file name --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com> Co-authored-by: slaren <2141330+slaren@users.noreply.github.com> Co-authored-by: Henri Vasserman <henv@hot.ee> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-04-28 17:57:16 +03:00
Johannes Gäßler	92a6e13a31	Add Manjaro CUDA include and lib dirs to Makefile (#1212 )	2023-04-28 15:40:32 +02:00
Concedo	032a171867	integrated q5 formats	2023-04-28 12:58:39 +08:00
Concedo	235daf4016	Merge branch 'master' into concedo # Conflicts: # .github/workflows/build.yml # README.md	2023-04-25 20:44:22 +08:00
slaren	e4cf982e0d	Fix cuda compilation (#1128 ) * Fix: Issue with CUBLAS compilation error due to missing -fPIC flag --------- Co-authored-by: B1gM8c <89020353+B1gM8c@users.noreply.github.com>	2023-04-24 17:29:58 +02:00
Concedo	59fb174678	fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj	2023-04-24 23:20:06 +08:00
Concedo	8e615c8245	Merge branch 'master' into concedo_experimental # Conflicts: # README.md	2023-04-24 12:20:08 +08:00
Georgi Gerganov	e4422e299c	ggml : better PERF prints + support "LLAMA_PERF=1 make"	2023-04-23 18:15:39 +03:00
Concedo	1b7aa2b815	Merge branch 'master' into concedo # Conflicts: # .github/workflows/build.yml # CMakeLists.txt # Makefile	2023-04-22 16:22:08 +08:00
Georgi Gerganov	872c365a91	ggml : fix AVX build + update to new Q8_0 format	2023-04-22 11:08:12 +03:00
Concedo	7b3d04e5d4	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt	2023-04-22 10:58:16 +08:00
Concedo	4fa3dfe8bc	just doesn't work properly on windows. will leave it as a manual flag for others	2023-04-22 10:57:38 +08:00
slaren	50cb666b8a	Improve cuBLAS performance by using a memory pool (#1094 ) * Improve cuBLAS performance by using a memory pool * Move cuda specific definitions to ggml-cuda.h/cu * Add CXX flags to nvcc * Change memory pool synchronization mechanism to a spin lock General code cleanup	2023-04-21 21:59:17 +02:00
Concedo	68898046c2	accidentally added the binaries onto repo again.	2023-04-22 00:41:19 +08:00
Concedo	f555db44ec	adding the libraries for cublas first. but i cannot get the kernel to work yet	2023-04-21 23:24:09 +08:00
Concedo	794a38a2e8	Revert "cublas is not feasible at this time. removed for now" This reverts commit `3687db7cf7`.	2023-04-21 21:02:40 +08:00

1 2 3

137 commits