Commit graph

137 commits

Author SHA1 Message Date
Concedo
cd4012c3ed minor fixes to debug logging, fixed a typo, added a new failsafe mode 2023-05-23 21:31:42 +08:00
Concedo
b9f06a7670 mavx only for windows by default, let them eat march native. 2023-05-22 16:48:55 +08:00
Concedo
169a26d15f removed unused build targets 2023-05-22 13:53:10 +08:00
Concedo
587308a202 fixed some build errors on linux, changed icon resolution, added more error printing 2023-05-22 12:18:42 +08:00
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
f561fe5a4a switch back to ofast for c 2023-05-17 10:04:54 +08:00
Concedo
504a2aa874 Merge remote-tracking branch 'fixmake/concedo' into concedo_experimental 2023-05-17 10:01:57 +08:00
horenbergerb
f29c25e7a1 hacky fix for linux cublas build 2023-05-16 12:29:04 -04:00
Concedo
196fbba527 Merge branch 'opencl-dev2' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
2023-05-16 17:04:33 +08:00
Concedo
e4e6994353 Not working, don't use. testing a merge 2023-05-16 12:33:24 +08:00
0cc4m
c77966524a Refactor OpenCL code to work more like the CUDA code, add missing functions 2023-05-14 17:01:46 +02:00
Concedo
e01e373e63 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	ggml.c
#	llama.cpp
2023-05-14 11:34:41 +08:00
Georgi Gerganov
bda4d7c215 make : fix PERF build with cuBLAS 2023-05-13 17:25:09 +03:00
Concedo
cee8042793 integrated new version of clblast kernels as a separate file 2023-05-13 12:53:28 +08:00
Concedo
08810d5fee interim merge. do not use 2023-05-13 00:33:55 +08:00
Concedo
e9caff1cda Interim merge. Do not use.
Merge branch 'master' into concedo_experimental

# Conflicts:
#	README.md
#	SHA256SUMS
#	examples/quantize/quantize.cpp
#	ggml-opencl.c
#	ggml.c
#	ggml.h
#	llama.cpp
#	llama.h
2023-05-12 23:20:27 +08:00
Concedo
62beded0e7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	Makefile
#	README.md
2023-05-07 19:10:01 +08:00
DaniAndTheWeb
173d0e6419
makefile: automatic Arch Linux detection (#1332)
This commit is a port of a detection method used in koboldcpp's Makefile in order to automatically set the -lcblas option on Arch Linux
2023-05-05 23:57:14 +02:00
Ionoclast Laboratories
2d13786e91
Fix for OpenCL / clbast builds on macOS. (#1329) 2023-05-05 14:18:21 +02:00
Concedo
7c129305f5 derp (+1 squashed commits)
Squashed commits:

[8fa8af7] suppress the rwkv Wwrite-strings warnings
2023-05-04 12:16:25 +08:00
Concedo
ede8e4edbb Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
2023-05-03 23:34:50 +08:00
Concedo
105f818d45 integrated new version of rwkv from upstream 2023-05-03 23:26:39 +08:00
DannyDaemonic
55bc5f0900
Call sh on build-info.sh (#1294) 2023-05-02 17:52:35 -07:00
Concedo
94827172e0 Merge branch 'master' into concedo
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	ggml-cuda.cu
#	ggml-cuda.h
2023-05-02 14:38:31 +08:00
DannyDaemonic
f4cef87edf
Add git-based build information for better issue tracking (#1232)
* Add git-based build information for better issue tracking

* macOS fix

* "build (hash)" and "CMAKE_SOURCE_DIR" changes

* Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages

* Fix conditional dependency on missing target

* Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile

* 4 space indenting for cmake, attempt to clean up my mess in Makefile

* Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
2023-05-01 18:23:47 +02:00
Concedo
3de34ee492 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	ggml-opencl.c
2023-05-01 12:03:46 +08:00
Pavol Rusnak
6f79699286
build: add armv{6,7,8} support to cmake (#1251)
- flags copied from Makefile
- updated comments in both CMakeLists.txt and Makefile to match reality
2023-04-30 20:48:38 +02:00
Stephan Walter
f0d70f147d
Various fixes to mat_mul benchmark (#1253) 2023-04-30 12:32:37 +00:00
Concedo
b3315459c7 pilled the new dequants for clblast, fixed some ooms 2023-04-30 14:15:44 +08:00
Georgi Gerganov
214b6a3570
ggml : adjust mul_mat_f16 work memory (#1226)
* llama : minor - remove explicity int64_t cast

* ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS

* ggml : add asserts to guard for incorrect wsize
2023-04-29 18:43:28 +03:00
Georgi Gerganov
305eb5afd5
build : fix reference to old llama_util.h 2023-04-29 13:53:12 +03:00
Concedo
bb282a4ecf reinstated the q4_3 format, for backwards compatibility. 2023-04-29 11:42:04 +08:00
Concedo
0fc1772a8f Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
#	ggml.c
2023-04-29 11:14:05 +08:00
slaren
7fc50c051a
cuBLAS: use host pinned memory and dequantize while copying (#1207)
* cuBLAS: dequantize simultaneously while copying memory

* cuBLAS: use host pinned memory

* cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory

* cuBLAS: also pin kv cache

* fix rebase
2023-04-29 02:04:18 +02:00
0cc4m
7296c961d9
ggml : add CLBlast support (#1164)
* Allow use of OpenCL GPU-based BLAS using ClBlast instead of OpenBLAS for context processing

* Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers

* Finish merge of ClBlast support

* Move CLBlast implementation to separate file

Add buffer reuse code (adapted from slaren's cuda implementation)

* Add q4_2 and q4_3 CLBlast support, improve code

* Double CLBlast speed by disabling OpenBLAS thread workaround

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>

* Fix device selection env variable names

* Fix cast in opencl kernels

* Add CLBlast to CMakeLists.txt

* Replace buffer pool with static buffers a, b, qb, c

Fix compile warnings

* Fix typos, use GGML_TYPE defines, improve code

* Improve btype dequant kernel selection code, add error if type is unsupported

* Improve code quality

* Move internal stuff out of header
* Use internal enums instead of CLBlast enums
* Remove leftover C++ includes and defines
* Make event use easier to read

Co-authored-by: Henri Vasserman <henv@hot.ee>

* Use c compiler for opencl files

* Simplify code, fix include

* First check error, then release event

* Make globals static, fix indentation

* Rename dequant kernels file to conform with other file names

* Fix import cl file name

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Henri Vasserman <henv@hot.ee>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-28 17:57:16 +03:00
Johannes Gäßler
92a6e13a31
Add Manjaro CUDA include and lib dirs to Makefile (#1212) 2023-04-28 15:40:32 +02:00
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
235daf4016 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	README.md
2023-04-25 20:44:22 +08:00
slaren
e4cf982e0d
Fix cuda compilation (#1128)
* Fix: Issue with CUBLAS compilation error due to missing -fPIC flag

---------

Co-authored-by: B1gM8c <89020353+B1gM8c@users.noreply.github.com>
2023-04-24 17:29:58 +02:00
Concedo
59fb174678 fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj 2023-04-24 23:20:06 +08:00
Concedo
8e615c8245 Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-04-24 12:20:08 +08:00
Georgi Gerganov
e4422e299c
ggml : better PERF prints + support "LLAMA_PERF=1 make" 2023-04-23 18:15:39 +03:00
Concedo
1b7aa2b815 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
2023-04-22 16:22:08 +08:00
Georgi Gerganov
872c365a91 ggml : fix AVX build + update to new Q8_0 format 2023-04-22 11:08:12 +03:00
Concedo
7b3d04e5d4 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
2023-04-22 10:58:16 +08:00
Concedo
4fa3dfe8bc just doesn't work properly on windows. will leave it as a manual flag for others 2023-04-22 10:57:38 +08:00
slaren
50cb666b8a
Improve cuBLAS performance by using a memory pool (#1094)
* Improve cuBLAS performance by using a memory pool

* Move cuda specific definitions to ggml-cuda.h/cu

* Add CXX flags to nvcc

* Change memory pool synchronization mechanism to a spin lock
General code cleanup
2023-04-21 21:59:17 +02:00
Concedo
68898046c2 accidentally added the binaries onto repo again. 2023-04-22 00:41:19 +08:00
Concedo
f555db44ec adding the libraries for cublas first. but i cannot get the kernel to work yet 2023-04-21 23:24:09 +08:00
Concedo
794a38a2e8 Revert "cublas is not feasible at this time. removed for now"
This reverts commit 3687db7cf7.
2023-04-21 21:02:40 +08:00