Concedo
|
bfeb3471d7
|
fix typos
|
2023-07-03 21:36:42 +08:00 |
|
Concedo
|
3d2907d208
|
make gptneox and gptj work with extended context too
|
2023-07-02 18:28:09 +08:00 |
|
Concedo
|
ef3b8dc0d9
|
GPU accel for rwkv is slow, disable it
|
2023-07-02 00:41:46 +08:00 |
|
Concedo
|
e1a7042943
|
try out the new rwkv but it seems worse, may revert
|
2023-07-02 00:10:56 +08:00 |
|
Concedo
|
86469d15c4
|
fix for yr-rocm, large gpu scratch
|
2023-06-30 12:40:08 +08:00 |
|
Concedo
|
86b061b98c
|
wip on unified cublas integration, add all the small libraries but exclude the large ones
|
2023-06-29 18:35:31 +08:00 |
|
Concedo
|
c2f1ed6556
|
fix compile errors
|
2023-06-29 17:54:12 +08:00 |
|
Concedo
|
b4698abafc
|
Wip, CUDA porting malloc improvements, gpu accel for non-llama, backport old quants
|
2023-06-28 18:20:46 +08:00 |
|
Concedo
|
9527a783ea
|
fix rope inplace
|
2023-06-27 19:44:33 +08:00 |
|
Concedo
|
8342fe81b1
|
revert the wstring tokenization. coherency was affected
|
2023-06-24 12:58:49 +08:00 |
|
Concedo
|
0485fa65a2
|
wstring convert for mpt
|
2023-06-24 11:43:42 +08:00 |
|
Concedo
|
490cf395f8
|
better alloc error
|
2023-06-23 22:51:51 +08:00 |
|
Concedo
|
f39a746089
|
bug fixes for openblas
|
2023-06-23 22:45:22 +08:00 |
|
Concedo
|
43c2891afa
|
option to not use scratch
|
2023-06-23 19:01:36 +08:00 |
|
Concedo
|
d5e4cf7ffe
|
handle ctx manip
|
2023-06-23 19:01:15 +08:00 |
|
Concedo
|
df9135e3a9
|
fixing memory bugs
|
2023-06-23 18:41:23 +08:00 |
|
Concedo
|
e6ddb15c3a
|
cleanup
|
2023-06-22 10:38:27 +08:00 |
|
Concedo
|
1b71752a9f
|
Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX
|
2023-06-22 00:43:25 +08:00 |
|
Concedo
|
dfdd20240c
|
gpt j use scratch buffers
|
2023-06-21 16:10:31 +08:00 |
|
Concedo
|
8e2dc19dc6
|
updated tokenizer, added support for scratch buffers for neox and gpt2
|
2023-06-19 21:29:06 +08:00 |
|
Concedo
|
3ed3e7b7e2
|
reverted sequence mode for rwkv due to multiple issues with speed loss with bigger quantized models
|
2023-06-14 20:03:14 +08:00 |
|
Concedo
|
871009dfab
|
integrated world tokenizer for RWKV
|
2023-06-13 20:06:19 +08:00 |
|
Concedo
|
860fb026df
|
rwkv compile fix (+1 squashed commits)
Squashed commits:
[8b0ebb1] upgraded rwkv + added memory overheads + added state_out bufs
|
2023-06-12 23:04:40 +08:00 |
|
Concedo
|
c44b9c3ecf
|
added the llama_v2 cuda back (+2 squashed commit)
Squashed commit:
[1c97fd4] Revert "fix for cublas"
This reverts commit 994be9a4db.
[fce03c3] Revert "fix for cublas"
This reverts commit 33528f5b1d.
|
2023-06-11 23:23:24 +08:00 |
|
Concedo
|
a6a0fa338a
|
cleanup indentation, fixing cublas build
|
2023-06-08 22:40:53 +08:00 |
|
Concedo
|
c046db5197
|
lite bugfixes, buffer size changes, fixed a topk bug.
|
2023-06-06 22:38:25 +08:00 |
|
Concedo
|
9270056269
|
fixed compile error in cmake VS
|
2023-06-05 11:48:04 +08:00 |
|
Concedo
|
9aa2d8535b
|
hide gpu input box when dropdown not selected, minor memory fix for neox and gptj
|
2023-06-04 21:47:17 +08:00 |
|
Concedo
|
20803c221e
|
cleaning up some old junk
|
2023-06-04 11:05:46 +08:00 |
|
Concedo
|
b62279cb39
|
buf size for starcoder still not good
|
2023-06-04 00:41:08 +08:00 |
|
Concedo
|
c1b293d31a
|
fixed MPT ooms
|
2023-06-03 18:37:13 +08:00 |
|
Concedo
|
6f82e17b7a
|
added MPT support
|
2023-06-03 16:14:08 +08:00 |
|
Concedo
|
234270bd83
|
back to 32 block size, not better
|
2023-06-01 00:14:22 +08:00 |
|
Concedo
|
446e42a8c6
|
change dmmv block size
|
2023-05-31 21:40:12 +08:00 |
|
Concedo
|
6b3373cb81
|
revert bad fix
|
2023-05-29 22:06:12 +08:00 |
|
Concedo
|
ef16d09a51
|
fix for older gcc, updated lite
|
2023-05-29 18:54:15 +08:00 |
|
Concedo
|
97b39f875c
|
fixed fstat64 build error on mac
|
2023-05-29 15:50:07 +08:00 |
|
Concedo
|
55e0fbf024
|
wip integrating new rwkv
|
2023-05-27 22:45:28 +08:00 |
|
Concedo
|
6d7749c98f
|
no difference
|
2023-05-27 12:42:19 +08:00 |
|
Concedo
|
bd4fe936f5
|
cleanup sampling code
|
2023-05-27 11:58:39 +08:00 |
|
Concedo
|
bf482d1786
|
revert klite newline bug, trying to add win7 support
|
2023-05-24 22:21:01 +08:00 |
|
Concedo
|
844f92688a
|
subpattern fix
|
2023-05-24 16:48:39 +08:00 |
|
Concedo
|
abb9ad789c
|
fixed other arch
|
2023-05-24 00:20:43 +08:00 |
|
Concedo
|
981d5ba866
|
Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# Makefile
# README.md
# ggml-opencl.cpp
# llama.cpp
# otherarch/ggml_v2-opencl-legacy.c
|
2023-05-22 16:16:48 +08:00 |
|
Concedo
|
587308a202
|
fixed some build errors on linux, changed icon resolution, added more error printing
|
2023-05-22 12:18:42 +08:00 |
|
Concedo
|
fea84c3cf5
|
fix for stupid msvc compiler
|
2023-05-21 22:41:33 +08:00 |
|
Concedo
|
60e0c67874
|
fix compile errors on cuda
|
2023-05-21 21:13:17 +08:00 |
|
Concedo
|
33528f5b1d
|
fix for cublas
|
2023-05-21 21:03:36 +08:00 |
|
Concedo
|
994be9a4db
|
fix for cublas
|
2023-05-21 21:02:21 +08:00 |
|
Concedo
|
24127ebf98
|
updated lite, fixed some encoding issues
|
2023-05-21 17:29:00 +08:00 |
|