Commit graph

14 commits

Author SHA1 Message Date
Concedo
86469d15c4 fix for yr-rocm, large gpu scratch 2023-06-30 12:40:08 +08:00
Concedo
b4698abafc Wip, CUDA porting malloc improvements, gpu accel for non-llama, backport old quants 2023-06-28 18:20:46 +08:00
Concedo
490cf395f8 better alloc error 2023-06-23 22:51:51 +08:00
Concedo
f39a746089 bug fixes for openblas 2023-06-23 22:45:22 +08:00
Concedo
43c2891afa option to not use scratch 2023-06-23 19:01:36 +08:00
Concedo
d5e4cf7ffe handle ctx manip 2023-06-23 19:01:15 +08:00
Concedo
e6ddb15c3a cleanup 2023-06-22 10:38:27 +08:00
Concedo
1b71752a9f Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX 2023-06-22 00:43:25 +08:00
Concedo
8e2dc19dc6 updated tokenizer, added support for scratch buffers for neox and gpt2 2023-06-19 21:29:06 +08:00
Concedo
c046db5197 lite bugfixes, buffer size changes, fixed a topk bug. 2023-06-06 22:38:25 +08:00
Concedo
20803c221e cleaning up some old junk 2023-06-04 11:05:46 +08:00
Concedo
b62279cb39 buf size for starcoder still not good 2023-06-04 00:41:08 +08:00
Concedo
abb9ad789c fixed other arch 2023-05-24 00:20:43 +08:00
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00