Concedo
86469d15c4
fix for yr-rocm, large gpu scratch
2023-06-30 12:40:08 +08:00
Concedo
b4698abafc
Wip, CUDA porting malloc improvements, gpu accel for non-llama, backport old quants
2023-06-28 18:20:46 +08:00
Concedo
490cf395f8
better alloc error
2023-06-23 22:51:51 +08:00
Concedo
f39a746089
bug fixes for openblas
2023-06-23 22:45:22 +08:00
Concedo
43c2891afa
option to not use scratch
2023-06-23 19:01:36 +08:00
Concedo
d5e4cf7ffe
handle ctx manip
2023-06-23 19:01:15 +08:00
Concedo
e6ddb15c3a
cleanup
2023-06-22 10:38:27 +08:00
Concedo
1b71752a9f
Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX
2023-06-22 00:43:25 +08:00
Concedo
8e2dc19dc6
updated tokenizer, added support for scratch buffers for neox and gpt2
2023-06-19 21:29:06 +08:00
Concedo
c046db5197
lite bugfixes, buffer size changes, fixed a topk bug.
2023-06-06 22:38:25 +08:00
Concedo
20803c221e
cleaning up some old junk
2023-06-04 11:05:46 +08:00
Concedo
b62279cb39
buf size for starcoder still not good
2023-06-04 00:41:08 +08:00
Concedo
abb9ad789c
fixed other arch
2023-05-24 00:20:43 +08:00
Concedo
c048bcfec4
remove old filever checks (+7 squashed commit)
...
Squashed commit:
[b72627a] new format not working
[e568870] old ver works
[7053b77] compile errors fixed, fixing linkers
[4ae8889] add new ver
[ff82dfd] file format checks
[25b8aa8] refactoring type names
[931063b] still merging
2023-05-21 00:15:39 +08:00