Commit graph

123 commits

Author SHA1 Message Date
Concedo
ba2040d1df compile fix for ARM NEON 2023-08-03 12:52:06 +08:00
Concedo
34e60be41a compile fix 2023-08-03 10:36:14 +08:00
Concedo
c58ffc92e5 fixed compile error 2023-08-01 18:28:49 +08:00
Concedo
45456fa6ca switch noavx2 to not use openblas, as it has incompatible instructions 2023-07-30 16:47:33 +08:00
Concedo
2807d98fd4 touchup (+2 squashed commit)
Squashed commit:

[8b06458] fixed broken param order

[7eabdc0] very broken, do not use
2023-07-22 22:57:56 +08:00
Concedo
374fffb9c6 Reworking rope WIP 2023-07-19 00:54:41 +08:00
Concedo
523fc3be52 fixed rwkv, standardized new ctx usage 2023-07-10 20:05:53 +08:00
Concedo
2827920044 fix compile errors, rwkv not working 2023-07-10 18:23:25 +08:00
Concedo
27a0907cfa backport MM256_SET_M128I to ggml_v2, updated lite, added support for selecting the GPU for cublas 2023-07-06 22:33:46 +08:00
Concedo
ca9a11697c possibly slower, but cannot use larger batches without modifying ggml library. 2023-07-04 00:35:02 +08:00
Concedo
bfeb3471d7 fix typos 2023-07-03 21:36:42 +08:00
Concedo
3d2907d208 make gptneox and gptj work with extended context too 2023-07-02 18:28:09 +08:00
Concedo
ef3b8dc0d9 GPU accel for rwkv is slow, disable it 2023-07-02 00:41:46 +08:00
Concedo
e1a7042943 try out the new rwkv but it seems worse, may revert 2023-07-02 00:10:56 +08:00
Concedo
86469d15c4 fix for yr-rocm, large gpu scratch 2023-06-30 12:40:08 +08:00
Concedo
86b061b98c wip on unified cublas integration, add all the small libraries but exclude the large ones 2023-06-29 18:35:31 +08:00
Concedo
c2f1ed6556 fix compile errors 2023-06-29 17:54:12 +08:00
Concedo
b4698abafc Wip, CUDA porting malloc improvements, gpu accel for non-llama, backport old quants 2023-06-28 18:20:46 +08:00
Concedo
9527a783ea fix rope inplace 2023-06-27 19:44:33 +08:00
Concedo
8342fe81b1 revert the wstring tokenization. coherency was affected 2023-06-24 12:58:49 +08:00
Concedo
0485fa65a2 wstring convert for mpt 2023-06-24 11:43:42 +08:00
Concedo
490cf395f8 better alloc error 2023-06-23 22:51:51 +08:00
Concedo
f39a746089 bug fixes for openblas 2023-06-23 22:45:22 +08:00
Concedo
43c2891afa option to not use scratch 2023-06-23 19:01:36 +08:00
Concedo
d5e4cf7ffe handle ctx manip 2023-06-23 19:01:15 +08:00
Concedo
df9135e3a9 fixing memory bugs 2023-06-23 18:41:23 +08:00
Concedo
e6ddb15c3a cleanup 2023-06-22 10:38:27 +08:00
Concedo
1b71752a9f Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX 2023-06-22 00:43:25 +08:00
Concedo
dfdd20240c gpt j use scratch buffers 2023-06-21 16:10:31 +08:00
Concedo
8e2dc19dc6 updated tokenizer, added support for scratch buffers for neox and gpt2 2023-06-19 21:29:06 +08:00
Concedo
3ed3e7b7e2 reverted sequence mode for rwkv due to multiple issues with speed loss with bigger quantized models 2023-06-14 20:03:14 +08:00
Concedo
871009dfab integrated world tokenizer for RWKV 2023-06-13 20:06:19 +08:00
Concedo
860fb026df rwkv compile fix (+1 squashed commits)
Squashed commits:

[8b0ebb1] upgraded rwkv + added memory overheads + added state_out bufs
2023-06-12 23:04:40 +08:00
Concedo
c44b9c3ecf added the llama_v2 cuda back (+2 squashed commit)
Squashed commit:

[1c97fd4] Revert "fix for cublas"

This reverts commit 994be9a4db.

[fce03c3] Revert "fix for cublas"

This reverts commit 33528f5b1d.
2023-06-11 23:23:24 +08:00
Concedo
a6a0fa338a cleanup indentation, fixing cublas build 2023-06-08 22:40:53 +08:00
Concedo
c046db5197 lite bugfixes, buffer size changes, fixed a topk bug. 2023-06-06 22:38:25 +08:00
Concedo
9270056269 fixed compile error in cmake VS 2023-06-05 11:48:04 +08:00
Concedo
9aa2d8535b hide gpu input box when dropdown not selected, minor memory fix for neox and gptj 2023-06-04 21:47:17 +08:00
Concedo
20803c221e cleaning up some old junk 2023-06-04 11:05:46 +08:00
Concedo
b62279cb39 buf size for starcoder still not good 2023-06-04 00:41:08 +08:00
Concedo
c1b293d31a fixed MPT ooms 2023-06-03 18:37:13 +08:00
Concedo
6f82e17b7a added MPT support 2023-06-03 16:14:08 +08:00
Concedo
234270bd83 back to 32 block size, not better 2023-06-01 00:14:22 +08:00
Concedo
446e42a8c6 change dmmv block size 2023-05-31 21:40:12 +08:00
Concedo
6b3373cb81 revert bad fix 2023-05-29 22:06:12 +08:00
Concedo
ef16d09a51 fix for older gcc, updated lite 2023-05-29 18:54:15 +08:00
Concedo
97b39f875c fixed fstat64 build error on mac 2023-05-29 15:50:07 +08:00
Concedo
55e0fbf024 wip integrating new rwkv 2023-05-27 22:45:28 +08:00
Concedo
6d7749c98f no difference 2023-05-27 12:42:19 +08:00
Concedo
bd4fe936f5 cleanup sampling code 2023-05-27 11:58:39 +08:00