Commit graph

31 commits

Author SHA1 Message Date
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30 still merging in process 2023-05-20 15:58:33 +08:00
Concedo
4e86a07e57 wip cleanup before big merge 2023-05-20 12:48:28 +08:00
Concedo
2c6ac06936 gpu offload not working for other arch. debug in future. 2023-05-17 17:13:01 +08:00
Concedo
57230b5196 upgrade all other formats 2023-05-17 16:28:20 +08:00
Concedo
90fe9096b4 clean and refactoring pass before supporting newer models for different arch 2023-05-17 11:23:29 +08:00
Concedo
94ef3e81cf inc allocation 2023-05-16 23:32:35 +08:00
Concedo
72836d4eac fixing more compile issues 2023-05-15 20:10:54 +08:00
Concedo
6504150fac just testing cublas 2023-05-15 20:01:22 +08:00
Concedo
c81dd58e76 Merge commit 'f954edda93' into archive_lib
# Conflicts:
#	ggml.c
2023-05-14 18:34:56 +08:00
Concedo
b692e4d2a4 wip 2023-05-14 17:21:07 +08:00
Concedo
05cf5f7d6e partially working, but the blas matmul is broken 2023-05-13 11:35:38 +08:00
Concedo
e47f7ade05 updated kobold lite, patch oom errors 2023-05-09 19:16:45 +08:00
Concedo
b3315459c7 pilled the new dequants for clblast, fixed some ooms 2023-04-30 14:15:44 +08:00
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
59fb174678 fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj 2023-04-24 23:20:06 +08:00
Concedo
432cc91649 still needs to be a bit higher for very small contexts 2023-04-23 15:01:38 +08:00
Concedo
4e1ea2ac61 hopefully fixed the ooms for good 2023-04-23 13:49:50 +08:00
Concedo
d41490c27b just revert back to the working commit 2023-04-23 00:35:42 +08:00
Concedo
b5d6284190 increase initial buffer too 2023-04-23 00:07:33 +08:00
Concedo
d2f14b2b1f add an extra buffer to mem allocations 2023-04-23 00:04:32 +08:00
Concedo
4fa3dfe8bc just doesn't work properly on windows. will leave it as a manual flag for others 2023-04-22 10:57:38 +08:00
Concedo
7ba36c2c6c trying to put out penguin based fires. sorry for inconvenience 2023-04-20 23:15:07 +08:00
Concedo
49697d86d8 adjusted down the buf memory allocation now that realloc seems to work 2023-04-20 17:51:13 +08:00
Concedo
cc407f283a messing around with memory allocation to bandaid the random ooms with various gpt2 and gptj models 2023-04-19 20:18:55 +08:00
Concedo
45ec09d31b fast forwarding for rwkv for unmodified contexts 2023-04-19 15:09:35 +08:00
Concedo
c757fbee1d fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite 2023-04-16 21:54:18 +08:00
Concedo
1bd5992da4 clean and refactor handling of flags 2023-04-12 23:25:31 +08:00
Concedo
69b85f5b61 fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size. 2023-04-11 00:14:57 +08:00
Concedo
18a154715e added version label, improved file type checks 2023-04-10 01:03:09 +08:00
Concedo
d8e37bfe75 new gpt2 format supported 2023-04-08 17:35:36 +08:00