Commit graph

31 commits

Author SHA1 Message Date
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
e8a389f85b updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching 2023-04-28 11:41:03 +08:00
Concedo
5070815dcf fixing discussion #121 and issue #122 2023-04-27 16:10:01 +08:00
Concedo
a696b0a16c missed another thing 2023-04-25 23:16:04 +08:00
Concedo
8c9c218609 missed a thing 2023-04-25 23:02:08 +08:00
Concedo
5eec5d6ed9 Added backwards compatibility to an earlier version of NeoX. 2023-04-25 20:34:18 +08:00
Concedo
3962eb39c7 added token unbanning 2023-04-24 21:50:20 +08:00
Concedo
1b9b9068b1 merged q4_2 and q4_3 dequants and FIXED CLBLAST SLOWNESS! 2023-04-24 21:33:01 +08:00
Concedo
9129e937f9 only llama can use batch sizes above 256 to prevent unacceptably high memory usage 2023-04-23 15:57:06 +08:00
Concedo
6e908c1792 added lora support 2023-04-22 12:29:38 +08:00
Concedo
c454f8b848 Gpt NeoX / Pythia integration completed 2023-04-22 11:23:25 +08:00
Concedo
ef13443047 wip pythia integration 2023-04-22 01:08:23 +08:00
Concedo
68898046c2 accidentally added the binaries onto repo again. 2023-04-22 00:41:19 +08:00
Concedo
5160053e51 merged llama adapter into the rest of the gpt adapters 2023-04-21 17:47:48 +08:00
Concedo
45ec09d31b fast forwarding for rwkv for unmodified contexts 2023-04-19 15:09:35 +08:00
Concedo
ea01771dd5 rwkv is done 2023-04-18 20:55:01 +08:00
Concedo
c200b674f4 updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter 2023-04-18 17:36:44 +08:00
Concedo
763ad172c0 arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation 2023-04-17 17:31:45 +08:00
Concedo
bee6a401fd slight clarity fix 2023-04-16 22:04:19 +08:00
Concedo
c757fbee1d fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite 2023-04-16 21:54:18 +08:00
Concedo
6548d3b3fb Added prints for stopping sequences, made makefile 1% friendlier to arch linux users 2023-04-16 20:43:17 +08:00
Concedo
525184930d added a kobold API compatible implementation of stopping sequences 2023-04-16 18:37:49 +08:00
Concedo
ad5676810a merge CLBlast improvements - GPU dequant 2023-04-16 01:17:40 +08:00
Concedo
8ad42a1102 read from inputs 2023-04-14 21:30:26 +08:00
Concedo
adb4df78d6 Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation. 2023-04-14 21:24:16 +08:00
Concedo
5c22f7e4c4 reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode. 2023-04-13 11:32:05 +08:00
Concedo
69b85f5b61 fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size. 2023-04-11 00:14:57 +08:00
Concedo
18a154715e added version label, improved file type checks 2023-04-10 01:03:09 +08:00
Concedo
b91abc3316 increase default blas batch size 2023-04-09 15:27:43 +08:00
Concedo
d8e37bfe75 new gpt2 format supported 2023-04-08 17:35:36 +08:00
Concedo
14273fea7a integrated gpt2 support 2023-04-04 23:15:47 +08:00
Renamed from gptj_adapter.cpp (Browse further)