Commit graph

39 commits

Author SHA1 Message Date
Concedo
8a964e76c8 integrated mirostat as a launch parameter, works on all models 2023-05-06 00:47:17 +08:00
Hendrik Langer
8131bc8b56 add new sampling algorithm mirostat 2023-05-05 13:23:47 +02:00
Concedo
4857739ab5 allow specifying a different thread count for GPU blas 2023-05-03 21:19:59 +08:00
Concedo
966cd2ce91 Merge remote-tracking branch 'temp/concedo' into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
25201233ca fixed unbantokens not following EOS 2023-05-01 00:02:45 +08:00
Concedo
7afad2b9b5 integrated the new samplers 2023-04-29 19:41:41 +08:00
Concedo
fe0e4de8e8 fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken 2023-04-29 18:25:17 +08:00
Concedo
bb282a4ecf reinstated the q4_3 format, for backwards compatibility. 2023-04-29 11:42:04 +08:00
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
e8a389f85b updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching 2023-04-28 11:41:03 +08:00
Concedo
5070815dcf fixing discussion #121 and issue #122 2023-04-27 16:10:01 +08:00
Concedo
a696b0a16c missed another thing 2023-04-25 23:16:04 +08:00
Concedo
8c9c218609 missed a thing 2023-04-25 23:02:08 +08:00
Concedo
5eec5d6ed9 Added backwards compatibility to an earlier version of NeoX. 2023-04-25 20:34:18 +08:00
Concedo
3962eb39c7 added token unbanning 2023-04-24 21:50:20 +08:00
Concedo
1b9b9068b1 merged q4_2 and q4_3 dequants and FIXED CLBLAST SLOWNESS! 2023-04-24 21:33:01 +08:00
Concedo
9129e937f9 only llama can use batch sizes above 256 to prevent unacceptably high memory usage 2023-04-23 15:57:06 +08:00
Concedo
6e908c1792 added lora support 2023-04-22 12:29:38 +08:00
Concedo
c454f8b848 Gpt NeoX / Pythia integration completed 2023-04-22 11:23:25 +08:00
Concedo
ef13443047 wip pythia integration 2023-04-22 01:08:23 +08:00
Concedo
68898046c2 accidentally added the binaries onto repo again. 2023-04-22 00:41:19 +08:00
Concedo
5160053e51 merged llama adapter into the rest of the gpt adapters 2023-04-21 17:47:48 +08:00
Concedo
45ec09d31b fast forwarding for rwkv for unmodified contexts 2023-04-19 15:09:35 +08:00
Concedo
ea01771dd5 rwkv is done 2023-04-18 20:55:01 +08:00
Concedo
c200b674f4 updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter 2023-04-18 17:36:44 +08:00
Concedo
763ad172c0 arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation 2023-04-17 17:31:45 +08:00
Concedo
bee6a401fd slight clarity fix 2023-04-16 22:04:19 +08:00
Concedo
c757fbee1d fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite 2023-04-16 21:54:18 +08:00
Concedo
6548d3b3fb Added prints for stopping sequences, made makefile 1% friendlier to arch linux users 2023-04-16 20:43:17 +08:00
Concedo
525184930d added a kobold API compatible implementation of stopping sequences 2023-04-16 18:37:49 +08:00
Concedo
ad5676810a merge CLBlast improvements - GPU dequant 2023-04-16 01:17:40 +08:00
Concedo
8ad42a1102 read from inputs 2023-04-14 21:30:26 +08:00
Concedo
adb4df78d6 Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation. 2023-04-14 21:24:16 +08:00
Concedo
5c22f7e4c4 reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode. 2023-04-13 11:32:05 +08:00
Concedo
69b85f5b61 fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size. 2023-04-11 00:14:57 +08:00
Concedo
18a154715e added version label, improved file type checks 2023-04-10 01:03:09 +08:00
Concedo
b91abc3316 increase default blas batch size 2023-04-09 15:27:43 +08:00
Concedo
d8e37bfe75 new gpt2 format supported 2023-04-08 17:35:36 +08:00
Concedo
14273fea7a integrated gpt2 support 2023-04-04 23:15:47 +08:00
Renamed from gptj_adapter.cpp (Browse further)