Commit graph

57 commits

Author SHA1 Message Date
Concedo
d418146535 fixed a token decoding bug 2023-05-21 00:53:20 +08:00
Concedo
5032e0fd64 trying to fix ggjt v3 2023-05-21 00:29:50 +08:00
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30 still merging in process 2023-05-20 15:58:33 +08:00
Concedo
a8958f6b76 merging, do not use 2023-05-20 15:12:31 +08:00
Concedo
010b2753d9 Merge commit '6986c7835a' into concedo_experimental
# Conflicts:
#	README.md
2023-05-20 11:30:51 +08:00
Concedo
487ac226b4 need to set the unshuffle before loading the model 2023-05-17 17:58:21 +08:00
Concedo
2c6ac06936 gpu offload not working for other arch. debug in future. 2023-05-17 17:13:01 +08:00
Concedo
00da2a5f4e neox is updated 2023-05-17 14:56:54 +08:00
Concedo
90fe9096b4 clean and refactoring pass before supporting newer models for different arch 2023-05-17 11:23:29 +08:00
Concedo
466cd21368 test cmakefile for cublas. 2023-05-15 14:50:38 +08:00
Concedo
b692e4d2a4 wip 2023-05-14 17:21:07 +08:00
Concedo
8a5fe628df recognize q8_0 as an older format as the new clblast doesnt work correctly with it 2023-05-14 11:06:23 +08:00
Concedo
e05455f852 fixed wrong sized struct from legacy q8_1, fixed opencl varsize arrays 2023-05-13 23:56:08 +08:00
Concedo
05cf5f7d6e partially working, but the blas matmul is broken 2023-05-13 11:35:38 +08:00
Concedo
54194911ac Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-05-09 16:50:43 +08:00
Concedo
2f2eff6e13 the dark gods have been sated, and redpajama is integrated... but at what cost? 2023-05-08 20:58:00 +08:00
Concedo
62beded0e7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	Makefile
#	README.md
2023-05-07 19:10:01 +08:00
Concedo
8a964e76c8 integrated mirostat as a launch parameter, works on all models 2023-05-06 00:47:17 +08:00
Hendrik Langer
8131bc8b56 add new sampling algorithm mirostat 2023-05-05 13:23:47 +02:00
Concedo
4857739ab5 allow specifying a different thread count for GPU blas 2023-05-03 21:19:59 +08:00
Concedo
966cd2ce91 Merge remote-tracking branch 'temp/concedo' into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
25201233ca fixed unbantokens not following EOS 2023-05-01 00:02:45 +08:00
Concedo
7afad2b9b5 integrated the new samplers 2023-04-29 19:41:41 +08:00
Concedo
fe0e4de8e8 fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken 2023-04-29 18:25:17 +08:00
Concedo
bb282a4ecf reinstated the q4_3 format, for backwards compatibility. 2023-04-29 11:42:04 +08:00
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
e8a389f85b updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching 2023-04-28 11:41:03 +08:00
Concedo
5070815dcf fixing discussion #121 and issue #122 2023-04-27 16:10:01 +08:00
Concedo
a696b0a16c missed another thing 2023-04-25 23:16:04 +08:00
Concedo
8c9c218609 missed a thing 2023-04-25 23:02:08 +08:00
Concedo
5eec5d6ed9 Added backwards compatibility to an earlier version of NeoX. 2023-04-25 20:34:18 +08:00
Concedo
3962eb39c7 added token unbanning 2023-04-24 21:50:20 +08:00
Concedo
1b9b9068b1 merged q4_2 and q4_3 dequants and FIXED CLBLAST SLOWNESS! 2023-04-24 21:33:01 +08:00
Concedo
9129e937f9 only llama can use batch sizes above 256 to prevent unacceptably high memory usage 2023-04-23 15:57:06 +08:00
Concedo
6e908c1792 added lora support 2023-04-22 12:29:38 +08:00
Concedo
c454f8b848 Gpt NeoX / Pythia integration completed 2023-04-22 11:23:25 +08:00
Concedo
ef13443047 wip pythia integration 2023-04-22 01:08:23 +08:00
Concedo
68898046c2 accidentally added the binaries onto repo again. 2023-04-22 00:41:19 +08:00
Concedo
5160053e51 merged llama adapter into the rest of the gpt adapters 2023-04-21 17:47:48 +08:00
Concedo
45ec09d31b fast forwarding for rwkv for unmodified contexts 2023-04-19 15:09:35 +08:00
Concedo
ea01771dd5 rwkv is done 2023-04-18 20:55:01 +08:00
Concedo
c200b674f4 updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter 2023-04-18 17:36:44 +08:00
Concedo
763ad172c0 arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation 2023-04-17 17:31:45 +08:00
Concedo
bee6a401fd slight clarity fix 2023-04-16 22:04:19 +08:00
Concedo
c757fbee1d fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite 2023-04-16 21:54:18 +08:00
Concedo
6548d3b3fb Added prints for stopping sequences, made makefile 1% friendlier to arch linux users 2023-04-16 20:43:17 +08:00
Concedo
525184930d added a kobold API compatible implementation of stopping sequences 2023-04-16 18:37:49 +08:00
Concedo
ad5676810a merge CLBlast improvements - GPU dequant 2023-04-16 01:17:40 +08:00
Concedo
8ad42a1102 read from inputs 2023-04-14 21:30:26 +08:00