Commit graph

77 commits

Author SHA1 Message Date
Concedo
9b6c35b651 rwkv speed enhancements (batch processing), fixed a rwkv token processing bug 2023-06-13 16:02:12 +08:00
Concedo
66a3f4e421 added support for lora base 2023-06-10 19:29:45 +08:00
Concedo
43f7e40470 added extra endpoints for abort gen and polled streaming 2023-06-10 18:13:26 +08:00
Concedo
b92f9fe3a2 Merge remote-tracking branch 'sammcheese/sammcheese/tokenstreaming' into concedo_experimental 2023-06-09 20:41:02 +08:00
12Boti
e1ab14c4ab
fix format string vulnerability (#223) 2023-06-09 20:16:03 +08:00
SammCheese
e6231c3055
back to http.server, improved implementation 2023-06-09 12:17:55 +02:00
SammCheese
9a8da35ec4
working streaming. TODO: fix lite 2023-06-08 18:34:23 +02:00
SammCheese
97971291e9
draft: token streaming 2023-06-08 18:34:08 +02:00
Concedo
a6a0fa338a cleanup indentation, fixing cublas build 2023-06-08 22:40:53 +08:00
Concedo
6f82e17b7a added MPT support 2023-06-03 16:14:08 +08:00
Concedo
37659d2c4e allow blasbatchsize -1 which disables blas, but keeps benefits like gpu offloads. 2023-06-01 22:33:50 +08:00
Concedo
49272e3c53 adjusted defaults 2023-06-01 20:03:44 +08:00
Concedo
ea336bfa33 rwkv eos 2023-05-29 22:40:27 +08:00
Concedo
28f1196f65 adjust default rep pen range 2023-05-28 19:36:21 +08:00
Concedo
5d9f5b28a6 rwkv integration completed 2023-05-28 00:48:56 +08:00
Concedo
55e0fbf024 wip integrating new rwkv 2023-05-27 22:45:28 +08:00
Concedo
abfdfb702e added top_a sampler 2023-05-27 17:32:37 +08:00
Concedo
bd4fe936f5 cleanup sampling code 2023-05-27 11:58:39 +08:00
Concedo
3c8f404243 integrated token probability viewer in debugmode 2023-05-26 16:40:26 +08:00
Concedo
cd4012c3ed minor fixes to debug logging, fixed a typo, added a new failsafe mode 2023-05-23 21:31:42 +08:00
Concedo
d418146535 fixed a token decoding bug 2023-05-21 00:53:20 +08:00
Concedo
5032e0fd64 trying to fix ggjt v3 2023-05-21 00:29:50 +08:00
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30 still merging in process 2023-05-20 15:58:33 +08:00
Concedo
a8958f6b76 merging, do not use 2023-05-20 15:12:31 +08:00
Concedo
010b2753d9 Merge commit '6986c7835a' into concedo_experimental
# Conflicts:
#	README.md
2023-05-20 11:30:51 +08:00
Concedo
487ac226b4 need to set the unshuffle before loading the model 2023-05-17 17:58:21 +08:00
Concedo
2c6ac06936 gpu offload not working for other arch. debug in future. 2023-05-17 17:13:01 +08:00
Concedo
00da2a5f4e neox is updated 2023-05-17 14:56:54 +08:00
Concedo
90fe9096b4 clean and refactoring pass before supporting newer models for different arch 2023-05-17 11:23:29 +08:00
Concedo
466cd21368 test cmakefile for cublas. 2023-05-15 14:50:38 +08:00
Concedo
b692e4d2a4 wip 2023-05-14 17:21:07 +08:00
Concedo
8a5fe628df recognize q8_0 as an older format as the new clblast doesnt work correctly with it 2023-05-14 11:06:23 +08:00
Concedo
e05455f852 fixed wrong sized struct from legacy q8_1, fixed opencl varsize arrays 2023-05-13 23:56:08 +08:00
Concedo
05cf5f7d6e partially working, but the blas matmul is broken 2023-05-13 11:35:38 +08:00
Concedo
54194911ac Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-05-09 16:50:43 +08:00
Concedo
2f2eff6e13 the dark gods have been sated, and redpajama is integrated... but at what cost? 2023-05-08 20:58:00 +08:00
Concedo
62beded0e7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	Makefile
#	README.md
2023-05-07 19:10:01 +08:00
Concedo
8a964e76c8 integrated mirostat as a launch parameter, works on all models 2023-05-06 00:47:17 +08:00
Hendrik Langer
8131bc8b56 add new sampling algorithm mirostat 2023-05-05 13:23:47 +02:00
Concedo
4857739ab5 allow specifying a different thread count for GPU blas 2023-05-03 21:19:59 +08:00
Concedo
966cd2ce91 Merge remote-tracking branch 'temp/concedo' into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
25201233ca fixed unbantokens not following EOS 2023-05-01 00:02:45 +08:00
Concedo
7afad2b9b5 integrated the new samplers 2023-04-29 19:41:41 +08:00
Concedo
fe0e4de8e8 fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken 2023-04-29 18:25:17 +08:00
Concedo
bb282a4ecf reinstated the q4_3 format, for backwards compatibility. 2023-04-29 11:42:04 +08:00
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
e8a389f85b updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching 2023-04-28 11:41:03 +08:00
Concedo
5070815dcf fixing discussion #121 and issue #122 2023-04-27 16:10:01 +08:00
Concedo
a696b0a16c missed another thing 2023-04-25 23:16:04 +08:00