Concedo
9b6c35b651
rwkv speed enhancements (batch processing), fixed a rwkv token processing bug
2023-06-13 16:02:12 +08:00
Concedo
66a3f4e421
added support for lora base
2023-06-10 19:29:45 +08:00
Concedo
43f7e40470
added extra endpoints for abort gen and polled streaming
2023-06-10 18:13:26 +08:00
Concedo
b92f9fe3a2
Merge remote-tracking branch 'sammcheese/sammcheese/tokenstreaming' into concedo_experimental
2023-06-09 20:41:02 +08:00
12Boti
e1ab14c4ab
fix format string vulnerability ( #223 )
2023-06-09 20:16:03 +08:00
SammCheese
e6231c3055
back to http.server, improved implementation
2023-06-09 12:17:55 +02:00
SammCheese
9a8da35ec4
working streaming. TODO: fix lite
2023-06-08 18:34:23 +02:00
SammCheese
97971291e9
draft: token streaming
2023-06-08 18:34:08 +02:00
Concedo
a6a0fa338a
cleanup indentation, fixing cublas build
2023-06-08 22:40:53 +08:00
Concedo
6f82e17b7a
added MPT support
2023-06-03 16:14:08 +08:00
Concedo
37659d2c4e
allow blasbatchsize -1 which disables blas, but keeps benefits like gpu offloads.
2023-06-01 22:33:50 +08:00
Concedo
49272e3c53
adjusted defaults
2023-06-01 20:03:44 +08:00
Concedo
ea336bfa33
rwkv eos
2023-05-29 22:40:27 +08:00
Concedo
28f1196f65
adjust default rep pen range
2023-05-28 19:36:21 +08:00
Concedo
5d9f5b28a6
rwkv integration completed
2023-05-28 00:48:56 +08:00
Concedo
55e0fbf024
wip integrating new rwkv
2023-05-27 22:45:28 +08:00
Concedo
abfdfb702e
added top_a sampler
2023-05-27 17:32:37 +08:00
Concedo
bd4fe936f5
cleanup sampling code
2023-05-27 11:58:39 +08:00
Concedo
3c8f404243
integrated token probability viewer in debugmode
2023-05-26 16:40:26 +08:00
Concedo
cd4012c3ed
minor fixes to debug logging, fixed a typo, added a new failsafe mode
2023-05-23 21:31:42 +08:00
Concedo
d418146535
fixed a token decoding bug
2023-05-21 00:53:20 +08:00
Concedo
5032e0fd64
trying to fix ggjt v3
2023-05-21 00:29:50 +08:00
Concedo
c048bcfec4
remove old filever checks (+7 squashed commit)
...
Squashed commit:
[b72627a] new format not working
[e568870] old ver works
[7053b77] compile errors fixed, fixing linkers
[4ae8889] add new ver
[ff82dfd] file format checks
[25b8aa8] refactoring type names
[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30
still merging in process
2023-05-20 15:58:33 +08:00
Concedo
a8958f6b76
merging, do not use
2023-05-20 15:12:31 +08:00
Concedo
010b2753d9
Merge commit ' 6986c7835a
' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-20 11:30:51 +08:00
Concedo
487ac226b4
need to set the unshuffle before loading the model
2023-05-17 17:58:21 +08:00
Concedo
2c6ac06936
gpu offload not working for other arch. debug in future.
2023-05-17 17:13:01 +08:00
Concedo
00da2a5f4e
neox is updated
2023-05-17 14:56:54 +08:00
Concedo
90fe9096b4
clean and refactoring pass before supporting newer models for different arch
2023-05-17 11:23:29 +08:00
Concedo
466cd21368
test cmakefile for cublas.
2023-05-15 14:50:38 +08:00
Concedo
b692e4d2a4
wip
2023-05-14 17:21:07 +08:00
Concedo
8a5fe628df
recognize q8_0 as an older format as the new clblast doesnt work correctly with it
2023-05-14 11:06:23 +08:00
Concedo
e05455f852
fixed wrong sized struct from legacy q8_1, fixed opencl varsize arrays
2023-05-13 23:56:08 +08:00
Concedo
05cf5f7d6e
partially working, but the blas matmul is broken
2023-05-13 11:35:38 +08:00
Concedo
54194911ac
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-09 16:50:43 +08:00
Concedo
2f2eff6e13
the dark gods have been sated, and redpajama is integrated... but at what cost?
2023-05-08 20:58:00 +08:00
Concedo
62beded0e7
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# Makefile
# README.md
2023-05-07 19:10:01 +08:00
Concedo
8a964e76c8
integrated mirostat as a launch parameter, works on all models
2023-05-06 00:47:17 +08:00
Hendrik Langer
8131bc8b56
add new sampling algorithm mirostat
2023-05-05 13:23:47 +02:00
Concedo
4857739ab5
allow specifying a different thread count for GPU blas
2023-05-03 21:19:59 +08:00
Concedo
966cd2ce91
Merge remote-tracking branch 'temp/concedo' into concedo_experimental
...
# Conflicts:
# koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
25201233ca
fixed unbantokens not following EOS
2023-05-01 00:02:45 +08:00
Concedo
7afad2b9b5
integrated the new samplers
2023-04-29 19:41:41 +08:00
Concedo
fe0e4de8e8
fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken
2023-04-29 18:25:17 +08:00
Concedo
bb282a4ecf
reinstated the q4_3 format, for backwards compatibility.
2023-04-29 11:42:04 +08:00
Concedo
032a171867
integrated q5 formats
2023-04-28 12:58:39 +08:00
Concedo
e8a389f85b
updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching
2023-04-28 11:41:03 +08:00
Concedo
5070815dcf
fixing discussion #121 and issue #122
2023-04-27 16:10:01 +08:00
Concedo
a696b0a16c
missed another thing
2023-04-25 23:16:04 +08:00