Concedo
d418146535
fixed a token decoding bug
2023-05-21 00:53:20 +08:00
Concedo
5032e0fd64
trying to fix ggjt v3
2023-05-21 00:29:50 +08:00
Concedo
c048bcfec4
remove old filever checks (+7 squashed commit)
...
Squashed commit:
[b72627a] new format not working
[e568870] old ver works
[7053b77] compile errors fixed, fixing linkers
[4ae8889] add new ver
[ff82dfd] file format checks
[25b8aa8] refactoring type names
[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30
still merging in process
2023-05-20 15:58:33 +08:00
Concedo
a8958f6b76
merging, do not use
2023-05-20 15:12:31 +08:00
Concedo
010b2753d9
Merge commit ' 6986c7835a
' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-20 11:30:51 +08:00
Concedo
487ac226b4
need to set the unshuffle before loading the model
2023-05-17 17:58:21 +08:00
Concedo
2c6ac06936
gpu offload not working for other arch. debug in future.
2023-05-17 17:13:01 +08:00
Concedo
00da2a5f4e
neox is updated
2023-05-17 14:56:54 +08:00
Concedo
90fe9096b4
clean and refactoring pass before supporting newer models for different arch
2023-05-17 11:23:29 +08:00
Concedo
466cd21368
test cmakefile for cublas.
2023-05-15 14:50:38 +08:00
Concedo
b692e4d2a4
wip
2023-05-14 17:21:07 +08:00
Concedo
8a5fe628df
recognize q8_0 as an older format as the new clblast doesnt work correctly with it
2023-05-14 11:06:23 +08:00
Concedo
e05455f852
fixed wrong sized struct from legacy q8_1, fixed opencl varsize arrays
2023-05-13 23:56:08 +08:00
Concedo
05cf5f7d6e
partially working, but the blas matmul is broken
2023-05-13 11:35:38 +08:00
Concedo
54194911ac
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-09 16:50:43 +08:00
Concedo
2f2eff6e13
the dark gods have been sated, and redpajama is integrated... but at what cost?
2023-05-08 20:58:00 +08:00
Concedo
62beded0e7
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# Makefile
# README.md
2023-05-07 19:10:01 +08:00
Concedo
8a964e76c8
integrated mirostat as a launch parameter, works on all models
2023-05-06 00:47:17 +08:00
Hendrik Langer
8131bc8b56
add new sampling algorithm mirostat
2023-05-05 13:23:47 +02:00
Concedo
4857739ab5
allow specifying a different thread count for GPU blas
2023-05-03 21:19:59 +08:00
Concedo
966cd2ce91
Merge remote-tracking branch 'temp/concedo' into concedo_experimental
...
# Conflicts:
# koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
25201233ca
fixed unbantokens not following EOS
2023-05-01 00:02:45 +08:00
Concedo
7afad2b9b5
integrated the new samplers
2023-04-29 19:41:41 +08:00
Concedo
fe0e4de8e8
fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken
2023-04-29 18:25:17 +08:00
Concedo
bb282a4ecf
reinstated the q4_3 format, for backwards compatibility.
2023-04-29 11:42:04 +08:00
Concedo
032a171867
integrated q5 formats
2023-04-28 12:58:39 +08:00
Concedo
e8a389f85b
updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching
2023-04-28 11:41:03 +08:00
Concedo
5070815dcf
fixing discussion #121 and issue #122
2023-04-27 16:10:01 +08:00
Concedo
a696b0a16c
missed another thing
2023-04-25 23:16:04 +08:00
Concedo
8c9c218609
missed a thing
2023-04-25 23:02:08 +08:00
Concedo
5eec5d6ed9
Added backwards compatibility to an earlier version of NeoX.
2023-04-25 20:34:18 +08:00
Concedo
3962eb39c7
added token unbanning
2023-04-24 21:50:20 +08:00
Concedo
1b9b9068b1
merged q4_2 and q4_3 dequants and FIXED CLBLAST SLOWNESS!
2023-04-24 21:33:01 +08:00
Concedo
9129e937f9
only llama can use batch sizes above 256 to prevent unacceptably high memory usage
2023-04-23 15:57:06 +08:00
Concedo
6e908c1792
added lora support
2023-04-22 12:29:38 +08:00
Concedo
c454f8b848
Gpt NeoX / Pythia integration completed
2023-04-22 11:23:25 +08:00
Concedo
ef13443047
wip pythia integration
2023-04-22 01:08:23 +08:00
Concedo
68898046c2
accidentally added the binaries onto repo again.
2023-04-22 00:41:19 +08:00
Concedo
5160053e51
merged llama adapter into the rest of the gpt adapters
2023-04-21 17:47:48 +08:00
Concedo
45ec09d31b
fast forwarding for rwkv for unmodified contexts
2023-04-19 15:09:35 +08:00
Concedo
ea01771dd5
rwkv is done
2023-04-18 20:55:01 +08:00
Concedo
c200b674f4
updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter
2023-04-18 17:36:44 +08:00
Concedo
763ad172c0
arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation
2023-04-17 17:31:45 +08:00
Concedo
bee6a401fd
slight clarity fix
2023-04-16 22:04:19 +08:00
Concedo
c757fbee1d
fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite
2023-04-16 21:54:18 +08:00
Concedo
6548d3b3fb
Added prints for stopping sequences, made makefile 1% friendlier to arch linux users
2023-04-16 20:43:17 +08:00
Concedo
525184930d
added a kobold API compatible implementation of stopping sequences
2023-04-16 18:37:49 +08:00
Concedo
ad5676810a
merge CLBlast improvements - GPU dequant
2023-04-16 01:17:40 +08:00
Concedo
8ad42a1102
read from inputs
2023-04-14 21:30:26 +08:00