Concedo
|
487ac226b4
|
need to set the unshuffle before loading the model
|
2023-05-17 17:58:21 +08:00 |
|
Concedo
|
2c6ac06936
|
gpu offload not working for other arch. debug in future.
|
2023-05-17 17:13:01 +08:00 |
|
Concedo
|
00da2a5f4e
|
neox is updated
|
2023-05-17 14:56:54 +08:00 |
|
Concedo
|
90fe9096b4
|
clean and refactoring pass before supporting newer models for different arch
|
2023-05-17 11:23:29 +08:00 |
|
Concedo
|
466cd21368
|
test cmakefile for cublas.
|
2023-05-15 14:50:38 +08:00 |
|
Concedo
|
b692e4d2a4
|
wip
|
2023-05-14 17:21:07 +08:00 |
|
Concedo
|
8a5fe628df
|
recognize q8_0 as an older format as the new clblast doesnt work correctly with it
|
2023-05-14 11:06:23 +08:00 |
|
Concedo
|
e05455f852
|
fixed wrong sized struct from legacy q8_1, fixed opencl varsize arrays
|
2023-05-13 23:56:08 +08:00 |
|
Concedo
|
05cf5f7d6e
|
partially working, but the blas matmul is broken
|
2023-05-13 11:35:38 +08:00 |
|
Concedo
|
54194911ac
|
Merge branch 'master' into concedo_experimental
# Conflicts:
# README.md
|
2023-05-09 16:50:43 +08:00 |
|
Concedo
|
2f2eff6e13
|
the dark gods have been sated, and redpajama is integrated... but at what cost?
|
2023-05-08 20:58:00 +08:00 |
|
Concedo
|
62beded0e7
|
Merge branch 'master' into concedo_experimental
# Conflicts:
# .github/workflows/build.yml
# Makefile
# README.md
|
2023-05-07 19:10:01 +08:00 |
|
Concedo
|
8a964e76c8
|
integrated mirostat as a launch parameter, works on all models
|
2023-05-06 00:47:17 +08:00 |
|
Hendrik Langer
|
8131bc8b56
|
add new sampling algorithm mirostat
|
2023-05-05 13:23:47 +02:00 |
|
Concedo
|
4857739ab5
|
allow specifying a different thread count for GPU blas
|
2023-05-03 21:19:59 +08:00 |
|
Concedo
|
966cd2ce91
|
Merge remote-tracking branch 'temp/concedo' into concedo_experimental
# Conflicts:
# koboldcpp.py
|
2023-05-02 22:43:34 +08:00 |
|
Concedo
|
25201233ca
|
fixed unbantokens not following EOS
|
2023-05-01 00:02:45 +08:00 |
|
Concedo
|
7afad2b9b5
|
integrated the new samplers
|
2023-04-29 19:41:41 +08:00 |
|
Concedo
|
fe0e4de8e8
|
fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken
|
2023-04-29 18:25:17 +08:00 |
|
Concedo
|
bb282a4ecf
|
reinstated the q4_3 format, for backwards compatibility.
|
2023-04-29 11:42:04 +08:00 |
|
Concedo
|
032a171867
|
integrated q5 formats
|
2023-04-28 12:58:39 +08:00 |
|
Concedo
|
e8a389f85b
|
updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching
|
2023-04-28 11:41:03 +08:00 |
|
Concedo
|
5070815dcf
|
fixing discussion #121 and issue #122
|
2023-04-27 16:10:01 +08:00 |
|
Concedo
|
a696b0a16c
|
missed another thing
|
2023-04-25 23:16:04 +08:00 |
|
Concedo
|
8c9c218609
|
missed a thing
|
2023-04-25 23:02:08 +08:00 |
|
Concedo
|
5eec5d6ed9
|
Added backwards compatibility to an earlier version of NeoX.
|
2023-04-25 20:34:18 +08:00 |
|
Concedo
|
3962eb39c7
|
added token unbanning
|
2023-04-24 21:50:20 +08:00 |
|
Concedo
|
1b9b9068b1
|
merged q4_2 and q4_3 dequants and FIXED CLBLAST SLOWNESS!
|
2023-04-24 21:33:01 +08:00 |
|
Concedo
|
9129e937f9
|
only llama can use batch sizes above 256 to prevent unacceptably high memory usage
|
2023-04-23 15:57:06 +08:00 |
|
Concedo
|
6e908c1792
|
added lora support
|
2023-04-22 12:29:38 +08:00 |
|
Concedo
|
c454f8b848
|
Gpt NeoX / Pythia integration completed
|
2023-04-22 11:23:25 +08:00 |
|
Concedo
|
ef13443047
|
wip pythia integration
|
2023-04-22 01:08:23 +08:00 |
|
Concedo
|
68898046c2
|
accidentally added the binaries onto repo again.
|
2023-04-22 00:41:19 +08:00 |
|
Concedo
|
5160053e51
|
merged llama adapter into the rest of the gpt adapters
|
2023-04-21 17:47:48 +08:00 |
|
Concedo
|
45ec09d31b
|
fast forwarding for rwkv for unmodified contexts
|
2023-04-19 15:09:35 +08:00 |
|
Concedo
|
ea01771dd5
|
rwkv is done
|
2023-04-18 20:55:01 +08:00 |
|
Concedo
|
c200b674f4
|
updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter
|
2023-04-18 17:36:44 +08:00 |
|
Concedo
|
763ad172c0
|
arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation
|
2023-04-17 17:31:45 +08:00 |
|
Concedo
|
bee6a401fd
|
slight clarity fix
|
2023-04-16 22:04:19 +08:00 |
|
Concedo
|
c757fbee1d
|
fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite
|
2023-04-16 21:54:18 +08:00 |
|
Concedo
|
6548d3b3fb
|
Added prints for stopping sequences, made makefile 1% friendlier to arch linux users
|
2023-04-16 20:43:17 +08:00 |
|
Concedo
|
525184930d
|
added a kobold API compatible implementation of stopping sequences
|
2023-04-16 18:37:49 +08:00 |
|
Concedo
|
ad5676810a
|
merge CLBlast improvements - GPU dequant
|
2023-04-16 01:17:40 +08:00 |
|
Concedo
|
8ad42a1102
|
read from inputs
|
2023-04-14 21:30:26 +08:00 |
|
Concedo
|
adb4df78d6
|
Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation.
|
2023-04-14 21:24:16 +08:00 |
|
Concedo
|
5c22f7e4c4
|
reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode.
|
2023-04-13 11:32:05 +08:00 |
|
Concedo
|
69b85f5b61
|
fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size.
|
2023-04-11 00:14:57 +08:00 |
|
Concedo
|
18a154715e
|
added version label, improved file type checks
|
2023-04-10 01:03:09 +08:00 |
|
Concedo
|
b91abc3316
|
increase default blas batch size
|
2023-04-09 15:27:43 +08:00 |
|
Concedo
|
d8e37bfe75
|
new gpt2 format supported
|
2023-04-08 17:35:36 +08:00 |
|