Concedo
b3315459c7
pilled the new dequants for clblast, fixed some ooms
2023-04-30 14:15:44 +08:00
Concedo
032a171867
integrated q5 formats
2023-04-28 12:58:39 +08:00
Concedo
5070815dcf
fixing discussion #121 and issue #122
2023-04-27 16:10:01 +08:00
Concedo
0aa3d839fb
free old ctx on retry
2023-04-25 23:42:57 +08:00
Concedo
72b2331ad6
edge cases with mem crash? need verify
2023-04-25 20:42:30 +08:00
Concedo
5eec5d6ed9
Added backwards compatibility to an earlier version of NeoX.
2023-04-25 20:34:18 +08:00
Concedo
59fb174678
fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj
2023-04-24 23:20:06 +08:00
Concedo
432cc91649
still needs to be a bit higher for very small contexts
2023-04-23 15:01:38 +08:00
Concedo
4e1ea2ac61
hopefully fixed the ooms for good
2023-04-23 13:49:50 +08:00
Concedo
d41490c27b
just revert back to the working commit
2023-04-23 00:35:42 +08:00
Concedo
c60fb5ef4b
fixed rwkv build errors on ARM devices
2023-04-23 00:18:38 +08:00
Concedo
b5d6284190
increase initial buffer too
2023-04-23 00:07:33 +08:00
Concedo
d2f14b2b1f
add an extra buffer to mem allocations
2023-04-23 00:04:32 +08:00
Concedo
c454f8b848
Gpt NeoX / Pythia integration completed
2023-04-22 11:23:25 +08:00
Concedo
4fa3dfe8bc
just doesn't work properly on windows. will leave it as a manual flag for others
2023-04-22 10:57:38 +08:00
Concedo
68898046c2
accidentally added the binaries onto repo again.
2023-04-22 00:41:19 +08:00
Concedo
7ba36c2c6c
trying to put out penguin based fires. sorry for inconvenience
2023-04-20 23:15:07 +08:00
Concedo
49697d86d8
adjusted down the buf memory allocation now that realloc seems to work
2023-04-20 17:51:13 +08:00
Concedo
cc407f283a
messing around with memory allocation to bandaid the random ooms with various gpt2 and gptj models
2023-04-19 20:18:55 +08:00
Concedo
45ec09d31b
fast forwarding for rwkv for unmodified contexts
2023-04-19 15:09:35 +08:00
Concedo
ea01771dd5
rwkv is done
2023-04-18 20:55:01 +08:00
Concedo
c200b674f4
updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter
2023-04-18 17:36:44 +08:00
Concedo
763ad172c0
arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation
2023-04-17 17:31:45 +08:00
Concedo
c757fbee1d
fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite
2023-04-16 21:54:18 +08:00
Concedo
1bd5992da4
clean and refactor handling of flags
2023-04-12 23:25:31 +08:00
Concedo
636f8e5a8e
updated the quantize files and makefile
2023-04-12 21:40:25 +08:00
Concedo
69b85f5b61
fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size.
2023-04-11 00:14:57 +08:00
Concedo
18a154715e
added version label, improved file type checks
2023-04-10 01:03:09 +08:00
Concedo
d8e37bfe75
new gpt2 format supported
2023-04-08 17:35:36 +08:00
Concedo
1abcdb2394
should not be static
2023-04-07 20:35:19 +08:00
Concedo
1d48db4f63
dont build quantize
2023-04-07 17:11:26 +08:00
Concedo
4f5faf9612
some users report that this repo is now being flagged as malicious?
...
no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.
2023-04-06 21:49:43 +08:00
Concedo
3d650d0e25
remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold
2023-04-06 11:08:19 +08:00
Concedo
1490cdd71d
change GPT-J and GPT2 KVs to use fp16 instead
2023-04-05 15:53:07 +08:00
Concedo
57e9f929ee
renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags
2023-04-05 15:22:13 +08:00
Concedo
14273fea7a
integrated gpt2 support
2023-04-04 23:15:47 +08:00
Concedo
52de932842
removed main.exe to reduce clutter, added support for rep pen in gptj
2023-04-04 20:43:13 +08:00
Concedo
8dd8ab1659
Various enhancement and integration pygmalion.cpp
2023-04-03 00:04:43 +08:00
Concedo
9aabb0d9db
massive refactor completed, GPT-J integrated
2023-04-02 17:03:30 +08:00
Concedo
b1f08813e3
added support for gpt4all original format
2023-04-02 00:53:46 +08:00
Concedo
085a9f90a7
still refactoring
2023-04-01 11:56:34 +08:00
Concedo
6b86f5ea22
halfway refactoring, wip adding other model types
2023-04-01 01:13:05 +08:00