koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-12 09:59:41 +00:00

Author	SHA1	Message	Date
Concedo	b3315459c7	pilled the new dequants for clblast, fixed some ooms	2023-04-30 14:15:44 +08:00
Concedo	032a171867	integrated q5 formats	2023-04-28 12:58:39 +08:00
Concedo	5070815dcf	fixing discussion #121 and issue #122	2023-04-27 16:10:01 +08:00
Concedo	0aa3d839fb	free old ctx on retry	2023-04-25 23:42:57 +08:00
Concedo	72b2331ad6	edge cases with mem crash? need verify	2023-04-25 20:42:30 +08:00
Concedo	5eec5d6ed9	Added backwards compatibility to an earlier version of NeoX.	2023-04-25 20:34:18 +08:00
Concedo	59fb174678	fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj	2023-04-24 23:20:06 +08:00
Concedo	432cc91649	still needs to be a bit higher for very small contexts	2023-04-23 15:01:38 +08:00
Concedo	4e1ea2ac61	hopefully fixed the ooms for good	2023-04-23 13:49:50 +08:00
Concedo	d41490c27b	just revert back to the working commit	2023-04-23 00:35:42 +08:00
Concedo	c60fb5ef4b	fixed rwkv build errors on ARM devices	2023-04-23 00:18:38 +08:00
Concedo	b5d6284190	increase initial buffer too	2023-04-23 00:07:33 +08:00
Concedo	d2f14b2b1f	add an extra buffer to mem allocations	2023-04-23 00:04:32 +08:00
Concedo	c454f8b848	Gpt NeoX / Pythia integration completed	2023-04-22 11:23:25 +08:00
Concedo	4fa3dfe8bc	just doesn't work properly on windows. will leave it as a manual flag for others	2023-04-22 10:57:38 +08:00
Concedo	68898046c2	accidentally added the binaries onto repo again.	2023-04-22 00:41:19 +08:00
Concedo	7ba36c2c6c	trying to put out penguin based fires. sorry for inconvenience	2023-04-20 23:15:07 +08:00
Concedo	49697d86d8	adjusted down the buf memory allocation now that realloc seems to work	2023-04-20 17:51:13 +08:00
Concedo	cc407f283a	messing around with memory allocation to bandaid the random ooms with various gpt2 and gptj models	2023-04-19 20:18:55 +08:00
Concedo	45ec09d31b	fast forwarding for rwkv for unmodified contexts	2023-04-19 15:09:35 +08:00
Concedo	ea01771dd5	rwkv is done	2023-04-18 20:55:01 +08:00
Concedo	c200b674f4	updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter	2023-04-18 17:36:44 +08:00
Concedo	763ad172c0	arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation	2023-04-17 17:31:45 +08:00
Concedo	c757fbee1d	fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite	2023-04-16 21:54:18 +08:00
Concedo	1bd5992da4	clean and refactor handling of flags	2023-04-12 23:25:31 +08:00
Concedo	636f8e5a8e	updated the quantize files and makefile	2023-04-12 21:40:25 +08:00
Concedo	69b85f5b61	fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size.	2023-04-11 00:14:57 +08:00
Concedo	18a154715e	added version label, improved file type checks	2023-04-10 01:03:09 +08:00
Concedo	d8e37bfe75	new gpt2 format supported	2023-04-08 17:35:36 +08:00
Concedo	1abcdb2394	should not be static	2023-04-07 20:35:19 +08:00
Concedo	1d48db4f63	dont build quantize	2023-04-07 17:11:26 +08:00
Concedo	4f5faf9612	some users report that this repo is now being flagged as malicious? no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.	2023-04-06 21:49:43 +08:00
Concedo	3d650d0e25	remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold	2023-04-06 11:08:19 +08:00
Concedo	1490cdd71d	change GPT-J and GPT2 KVs to use fp16 instead	2023-04-05 15:53:07 +08:00
Concedo	57e9f929ee	renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags	2023-04-05 15:22:13 +08:00
Concedo	14273fea7a	integrated gpt2 support	2023-04-04 23:15:47 +08:00
Concedo	52de932842	removed main.exe to reduce clutter, added support for rep pen in gptj	2023-04-04 20:43:13 +08:00
Concedo	8dd8ab1659	Various enhancement and integration pygmalion.cpp	2023-04-03 00:04:43 +08:00
Concedo	9aabb0d9db	massive refactor completed, GPT-J integrated	2023-04-02 17:03:30 +08:00
Concedo	b1f08813e3	added support for gpt4all original format	2023-04-02 00:53:46 +08:00
Concedo	085a9f90a7	still refactoring	2023-04-01 11:56:34 +08:00
Concedo	6b86f5ea22	halfway refactoring, wip adding other model types	2023-04-01 01:13:05 +08:00

1 2 3

142 commits