koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	2c6ac06936	gpu offload not working for other arch. debug in future.	2023-05-17 17:13:01 +08:00
Concedo	57230b5196	upgrade all other formats	2023-05-17 16:28:20 +08:00
Concedo	90fe9096b4	clean and refactoring pass before supporting newer models for different arch	2023-05-17 11:23:29 +08:00
Concedo	94ef3e81cf	inc allocation	2023-05-16 23:32:35 +08:00
Concedo	72836d4eac	fixing more compile issues	2023-05-15 20:10:54 +08:00
Concedo	6504150fac	just testing cublas	2023-05-15 20:01:22 +08:00
Concedo	c81dd58e76	Merge commit '`f954edda93`' into archive_lib # Conflicts: # ggml.c	2023-05-14 18:34:56 +08:00
Concedo	b692e4d2a4	wip	2023-05-14 17:21:07 +08:00
Concedo	e47f7ade05	updated kobold lite, patch oom errors	2023-05-09 19:16:45 +08:00
Concedo	b3315459c7	pilled the new dequants for clblast, fixed some ooms	2023-04-30 14:15:44 +08:00
Concedo	032a171867	integrated q5 formats	2023-04-28 12:58:39 +08:00
Concedo	59fb174678	fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj	2023-04-24 23:20:06 +08:00
Concedo	432cc91649	still needs to be a bit higher for very small contexts	2023-04-23 15:01:38 +08:00
Concedo	4e1ea2ac61	hopefully fixed the ooms for good	2023-04-23 13:49:50 +08:00
Concedo	d41490c27b	just revert back to the working commit	2023-04-23 00:35:42 +08:00
Concedo	b5d6284190	increase initial buffer too	2023-04-23 00:07:33 +08:00
Concedo	d2f14b2b1f	add an extra buffer to mem allocations	2023-04-23 00:04:32 +08:00
Concedo	7ba36c2c6c	trying to put out penguin based fires. sorry for inconvenience	2023-04-20 23:15:07 +08:00
Concedo	49697d86d8	adjusted down the buf memory allocation now that realloc seems to work	2023-04-20 17:51:13 +08:00
Concedo	cc407f283a	messing around with memory allocation to bandaid the random ooms with various gpt2 and gptj models	2023-04-19 20:18:55 +08:00
Concedo	45ec09d31b	fast forwarding for rwkv for unmodified contexts	2023-04-19 15:09:35 +08:00
Concedo	c757fbee1d	fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite	2023-04-16 21:54:18 +08:00
Concedo	1bd5992da4	clean and refactor handling of flags	2023-04-12 23:25:31 +08:00
Concedo	69b85f5b61	fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size.	2023-04-11 00:14:57 +08:00
Concedo	4f5faf9612	some users report that this repo is now being flagged as malicious? no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.	2023-04-06 21:49:43 +08:00
Concedo	1490cdd71d	change GPT-J and GPT2 KVs to use fp16 instead	2023-04-05 15:53:07 +08:00
Concedo	52de932842	removed main.exe to reduce clutter, added support for rep pen in gptj	2023-04-04 20:43:13 +08:00
Concedo	8dd8ab1659	Various enhancement and integration pygmalion.cpp	2023-04-03 00:04:43 +08:00
Concedo	9aabb0d9db	massive refactor completed, GPT-J integrated	2023-04-02 17:03:30 +08:00

29 commits