koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	c048bcfec4	remove old filever checks (+7 squashed commit) Squashed commit: [b72627a] new format not working [e568870] old ver works [7053b77] compile errors fixed, fixing linkers [4ae8889] add new ver [ff82dfd] file format checks [25b8aa8] refactoring type names [931063b] still merging	2023-05-21 00:15:39 +08:00
Concedo	a0cfed1e30	still merging in process	2023-05-20 15:58:33 +08:00
Concedo	4e86a07e57	wip cleanup before big merge	2023-05-20 12:48:28 +08:00
Concedo	2c6ac06936	gpu offload not working for other arch. debug in future.	2023-05-17 17:13:01 +08:00
Concedo	57230b5196	upgrade all other formats	2023-05-17 16:28:20 +08:00
Concedo	90fe9096b4	clean and refactoring pass before supporting newer models for different arch	2023-05-17 11:23:29 +08:00
Concedo	94ef3e81cf	inc allocation	2023-05-16 23:32:35 +08:00
Concedo	72836d4eac	fixing more compile issues	2023-05-15 20:10:54 +08:00
Concedo	6504150fac	just testing cublas	2023-05-15 20:01:22 +08:00
Concedo	c81dd58e76	Merge commit '`f954edda93`' into archive_lib # Conflicts: # ggml.c	2023-05-14 18:34:56 +08:00
Concedo	b692e4d2a4	wip	2023-05-14 17:21:07 +08:00
Concedo	05cf5f7d6e	partially working, but the blas matmul is broken	2023-05-13 11:35:38 +08:00
Concedo	e47f7ade05	updated kobold lite, patch oom errors	2023-05-09 19:16:45 +08:00
Concedo	b3315459c7	pilled the new dequants for clblast, fixed some ooms	2023-04-30 14:15:44 +08:00
Concedo	032a171867	integrated q5 formats	2023-04-28 12:58:39 +08:00
Concedo	59fb174678	fixed compile errors, made mmap automatic when lora is selected, added updated quantizers and quantization handling for gpt neox gpt 2 and gptj	2023-04-24 23:20:06 +08:00
Concedo	432cc91649	still needs to be a bit higher for very small contexts	2023-04-23 15:01:38 +08:00
Concedo	4e1ea2ac61	hopefully fixed the ooms for good	2023-04-23 13:49:50 +08:00
Concedo	d41490c27b	just revert back to the working commit	2023-04-23 00:35:42 +08:00
Concedo	b5d6284190	increase initial buffer too	2023-04-23 00:07:33 +08:00
Concedo	d2f14b2b1f	add an extra buffer to mem allocations	2023-04-23 00:04:32 +08:00
Concedo	4fa3dfe8bc	just doesn't work properly on windows. will leave it as a manual flag for others	2023-04-22 10:57:38 +08:00
Concedo	7ba36c2c6c	trying to put out penguin based fires. sorry for inconvenience	2023-04-20 23:15:07 +08:00
Concedo	49697d86d8	adjusted down the buf memory allocation now that realloc seems to work	2023-04-20 17:51:13 +08:00
Concedo	cc407f283a	messing around with memory allocation to bandaid the random ooms with various gpt2 and gptj models	2023-04-19 20:18:55 +08:00
Concedo	45ec09d31b	fast forwarding for rwkv for unmodified contexts	2023-04-19 15:09:35 +08:00
Concedo	c757fbee1d	fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite	2023-04-16 21:54:18 +08:00
Concedo	1bd5992da4	clean and refactor handling of flags	2023-04-12 23:25:31 +08:00
Concedo	69b85f5b61	fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size.	2023-04-11 00:14:57 +08:00
Concedo	18a154715e	added version label, improved file type checks	2023-04-10 01:03:09 +08:00
Concedo	d8e37bfe75	new gpt2 format supported	2023-04-08 17:35:36 +08:00

31 commits