koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 09:04:36 +00:00

Author	SHA1	Message	Date
Concedo	487ac226b4	need to set the unshuffle before loading the model	2023-05-17 17:58:21 +08:00
Concedo	2c6ac06936	gpu offload not working for other arch. debug in future.	2023-05-17 17:13:01 +08:00
Concedo	00da2a5f4e	neox is updated	2023-05-17 14:56:54 +08:00
Concedo	90fe9096b4	clean and refactoring pass before supporting newer models for different arch	2023-05-17 11:23:29 +08:00
Concedo	466cd21368	test cmakefile for cublas.	2023-05-15 14:50:38 +08:00
Concedo	b692e4d2a4	wip	2023-05-14 17:21:07 +08:00
Concedo	8a5fe628df	recognize q8_0 as an older format as the new clblast doesnt work correctly with it	2023-05-14 11:06:23 +08:00
Concedo	e05455f852	fixed wrong sized struct from legacy q8_1, fixed opencl varsize arrays	2023-05-13 23:56:08 +08:00
Concedo	05cf5f7d6e	partially working, but the blas matmul is broken	2023-05-13 11:35:38 +08:00
Concedo	54194911ac	Merge branch 'master' into concedo_experimental # Conflicts: # README.md	2023-05-09 16:50:43 +08:00
Concedo	2f2eff6e13	the dark gods have been sated, and redpajama is integrated... but at what cost?	2023-05-08 20:58:00 +08:00
Concedo	62beded0e7	Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # Makefile # README.md	2023-05-07 19:10:01 +08:00
Concedo	8a964e76c8	integrated mirostat as a launch parameter, works on all models	2023-05-06 00:47:17 +08:00
Hendrik Langer	8131bc8b56	add new sampling algorithm mirostat	2023-05-05 13:23:47 +02:00
Concedo	4857739ab5	allow specifying a different thread count for GPU blas	2023-05-03 21:19:59 +08:00
Concedo	966cd2ce91	Merge remote-tracking branch 'temp/concedo' into concedo_experimental # Conflicts: # koboldcpp.py	2023-05-02 22:43:34 +08:00
Concedo	25201233ca	fixed unbantokens not following EOS	2023-05-01 00:02:45 +08:00
Concedo	7afad2b9b5	integrated the new samplers	2023-04-29 19:41:41 +08:00
Concedo	fe0e4de8e8	fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken	2023-04-29 18:25:17 +08:00
Concedo	bb282a4ecf	reinstated the q4_3 format, for backwards compatibility.	2023-04-29 11:42:04 +08:00
Concedo	032a171867	integrated q5 formats	2023-04-28 12:58:39 +08:00
Concedo	e8a389f85b	updated kobold lite, added debug mode, changed streaming mode to now use the same url when launching	2023-04-28 11:41:03 +08:00
Concedo	5070815dcf	fixing discussion #121 and issue #122	2023-04-27 16:10:01 +08:00
Concedo	a696b0a16c	missed another thing	2023-04-25 23:16:04 +08:00
Concedo	8c9c218609	missed a thing	2023-04-25 23:02:08 +08:00
Concedo	5eec5d6ed9	Added backwards compatibility to an earlier version of NeoX.	2023-04-25 20:34:18 +08:00
Concedo	3962eb39c7	added token unbanning	2023-04-24 21:50:20 +08:00
Concedo	1b9b9068b1	merged q4_2 and q4_3 dequants and FIXED CLBLAST SLOWNESS!	2023-04-24 21:33:01 +08:00
Concedo	9129e937f9	only llama can use batch sizes above 256 to prevent unacceptably high memory usage	2023-04-23 15:57:06 +08:00
Concedo	6e908c1792	added lora support	2023-04-22 12:29:38 +08:00
Concedo	c454f8b848	Gpt NeoX / Pythia integration completed	2023-04-22 11:23:25 +08:00
Concedo	ef13443047	wip pythia integration	2023-04-22 01:08:23 +08:00
Concedo	68898046c2	accidentally added the binaries onto repo again.	2023-04-22 00:41:19 +08:00
Concedo	5160053e51	merged llama adapter into the rest of the gpt adapters	2023-04-21 17:47:48 +08:00
Concedo	45ec09d31b	fast forwarding for rwkv for unmodified contexts	2023-04-19 15:09:35 +08:00
Concedo	ea01771dd5	rwkv is done	2023-04-18 20:55:01 +08:00
Concedo	c200b674f4	updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter	2023-04-18 17:36:44 +08:00
Concedo	763ad172c0	arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation	2023-04-17 17:31:45 +08:00
Concedo	bee6a401fd	slight clarity fix	2023-04-16 22:04:19 +08:00
Concedo	c757fbee1d	fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite	2023-04-16 21:54:18 +08:00
Concedo	6548d3b3fb	Added prints for stopping sequences, made makefile 1% friendlier to arch linux users	2023-04-16 20:43:17 +08:00
Concedo	525184930d	added a kobold API compatible implementation of stopping sequences	2023-04-16 18:37:49 +08:00
Concedo	ad5676810a	merge CLBlast improvements - GPU dequant	2023-04-16 01:17:40 +08:00
Concedo	8ad42a1102	read from inputs	2023-04-14 21:30:26 +08:00
Concedo	adb4df78d6	Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation.	2023-04-14 21:24:16 +08:00
Concedo	5c22f7e4c4	reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode.	2023-04-13 11:32:05 +08:00
Concedo	69b85f5b61	fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size.	2023-04-11 00:14:57 +08:00
Concedo	18a154715e	added version label, improved file type checks	2023-04-10 01:03:09 +08:00
Concedo	b91abc3316	increase default blas batch size	2023-04-09 15:27:43 +08:00
Concedo	d8e37bfe75	new gpt2 format supported	2023-04-08 17:35:36 +08:00

1 2

51 commits