koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	1d1111e10f	expose timing info in web api	2023-07-11 18:56:06 +08:00
Concedo	7222877069	Merge remote-tracking branch 'ren/concedo' into concedo_experimental	2023-07-11 18:45:36 +08:00
Concedo	5ca204d527	Merge remote-tracking branch 'yellowrose/pr/open/LostRuins/koboldcpp/multigpu-cuda-gui' into concedo_experimental # Conflicts: # koboldcpp.py	2023-07-11 18:22:54 +08:00
Concedo	4be167915a	added linear rope option, added warning for bad samplers	2023-07-11 18:08:19 +08:00
Concedo	9324cb804a	reimplemented save and load	2023-07-10 22:49:27 +08:00
YellowRoseCx	f1014f3cc7	remove unused .re	2023-07-10 00:26:40 -05:00
YellowRoseCx	242f01e983	Add Multi-GPU CuBLAS support in the new GUI	2023-07-09 17:10:14 -05:00
callMeMakerRen	4e46673f80	Merge branch 'LostRuins:concedo' into concedo	2023-07-08 09:33:26 +08:00
Concedo	8edcb337c6	added ability to select "all devices"	2023-07-07 23:37:55 +08:00
Concedo	ddaa4f2a26	fix cuda garbage results and gpu selection issues	2023-07-07 22:14:14 +08:00
Concedo	95eca51bef	add gpu choice for GUI for cuda	2023-07-07 18:39:47 +08:00
Concedo	a689a66068	make it work with pyinstaller	2023-07-07 17:52:34 +08:00
Concedo	9ee9a77f12	warn outdated GUI (+1 squashed commits) Squashed commits: [15aec3d] spelling error	2023-07-07 16:39:17 +08:00
Concedo	32102c2064	Merge branch 'master' into concedo_experimental # Conflicts: # README.md	2023-07-07 14:15:39 +08:00
shutup	894c72819c	Merge branch 'concedo' of https://github.com/callMeMakerRen/koboldcpp into concedo	2023-07-07 11:57:25 +08:00
shutup	1727e652f1	expose some useful info that can be used in statistics of performence	2023-07-07 11:52:58 +08:00
Concedo	8424a35c62	added the ability to ban any substring tokens	2023-07-06 23:24:21 +08:00
Concedo	27a0907cfa	backport MM256_SET_M128I to ggml_v2, updated lite, added support for selecting the GPU for cublas	2023-07-06 22:33:46 +08:00
Concedo	4d1700b172	adjust some ui sizing	2023-07-06 15:17:47 +08:00
Vali-98	1c80002310	New UI using customtkinter (#284 ) * Initial conversion to customtkinter. * Initial conversion to customtkinter. * Additions to UI, still non-functional * UI now functional, untested * UI now functional, untested * Added saving configs * Saving and loading now functional * Fixed sliders not loading * Cleaned up duplicate arrays * Cleaned up duplicate arrays * Fixed loading bugs * wip fixing all the broken parameters. PLEASE test before you commit * further cleaning * bugfix completed for gui. now evaluating save and load * cleanup prepare to merge --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2023-07-06 15:00:57 +08:00
Concedo	00e35d0bbf	Merge branch 'concedo' into concedo_experimental	2023-07-04 18:46:40 +08:00
Michael Moon	f9108ba401	Make koboldcpp.py executable on Linux (#293 )	2023-07-04 18:46:08 +08:00
Concedo	c6c0afdf18	refactor to avoid code duplication	2023-07-04 18:35:54 +08:00
Ycros	309534dcd0	implement sampler order, expose sampler order and mirostat in api	2023-07-02 18:15:34 +00:00
Concedo	632bf27b65	more granular context size selections	2023-07-01 11:02:44 +08:00
Concedo	eda663f15f	update lite and up ver	2023-07-01 00:15:26 +08:00
Concedo	f09debb1ec	remove debug	2023-06-29 20:54:56 +08:00
Concedo	4b3a1282f0	Add flag for lowvram directly into cublas launch param Merge remote-tracking branch 'yellowrose/pr/open/LostRuins/koboldcpp/lowvram' into concedo_experimental # Conflicts: # koboldcpp.py	2023-06-29 17:07:31 +08:00
Concedo	746f5fa9e9	update lite	2023-06-29 16:44:39 +08:00
Concedo	b084f4dc46	option for cublas	2023-06-28 21:16:40 +08:00
YellowRoseCx	8afa800fb6	Expose low_vram for CUDA Enabling --lowvram instructs the program to not allocate a VRAM scratch buffer for holding temporary results. Reduces VRAM usage at the cost of performance, particularly prompt processing speed. Requires CUDA	2023-06-26 16:47:22 -05:00
Concedo	1fdf9d1131	desc	2023-06-26 16:58:59 +08:00
Concedo	d2034ced7b	Merge branch 'master' into concedo_experimental # Conflicts: # README.md # build.zig # flake.nix # tests/test-grad0.c # tests/test-sampling.cpp # tests/test-tokenizer-0.cpp	2023-06-25 17:01:15 +08:00
Concedo	8342fe81b1	revert the wstring tokenization. coherency was affected	2023-06-24 12:58:49 +08:00
Concedo	6da38b0d40	up ver	2023-06-24 12:30:38 +08:00
Concedo	df9135e3a9	fixing memory bugs	2023-06-23 18:41:23 +08:00
Ycros	b1f00fa9cc	Fix hordeconfig max context setting, and add Makefile flags for cuda F16/KQuants per iter. (#252 ) * Fix hordeconfig maxcontext setting. * cuda: Bring DMMV_F16 and KQUANTS_ITER Makefile flags over from llama.	2023-06-21 23:01:46 +08:00
Concedo	d0d3c4f32b	Merge remote-tracking branch 'origin/master' into concedo_experimental # Conflicts: # README.md	2023-06-18 22:53:10 +08:00
Concedo	b08b371983	allow hordeconfig to set a max ctx length too.	2023-06-18 16:42:32 +08:00
Concedo	8775dd99f4	various debug logging improvements	2023-06-18 15:24:58 +08:00
Concedo	dbd11ddd60	up ver	2023-06-17 23:08:14 +08:00
Concedo	ae88eec40b	updated lite	2023-06-16 16:27:23 +08:00
Concedo	3ed3e7b7e2	reverted sequence mode for rwkv due to multiple issues with speed loss with bigger quantized models	2023-06-14 20:03:14 +08:00
Concedo	443903fa0f	up ver with these minor improvements	2023-06-14 11:50:13 +08:00
Concedo	82cf97ce92	hotfix for rwkv	2023-06-13 23:38:41 +08:00
Concedo	fa64971881	encoding	2023-06-10 21:05:35 +08:00
Concedo	66a3f4e421	added support for lora base	2023-06-10 19:29:45 +08:00
Concedo	a68fcfe738	only start a new thread when using sse	2023-06-10 19:03:41 +08:00
Concedo	43f7e40470	added extra endpoints for abort gen and polled streaming	2023-06-10 18:13:26 +08:00
Concedo	5bd9cef9fa	merging Proper SSE Token Streaming #220 with end connection fix test	2023-06-09 23:22:16 +08:00

... 2 3 4 5 6 ...

304 commits