koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	f288c6b5e3	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # build.zig # scripts/sync-ggml.sh	2023-10-10 00:09:46 +08:00
Matěj Štágl	96e9539f05	OpenAI compat API adapter (#466 ) * feat: oai-adapter * simplify optional adapter for instruct start and end tags --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2023-10-09 23:24:48 +08:00
Concedo	4e5b6293ab	adjust streaming timings	2023-10-08 23:12:45 +08:00
Concedo	a2b8473354	force flush sse	2023-10-08 15:12:07 +08:00
Concedo	07a114de63	force debugmode to be indicated on horde, allow 64k context for gguf	2023-10-07 10:23:33 +08:00
Concedo	120695ddf7	add update link	2023-10-07 01:33:18 +08:00
Concedo	2a36c85558	abort has multiuser support via genkey too	2023-10-06 23:27:00 +08:00
Concedo	1d1232ffbc	show horde job count	2023-10-06 18:42:59 +08:00
Concedo	efd0567f10	Merge branch 'concedo' into concedo_experimental # Conflicts: # koboldcpp.py	2023-10-06 11:22:01 +08:00
grawity	9d0dd7ab11	avoid leaving a zombie process for --onready (#462 ) Popen() needs to be used with 'with' or have .wait() called or be destroyed, otherwise there is a zombie child that sticks around until the object is GC'd.	2023-10-06 11:06:37 +08:00
Concedo	da8a09ba10	use filename as default model name	2023-10-05 22:24:20 +08:00
Concedo	a0c1ba7747	Merge branch 'concedo_experimental' of https://github.com/LostRuins/llamacpp-for-kobold into concedo_experimental # Conflicts: # koboldcpp.py	2023-10-05 21:20:21 +08:00
Concedo	b4b5c35074	add documentation for koboldcpp	2023-10-05 21:17:36 +08:00
teddybear082	f9f4cdf3c0	Implement basic chat/completions openai endpoint (#461 ) * Implement basic chat/completions openai endpoint -Basic support for openai chat/completions endpoint documented at: https://platform.openai.com/docs/api-reference/chat/create -Tested with example code from openai for chat/completions and chat/completions with stream=True parameter found here: https://cookbook.openai.com/examples/how_to_stream_completions. -Tested with Mantella, the skyrim mod that turns all the NPC's into AI chattable characters, which uses openai's acreate / async competions method: https://github.com/art-from-the-machine/Mantella/blob/main/src/output_manager.py -Tested default koboldcpp api behavior with streaming and non-streaming generate endpoints and running GUI and seems to be fine. -Still TODO / evaluate before merging: (1) implement rest of openai chat/completion parameters to the extent possible, mapping to koboldcpp parameters (2) determine if there is a way to use kobold's prompt formats for certain models when translating openai messages format into a prompt string. (Not sure if possible or where these are in the code) (3) have chat/completions responses include the actual local model the user is using instead of just koboldcpp (Not sure if this is possible) Note I am a python noob, so if there is a more elegant way of doing this at minimum hopefully I have done some of the grunt work for you to implement on your own. * Fix typographical error on deleted streaming argument -Mistakenly left code relating to streaming argument from main branch in experimental. * add additional openai chat completions parameters -support stop parameter mapped to koboldai stop_sequence parameter -make default max_length / max_tokens parameter consistent with default 80 token length in generate function -add support for providing name of local model in openai responses * Revert "add additional openai chat completions parameters" This reverts commit `443a6f7ff6`. * add additional openai chat completions parameters -support stop parameter mapped to koboldai stop_sequence parameter -make default max_length / max_tokens parameter consistent with default 80 token length in generate function -add support for providing name of local model in openai responses * add /n after formatting prompts from openaiformat to conform with alpaca standard used as default in lite.koboldai.net * tidy up and simplify code, do not set globals for streaming * oai endpoints must start with v1 --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2023-10-05 20:13:10 +08:00
Concedo	ce065d39d0	allow drag and drop kcpps file and openwith	2023-10-05 11:38:37 +08:00
Concedo	47f7ebb632	adjust horde worker and debugmode	2023-10-04 14:00:07 +08:00
Concedo	ea726fcffa	cleanup threaded horde submit	2023-10-04 00:34:26 +08:00
Concedo	0cc740115d	updated lite, improve horde worker (+1 squashed commits) Squashed commits: [a7c25999] improve horde worker	2023-10-03 23:44:27 +08:00
Concedo	ae8ccdc1be	Remove old tkinter gui (+1 squashed commits) Squashed commits: [0933c1da] Remove old tkinter gui	2023-10-03 22:05:44 +08:00
Concedo	d10470a1e3	Breaking Change: Remove deprecated commands	2023-10-03 17:16:09 +08:00
Concedo	5d3e142145	use_default_badwordsids defaults to false if the parameter is missing	2023-10-02 19:41:07 +08:00
Concedo	23b9d3af49	force oai endpoints to return json	2023-10-02 12:45:14 +08:00
Concedo	0c47e79537	updated the API routing path and fixed a bug with threads	2023-10-02 11:05:19 +08:00
Concedo	dffc6bee74	deprecate some launcher arguments.	2023-10-01 22:30:48 +08:00
Concedo	b49a5bc546	formatting of text	2023-10-01 18:38:32 +08:00
Concedo	bc841ec302	flag to retain grammar, fix makefile (+2 squashed commit) Squashed commit: [d5cd3f28] flag to retain grammar, fix makefile [b3352963] updated lite to v73	2023-10-01 14:39:56 +08:00
Concedo	191de1e8a3	allow launching with kcpps files	2023-09-30 19:35:03 +08:00
Concedo	ca8b315202	increase context for gguf to 32k, horde worker stats, fixed glitch in horde launcher ui, oai freq penalty, updated lite	2023-09-28 23:50:08 +08:00
Concedo	6a821b268a	improved SSE streamiing	2023-09-28 17:33:34 +08:00
Concedo	cf31658cbf	added a flag to keep console in foreground	2023-09-27 01:53:30 +08:00
Concedo	eb86cd4027	bump token limits	2023-09-27 01:26:00 +08:00
Concedo	8bf6f7f8b0	added simulated OAI endpoint	2023-09-27 00:49:24 +08:00
Concedo	7f112e2cd4	support genkeys in polled streaming	2023-09-26 23:46:07 +08:00
Concedo	6c2134a860	improved makefile, allowing building without k quants	2023-09-25 22:10:47 +08:00
Concedo	17ee719c56	improved remotelink cmd, fixed lib unload, updated class.py	2023-09-25 17:50:00 +08:00
Concedo	8ecf505d5d	improved embedded horde worker (+2 squashed commit) Squashed commit: [99234379] improved embedded horde worker [ebcd1968] update lite	2023-09-24 15:16:49 +08:00
Concedo	32cf02487e	colab use mmq, update lite and ver	2023-09-23 23:32:00 +08:00
Concedo	bfc696fcc4	update lite, update ver	2023-09-23 12:35:23 +08:00
Concedo	14295922f9	updated ver, updated lite (+1 squashed commits) Squashed commits: [891291bc] updated lite to v67	2023-09-21 17:44:01 +08:00
Concedo	b63cf223c9	add queue info	2023-09-20 21:07:21 +08:00
Concedo	8c453d1e4e	added grammar sampling	2023-09-18 23:02:00 +08:00
Concedo	951614bfc6	library unloading is working	2023-09-18 15:03:52 +08:00
Concedo	53885de6db	added multiuser mode	2023-09-16 11:23:39 +08:00
YellowRoseCx	4218641d97	Separate CuBLAS/hipBLAS (#438 )	2023-09-16 10:13:44 +08:00
Concedo	63fcbbb3f1	Change label to avoid confusion - rocm hipblas users should obtain binaries from yellowrosecx fork. The rocm support in this repo requires self-compilation	2023-09-16 00:04:11 +08:00
Concedo	4d3a64fbb2	add endpoint to fetch true max context	2023-09-14 23:27:12 +08:00
Concedo	3d50c6fe0b	only add dll directory on windows	2023-09-13 18:45:54 +08:00
Concedo	8f8a530b83	add additional paths to loook for DLLs inside	2023-09-13 14:30:13 +08:00
Concedo	74384cfbb5	added onready argument to execute a command after load is done	2023-09-12 17:10:52 +08:00
Concedo	6667fdcec8	add option for 4th gpu, also fixed missing case in auto rope scaling	2023-09-11 11:43:54 +08:00

1 2 3 4 5 ...

277 commits