Commit graph

406 commits

Author SHA1 Message Date
Concedo
21588cefd4 tunnel code done (+1 squashed commits)
Squashed commits:

[b4bc7d20] wip integration of trycloudflare
2023-11-01 23:28:23 +08:00
Concedo
3b227fc704 automatic gpu layer detection 2023-11-01 20:55:26 +08:00
Concedo
b395dbf6f5 wip layer calculator 2023-11-01 20:04:10 +08:00
Concedo
ae2cd56de8 kobold integration of min_p sampler (+1 squashed commits)
Squashed commits:

[8ad2e349] kobold integration for min_p sampler
2023-11-01 19:08:45 +08:00
Concedo
df7e757d40 windows: added simpleclinfo, which helps determine clblast platform and device on windows 2023-11-01 18:10:35 +08:00
Concedo
f3690ba6d2 shifting enabled by default 2023-10-31 21:41:57 +08:00
Concedo
61c395833d context shifting is still buggy 2023-10-30 16:25:01 +08:00
Concedo
7f5d1b2fc6 slider error 2023-10-30 00:02:38 +08:00
Concedo
7924592a83 context shift feature done 2023-10-29 18:21:39 +08:00
Concedo
09c74ea046 include content-length 2023-10-28 14:24:37 +08:00
Concedo
15f525c580 revamped smart context for llama models 2023-10-28 12:59:08 +08:00
Concedo
c2f675133d support for abort without crash on disconnect 2023-10-27 15:27:17 +08:00
Concedo
aed05e5565 todo: troubleshoot sse with multiuser 2023-10-27 00:21:52 +08:00
Concedo
5db89b90b7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	README.md
#	build.zig
#	ggml-opencl.cpp
#	tests/CMakeLists.txt
#	tests/test-double-float.cpp
#	tests/test-sampling.cpp
2023-10-25 23:58:15 +08:00
Concedo
98d1dba256 tighten timings 2023-10-25 20:44:20 +08:00
Concedo
cff75061fe fixed some old models failing due to tokenizer changes, update lite (+1 squashed commits)
Squashed commits:

[9dee81ec] fixed some old models failing due to tokenizer changes, update lite tooltip (+3 squashed commit)

Squashed commit:

[5ab95a79] fixes

[a561d5e2] fixed some old models failing due to tokenizer changes

[95e65daf] lite updates
2023-10-22 11:04:59 +08:00
Concedo
6fa681b692 fixed a race condition with SSE streaming 2023-10-20 22:01:09 +08:00
Concedo
4382e51719 updated lite and default horde ctx amount 2023-10-19 22:49:59 +08:00
Concedo
6f8fe88f10 fix for lite (+5 squashed commit)
Squashed commit:

[f9ce9855] catch more exceptions

[8cdaf149] tweaked horde worker timeouts, updated lite

[619ebef4] fixed abort no response if failed

[a54a66a2] fixed time overflow

[9affdc3e] updated lite
2023-10-17 23:04:32 +08:00
Concedo
643902fbbb fixed tensor split save and load 2023-10-13 10:07:22 +08:00
Concedo
7e2f714c9c tensor split only for cuda 2023-10-12 17:01:52 +08:00
Alexander Abushady
11b8f97c1e
Tensor split UI (#471)
* update .gitignore

Remove .idea folder created by Jet Brains products.

* Front end, and partial backe-end

Tensor Split pulled in, shows in console, then not respected on model load.

* UI Tweak + Tensor Split Fix

Made Tensor Flow input match similar boxes around it. Also, fixed Tensor Split to populate the correct argument.

* Changed int to float for tensor split

Accidentally set int, needed to be float when setting tensor split args
2023-10-12 16:50:21 +08:00
Concedo
8be043ee38 more horde optimizations 2023-10-12 16:20:52 +08:00
Concedo
8d1cd512e2 missed a flag 2023-10-12 15:00:51 +08:00
Concedo
c6fe820357 improve cors and header handling 2023-10-12 14:53:39 +08:00
Concedo
f604cffdce multiuser racer bugfix 2023-10-12 13:39:12 +08:00
Concedo
a003e3c348 horde auto recovery 2023-10-12 00:57:32 +08:00
Concedo
d74eab0e63 actually for this round, do not include deprecated params. i dont want to have to deal with them (+2 squashed commit)
Squashed commit:

[df2691c2] show context limit

[7c74f52a] prevent old scripts from crashing
2023-10-10 19:20:33 +08:00
YellowRoseCx
1b25b21655
Merge pull request #27 from one-lithe-rune/allow-sdk-dll-loading - Allow use of hip SDK (if installed) dlls on windows (#470)
* If the rocm/hip sdk is installed on windows, then include the sdk
as a potential location to load the hipBlas/rocBlas .dlls from. This
allows running koboldcpp.py directly with python after building
work on windows without having to build the .exe and run that or
copy .dlls around.

Co-authored-by: one-lithe-rune <skapusniak@lithe-runes.com>
2023-10-10 17:16:33 +08:00
Concedo
f288c6b5e3 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	build.zig
#	scripts/sync-ggml.sh
2023-10-10 00:09:46 +08:00
Matěj Štágl
96e9539f05
OpenAI compat API adapter (#466)
* feat: oai-adapter

* simplify optional adapter for instruct start and end tags

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-10-09 23:24:48 +08:00
Concedo
4e5b6293ab adjust streaming timings 2023-10-08 23:12:45 +08:00
Concedo
a2b8473354 force flush sse 2023-10-08 15:12:07 +08:00
Concedo
07a114de63 force debugmode to be indicated on horde, allow 64k context for gguf 2023-10-07 10:23:33 +08:00
Concedo
120695ddf7 add update link 2023-10-07 01:33:18 +08:00
Concedo
2a36c85558 abort has multiuser support via genkey too 2023-10-06 23:27:00 +08:00
Concedo
1d1232ffbc show horde job count 2023-10-06 18:42:59 +08:00
Concedo
efd0567f10 Merge branch 'concedo' into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-10-06 11:22:01 +08:00
grawity
9d0dd7ab11
avoid leaving a zombie process for --onready (#462)
Popen() needs to be used with 'with' or have .wait() called or be
destroyed, otherwise there is a zombie child that sticks around until
the object is GC'd.
2023-10-06 11:06:37 +08:00
Concedo
da8a09ba10 use filename as default model name 2023-10-05 22:24:20 +08:00
Concedo
a0c1ba7747 Merge branch 'concedo_experimental' of https://github.com/LostRuins/llamacpp-for-kobold into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-10-05 21:20:21 +08:00
Concedo
b4b5c35074 add documentation for koboldcpp 2023-10-05 21:17:36 +08:00
teddybear082
f9f4cdf3c0
Implement basic chat/completions openai endpoint (#461)
* Implement basic chat/completions openai endpoint

-Basic support for openai chat/completions endpoint documented at: https://platform.openai.com/docs/api-reference/chat/create

-Tested with example code from openai for chat/completions and chat/completions with stream=True parameter found here: https://cookbook.openai.com/examples/how_to_stream_completions.

-Tested with Mantella, the skyrim mod that turns all the NPC's into AI chattable characters, which uses openai's acreate / async competions method: https://github.com/art-from-the-machine/Mantella/blob/main/src/output_manager.py

-Tested default koboldcpp api behavior with streaming and non-streaming generate endpoints and running GUI and seems to be fine.

-Still TODO / evaluate before merging:

(1) implement rest of openai chat/completion parameters to the extent possible, mapping to koboldcpp parameters

(2) determine if there is a way to use kobold's prompt formats for certain models when translating openai messages format into a prompt string. (Not sure if possible or where these are in the code)

(3) have chat/completions responses include the actual local model the user is using instead of just koboldcpp (Not sure if this is possible)

Note I am a python noob, so if there is a more elegant way of doing this at minimum hopefully I have done some of the grunt work for you to implement on your own.

* Fix typographical error on deleted streaming argument

-Mistakenly left code relating to streaming argument from main branch in experimental.

* add additional openai chat completions parameters

-support stop parameter mapped to koboldai stop_sequence parameter

-make default max_length / max_tokens parameter consistent with default 80 token length in generate function

-add support for providing name of local model in openai responses

* Revert "add additional openai chat completions parameters"

This reverts commit 443a6f7ff6.

* add additional openai chat completions parameters

-support stop parameter mapped to koboldai stop_sequence parameter

-make default max_length / max_tokens parameter consistent with default 80 token length in generate function

-add support for providing name of local model in openai responses

* add /n after formatting prompts from openaiformat

to conform with alpaca standard used as default in lite.koboldai.net

* tidy up and simplify code, do not set globals for streaming

* oai endpoints must start with v1

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-10-05 20:13:10 +08:00
Concedo
ce065d39d0 allow drag and drop kcpps file and openwith 2023-10-05 11:38:37 +08:00
Concedo
47f7ebb632 adjust horde worker and debugmode 2023-10-04 14:00:07 +08:00
Concedo
ea726fcffa cleanup threaded horde submit 2023-10-04 00:34:26 +08:00
Concedo
0cc740115d updated lite, improve horde worker (+1 squashed commits)
Squashed commits:

[a7c25999] improve horde worker
2023-10-03 23:44:27 +08:00
Concedo
ae8ccdc1be Remove old tkinter gui (+1 squashed commits)
Squashed commits:

[0933c1da] Remove old tkinter gui
2023-10-03 22:05:44 +08:00
Concedo
d10470a1e3 Breaking Change: Remove deprecated commands 2023-10-03 17:16:09 +08:00
Concedo
5d3e142145 use_default_badwordsids defaults to false if the parameter is missing 2023-10-02 19:41:07 +08:00