Commit graph

304 commits

Author SHA1 Message Date
Concedo
1d1111e10f expose timing info in web api 2023-07-11 18:56:06 +08:00
Concedo
7222877069 Merge remote-tracking branch 'ren/concedo' into concedo_experimental 2023-07-11 18:45:36 +08:00
Concedo
5ca204d527 Merge remote-tracking branch 'yellowrose/pr/open/LostRuins/koboldcpp/multigpu-cuda-gui' into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-07-11 18:22:54 +08:00
Concedo
4be167915a added linear rope option, added warning for bad samplers 2023-07-11 18:08:19 +08:00
Concedo
9324cb804a reimplemented save and load 2023-07-10 22:49:27 +08:00
YellowRoseCx
f1014f3cc7 remove unused .re 2023-07-10 00:26:40 -05:00
YellowRoseCx
242f01e983 Add Multi-GPU CuBLAS support in the new GUI 2023-07-09 17:10:14 -05:00
callMeMakerRen
4e46673f80
Merge branch 'LostRuins:concedo' into concedo 2023-07-08 09:33:26 +08:00
Concedo
8edcb337c6 added ability to select "all devices" 2023-07-07 23:37:55 +08:00
Concedo
ddaa4f2a26 fix cuda garbage results and gpu selection issues 2023-07-07 22:14:14 +08:00
Concedo
95eca51bef add gpu choice for GUI for cuda 2023-07-07 18:39:47 +08:00
Concedo
a689a66068 make it work with pyinstaller 2023-07-07 17:52:34 +08:00
Concedo
9ee9a77f12 warn outdated GUI (+1 squashed commits)
Squashed commits:

[15aec3d] spelling error
2023-07-07 16:39:17 +08:00
Concedo
32102c2064 Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-07-07 14:15:39 +08:00
shutup
894c72819c Merge branch 'concedo' of https://github.com/callMeMakerRen/koboldcpp into concedo 2023-07-07 11:57:25 +08:00
shutup
1727e652f1 expose some useful info that can be used in statistics of performence 2023-07-07 11:52:58 +08:00
Concedo
8424a35c62 added the ability to ban any substring tokens 2023-07-06 23:24:21 +08:00
Concedo
27a0907cfa backport MM256_SET_M128I to ggml_v2, updated lite, added support for selecting the GPU for cublas 2023-07-06 22:33:46 +08:00
Concedo
4d1700b172 adjust some ui sizing 2023-07-06 15:17:47 +08:00
Vali-98
1c80002310
New UI using customtkinter (#284)
* Initial conversion to customtkinter.

* Initial conversion to customtkinter.

* Additions to UI, still non-functional

* UI now functional, untested

* UI now functional, untested

* Added saving configs

* Saving and loading now functional

* Fixed sliders not loading

* Cleaned up duplicate arrays

* Cleaned up duplicate arrays

* Fixed loading bugs

* wip fixing all the broken parameters. PLEASE test before you commit

* further cleaning

* bugfix completed for gui. now evaluating save and load

* cleanup prepare to merge

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-07-06 15:00:57 +08:00
Concedo
00e35d0bbf Merge branch 'concedo' into concedo_experimental 2023-07-04 18:46:40 +08:00
Michael Moon
f9108ba401
Make koboldcpp.py executable on Linux (#293) 2023-07-04 18:46:08 +08:00
Concedo
c6c0afdf18 refactor to avoid code duplication 2023-07-04 18:35:54 +08:00
Ycros
309534dcd0 implement sampler order, expose sampler order and mirostat in api 2023-07-02 18:15:34 +00:00
Concedo
632bf27b65 more granular context size selections 2023-07-01 11:02:44 +08:00
Concedo
eda663f15f update lite and up ver 2023-07-01 00:15:26 +08:00
Concedo
f09debb1ec remove debug 2023-06-29 20:54:56 +08:00
Concedo
4b3a1282f0 Add flag for lowvram directly into cublas launch param
Merge remote-tracking branch 'yellowrose/pr/open/LostRuins/koboldcpp/lowvram' into concedo_experimental

# Conflicts:
#	koboldcpp.py
2023-06-29 17:07:31 +08:00
Concedo
746f5fa9e9 update lite 2023-06-29 16:44:39 +08:00
Concedo
b084f4dc46 option for cublas 2023-06-28 21:16:40 +08:00
YellowRoseCx
8afa800fb6 Expose low_vram for CUDA
Enabling --lowvram instructs the program to not allocate a VRAM scratch buffer for holding temporary results. Reduces VRAM usage at the cost of performance, particularly prompt processing speed. Requires CUDA
2023-06-26 16:47:22 -05:00
Concedo
1fdf9d1131 desc 2023-06-26 16:58:59 +08:00
Concedo
d2034ced7b Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
#	build.zig
#	flake.nix
#	tests/test-grad0.c
#	tests/test-sampling.cpp
#	tests/test-tokenizer-0.cpp
2023-06-25 17:01:15 +08:00
Concedo
8342fe81b1 revert the wstring tokenization. coherency was affected 2023-06-24 12:58:49 +08:00
Concedo
6da38b0d40 up ver 2023-06-24 12:30:38 +08:00
Concedo
df9135e3a9 fixing memory bugs 2023-06-23 18:41:23 +08:00
Ycros
b1f00fa9cc
Fix hordeconfig max context setting, and add Makefile flags for cuda F16/KQuants per iter. (#252)
* Fix hordeconfig maxcontext setting.

* cuda: Bring DMMV_F16 and KQUANTS_ITER Makefile flags over from llama.
2023-06-21 23:01:46 +08:00
Concedo
d0d3c4f32b Merge remote-tracking branch 'origin/master' into concedo_experimental
# Conflicts:
#	README.md
2023-06-18 22:53:10 +08:00
Concedo
b08b371983 allow hordeconfig to set a max ctx length too. 2023-06-18 16:42:32 +08:00
Concedo
8775dd99f4 various debug logging improvements 2023-06-18 15:24:58 +08:00
Concedo
dbd11ddd60 up ver 2023-06-17 23:08:14 +08:00
Concedo
ae88eec40b updated lite 2023-06-16 16:27:23 +08:00
Concedo
3ed3e7b7e2 reverted sequence mode for rwkv due to multiple issues with speed loss with bigger quantized models 2023-06-14 20:03:14 +08:00
Concedo
443903fa0f up ver with these minor improvements 2023-06-14 11:50:13 +08:00
Concedo
82cf97ce92 hotfix for rwkv 2023-06-13 23:38:41 +08:00
Concedo
fa64971881 encoding 2023-06-10 21:05:35 +08:00
Concedo
66a3f4e421 added support for lora base 2023-06-10 19:29:45 +08:00
Concedo
a68fcfe738 only start a new thread when using sse 2023-06-10 19:03:41 +08:00
Concedo
43f7e40470 added extra endpoints for abort gen and polled streaming 2023-06-10 18:13:26 +08:00
Concedo
5bd9cef9fa merging Proper SSE Token Streaming #220 with end connection fix test 2023-06-09 23:22:16 +08:00