Concedo
8424a35c62
added the ability to ban any substring tokens
2023-07-06 23:24:21 +08:00
Concedo
27a0907cfa
backport MM256_SET_M128I to ggml_v2, updated lite, added support for selecting the GPU for cublas
2023-07-06 22:33:46 +08:00
Concedo
4d1700b172
adjust some ui sizing
2023-07-06 15:17:47 +08:00
Vali-98
1c80002310
New UI using customtkinter ( #284 )
...
* Initial conversion to customtkinter.
* Initial conversion to customtkinter.
* Additions to UI, still non-functional
* UI now functional, untested
* UI now functional, untested
* Added saving configs
* Saving and loading now functional
* Fixed sliders not loading
* Cleaned up duplicate arrays
* Cleaned up duplicate arrays
* Fixed loading bugs
* wip fixing all the broken parameters. PLEASE test before you commit
* further cleaning
* bugfix completed for gui. now evaluating save and load
* cleanup prepare to merge
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-07-06 15:00:57 +08:00
Concedo
00e35d0bbf
Merge branch 'concedo' into concedo_experimental
2023-07-04 18:46:40 +08:00
Michael Moon
f9108ba401
Make koboldcpp.py executable on Linux ( #293 )
2023-07-04 18:46:08 +08:00
Concedo
c6c0afdf18
refactor to avoid code duplication
2023-07-04 18:35:54 +08:00
Ycros
309534dcd0
implement sampler order, expose sampler order and mirostat in api
2023-07-02 18:15:34 +00:00
Concedo
632bf27b65
more granular context size selections
2023-07-01 11:02:44 +08:00
Concedo
eda663f15f
update lite and up ver
2023-07-01 00:15:26 +08:00
Concedo
f09debb1ec
remove debug
2023-06-29 20:54:56 +08:00
Concedo
4b3a1282f0
Add flag for lowvram directly into cublas launch param
...
Merge remote-tracking branch 'yellowrose/pr/open/LostRuins/koboldcpp/lowvram' into concedo_experimental
# Conflicts:
# koboldcpp.py
2023-06-29 17:07:31 +08:00
Concedo
746f5fa9e9
update lite
2023-06-29 16:44:39 +08:00
Concedo
b084f4dc46
option for cublas
2023-06-28 21:16:40 +08:00
YellowRoseCx
8afa800fb6
Expose low_vram for CUDA
...
Enabling --lowvram instructs the program to not allocate a VRAM scratch buffer for holding temporary results. Reduces VRAM usage at the cost of performance, particularly prompt processing speed. Requires CUDA
2023-06-26 16:47:22 -05:00
Concedo
1fdf9d1131
desc
2023-06-26 16:58:59 +08:00
Concedo
d2034ced7b
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
# build.zig
# flake.nix
# tests/test-grad0.c
# tests/test-sampling.cpp
# tests/test-tokenizer-0.cpp
2023-06-25 17:01:15 +08:00
Concedo
8342fe81b1
revert the wstring tokenization. coherency was affected
2023-06-24 12:58:49 +08:00
Concedo
6da38b0d40
up ver
2023-06-24 12:30:38 +08:00
Concedo
df9135e3a9
fixing memory bugs
2023-06-23 18:41:23 +08:00
Ycros
b1f00fa9cc
Fix hordeconfig max context setting, and add Makefile flags for cuda F16/KQuants per iter. ( #252 )
...
* Fix hordeconfig maxcontext setting.
* cuda: Bring DMMV_F16 and KQUANTS_ITER Makefile flags over from llama.
2023-06-21 23:01:46 +08:00
Concedo
d0d3c4f32b
Merge remote-tracking branch 'origin/master' into concedo_experimental
...
# Conflicts:
# README.md
2023-06-18 22:53:10 +08:00
Concedo
b08b371983
allow hordeconfig to set a max ctx length too.
2023-06-18 16:42:32 +08:00
Concedo
8775dd99f4
various debug logging improvements
2023-06-18 15:24:58 +08:00
Concedo
dbd11ddd60
up ver
2023-06-17 23:08:14 +08:00
Concedo
ae88eec40b
updated lite
2023-06-16 16:27:23 +08:00
Concedo
3ed3e7b7e2
reverted sequence mode for rwkv due to multiple issues with speed loss with bigger quantized models
2023-06-14 20:03:14 +08:00
Concedo
443903fa0f
up ver with these minor improvements
2023-06-14 11:50:13 +08:00
Concedo
82cf97ce92
hotfix for rwkv
2023-06-13 23:38:41 +08:00
Concedo
fa64971881
encoding
2023-06-10 21:05:35 +08:00
Concedo
66a3f4e421
added support for lora base
2023-06-10 19:29:45 +08:00
Concedo
a68fcfe738
only start a new thread when using sse
2023-06-10 19:03:41 +08:00
Concedo
43f7e40470
added extra endpoints for abort gen and polled streaming
2023-06-10 18:13:26 +08:00
Concedo
5bd9cef9fa
merging Proper SSE Token Streaming #220 with end connection fix test
2023-06-09 23:22:16 +08:00
SammCheese
57b0b53b54
fix kobold lite generation
2023-06-09 12:39:35 +02:00
SammCheese
e6231c3055
back to http.server, improved implementation
2023-06-09 12:17:55 +02:00
SammCheese
dee692a63e
compability with basic_api, change api path to /extra
2023-06-08 18:34:24 +02:00
SammCheese
b4e9e185d3
fix legacy streaming
2023-06-08 18:34:24 +02:00
SammCheese
9a8da35ec4
working streaming. TODO: fix lite
2023-06-08 18:34:23 +02:00
SammCheese
97971291e9
draft: token streaming
2023-06-08 18:34:08 +02:00
Concedo
a6a0fa338a
cleanup indentation, fixing cublas build
2023-06-08 22:40:53 +08:00
Concedo
c046db5197
lite bugfixes, buffer size changes, fixed a topk bug.
2023-06-06 22:38:25 +08:00
Concedo
79df932d0a
added dropdown for blasbatch. added capability to build avx clblast but not in default build for now
2023-06-05 22:50:21 +08:00
Concedo
9aa2d8535b
hide gpu input box when dropdown not selected, minor memory fix for neox and gptj
2023-06-04 21:47:17 +08:00
Concedo
c3c05fc33b
further cleanup, refactor renamemode to hordeconfig
2023-06-04 11:57:46 +08:00
Concedo
8bd9a3a48b
updated readme, improved simple launcher
2023-06-03 17:17:15 +08:00
Concedo
9839259b63
allow specifying the horde limit as well
2023-06-03 00:55:44 +08:00
Concedo
37659d2c4e
allow blasbatchsize -1 which disables blas, but keeps benefits like gpu offloads.
2023-06-01 22:33:50 +08:00
Concedo
49272e3c53
adjusted defaults
2023-06-01 20:03:44 +08:00
Concedo
32dada5e5f
updated lite
2023-05-31 17:52:09 +08:00