Commit graph

461 commits

Author SHA1 Message Date
Eugene Palmoff
a787ebe7cf
Handle broken pipe error (#572) 2023-12-21 17:51:36 +08:00
Concedo
3f863eed72 add presence penalty 2023-12-19 23:18:56 +08:00
Concedo
da2db0302c Added support for ssl cert and key 2023-12-19 22:23:19 +08:00
Concedo
49a5dfc604 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	README.md
2023-12-19 16:07:48 +08:00
Concedo
1f77d2ad73 move multiprocessing import into function scope 2023-12-19 15:56:58 +08:00
ebolam
6948da5a0d
Fix for windows model unloading not releasing memory (#569)
* Add in model processes as a separate process so it can be killed when unloading to release memory on windows

* Fix from Henky
2023-12-19 15:55:41 +08:00
Concedo
ec05230703 updated lite, up ver 2023-12-17 14:38:39 +08:00
Concedo
aac7f0b944 Merge branch 'master' into concedo_experimental
# Conflicts:
#	ggml.c
2023-12-14 17:24:42 +08:00
Concedo
f0de4953ae fixed length exceeding max ctx 2023-12-14 16:58:41 +08:00
Concedo
0e31f53422 Revert "lowvram var defaults"
This reverts commit 7a691522a6.
2023-12-14 15:14:11 +08:00
Concedo
8dd975653d removing existing yml files 2023-12-14 14:47:03 +08:00
Concedo
74acc5441d Revert "Hide hipBLAS (ROCm) if CuBLAS exists - vice versa"
This reverts commit 4b854d46a4.
2023-12-12 10:53:34 +08:00
Concedo
06581f243f perf endpoint lets you monitor if the embedded horde worker has issues 2023-12-11 16:54:42 +08:00
YellowRoseCx
4b854d46a4 Hide hipBLAS (ROCm) if CuBLAS exists - vice versa 2023-12-10 22:49:35 -06:00
Concedo
7a691522a6 lowvram var defaults 2023-12-08 21:06:32 +08:00
Concedo
7418bca910 up ver 2023-12-08 19:20:30 +08:00
Concedo
c47bc28488 slight refactor for noscript ui 2023-12-08 18:35:45 +08:00
Concedo
ec21fa7712 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	Package.swift
#	README.md
#	ggml-cuda.cu
#	llama.cpp
#	llama.h
#	scripts/sync-ggml.sh
#	tests/CMakeLists.txt
2023-12-08 17:42:26 +08:00
Concedo
930cdfb1ce updated lite, added patch that links to noscript mode 2023-12-08 16:53:30 +08:00
Concedo
c7511526a2 noscript mode is done 2023-12-07 00:52:25 +08:00
Concedo
12002d8ed6 very basic noscript mode 2023-12-06 17:51:08 +08:00
Concedo
b6f952fd8d improved exit logic 2023-12-05 21:08:10 +08:00
Concedo
a5a5839f5c handle accidentally selecting a kcpps file as model instead 2023-12-04 21:10:42 +08:00
Concedo
6570a2005b token count includes ids 2023-12-03 15:44:53 +08:00
Concedo
c142c5634a fixed segfault with clblast by reversing commit in issue https://github.com/ggerganov/llama.cpp/issues/4296 2023-12-03 00:56:00 +08:00
Concedo
a829a1ee56 fix for janitorai 2023-12-02 23:58:41 +08:00
Concedo
1c422f45cb more printouts 2023-12-02 11:48:48 +08:00
Concedo
66ef4a20e2 refined multiuser mode 2023-11-29 14:29:45 +08:00
Concedo
b75152e3e9 added a proper quiet mode 2023-11-28 21:20:51 +08:00
Concedo
ba5c33319b Allocate a small amount of extra context for GGUF to deal with KV fragmentation causing issues in some scenarios. 2023-11-28 20:55:14 +08:00
Concedo
d2ef458b02 show more info about available APIs 2023-11-28 17:17:47 +08:00
Concedo
0e5f16de53 reduce max ctx to fit instead of crashing 2023-11-27 19:08:54 +08:00
Concedo
2f51a6afd5 trigger quiet mode when selecting remotetunnel 2023-11-27 00:16:36 +08:00
Concedo
bffa78116d explore quiet mode 2023-11-26 23:57:27 +08:00
Concedo
eb42c73953 revert auto rope scaling for already-ropetuned models - just use their values 2023-11-24 14:20:36 +08:00
Concedo
dc4078c039 fixed segfault with all non-gguf models 2023-11-20 22:31:56 +08:00
Concedo
22c56f9221 default to multiuser 2023-11-18 12:55:59 +08:00
Concedo
a3f708afce added more fields to the openai compatible completions APIs 2023-11-16 00:58:08 +08:00
Concedo
8b919b5b57 allow customized rope to use model set values 2023-11-15 16:21:52 +08:00
Concedo
f4ee91abbb improved estimation 2023-11-13 15:45:13 +08:00
Concedo
be92cfa125 added preloadstory 2023-11-10 13:05:22 +08:00
Concedo
7ef4ec3b16 added trim_stop flag 2023-11-09 16:55:44 +08:00
Concedo
afa466807d nooby layer selector considers contextsize 2023-11-09 14:05:35 +08:00
Concedo
fb3bcac368 handle memory separately for kcpp 2023-11-07 17:15:14 +08:00
Concedo
ea81eae189 cleanup, up ver (+1 squashed commits)
Squashed commits:

[1ea303d6] cleanup , up ver (+1 squashed commits)

Squashed commits:

[79f09b22] cleanup
2023-11-05 22:49:23 +08:00
YellowRoseCx
e2e5fe56a8
KCPP Fetches AMD ROCm Memory without a stick, CC_TURING Gets the Boot, koboldcpp_hipblas.dll Talks To The Hand, and hipBLAS Compiler Finds Its Independence! (#517)
* AMD ROCm memory fetching and max mem setting

* Update .gitignore with koboldcpp_hipblas.dll

* Update CMakeLists.txt remove CC_TURING for AMD

* separate hipBLAS compiler, update MMV_Y, move CXX/CC print

separate hipBLAS compiler, update MMV_Y value, move the section that prints CXX and CC compiler name
2023-11-05 22:23:18 +08:00
Concedo
5e5be717c3 fix for removing inaccessible backends in gui 2023-11-05 10:12:12 +08:00
Concedo
1e7088a80b autopick cublas in gui if possible, better layer picking logic 2023-11-05 01:35:27 +08:00
Concedo
135001abc4 try to make the tunnel more reliable 2023-11-04 09:18:19 +08:00
Concedo
36f43ae834 syntax correction 2023-11-04 00:03:45 +08:00