Concedo
|
bc24c9334c
|
prevent prompt leakage during usage of check endpoint when genkey is provided in multiuser mode
|
2023-12-24 17:08:43 +08:00 |
|
Concedo
|
8823e8b06d
|
added presence penalty into lite ui
|
2023-12-23 10:39:40 +08:00 |
|
Concedo
|
852ca780c9
|
cherrypicked the Hipblas fixed from PR #571
|
2023-12-22 21:29:20 +08:00 |
|
Concedo
|
77463e0e9c
|
batch size improvements
|
2023-12-22 15:27:40 +08:00 |
|
Concedo
|
2378a29bde
|
better error handling, try to avoid segfault in sillytavern
|
2023-12-21 22:58:48 +08:00 |
|
Eugene Palmoff
|
a787ebe7cf
|
Handle broken pipe error (#572)
|
2023-12-21 17:51:36 +08:00 |
|
Concedo
|
3f863eed72
|
add presence penalty
|
2023-12-19 23:18:56 +08:00 |
|
Concedo
|
da2db0302c
|
Added support for ssl cert and key
|
2023-12-19 22:23:19 +08:00 |
|
Concedo
|
49a5dfc604
|
Merge branch 'master' into concedo_experimental
# Conflicts:
# Makefile
# README.md
|
2023-12-19 16:07:48 +08:00 |
|
Concedo
|
1f77d2ad73
|
move multiprocessing import into function scope
|
2023-12-19 15:56:58 +08:00 |
|
ebolam
|
6948da5a0d
|
Fix for windows model unloading not releasing memory (#569)
* Add in model processes as a separate process so it can be killed when unloading to release memory on windows
* Fix from Henky
|
2023-12-19 15:55:41 +08:00 |
|
Concedo
|
ec05230703
|
updated lite, up ver
|
2023-12-17 14:38:39 +08:00 |
|
Concedo
|
aac7f0b944
|
Merge branch 'master' into concedo_experimental
# Conflicts:
# ggml.c
|
2023-12-14 17:24:42 +08:00 |
|
Concedo
|
f0de4953ae
|
fixed length exceeding max ctx
|
2023-12-14 16:58:41 +08:00 |
|
Concedo
|
0e31f53422
|
Revert "lowvram var defaults"
This reverts commit 7a691522a6 .
|
2023-12-14 15:14:11 +08:00 |
|
Concedo
|
8dd975653d
|
removing existing yml files
|
2023-12-14 14:47:03 +08:00 |
|
Concedo
|
74acc5441d
|
Revert "Hide hipBLAS (ROCm) if CuBLAS exists - vice versa"
This reverts commit 4b854d46a4 .
|
2023-12-12 10:53:34 +08:00 |
|
Concedo
|
06581f243f
|
perf endpoint lets you monitor if the embedded horde worker has issues
|
2023-12-11 16:54:42 +08:00 |
|
YellowRoseCx
|
4b854d46a4
|
Hide hipBLAS (ROCm) if CuBLAS exists - vice versa
|
2023-12-10 22:49:35 -06:00 |
|
Concedo
|
7a691522a6
|
lowvram var defaults
|
2023-12-08 21:06:32 +08:00 |
|
Concedo
|
7418bca910
|
up ver
|
2023-12-08 19:20:30 +08:00 |
|
Concedo
|
c47bc28488
|
slight refactor for noscript ui
|
2023-12-08 18:35:45 +08:00 |
|
Concedo
|
ec21fa7712
|
Merge branch 'master' into concedo_experimental
# Conflicts:
# .github/workflows/build.yml
# .gitignore
# CMakeLists.txt
# Makefile
# Package.swift
# README.md
# ggml-cuda.cu
# llama.cpp
# llama.h
# scripts/sync-ggml.sh
# tests/CMakeLists.txt
|
2023-12-08 17:42:26 +08:00 |
|
Concedo
|
930cdfb1ce
|
updated lite, added patch that links to noscript mode
|
2023-12-08 16:53:30 +08:00 |
|
Concedo
|
c7511526a2
|
noscript mode is done
|
2023-12-07 00:52:25 +08:00 |
|
Concedo
|
12002d8ed6
|
very basic noscript mode
|
2023-12-06 17:51:08 +08:00 |
|
Concedo
|
b6f952fd8d
|
improved exit logic
|
2023-12-05 21:08:10 +08:00 |
|
Concedo
|
a5a5839f5c
|
handle accidentally selecting a kcpps file as model instead
|
2023-12-04 21:10:42 +08:00 |
|
Concedo
|
6570a2005b
|
token count includes ids
|
2023-12-03 15:44:53 +08:00 |
|
Concedo
|
c142c5634a
|
fixed segfault with clblast by reversing commit in issue https://github.com/ggerganov/llama.cpp/issues/4296
|
2023-12-03 00:56:00 +08:00 |
|
Concedo
|
a829a1ee56
|
fix for janitorai
|
2023-12-02 23:58:41 +08:00 |
|
Concedo
|
1c422f45cb
|
more printouts
|
2023-12-02 11:48:48 +08:00 |
|
Concedo
|
66ef4a20e2
|
refined multiuser mode
|
2023-11-29 14:29:45 +08:00 |
|
Concedo
|
b75152e3e9
|
added a proper quiet mode
|
2023-11-28 21:20:51 +08:00 |
|
Concedo
|
ba5c33319b
|
Allocate a small amount of extra context for GGUF to deal with KV fragmentation causing issues in some scenarios.
|
2023-11-28 20:55:14 +08:00 |
|
Concedo
|
d2ef458b02
|
show more info about available APIs
|
2023-11-28 17:17:47 +08:00 |
|
Concedo
|
0e5f16de53
|
reduce max ctx to fit instead of crashing
|
2023-11-27 19:08:54 +08:00 |
|
Concedo
|
2f51a6afd5
|
trigger quiet mode when selecting remotetunnel
|
2023-11-27 00:16:36 +08:00 |
|
Concedo
|
bffa78116d
|
explore quiet mode
|
2023-11-26 23:57:27 +08:00 |
|
Concedo
|
eb42c73953
|
revert auto rope scaling for already-ropetuned models - just use their values
|
2023-11-24 14:20:36 +08:00 |
|
Concedo
|
dc4078c039
|
fixed segfault with all non-gguf models
|
2023-11-20 22:31:56 +08:00 |
|
Concedo
|
22c56f9221
|
default to multiuser
|
2023-11-18 12:55:59 +08:00 |
|
Concedo
|
a3f708afce
|
added more fields to the openai compatible completions APIs
|
2023-11-16 00:58:08 +08:00 |
|
Concedo
|
8b919b5b57
|
allow customized rope to use model set values
|
2023-11-15 16:21:52 +08:00 |
|
Concedo
|
f4ee91abbb
|
improved estimation
|
2023-11-13 15:45:13 +08:00 |
|
Concedo
|
be92cfa125
|
added preloadstory
|
2023-11-10 13:05:22 +08:00 |
|
Concedo
|
7ef4ec3b16
|
added trim_stop flag
|
2023-11-09 16:55:44 +08:00 |
|
Concedo
|
afa466807d
|
nooby layer selector considers contextsize
|
2023-11-09 14:05:35 +08:00 |
|
Concedo
|
fb3bcac368
|
handle memory separately for kcpp
|
2023-11-07 17:15:14 +08:00 |
|
Concedo
|
ea81eae189
|
cleanup, up ver (+1 squashed commits)
Squashed commits:
[1ea303d6] cleanup , up ver (+1 squashed commits)
Squashed commits:
[79f09b22] cleanup
|
2023-11-05 22:49:23 +08:00 |
|