Concedo
|
d75cbd671d
|
alias banned_tokens with banned_strings from ST
increase max bans to 32 for now
|
2024-10-10 21:52:46 +08:00 |
|
Concedo
|
fe5479f286
|
unify antislop and token bans
|
2024-10-10 18:21:07 +08:00 |
|
Concedo
|
a6bf568fda
|
prevent GUI settings from being overridden
|
2024-10-10 11:46:57 +08:00 |
|
Concedo
|
65f3c68399
|
wip antislop
|
2024-10-07 20:19:22 +08:00 |
|
Concedo
|
3e8bb10e2d
|
wip on rewind function
|
2024-10-06 16:21:03 +08:00 |
|
Concedo
|
d9fcb94472
|
do not suppress stdout if debugmode
|
2024-10-04 16:04:29 +08:00 |
|
Concedo
|
a785a91e56
|
every request has timestamp
|
2024-09-27 22:10:41 +08:00 |
|
Concedo
|
ea55f69dc1
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# .dockerignore
# .github/workflows/build.yml
# .github/workflows/docker.yml
# Makefile
# README.md
# examples/infill/infill.cpp
# examples/perplexity/perplexity.cpp
# examples/server/README.md
# examples/speculative/speculative.cpp
# flake.lock
# ggml/src/CMakeLists.txt
# scripts/sync-ggml.last
# tests/test-backend-ops.cpp
# tests/test-sampling.cpp
|
2024-09-27 11:21:28 +08:00 |
|
Concedo
|
5a4bc89c8d
|
quiet mode on perf endpoint
|
2024-09-22 13:03:02 +08:00 |
|
Concedo
|
c38d1ecc8d
|
update templates, fix rwkv
|
2024-09-22 01:32:12 +08:00 |
|
Concedo
|
229108f877
|
fixed incorrect auto gpu settings, fixed clblast not working
|
2024-09-21 17:59:52 +08:00 |
|
Concedo
|
004a35b16d
|
add mutually exclusive group
|
2024-09-21 15:46:55 +08:00 |
|
Concedo
|
4b6a12e9c0
|
allow overriding kcpps values with explicit args
|
2024-09-21 11:00:10 +08:00 |
|
Concedo
|
68aee56498
|
updated lite, allow GUI to import launcher args and config files with showgui
|
2024-09-20 17:52:52 +08:00 |
|
Concedo
|
e958b2f78b
|
updated lite
|
2024-09-17 00:16:21 +08:00 |
|
Concedo
|
a4249abe5d
|
alias noblas to usecpu
|
2024-09-15 21:25:48 +08:00 |
|
Concedo
|
53bf0fb32d
|
removed openblas backend, merged into CPU (with llamafile for BLAS). GPU backend is now automatically selected when running from CLI unless noblas is specified.
|
2024-09-15 19:21:52 +08:00 |
|
Concedo
|
5b658ab6d4
|
updated lite
|
2024-09-12 10:47:47 +08:00 |
|
Concedo
|
70cdb55cc9
|
Merge commit '947538acb8 ' into concedo_experimental
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/docker.yml
# CMakePresets.json
# examples/llama-bench/llama-bench.cpp
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# tests/test-backend-ops.cpp
# tests/test-quantize-fns.cpp
|
2024-09-09 11:26:34 +08:00 |
|
Concedo
|
d777995991
|
able to handle kcpp protected model name endpoints
|
2024-09-04 16:26:28 +08:00 |
|
Concedo
|
5d34de0c08
|
fix basepath
|
2024-09-02 18:09:58 +08:00 |
|
Concedo
|
3c4fa57026
|
allow horde worker to work with password protected instances
|
2024-08-31 21:30:47 +08:00 |
|
Concedo
|
0f9968ef64
|
fixed some incorrect protocol prefix for localhost
|
2024-08-29 10:37:43 +08:00 |
|
Concedo
|
5f360f659c
|
Add 5m timeout for horde worker
|
2024-08-28 23:17:06 +08:00 |
|
Concedo
|
6acbf1d7f4
|
macos default to full offload when using gpulayers auto (-1)
|
2024-08-26 12:12:51 +08:00 |
|
Concedo
|
97aa8648ed
|
allow launching with no models loaded
|
2024-08-25 23:57:32 +08:00 |
|
Concedo
|
0b96097439
|
add version number into help page
|
2024-08-22 00:52:30 +08:00 |
|
Concedo
|
5bf527a6ae
|
added xtc sampler
|
2024-08-21 23:57:15 +08:00 |
|
Concedo
|
cd69ab218e
|
fixed DRY
|
2024-08-21 17:01:28 +08:00 |
|
Concedo
|
2cf6d16c40
|
adjust sleep time
|
2024-08-21 01:06:41 +08:00 |
|
Concedo
|
c1ae350e5b
|
fixed race condition when generating
|
2024-08-20 20:17:55 +08:00 |
|
Concedo
|
7ee359a59b
|
on multigpu setups, pick lowest free mem instead of highest for auto layers
|
2024-08-20 19:02:16 +08:00 |
|
Concedo
|
e9eb6fe51a
|
move chat compl to models tab
|
2024-08-18 14:56:10 +08:00 |
|
Concedo
|
e2e6d892b4
|
fix declaration order
|
2024-08-18 02:15:34 +08:00 |
|
Concedo
|
d71b5477c5
|
update lite, cleanup, fix interrogate format
|
2024-08-18 00:48:53 +08:00 |
|
Concedo
|
2c108ab17e
|
correct phrasing
|
2024-08-14 21:55:53 +08:00 |
|
Concedo
|
f4f24d0e14
|
small text change
|
2024-08-11 21:30:46 +08:00 |
|
Concedo
|
139ab3d198
|
generate passes whole object now
|
2024-08-11 00:08:13 +08:00 |
|
Concedo
|
da8a96199c
|
add a space between the bench prompt to fix an issue with old bpe tokenizer stack overflow (+1 squashed commits)
Squashed commits:
[44a689de] add a space between the bench prompt to fix an issue with old bpe tokenizer stack overflow
|
2024-08-10 19:35:56 +08:00 |
|
Concedo
|
86e687ae8b
|
updated lite, added promptlimit
|
2024-08-10 16:05:24 +08:00 |
|
Concedo
|
03adb90dc6
|
prompt command done
|
2024-08-07 20:52:28 +08:00 |
|
Concedo
|
853d57c53c
|
wip prompt
|
2024-08-06 21:54:08 +08:00 |
|
Concedo
|
6b8b50b350
|
try fix ipv6 (+1 squashed commits)
Squashed commits:
[8d95a639] try fix ipv6
|
2024-08-06 15:36:46 +08:00 |
|
Concedo
|
381b4a1844
|
default multiuser true
|
2024-08-05 20:03:29 +08:00 |
|
Concedo
|
bd4e55eb74
|
add used memory checks, add gpulayers for metal
|
2024-08-05 16:32:05 +08:00 |
|
Concedo
|
23caa63f94
|
up ver
|
2024-08-04 23:42:22 +08:00 |
|
Concedo
|
bfdf4b021f
|
adjust v4-v6 allocation, default back to localhost
|
2024-08-04 11:42:16 +08:00 |
|
Concedo
|
40481abf0c
|
allow ipv6 as well
|
2024-08-04 00:53:19 +08:00 |
|
Concedo
|
9a0976761e
|
use loopback ip instead of localhost
|
2024-08-03 00:41:32 +08:00 |
|
Concedo
|
6bf78967f9
|
more janky nonsense
|
2024-08-02 21:58:28 +08:00 |
|