Concedo
3813f6c517
added new flag nofastforward allowing users to disable fast forwarding
2024-11-13 10:59:01 +08:00
Concedo
df7c2b9923
renamed some labels
2024-11-11 19:40:47 +08:00
Concedo
c9977a5cb5
model downloading for new params
2024-11-07 14:41:25 +08:00
Concedo
ccbd630a42
allow custom t5, clipl and clipg
2024-11-06 19:05:48 +08:00
Concedo
f153a14daf
add common identity provider /.well-known/serviceinfo, updated docs
2024-11-04 21:29:26 +08:00
Concedo
847689e74c
fixed incorrect makefile flags
2024-11-04 20:39:10 +08:00
Concedo
6ac8b2bdb3
tweak ratios
2024-11-02 12:35:04 +08:00
Concedo
2a07f2dc2c
minor fix
2024-11-01 22:42:57 +08:00
Concedo
bbebc76817
fix top picks bug, lower input anti abuse thresholds (+1 squashed commits)
...
Squashed commits:
[a81d9b21] fix top picks bug, lower input anti abuse thresholds
2024-11-01 16:42:13 +08:00
Concedo
6a27003a06
logprobs feature completed
2024-11-01 15:24:07 +08:00
Concedo
aa26a58085
added logprobs api and logprobs viewer
2024-11-01 00:22:15 +08:00
Concedo
6731dd64f1
quick fix for trim stop
2024-10-30 11:24:55 +08:00
Concedo
90f5cd0f67
wip logprobs data
2024-10-30 00:59:34 +08:00
Concedo
bd05efd648
fix trim_stop failing on some edge cases
2024-10-27 21:41:47 +08:00
Concedo
4ec12756b3
multiuser fixes
2024-10-26 09:33:11 +08:00
Concedo
d0a6a52855
hide flash attention in quick launch for vulkan, updated lite
2024-10-24 22:00:09 +08:00
Concedo
6da5a63852
fix for uploaded wav files being incomplete due to fragmentation when converting to b64
2024-10-20 17:47:19 +08:00
Concedo
a9dbcdd3ec
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# docs/build.md
# examples/infill/infill.cpp
# examples/main/README.md
# examples/server/README.md
# flake.lock
# scripts/sync-ggml.last
# src/llama.cpp
# tests/test-json-schema-to-grammar.cpp
# tests/test-sampling.cpp
2024-10-17 16:36:02 +08:00
Maya
8bb220329c
Dynamic sizes for sequences ( #1157 )
...
* Dynamic sizes for sequences
* cleanup PR - move all dynamic fields to end of payload, ensure correct null handling to match existing behavior, add anti abuse limit of max 512 for dynamic fields
* adjust anti abuse limits
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-10-16 23:55:11 +08:00
JR
8e5ffc5a58
Add header X-Accel-Buffering set to no for SSE stream requests ( #1168 )
2024-10-16 17:28:05 +08:00
Concedo
21b2f6168e
Merge branch 'concedo_experimental' of https://github.com/LostRuins/koboldcpp into concedo_experimental
2024-10-14 22:09:38 +08:00
Concedo
1d40303050
increase again
2024-10-14 22:09:26 +08:00
YellowRoseCx
f029de6e46
Merge pull request #69 from matoro/main ( #1165 )
...
Fix gpulayers autodetection for cublas & clblast backends
2024-10-14 20:10:41 +08:00
Concedo
8d81519ca3
direct user to gguf model resources
2024-10-12 18:39:21 +08:00
Concedo
5ad826b82a
updated lite (+2 squashed commit)
...
Squashed commit:
[31a99e1f] bump baned phrase a bit more again
[c999736b] small fix
2024-10-11 11:05:04 +08:00
Maya
3dab63887f
Add custom_token_bans ( #1153 )
2024-10-10 23:45:07 +08:00
Concedo
a3b104a422
further increase some limits
2024-10-10 22:27:28 +08:00
Concedo
d75cbd671d
alias banned_tokens with banned_strings from ST
...
increase max bans to 32 for now
2024-10-10 21:52:46 +08:00
Concedo
fe5479f286
unify antislop and token bans
2024-10-10 18:21:07 +08:00
Concedo
a6bf568fda
prevent GUI settings from being overridden
2024-10-10 11:46:57 +08:00
Concedo
65f3c68399
wip antislop
2024-10-07 20:19:22 +08:00
Concedo
3e8bb10e2d
wip on rewind function
2024-10-06 16:21:03 +08:00
Concedo
d9fcb94472
do not suppress stdout if debugmode
2024-10-04 16:04:29 +08:00
Concedo
a785a91e56
every request has timestamp
2024-09-27 22:10:41 +08:00
Concedo
ea55f69dc1
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .dockerignore
# .github/workflows/build.yml
# .github/workflows/docker.yml
# Makefile
# README.md
# examples/infill/infill.cpp
# examples/perplexity/perplexity.cpp
# examples/server/README.md
# examples/speculative/speculative.cpp
# flake.lock
# ggml/src/CMakeLists.txt
# scripts/sync-ggml.last
# tests/test-backend-ops.cpp
# tests/test-sampling.cpp
2024-09-27 11:21:28 +08:00
Concedo
5a4bc89c8d
quiet mode on perf endpoint
2024-09-22 13:03:02 +08:00
Concedo
c38d1ecc8d
update templates, fix rwkv
2024-09-22 01:32:12 +08:00
Concedo
229108f877
fixed incorrect auto gpu settings, fixed clblast not working
2024-09-21 17:59:52 +08:00
Concedo
004a35b16d
add mutually exclusive group
2024-09-21 15:46:55 +08:00
Concedo
4b6a12e9c0
allow overriding kcpps values with explicit args
2024-09-21 11:00:10 +08:00
Concedo
68aee56498
updated lite, allow GUI to import launcher args and config files with showgui
2024-09-20 17:52:52 +08:00
Concedo
e958b2f78b
updated lite
2024-09-17 00:16:21 +08:00
Concedo
a4249abe5d
alias noblas to usecpu
2024-09-15 21:25:48 +08:00
Concedo
53bf0fb32d
removed openblas backend, merged into CPU (with llamafile for BLAS). GPU backend is now automatically selected when running from CLI unless noblas is specified.
2024-09-15 19:21:52 +08:00
Concedo
5b658ab6d4
updated lite
2024-09-12 10:47:47 +08:00
Concedo
70cdb55cc9
Merge commit ' 947538acb8
' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/docker.yml
# CMakePresets.json
# examples/llama-bench/llama-bench.cpp
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# tests/test-backend-ops.cpp
# tests/test-quantize-fns.cpp
2024-09-09 11:26:34 +08:00
Concedo
d777995991
able to handle kcpp protected model name endpoints
2024-09-04 16:26:28 +08:00
Concedo
5d34de0c08
fix basepath
2024-09-02 18:09:58 +08:00
Concedo
3c4fa57026
allow horde worker to work with password protected instances
2024-08-31 21:30:47 +08:00
Concedo
0f9968ef64
fixed some incorrect protocol prefix for localhost
2024-08-29 10:37:43 +08:00