Concedo
c0da7e4dcf
multiplayer activity tracking
2024-11-23 19:59:55 +08:00
Concedo
1dd37933e3
fixed grammar not resetting correctly
2024-11-23 09:55:12 +08:00
Concedo
18f227625b
multiplayer fixes
2024-11-22 19:02:31 +08:00
mkarr
ac6a0cde91
Support chunked encoding. ( #1226 )
...
* Support chunked encoding.
The koboldcpp API does not support HTTP chunked encoding. Some HTTP
libraries, notable Go's net/http can automatically choose to use chunked
encoding. This adds support for chunked encoding within the do_POST()
handler.
* refactor slightly to add additional safety checks and follow original format
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-11-21 18:24:04 +08:00
Concedo
c2ca2ec2bc
updated docs, fixed a few issues with multiplayer
2024-11-21 18:16:13 +08:00
Concedo
272828cab0
tweaks to chat template
2024-11-21 11:10:30 +08:00
kallewoof
547ab2aebb
API: add /props route ( #1222 )
...
* API: add an /extra/chat_template route
A lot of manual tweaking is done when swapping between models. We can automate or make better assumptions about some of them by having more information, such as chat template. This PR adds an endpoint /extra/chat_template which returns the model chat template string as is in a 'chat_template' key. The front end can then use this to derive the proper templates or use it as is, or at least warn the user when they are trying to use e.g. a Mistral preset with a Llama 3.1 model.
* switch to pre-established /props endpoint for chat template
* bug-fix (upstream): one-off in string juggling
2024-11-21 10:58:32 +08:00
Concedo
8ab3eb89a8
updated lite
2024-11-21 10:43:48 +08:00
Concedo
a439dcb38e
multiplayer error handling
2024-11-19 23:31:48 +08:00
Concedo
1b663e10c8
first functional multiplayer
2024-11-19 22:49:28 +08:00
Concedo
14cbd07eaa
more wip multiplayer
2024-11-19 18:09:26 +08:00
Concedo
39124828ab
wip multiplayer
2024-11-17 23:29:25 +08:00
Concedo
a8694698fd
accept gguf text encoders for sd
2024-11-16 17:23:02 +08:00
Concedo
70aee82552
attempts a backflip, but does he stick the landing?
2024-11-16 17:05:45 +08:00
Concedo
a5f8e596d3
unset sc if ff off
2024-11-16 10:52:33 +08:00
Concedo
3813f6c517
added new flag nofastforward allowing users to disable fast forwarding
2024-11-13 10:59:01 +08:00
Concedo
df7c2b9923
renamed some labels
2024-11-11 19:40:47 +08:00
Concedo
c9977a5cb5
model downloading for new params
2024-11-07 14:41:25 +08:00
Concedo
ccbd630a42
allow custom t5, clipl and clipg
2024-11-06 19:05:48 +08:00
Concedo
f153a14daf
add common identity provider /.well-known/serviceinfo, updated docs
2024-11-04 21:29:26 +08:00
Concedo
847689e74c
fixed incorrect makefile flags
2024-11-04 20:39:10 +08:00
Concedo
6ac8b2bdb3
tweak ratios
2024-11-02 12:35:04 +08:00
Concedo
2a07f2dc2c
minor fix
2024-11-01 22:42:57 +08:00
Concedo
bbebc76817
fix top picks bug, lower input anti abuse thresholds (+1 squashed commits)
...
Squashed commits:
[a81d9b21] fix top picks bug, lower input anti abuse thresholds
2024-11-01 16:42:13 +08:00
Concedo
6a27003a06
logprobs feature completed
2024-11-01 15:24:07 +08:00
Concedo
aa26a58085
added logprobs api and logprobs viewer
2024-11-01 00:22:15 +08:00
Concedo
6731dd64f1
quick fix for trim stop
2024-10-30 11:24:55 +08:00
Concedo
90f5cd0f67
wip logprobs data
2024-10-30 00:59:34 +08:00
Concedo
bd05efd648
fix trim_stop failing on some edge cases
2024-10-27 21:41:47 +08:00
Concedo
4ec12756b3
multiuser fixes
2024-10-26 09:33:11 +08:00
Concedo
d0a6a52855
hide flash attention in quick launch for vulkan, updated lite
2024-10-24 22:00:09 +08:00
Concedo
6da5a63852
fix for uploaded wav files being incomplete due to fragmentation when converting to b64
2024-10-20 17:47:19 +08:00
Concedo
a9dbcdd3ec
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# docs/build.md
# examples/infill/infill.cpp
# examples/main/README.md
# examples/server/README.md
# flake.lock
# scripts/sync-ggml.last
# src/llama.cpp
# tests/test-json-schema-to-grammar.cpp
# tests/test-sampling.cpp
2024-10-17 16:36:02 +08:00
Maya
8bb220329c
Dynamic sizes for sequences ( #1157 )
...
* Dynamic sizes for sequences
* cleanup PR - move all dynamic fields to end of payload, ensure correct null handling to match existing behavior, add anti abuse limit of max 512 for dynamic fields
* adjust anti abuse limits
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-10-16 23:55:11 +08:00
JR
8e5ffc5a58
Add header X-Accel-Buffering set to no for SSE stream requests ( #1168 )
2024-10-16 17:28:05 +08:00
Concedo
21b2f6168e
Merge branch 'concedo_experimental' of https://github.com/LostRuins/koboldcpp into concedo_experimental
2024-10-14 22:09:38 +08:00
Concedo
1d40303050
increase again
2024-10-14 22:09:26 +08:00
YellowRoseCx
f029de6e46
Merge pull request #69 from matoro/main ( #1165 )
...
Fix gpulayers autodetection for cublas & clblast backends
2024-10-14 20:10:41 +08:00
Concedo
8d81519ca3
direct user to gguf model resources
2024-10-12 18:39:21 +08:00
Concedo
5ad826b82a
updated lite (+2 squashed commit)
...
Squashed commit:
[31a99e1f] bump baned phrase a bit more again
[c999736b] small fix
2024-10-11 11:05:04 +08:00
Maya
3dab63887f
Add custom_token_bans ( #1153 )
2024-10-10 23:45:07 +08:00
Concedo
a3b104a422
further increase some limits
2024-10-10 22:27:28 +08:00
Concedo
d75cbd671d
alias banned_tokens with banned_strings from ST
...
increase max bans to 32 for now
2024-10-10 21:52:46 +08:00
Concedo
fe5479f286
unify antislop and token bans
2024-10-10 18:21:07 +08:00
Concedo
a6bf568fda
prevent GUI settings from being overridden
2024-10-10 11:46:57 +08:00
Concedo
65f3c68399
wip antislop
2024-10-07 20:19:22 +08:00
Concedo
3e8bb10e2d
wip on rewind function
2024-10-06 16:21:03 +08:00
Concedo
d9fcb94472
do not suppress stdout if debugmode
2024-10-04 16:04:29 +08:00
Concedo
a785a91e56
every request has timestamp
2024-09-27 22:10:41 +08:00
Concedo
ea55f69dc1
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .dockerignore
# .github/workflows/build.yml
# .github/workflows/docker.yml
# Makefile
# README.md
# examples/infill/infill.cpp
# examples/perplexity/perplexity.cpp
# examples/server/README.md
# examples/speculative/speculative.cpp
# flake.lock
# ggml/src/CMakeLists.txt
# scripts/sync-ggml.last
# tests/test-backend-ops.cpp
# tests/test-sampling.cpp
2024-09-27 11:21:28 +08:00