Concedo
836c06d91a
minor edit
2024-12-06 00:37:38 +08:00
Concedo
d0d1d922de
handle and fix temp paths to chat completions adapter
2024-12-05 17:22:35 +08:00
Concedo
2787fca6b4
refactored library selection, fixed ollama params
2024-12-05 16:47:52 +08:00
Concedo
52cc908f7f
default trim_stop to true, which trims any tokens after a stop sequence and the stop sequence itself. This is potentially a breaking change.
2024-12-03 22:44:10 +08:00
Concedo
2ba5949054
updated sdcpp, also set euler as default sampler
2024-12-01 17:00:20 +08:00
Concedo
42228b9746
warning when selecting non gguf models
2024-12-01 13:35:51 +08:00
Concedo
b7cd210cd2
more linting with Ruff (+1 squashed commits)
...
Squashed commits:
[43802cfe2] Applied default Ruff linting
2024-12-01 01:23:13 +08:00
Concedo
409e393d10
fixed critical bug in image model loader
2024-11-30 23:28:24 +08:00
Concedo
0028e71993
special handling to resolve incomplete utf8 token sequences in qwen
2024-11-30 16:54:01 +08:00
Concedo
32ac3153e4
default speculative set to 8. added more adapter fields
2024-11-30 16:18:27 +08:00
Concedo
e0c59486ee
default to 12 tokens drafted
2024-11-30 11:52:07 +08:00
Concedo
b21d0fe3ac
customizable speculative size
2024-11-30 11:28:19 +08:00
Concedo
f75bbb945f
speculative decoding initial impl completed (+6 squashed commit)
...
Squashed commit:
[0a6306ca0] draft wip dont use (will be squashed)
[a758a1c9c] wip dont use (will be squashed)
[e1994d3ce] wip dont use
[f59690d68] wip
[77228147d] wip on spec decoding. dont use yet
[2445bca54] wip adding speculative decoding (+1 squashed commits)
Squashed commits:
[50e341bb7] wip adding speculative decoding
2024-11-30 10:41:10 +08:00
kallewoof
fd320f6682
/props endpoint: provide context size through default_generation_settings ( #1237 )
2024-11-26 16:15:27 +08:00
Concedo
1e0792a3ef
comfyui emulation also done
2024-11-24 15:39:03 +08:00
Concedo
9bd27323e7
emulate comfyui txt2img
2024-11-24 11:28:12 +08:00
Concedo
bf28d956ae
ollama chat api done
2024-11-24 00:10:15 +08:00
Concedo
62dde8cfb2
ollama sync completions mostly working. stupid api.
2024-11-23 23:31:37 +08:00
Concedo
2c1a06a07d
wip ollama emulation, added detokenize endpoint
2024-11-23 22:48:03 +08:00
Concedo
c0da7e4dcf
multiplayer activity tracking
2024-11-23 19:59:55 +08:00
Concedo
1dd37933e3
fixed grammar not resetting correctly
2024-11-23 09:55:12 +08:00
Concedo
18f227625b
multiplayer fixes
2024-11-22 19:02:31 +08:00
mkarr
ac6a0cde91
Support chunked encoding. ( #1226 )
...
* Support chunked encoding.
The koboldcpp API does not support HTTP chunked encoding. Some HTTP
libraries, notable Go's net/http can automatically choose to use chunked
encoding. This adds support for chunked encoding within the do_POST()
handler.
* refactor slightly to add additional safety checks and follow original format
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-11-21 18:24:04 +08:00
Concedo
c2ca2ec2bc
updated docs, fixed a few issues with multiplayer
2024-11-21 18:16:13 +08:00
Concedo
272828cab0
tweaks to chat template
2024-11-21 11:10:30 +08:00
kallewoof
547ab2aebb
API: add /props route ( #1222 )
...
* API: add an /extra/chat_template route
A lot of manual tweaking is done when swapping between models. We can automate or make better assumptions about some of them by having more information, such as chat template. This PR adds an endpoint /extra/chat_template which returns the model chat template string as is in a 'chat_template' key. The front end can then use this to derive the proper templates or use it as is, or at least warn the user when they are trying to use e.g. a Mistral preset with a Llama 3.1 model.
* switch to pre-established /props endpoint for chat template
* bug-fix (upstream): one-off in string juggling
2024-11-21 10:58:32 +08:00
Concedo
8ab3eb89a8
updated lite
2024-11-21 10:43:48 +08:00
Concedo
a439dcb38e
multiplayer error handling
2024-11-19 23:31:48 +08:00
Concedo
1b663e10c8
first functional multiplayer
2024-11-19 22:49:28 +08:00
Concedo
14cbd07eaa
more wip multiplayer
2024-11-19 18:09:26 +08:00
Concedo
39124828ab
wip multiplayer
2024-11-17 23:29:25 +08:00
Concedo
a8694698fd
accept gguf text encoders for sd
2024-11-16 17:23:02 +08:00
Concedo
70aee82552
attempts a backflip, but does he stick the landing?
2024-11-16 17:05:45 +08:00
Concedo
a5f8e596d3
unset sc if ff off
2024-11-16 10:52:33 +08:00
Concedo
3813f6c517
added new flag nofastforward allowing users to disable fast forwarding
2024-11-13 10:59:01 +08:00
Concedo
df7c2b9923
renamed some labels
2024-11-11 19:40:47 +08:00
Concedo
c9977a5cb5
model downloading for new params
2024-11-07 14:41:25 +08:00
Concedo
ccbd630a42
allow custom t5, clipl and clipg
2024-11-06 19:05:48 +08:00
Concedo
f153a14daf
add common identity provider /.well-known/serviceinfo, updated docs
2024-11-04 21:29:26 +08:00
Concedo
847689e74c
fixed incorrect makefile flags
2024-11-04 20:39:10 +08:00
Concedo
6ac8b2bdb3
tweak ratios
2024-11-02 12:35:04 +08:00
Concedo
2a07f2dc2c
minor fix
2024-11-01 22:42:57 +08:00
Concedo
bbebc76817
fix top picks bug, lower input anti abuse thresholds (+1 squashed commits)
...
Squashed commits:
[a81d9b21] fix top picks bug, lower input anti abuse thresholds
2024-11-01 16:42:13 +08:00
Concedo
6a27003a06
logprobs feature completed
2024-11-01 15:24:07 +08:00
Concedo
aa26a58085
added logprobs api and logprobs viewer
2024-11-01 00:22:15 +08:00
Concedo
6731dd64f1
quick fix for trim stop
2024-10-30 11:24:55 +08:00
Concedo
90f5cd0f67
wip logprobs data
2024-10-30 00:59:34 +08:00
Concedo
bd05efd648
fix trim_stop failing on some edge cases
2024-10-27 21:41:47 +08:00
Concedo
4ec12756b3
multiuser fixes
2024-10-26 09:33:11 +08:00
Concedo
d0a6a52855
hide flash attention in quick launch for vulkan, updated lite
2024-10-24 22:00:09 +08:00