koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 09:04:36 +00:00

Author	SHA1	Message	Date
Concedo	52cc908f7f	default trim_stop to true, which trims any tokens after a stop sequence and the stop sequence itself. This is potentially a breaking change.	2024-12-03 22:44:10 +08:00
Concedo	2ba5949054	updated sdcpp, also set euler as default sampler	2024-12-01 17:00:20 +08:00
Concedo	42228b9746	warning when selecting non gguf models	2024-12-01 13:35:51 +08:00
Concedo	b7cd210cd2	more linting with Ruff (+1 squashed commits) Squashed commits: [43802cfe2] Applied default Ruff linting	2024-12-01 01:23:13 +08:00
Concedo	409e393d10	fixed critical bug in image model loader	2024-11-30 23:28:24 +08:00
Concedo	0028e71993	special handling to resolve incomplete utf8 token sequences in qwen	2024-11-30 16:54:01 +08:00
Concedo	32ac3153e4	default speculative set to 8. added more adapter fields	2024-11-30 16:18:27 +08:00
Concedo	e0c59486ee	default to 12 tokens drafted	2024-11-30 11:52:07 +08:00
Concedo	b21d0fe3ac	customizable speculative size	2024-11-30 11:28:19 +08:00
Concedo	f75bbb945f	speculative decoding initial impl completed (+6 squashed commit) Squashed commit: [0a6306ca0] draft wip dont use (will be squashed) [a758a1c9c] wip dont use (will be squashed) [e1994d3ce] wip dont use [f59690d68] wip [77228147d] wip on spec decoding. dont use yet [2445bca54] wip adding speculative decoding (+1 squashed commits) Squashed commits: [50e341bb7] wip adding speculative decoding	2024-11-30 10:41:10 +08:00
kallewoof	fd320f6682	/props endpoint: provide context size through default_generation_settings (#1237 )	2024-11-26 16:15:27 +08:00
Concedo	1e0792a3ef	comfyui emulation also done	2024-11-24 15:39:03 +08:00
Concedo	9bd27323e7	emulate comfyui txt2img	2024-11-24 11:28:12 +08:00
Concedo	bf28d956ae	ollama chat api done	2024-11-24 00:10:15 +08:00
Concedo	62dde8cfb2	ollama sync completions mostly working. stupid api.	2024-11-23 23:31:37 +08:00
Concedo	2c1a06a07d	wip ollama emulation, added detokenize endpoint	2024-11-23 22:48:03 +08:00
Concedo	c0da7e4dcf	multiplayer activity tracking	2024-11-23 19:59:55 +08:00
Concedo	1dd37933e3	fixed grammar not resetting correctly	2024-11-23 09:55:12 +08:00
Concedo	18f227625b	multiplayer fixes	2024-11-22 19:02:31 +08:00
mkarr	ac6a0cde91	Support chunked encoding. (#1226 ) * Support chunked encoding. The koboldcpp API does not support HTTP chunked encoding. Some HTTP libraries, notable Go's net/http can automatically choose to use chunked encoding. This adds support for chunked encoding within the do_POST() handler. * refactor slightly to add additional safety checks and follow original format --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-11-21 18:24:04 +08:00
Concedo	c2ca2ec2bc	updated docs, fixed a few issues with multiplayer	2024-11-21 18:16:13 +08:00
Concedo	272828cab0	tweaks to chat template	2024-11-21 11:10:30 +08:00
kallewoof	547ab2aebb	API: add /props route (#1222 ) * API: add an /extra/chat_template route A lot of manual tweaking is done when swapping between models. We can automate or make better assumptions about some of them by having more information, such as chat template. This PR adds an endpoint /extra/chat_template which returns the model chat template string as is in a 'chat_template' key. The front end can then use this to derive the proper templates or use it as is, or at least warn the user when they are trying to use e.g. a Mistral preset with a Llama 3.1 model. * switch to pre-established /props endpoint for chat template * bug-fix (upstream): one-off in string juggling	2024-11-21 10:58:32 +08:00
Concedo	8ab3eb89a8	updated lite	2024-11-21 10:43:48 +08:00
Concedo	a439dcb38e	multiplayer error handling	2024-11-19 23:31:48 +08:00
Concedo	1b663e10c8	first functional multiplayer	2024-11-19 22:49:28 +08:00
Concedo	14cbd07eaa	more wip multiplayer	2024-11-19 18:09:26 +08:00
Concedo	39124828ab	wip multiplayer	2024-11-17 23:29:25 +08:00
Concedo	a8694698fd	accept gguf text encoders for sd	2024-11-16 17:23:02 +08:00
Concedo	70aee82552	attempts a backflip, but does he stick the landing?	2024-11-16 17:05:45 +08:00
Concedo	a5f8e596d3	unset sc if ff off	2024-11-16 10:52:33 +08:00
Concedo	3813f6c517	added new flag nofastforward allowing users to disable fast forwarding	2024-11-13 10:59:01 +08:00
Concedo	df7c2b9923	renamed some labels	2024-11-11 19:40:47 +08:00
Concedo	c9977a5cb5	model downloading for new params	2024-11-07 14:41:25 +08:00
Concedo	ccbd630a42	allow custom t5, clipl and clipg	2024-11-06 19:05:48 +08:00
Concedo	f153a14daf	add common identity provider /.well-known/serviceinfo, updated docs	2024-11-04 21:29:26 +08:00
Concedo	847689e74c	fixed incorrect makefile flags	2024-11-04 20:39:10 +08:00
Concedo	6ac8b2bdb3	tweak ratios	2024-11-02 12:35:04 +08:00
Concedo	2a07f2dc2c	minor fix	2024-11-01 22:42:57 +08:00
Concedo	bbebc76817	fix top picks bug, lower input anti abuse thresholds (+1 squashed commits) Squashed commits: [a81d9b21] fix top picks bug, lower input anti abuse thresholds	2024-11-01 16:42:13 +08:00
Concedo	6a27003a06	logprobs feature completed	2024-11-01 15:24:07 +08:00
Concedo	aa26a58085	added logprobs api and logprobs viewer	2024-11-01 00:22:15 +08:00
Concedo	6731dd64f1	quick fix for trim stop	2024-10-30 11:24:55 +08:00
Concedo	90f5cd0f67	wip logprobs data	2024-10-30 00:59:34 +08:00
Concedo	bd05efd648	fix trim_stop failing on some edge cases	2024-10-27 21:41:47 +08:00
Concedo	4ec12756b3	multiuser fixes	2024-10-26 09:33:11 +08:00
Concedo	d0a6a52855	hide flash attention in quick launch for vulkan, updated lite	2024-10-24 22:00:09 +08:00
Concedo	6da5a63852	fix for uploaded wav files being incomplete due to fragmentation when converting to b64	2024-10-20 17:47:19 +08:00
Concedo	a9dbcdd3ec	Merge branch 'upstream' into concedo_experimental # Conflicts: # README.md # docs/build.md # examples/infill/infill.cpp # examples/main/README.md # examples/server/README.md # flake.lock # scripts/sync-ggml.last # src/llama.cpp # tests/test-json-schema-to-grammar.cpp # tests/test-sampling.cpp	2024-10-17 16:36:02 +08:00
Maya	8bb220329c	Dynamic sizes for sequences (#1157 ) * Dynamic sizes for sequences * cleanup PR - move all dynamic fields to end of payload, ensure correct null handling to match existing behavior, add anti abuse limit of max 512 for dynamic fields * adjust anti abuse limits --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-10-16 23:55:11 +08:00

1 2 3 4 5 ...

727 commits