koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-06 16:21:49 +00:00

Author	SHA1	Message	Date
Concedo	3326bdc00a	if blank, autoguess template	2025-09-27 12:49:32 +08:00
Concedo	c7a1eec4e4	try to solve ttscpp oom regression	2025-09-24 17:45:28 +08:00
Concedo	d3f9db8d33	fix system32 writability check	2025-09-24 14:43:41 +08:00
Concedo	174d00bb74	fix aria2c with both download cases	2025-09-22 21:47:08 +08:00
Concedo	59b6a09ae1	try to fix kokoro alloc again	2025-09-22 21:22:41 +08:00
Concedo	216b766aee	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build-riscv-native.yml # .github/workflows/build.yml # ci/README.md # ci/run.sh # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/cvt.cl # tests/test-backend-ops.cpp	2025-09-22 13:56:02 +08:00
Concedo	08c0246f24	prioritize cwd for downloads if its writable	2025-09-22 13:47:50 +08:00
Concedo	13bee0d39d	some minor fixes	2025-09-22 13:20:06 +08:00
Concedo	c8686a627e	don't mandate mistral common for other model usage	2025-09-21 21:16:02 +08:00
Concedo	9e7661352c	Revert "FA default on" This reverts commit `19c3efb34a`.	2025-09-21 17:28:49 +08:00
Concedo	19c3efb34a	FA default on	2025-09-14 11:53:18 +08:00
Concedo	bf8fc4659b	updated lite, tweak default rep pens, default fa off (+3 squashed commit) Squashed commit: [be2b10125] default rep pen 1.05 [cb4527b15] better to default fa off [126104fe7] updated lite	2025-09-14 00:11:52 +08:00
Concedo	89feffc0e4	fix aria2c	2025-09-06 09:45:14 +08:00
Concedo	f9ce2a00f0	consistent file download locations	2025-09-06 09:26:45 +08:00
Concedo	979e2113e2	flash attention is now checked by default when using gui launcher	2025-09-03 23:36:43 +08:00
Concedo	5c4ad392ea	added a new parameter `--ratelimit` that will apply per-IP based rate limiting (to help prevent abuse of public instances).	2025-09-01 22:08:13 +08:00
Concedo	53360e2cff	linting	2025-08-30 15:27:31 +08:00
lone-cloud	cb9bd2fc4a	fix automatic VRAM detection for ROCm and Vulkan backends (#1715 ) * use rocminfo for ROCm VRAM detection * vulkan VRAM detection needs to consider all heaps, don't print that we're unable to detect VRAM until all detection is ran	2025-08-30 15:22:32 +08:00
Concedo	7b396bd917	added v1 voices endpoint, added lcpp aliases for cli, fixed dia wrong voice	2025-08-30 11:20:18 +08:00
Concedo	645b09ea20	renamed promptlimit to genlimit, now applies to API requests as well, can be set in the ui. hide API info display if running in CLI mode.	2025-08-30 00:26:05 +08:00
Concedo	3060dfb99f	Merge branch 'upstream' into concedo_experimental # Conflicts: # examples/model-conversion/Makefile # examples/model-conversion/scripts/causal/convert-model.sh # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/common.h # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-cuda/CMakeLists.txt # scripts/compare-commits.sh	2025-08-28 23:17:29 +08:00
Concedo	3655ecf9b3	minor template and tts ui fixes	2025-08-27 22:30:09 +08:00
Concedo	205a0b8d4c	fix kokoro replacement, add 4096 batch size option	2025-08-25 15:57:13 +08:00
Concedo	b0a8d11584	add tts max length for kokoro (+1 squashed commits) Squashed commits: [c1c6feaf] add tts max length for kokoro	2025-08-24 17:57:29 +08:00
Concedo	a6aa47322b	csv fix	2025-08-23 12:48:11 +08:00
Concedo	80dabbb689	minor adjustments for sdquant: allow backend to do the translation for the type more defensively, adjust the UI dropdown for clarity.	2025-08-22 23:23:32 +08:00
Wagner Bruna	2f8b0ec538	Support q8_0 quantization for image model loading (#1692 ) * Support q8_0 quantization for image model loading q4_0 may degrade quality significantly, especially for smaller models like SD 1.5 and SDXL. q8_0 provides a middle-ground, giving half the memory savings of q4_0 but loading faster and with less quality loss. * Accept --sdquant with no parameters * Use numerical values for the sdquant option	2025-08-22 22:17:15 +08:00
Concedo	7fef0bc949	fix filename regex for whisper	2025-08-22 22:04:05 +08:00
Concedo	9dd6b4c930	improve whisper transcribe apt regex	2025-08-22 17:13:51 +08:00
liuyunrui123	c13db49d5b	Log output supports utf8 encoding display (#1700 )	2025-08-21 16:52:03 +08:00
Concedo	3210b378e8	better tool calls	2025-08-20 22:11:31 +08:00
Concedo	eb33467c8c	fixed text	2025-08-20 12:25:04 +08:00
Wagner Bruna	6003e90e50	Add flash attention and conv2d direct controls for image generation (#1678 ) * Add separate flash attention config for image generation * Add config option for Conv2D Direct	2025-08-20 12:17:57 +08:00
Concedo	9fb0611115	handle contractions correctly, bump defaults	2025-08-18 22:33:44 +08:00
Concedo	2abe11071b	custom voice handling	2025-08-18 16:57:34 +08:00
Concedo	685129fb5a	add missing title, set max tts length to 1024, updated lite (+2 squashed commit) Squashed commit: [0737a028] add missing title [a42328b0] add max tts length 1024	2025-08-17 21:42:56 +08:00
Concedo	bcaf379509	tts.cpp merged and working in kcpp!	2025-08-17 18:09:28 +08:00
Concedo	52606e9b1d	tts cpp model is now loadable in kcpp	2025-08-17 15:47:22 +08:00
Concedo	5a921a40f9	add overridenativecontext flag, stop nagging me	2025-08-14 22:54:45 +08:00
Concedo	4b2ca1169c	more consistency fixes	2025-08-13 19:28:53 +08:00
Concedo	955cf66bbc	load embedding at current maxctx instead of max trained ctx by default	2025-08-13 18:42:14 +08:00
Concedo	06a3ee4c3b	populate better server identifier headers.	2025-08-13 16:10:30 +08:00
Concedo	30e2f25c05	alias tensorsplit , fixed python error	2025-08-10 22:38:14 +08:00
Concedo	8e6d27f629	handle if assistant_message_gen and assistant_message_gen!=assistant_message_start, replace final output tag with unspaced (gen) version if exists	2025-08-10 16:51:34 +08:00
kallewoof	204739e7f1	Adapter fixes (#1659 ) * test adapters * add assistant_gen adapter key * add support for chat templates stored as .jinja files * removed mistakenly commited gated-tokenizers link * autoguess: Harmony: add missing newline prefixes to system_end	2025-08-10 16:19:50 +08:00
Concedo	89266ac6b8	autoguess adapter make case insensitive	2025-08-10 00:58:47 +08:00
Concedo	487d509b44	try fix oldpc cuda broken without flash attn since upstream pr14361 between 1.94 and 1.95 (+1 squashed commits) Squashed commits: [940f0c639] try fix oldpc cuda broken without flash attn since upstream pr14361 between 1.94 and 1.95	2025-08-10 00:10:37 +08:00
Concedo	4c1faf61b2	increment version (+1 squashed commits) Squashed commits: [6e5080ad2] increment version	2025-08-09 20:53:26 +08:00
Concedo	ced98823a1	kai api tool calling	2025-08-09 10:51:10 +08:00
Concedo	9e7a940ce4	Merge branch 'upstream' into concedo_experimental # Conflicts: # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/softmax_4_f16.cl # ggml/src/ggml-opencl/kernels/softmax_4_f32.cl # ggml/src/ggml-opencl/kernels/softmax_f16.cl # ggml/src/ggml-opencl/kernels/softmax_f32.cl # ggml/src/ggml-rpc/ggml-rpc.cpp # ggml/src/ggml-sycl/ggml-sycl.cpp	2025-08-09 01:24:52 +08:00

1 2 3 4 5 ...

1107 commits