koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 09:04:36 +00:00

Author	SHA1	Message	Date
Concedo	238be98efa	Allow override config for gguf files when reloading in admin mode, updated lite, fixed typo (+1 squashed commits) Squashed commits: [fe14845cc] Allow override config for gguf files when reloading in admin mode, updated lite (+2 squashed commit) Squashed commit: [9ded66aa5] Allow override config for gguf files when reloading in admin mode [9597f6a34] update lite	2025-06-14 12:00:20 +08:00
Wagner Bruna	f6d2d1ce5c	configurable resolution limit (#1586 ) * refactor image gen configuration screen * make image size limit configurable * fix resolution limits and keep dimensions closer to the original ratio * use 0.0 for the configured default image size limit This prevents the current default value from being saved into the config files, in case we later decide to adopt a different value. * export image model version when loading * restore model-specific default image size limit * change the image area restriction to be specified by a square side * move image resolution limits down to the C++ level * Revert "export image model version when loading" This reverts commit `fa65b23de3`. * Linting Fixes: PY: - Inconsistent var name sd_restrict_square -> sd_restrict_square_var - GUI swap back to using absolute row numbers for now. - fstring fix - size_limit -> side_limit inconsistency C++: - roundup_64 standalone function - refactor sd_fix_resolution variable names for clarity - move "anti crashing" hard total megapixel limit always to be applied after soft total megapixel limit instead of conditionally only when sd_restrict_square is unset * allow unsafe resolutions if debugmode is on --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-06-13 20:05:20 +08:00
Concedo	1cbe716e45	allow setting maingpu	2025-06-12 17:53:43 +08:00
henk717	f151648f03	Pyinstaller launcher and dependency updates This PR adds a new launcher executable to the unpack feature, eliminating the need to have python and its dependencies in the unpacked version. It also does a few dependency changes to help future proof.	2025-06-10 23:08:02 +08:00
Concedo	8386546e08	Switched VS2019 for revert cu12.1 build, hopefully solves dll issues try change order (+3 squashed commit) Squashed commit: [457f02507] try newer jimver [`64af28862`] windows pyinstaller shim. the final loader will be moved into the packed directory later. [`0272ecf2d`] try alternative way of getting cuda toolkit 12.4 since jimver wont work, also fix rocm try again (+3 squashed commit) Squashed commit: [133e81633] try without pwsh [4d99cefba] try without pwsh [bdfa91e7d] try alternative way of getting cuda toolkit 12.4, also fix rocm	2025-06-10 23:08:02 +08:00
Concedo	7d8aa31f1f	fixed embeddings, added new parameter to limit max embeddings context	2025-06-10 01:11:55 +08:00
Concedo	8780b33c64	consolidate imports	2025-06-09 17:48:54 +08:00
Concedo	82d7c53b85	embeddings handle base64	2025-06-09 00:26:40 +08:00
Concedo	7f57846c2f	update bundled vcrts	2025-06-08 19:39:42 +08:00
Concedo	dcf88d6e78	Revert "make tts use gpu by default. use --ttscpu to disable" This reverts commit `669f80265b`.	2025-06-08 17:08:04 +08:00
Concedo	669f80265b	make tts use gpu by default. use --ttscpu to disable	2025-06-08 17:06:19 +08:00
Concedo	cfcdfd69bd	allow embeddings models to use mmap	2025-06-07 10:14:00 +08:00
Concedo	6effb65cfe	change singleinstance order	2025-06-06 21:20:30 +08:00
Concedo	740f91e3fd	lower aria interval	2025-06-06 17:43:38 +08:00
Concedo	9cf32e5fee	step limits over adapter for sd	2025-06-06 14:12:43 +08:00
Concedo	f6bbc350f2	various qol fixes	2025-06-05 10:26:02 +08:00
Concedo	736030bb9f	save and load state upgraded to 3 available states	2025-06-04 22:09:40 +08:00
Concedo	06d2bc3404	ollama compat fixes	2025-06-04 19:22:29 +08:00
Concedo	53f1511396	use a static buffer for kv reloads instead. also, added into lite ui	2025-06-03 22:32:46 +08:00
Concedo	4b57108508	Save KV State and Load KV State to memory added. GUI not yet updated	2025-06-03 17:46:29 +08:00
Concedo	6ce85c54d6	not working correctly	2025-06-02 22:12:10 +08:00
Concedo	8e1ebc55b5	dropped support for lora base as upstream no longer uses it. If provided it will be silently ignored	2025-06-02 12:49:53 +08:00
Concedo	51dc1cf920	added scale for text lora	2025-06-02 00:13:42 +08:00
Concedo	74ef097c4a	added ability to set koboldcpp as default handler for gguf and kcpps	2025-06-01 22:36:41 +08:00
Concedo	f3bb947a13	cuda use wmma flash attention for turing (+1 squashed commits) Squashed commits: [3c5112398] 117 (+10 squashed commit) Squashed commit: [4f01bb2d4] 117 graphs 80v [7549034ea] 117 graphs [dabf9cb99] checking if cuda 11.5.2 works [ba7ccdb7a] another try cu11.7 only [752cf2ae5] increase aria2c download log rate [dc4f198fd] test send turing to wmma flash attention [496a22e83] temp build test cu11.7.0 [ca759c424] temp build test cu11.7 [c46ada17c] test build: enable virtual80 for oldcpu [3ccfd939a] test build: with cuda graphs for all	2025-06-01 11:41:45 +08:00
Concedo	b08dca65ed	Merge branch 'upstream' into concedo_experimental # Conflicts: # common/CMakeLists.txt # common/arg.cpp # common/chat.cpp # examples/parallel/README.md # examples/parallel/parallel.cpp # ggml/cmake/common.cmake # ggml/src/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/rope.cpp # models/ggml-vocab-bert-bge.gguf.inp # models/ggml-vocab-bert-bge.gguf.out # models/ggml-vocab-command-r.gguf.inp # models/ggml-vocab-command-r.gguf.out # models/ggml-vocab-deepseek-coder.gguf.inp # models/ggml-vocab-deepseek-coder.gguf.out # models/ggml-vocab-deepseek-llm.gguf.inp # models/ggml-vocab-deepseek-llm.gguf.out # models/ggml-vocab-falcon.gguf.inp # models/ggml-vocab-falcon.gguf.out # models/ggml-vocab-gpt-2.gguf.inp # models/ggml-vocab-gpt-2.gguf.out # models/ggml-vocab-llama-bpe.gguf.inp # models/ggml-vocab-llama-bpe.gguf.out # models/ggml-vocab-llama-spm.gguf.inp # models/ggml-vocab-llama-spm.gguf.out # models/ggml-vocab-mpt.gguf.inp # models/ggml-vocab-mpt.gguf.out # models/ggml-vocab-phi-3.gguf.inp # models/ggml-vocab-phi-3.gguf.out # models/ggml-vocab-qwen2.gguf.inp # models/ggml-vocab-qwen2.gguf.out # models/ggml-vocab-refact.gguf.inp # models/ggml-vocab-refact.gguf.out # models/ggml-vocab-starcoder.gguf.inp # models/ggml-vocab-starcoder.gguf.out # requirements/requirements-gguf_editor_gui.txt # tests/CMakeLists.txt # tests/test-chat.cpp # tests/test-grammar-integration.cpp # tests/test-json-schema-to-grammar.cpp # tools/mtmd/CMakeLists.txt # tools/run/run.cpp # tools/server/CMakeLists.txt	2025-05-31 13:04:21 +08:00
Concedo	c923e9fe46	added option to unload model from admin control	2025-05-31 11:51:09 +08:00
Concedo	08e0745e7e	added singleinstance flag and local shutdown api	2025-05-31 11:37:32 +08:00
Concedo	6529326c59	allow temperatures up to 1.0 when function calling	2025-05-30 15:59:18 +08:00
Concedo	c881bb7348	match a few common oai voices	2025-05-29 23:29:17 +08:00
Concedo	26bf5b446d	fixed thread count <=0 , fixed clip skip <= 0	2025-05-28 00:38:15 +08:00
Concedo	f97bbdde00	fix to allow all EOGs to trigger a stop, occam's glm4 fix,	2025-05-24 22:55:11 +08:00
Concedo	ec04115ae9	swa options now available	2025-05-24 11:50:37 +08:00
Concedo	748dfcc2e4	massively improved tool calling	2025-05-24 02:26:11 +08:00
Concedo	c4df151298	experimental swa flag	2025-05-23 21:33:26 +08:00
Concedo	499283c63a	rename define to match upstream	2025-05-23 17:10:12 +08:00
Concedo	e68a5f448c	add ddim sampler	2025-05-22 21:28:01 +08:00
Concedo	f125e724eb	fix off-by-one npast during some instances of fast forwarding	2025-05-22 19:51:21 +08:00
Concedo	440350327c	set random range for seed	2025-05-21 23:47:18 +08:00
Wagner Bruna	5d0cfc9db3	store on the image the actual random seed, for reproducibility (#1549 )	2025-05-21 23:40:47 +08:00
Concedo	8b6dfbd1be	disabling the gMask prefix for glm-4 completions	2025-05-21 17:29:24 +08:00
Concedo	49305942ab	try disabling the gMask prefix for glm-4 completions	2025-05-21 16:47:08 +08:00
Concedo	5f4923bf24	backend tag replacement for endtags. view results with debug mode.	2025-05-19 23:14:43 +08:00
Concedo	710c747b60	minor noscript edit	2025-05-19 17:51:44 +08:00
Concedo	c546cb638e	disable showgui if skiplauncher is used	2025-05-18 01:42:14 +08:00
Concedo	ca4274e384	added size info into HF searcher	2025-05-17 00:31:54 +08:00
Concedo	5ccd4b2bf5	horde default max ctx matches main ctx	2025-05-15 10:26:20 +08:00
Concedo	c5ea7fad93	updated lite, only show processed input in debugmode	2025-05-14 17:46:54 +08:00
Concedo	21e31e255b	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/docker.yml # README.md # build-xcframework.sh # common/CMakeLists.txt # examples/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-metal/ggml-metal.m # ggml/src/ggml-metal/ggml-metal.metal # ggml/src/ggml-sycl/CMakeLists.txt # ggml/src/ggml-sycl/backend.hpp # ggml/src/ggml-sycl/common.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/mmvq.cpp # ggml/src/ggml-sycl/vecdotq.hpp # scripts/compare-llama-bench.py # src/CMakeLists.txt # src/llama-model.cpp # src/llama.cpp # tests/test-backend-ops.cpp # tests/test-opt.cpp # tools/llama-bench/README.md # tools/llama-bench/llama-bench.cpp # tools/mtmd/CMakeLists.txt # tools/mtmd/README.md # tools/mtmd/clip.cpp # tools/rpc/rpc-server.cpp # tools/server/CMakeLists.txt # tools/server/README.md	2025-05-13 00:28:35 +08:00
Concedo	40eb3a54c4	rename some toolip texts	2025-05-11 22:50:40 +08:00

1 2 3 4 5 ...

996 commits