koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 09:04:36 +00:00

Author	SHA1	Message	Date
Concedo	4b348d0b7e	add 2 more save slots	2025-07-22 21:07:19 +08:00
Concedo	75154a3d91	add ping endpoint	2025-07-22 18:55:35 +08:00
Concedo	30675b0798	Merge branch 'upstream' into concedo_experimental # Conflicts: # CODEOWNERS # docs/build.md # scripts/sync-ggml.last # tests/test-backend-ops.cpp # tools/imatrix/README.md # tools/imatrix/imatrix.cpp	2025-07-20 22:47:31 +08:00
Concedo	b028dd4e84	minor fixes	2025-07-18 13:22:59 +08:00
Concedo	1ca666f9c1	allow handling multipart files up to 999	2025-07-18 01:18:28 +08:00
Concedo	afca31bfbe	handle clean_env for remotetunnel	2025-07-17 18:21:22 +08:00
Concedo	d4a394ff73	label attached media with ids	2025-07-17 10:04:46 +08:00
Concedo	d3d5e36af6	backwards compat for older flags in config load	2025-07-15 22:05:57 +08:00
Concedo	51cac6f30c	add a title to load config	2025-07-15 18:22:44 +08:00
Concedo	8396add5be	removed hunyuan autoguess template, fixed multi file loading up to 999 parts	2025-07-15 17:49:49 +08:00
Concedo	b7f8d0fe2b	handle inconsistent final message content being sent with finish_reason	2025-07-14 22:17:18 +08:00
Concedo	0e8f96414a	add in backwards compatibility for older clients with incorrect json_schema passing	2025-07-14 17:41:08 +08:00
Concedo	e7eb6d3200	increase default ctx size to 8k, rename usecublas to usecuda	2025-07-13 18:27:42 +08:00
Concedo	811463a704	split audio and vision detection separately	2025-07-13 17:47:15 +08:00
Concedo	0938af7c83	fixed noscript image gen	2025-07-13 11:37:52 +08:00
Concedo	6f4f1b7389	allowing resuming incomplete aria2 downloads	2025-07-13 11:14:39 +08:00
Concedo	e9473305d0	wip2 (+1 squashed commits) Squashed commits: [4628777b6] wip	2025-07-12 18:54:40 +08:00
Concedo	57ce374240	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/ISSUE_TEMPLATE/010-bug-compilation.yml # .github/ISSUE_TEMPLATE/011-bug-results.yml # .github/labeler.yml # .github/workflows/build.yml # .github/workflows/release.yml # .gitmodules # CMakeLists.txt # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-opencl/kernels/softmax_4_f16.cl # ggml/src/ggml-opencl/kernels/softmax_4_f32.cl # ggml/src/ggml-opencl/kernels/softmax_f16.cl # ggml/src/ggml-opencl/kernels/softmax_f32.cl # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/element_wise.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # scripts/sync-ggml-am.sh # scripts/sync-ggml.last # scripts/sync-ggml.sh # tests/test-backend-ops.cpp # tests/test-c.c	2025-07-05 12:16:28 +08:00
Concedo	ae0c6b02f8	print system info	2025-07-05 11:58:00 +08:00
Wagner Bruna	d74c16e6e0	enable flash attention for image generation (#1633 )	2025-07-05 11:20:51 +08:00
Wagner Bruna	bc3e4c1197	remove obsolete warning about flash attention on Vulkan (#1634 )	2025-07-03 16:57:03 +08:00
Concedo	f407aa3b8a	emulated oai image generation	2025-07-02 16:01:56 +08:00
Concedo	cdda9d16e0	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/tools.sh # build-xcframework.sh # ci/run.sh # examples/Miku.sh # examples/chat-13B.sh # examples/chat-persistent.sh # examples/chat-vicuna.sh # examples/chat.sh # examples/jeopardy/jeopardy.sh # examples/reason-act.sh # examples/server-llama2-13B.sh # examples/sycl/build.sh # examples/sycl/run-llama2.sh # examples/sycl/run-llama3.sh # examples/ts-type-to-grammar.sh # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/element_wise.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # scripts/apple/validate-apps.sh # scripts/apple/validate-ios.sh # scripts/apple/validate-macos.sh # scripts/apple/validate-tvos.sh # scripts/apple/validate-visionos.sh # scripts/check-requirements.sh # scripts/ci-run.sh # scripts/compare-commits.sh # scripts/debug-test.sh # scripts/gen-authors.sh # scripts/get-hellaswag.sh # scripts/get-pg.sh # scripts/get-wikitext-103.sh # scripts/get-wikitext-2.sh # scripts/get-winogrande.sh # scripts/hf.sh # scripts/qnt-all.sh # scripts/run-all-perf.sh # scripts/run-all-ppl.sh # scripts/sync-ggml-am.sh # scripts/sync-ggml.sh # scripts/tool_bench.sh # tests/test-backend-ops.cpp # tests/test-lora-conversion-inference.sh # tests/test-tokenizer-0.sh # tools/server/README.md	2025-06-30 20:38:44 +08:00
Concedo	b6edb79648	filter out empty entries	2025-06-28 20:22:34 +08:00
Concedo	a88c56e70c	Merge remote-tracking branch 'origin/upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/release.yml # examples/eval-callback/eval-callback.cpp # ggml/src/ggml-cann/common.h # ggml/src/ggml-vulkan/CMakeLists.txt # ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt # tests/test-backend-ops.cpp	2025-06-28 17:47:53 +08:00
Concedo	4ec0e0fd21	now accept multiple images for reference images	2025-06-28 17:30:28 +08:00
Concedo	0bd648ffa4	photomaker renamed to extra image to handle future extension	2025-06-28 10:26:06 +08:00
tsite	df47b51bd1	support python 3.13 (#1621 )	2025-06-27 00:18:30 +08:00
Concedo	ace537d44e	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/release.yml # CMakeLists.txt # examples/simple-chat/simple-chat.cpp # src/llama-quant.cpp # tools/run/run.cpp # tools/server/README.md	2025-06-24 23:06:16 +08:00
Concedo	2d822d3059	fixed a typo	2025-06-22 23:28:29 +08:00
Concedo	abc1d8ac25	better way of checking for avx2 support	2025-06-22 22:56:50 +08:00
Concedo	52dcfe42d6	try auto selecting correct backend while checking intrinsics	2025-06-22 18:16:02 +08:00
Concedo	72d467c6d5	vision is now working in ollama owui	2025-06-21 23:43:43 +08:00
Concedo	6039791adf	minor bugfixes	2025-06-21 18:41:28 +08:00
Concedo	65ff041827	added more perf stats	2025-06-21 12:12:28 +08:00
Wagner Bruna	08adfb53c9	Configurable VAE threshold limit (#1601 ) * add backend support for changing the VAE tiling threshold * trigger VAE tiling by image area instead of dimensions I've tested with GGML_VULKAN_MEMORY_DEBUG all resolutions with the same 768x768 area (even extremes like 64x9216), and many below that: all consistently allocate 6656 bytes per image pixel. As tiling is primarily useful to avoid excessive memory usage, it seems reasonable to enable VAE tiling based on area rather than maximum image side. However, as there is currently no user interface option to change it back to a lower value, it's best to maintain the default behavior for now. * replace the notile option with a configurable threshold This allows selecting a lower threshold value, reducing the peak memory usage. The legacy sdnotile parameter gets automatically converted to the new parameter, if it's the only one supplied. * simplify tiling checks, 768 default visible in launcher --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-06-21 10:14:57 +08:00
Concedo	2ba7803b95	replace_instruct_placeholders is now default	2025-06-20 22:11:58 +08:00
Concedo	4e40f2aaf4	added photomaker face cloning	2025-06-20 21:33:36 +08:00
Concedo	21881a861d	rename restrict square to sdclampedsoft	2025-06-20 15:39:55 +08:00
Concedo	924dfa7cd3	bump version	2025-06-18 21:37:24 +08:00
Concedo	e0a7694328	try to set cuda pcie order first thing	2025-06-18 20:25:38 +08:00
Concedo	a8d33ebb0d	increase genamt hardlimit from 0.1 to 0.2 ratio	2025-06-18 19:34:12 +08:00
Concedo	40443a98f5	show available RAM, fixed SD vae tiling noise	2025-06-18 18:44:50 +08:00
Concedo	7966bdd1ad	allow embeddings model to use gpu	2025-06-18 00:46:30 +08:00
Concedo	ab29be54c4	comfyui compat - serve temporary upload endpoint for img2img	2025-06-16 23:18:47 +08:00
Concedo	861a2f5275	terminal title	2025-06-15 21:51:44 +08:00
Concedo	238be98efa	Allow override config for gguf files when reloading in admin mode, updated lite, fixed typo (+1 squashed commits) Squashed commits: [fe14845cc] Allow override config for gguf files when reloading in admin mode, updated lite (+2 squashed commit) Squashed commit: [9ded66aa5] Allow override config for gguf files when reloading in admin mode [9597f6a34] update lite	2025-06-14 12:00:20 +08:00
Wagner Bruna	f6d2d1ce5c	configurable resolution limit (#1586 ) * refactor image gen configuration screen * make image size limit configurable * fix resolution limits and keep dimensions closer to the original ratio * use 0.0 for the configured default image size limit This prevents the current default value from being saved into the config files, in case we later decide to adopt a different value. * export image model version when loading * restore model-specific default image size limit * change the image area restriction to be specified by a square side * move image resolution limits down to the C++ level * Revert "export image model version when loading" This reverts commit `fa65b23de3`. * Linting Fixes: PY: - Inconsistent var name sd_restrict_square -> sd_restrict_square_var - GUI swap back to using absolute row numbers for now. - fstring fix - size_limit -> side_limit inconsistency C++: - roundup_64 standalone function - refactor sd_fix_resolution variable names for clarity - move "anti crashing" hard total megapixel limit always to be applied after soft total megapixel limit instead of conditionally only when sd_restrict_square is unset * allow unsafe resolutions if debugmode is on --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-06-13 20:05:20 +08:00
Concedo	1cbe716e45	allow setting maingpu	2025-06-12 17:53:43 +08:00
henk717	f151648f03	Pyinstaller launcher and dependency updates This PR adds a new launcher executable to the unpack feature, eliminating the need to have python and its dependencies in the unpacked version. It also does a few dependency changes to help future proof.	2025-06-10 23:08:02 +08:00

1 2 3 4 5 ...

1042 commits