koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-07 00:41:50 +00:00

Author	SHA1	Message	Date
Concedo	fedd529fdc	autofit counts overheads	2025-12-21 14:31:08 +08:00
Concedo	9458e08346	fixed https://github.com/LostRuins/koboldcpp/issues/1892	2025-12-19 22:52:39 +08:00
Concedo	30fecac3a3	small tweak	2025-12-18 15:41:22 +08:00
Concedo	1e083d9c8b	integrate autofit for upstream, removed forceversion	2025-12-17 18:42:47 +08:00
Concedo	9bc724f86c	rearrage some elements in launcher	2025-12-17 17:00:26 +08:00
Concedo	cacfa37611	wip	2025-12-17 16:04:45 +08:00
Concedo	bca0258c2a	bump default gen amount by 128 to 896	2025-12-14 22:17:31 +08:00
Concedo	e46a6a2796	better int parser	2025-12-13 09:28:10 +08:00
Concedo	ab9bc6f2ae	zimage cfg clamp is opt out with remove_limits	2025-12-13 09:20:00 +08:00
Concedo	b714fe19e2	allow easy clamping of max cfg and steps	2025-12-12 15:22:37 +08:00
Concedo	d07d2c1b39	stub loras endpoint for comfy	2025-12-11 22:48:38 +08:00
Concedo	fd0d0cab03	move pipeline parallelism to a --pipelineparallel launch flag	2025-12-11 21:03:41 +08:00
Concedo	b7428048fc	try reduce pipeline parallelism in order to reduce compute buffer sizes	2025-12-11 14:30:38 +08:00
Concedo	8a18e094f5	added smartcaching implementation inspired from Pento95 (+2 squashed commit) Squashed commit: [fcc498688] wip basic smart caching test [b6e8b2577] wip basic smart caching test	2025-12-10 18:00:03 +08:00
Concedo	242ae8b8f3	http get cleanup	2025-12-08 19:51:55 +08:00
Concedo	8c17541cc0	modify llama.cpp branding on lcpp ui (+1 squashed commits) Squashed commits: [067343edf] modify llama.cpp branding on lcpp ui	2025-12-07 12:53:33 +08:00
Concedo	12d11cee5c	added url to docs	2025-12-06 09:21:56 +08:00
Concedo	3550265249	add indent to kcpps files	2025-12-04 21:26:44 +08:00
Concedo	177e0d7515	strip_oaicontent_of_media placeholder (+2 squashed commit) Squashed commit: [7ccd52ef4] placeholder [71fd2d7bb] strip_oaicontent_of_media	2025-12-01 01:29:57 +08:00
Concedo	bf5efcf86d	Merge commit '`d82b7a7c1d`' into concedo_experimental # Conflicts: # ci/run.sh # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-cuda/common.cuh # tests/CMakeLists.txt	2025-11-30 15:43:11 +08:00
Concedo	2985575be4	allow assistant prefills, fixed showgui issue	2025-11-30 12:52:28 +08:00
Concedo	925e7f8f6d	added a secondary terminal mirror for linux	2025-11-29 21:53:51 +08:00
Concedo	9999b8950d	cleaner resizing	2025-11-29 18:01:49 +08:00
Concedo	eda4a312cb	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/vulkan.Dockerfile # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-opencl/CMakeLists.txt # ggml/src/ggml-opencl/ggml-opencl.cpp # ggml/src/ggml-sycl/common.hpp # tests/test-backend-ops.cpp # tools/server/README.md	2025-11-28 13:22:02 +08:00
Concedo	e570478275	limit cuda arches + scale tweaks	2025-11-28 13:05:11 +08:00
Concedo	7527f1eff0	handle media for jinja path (+1 squashed commits) Squashed commits: [29d47d6b7] handle media for jinja path	2025-11-27 11:40:08 +08:00
Concedo	d68f4a5ae5	disable clip fa for now	2025-11-27 10:20:38 +08:00
Concedo	2b00292bfe	display path on 404	2025-11-27 10:07:08 +08:00
Concedo	c12f9e3b7c	bump version	2025-11-27 01:04:09 +08:00
Wagner Bruna	998dfcd1be	sd: add an API endpoint to list the available schedulers (#1856 )	2025-11-26 22:49:36 +08:00
Concedo	d9b9c54393	added another alias to jinjatools	2025-11-26 18:52:08 +08:00
Concedo	9b6320cd71	adjust launcher scaling behavior	2025-11-25 21:32:03 +08:00
Concedo	9a7f749f7c	minor tweak for sd	2025-11-24 22:31:03 +08:00
Wagner Bruna	3a7dd1a97f	sd: sync to master-358-347710f Also adapt Koboldcpp LoRA loading function, and add backend support for lora_apply_mode.	2025-11-23 19:28:54 -03:00
Concedo	1cc4403cba	updated llama.cpp web ui (+2 squashed commit) Squashed commit: [9b22ac6e4] more fixes for lcpp web ui,. will be squashed [522b59b4c] henky tries using svelte or something	2025-11-24 00:43:27 +08:00
Rose	eeb7363985	improvements to tool calling logic (merged changes from old PR branch) (#1855 ) * improvements to tool calling logic (merged changes from old PR branch) * added some tweaks for improved tool calls to reuse old ctx, but needs testing. refer to PR. * fixes to some stuff that concedo's modifications broke * fixed error in reasoning * extremely hacky way to cache tool list please fix * oops forgot to add this * slightly less hacky way to preserve the tool list in context * prevented unintended toolcalls from happening when LLM states something irrelevant to toolcall decision * fixed something that broke koboldlite * fixed bug added by concedo that broke jinja tools * experimental further compression of tools array, needs testing * reverted experimental further compression of tools array * final cleanup * add newline after memory insert * changed tool reasoning to always be in json format to enforce including final decision * used new json format to skip extra llm call when not necessary * more catching of possible bad llm output * further cleanup * got it down to just one llm call! * better json format * even better json format * further refinement to json format * further refinement to json format * fixed broken tool calling * single-call enforced json method now seems to work well. removed fallbacks as they are no longer required. --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2025-11-23 22:41:31 +08:00
Concedo	b281d2554a	add a resume after horde worker pauses	2025-11-18 22:48:18 +08:00
Concedo	8631bbcee3	linting	2025-11-18 18:56:31 +08:00
LostRuins Concedo	281542aa0d	add smoothing curve, not tested	2025-11-17 23:07:35 +08:00
LostRuins Concedo	ea22e04320	filename insenstive search for adapters	2025-11-15 09:48:22 +08:00
LostRuins Concedo	357bef3082	add toggle for jinja tools	2025-11-12 17:29:42 +08:00
LostRuins Concedo	95291a93df	rosie fixes: add format normalization for tools and tool call streaming fixes (#1842 )	2025-11-11 23:06:27 +08:00
LostRuins Concedo	cdc18f0945	linting (+1 squashed commits) Squashed commits: [`994427d3c`] linting	2025-11-10 20:54:44 +08:00
Wagner Bruna	2ae6bff5bd	split memory detection functions and add debug command (#1832 )	2025-11-10 18:07:15 +08:00
LostRuins Concedo	60a74bdd89	make tool calling work with jinja. but still need to fix qwen omni first (+1 squashed commits) Squashed commits: [e394da61e] make tool calling work with jinja. but still need to fix qwen omni first	2025-11-09 16:56:14 +08:00
LostRuins Concedo	055fdcef63	update model path jinja tojson	2025-11-08 21:51:50 +08:00
LostRuins Concedo	af94884971	update props	2025-11-08 10:15:13 +08:00
LostRuins Concedo	92b5afc019	flag to show if jinja is enabled	2025-11-08 00:49:50 +08:00
LostRuins Concedo	462a34ed5b	jinja is now working	2025-11-07 23:46:22 +08:00
LostRuins Concedo	cfb22b5c9d	rename a missed BLAS -> batch	2025-11-06 16:11:26 +08:00

1 2 3 4 5 ...

1190 commits