koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-10 04:00:53 +00:00

Author	SHA1	Message	Date
Concedo	0cfd8d23cb	handle symlinks (+1 squashed commits) Squashed commits: [fb8477b9] fixed makefile (+4 squashed commit) Squashed commit: [4a245bba] fixed a makefile issue [d68eba69] alias usehipblas to usecublas [a9ab0a7c] dynamic rocwmma selection [fefe17c7] revert rocwmma	2025-03-17 21:03:30 +08:00
Concedo	131107dc91	lite fix admin button display issue when preload story	2025-03-17 00:17:31 +08:00
Concedo	6888f5495d	allow quantkv with contextshift	2025-03-16 21:48:42 +08:00
Concedo	e466ce65e2	updated sd metadata	2025-03-16 20:12:43 +08:00
Concedo	8708403ee9	revert clean	2025-03-16 17:53:35 +08:00
Concedo	5ef1722d5f	fix for sd	2025-03-16 17:02:42 +08:00
Concedo	0954e9e476	improve model estimation	2025-03-16 16:14:13 +08:00
Concedo	5d7c5e9e33	Merge branch 'upstream' into concedo_experimental # Conflicts: # examples/tts/tts.cpp	2025-03-16 15:42:39 +08:00
Concedo	2401502cbd	improvement to tool calling, allowing specific tools to be used	2025-03-16 15:20:08 +08:00
Concedo	9f7fd63160	revert unwanted change to tool calling	2025-03-16 01:35:48 +08:00
marcoStocchi	f4c3dd5daa	llama-tts : add '-o' option (#12398 ) * added -o option to specify an output file name * llama-tts returns ENOENT in case of file write error note : PR #12042 is closed as superseded with this one.	2025-03-15 17:23:11 +01:00
Concedo	98eade358a	more rocm include dir	2025-03-15 23:29:00 +08:00
aubreyli	3d35d87b41	SYCL: Delete redundant plus sign and space (#12391 )	2025-03-15 15:49:03 +01:00
fairydreaming	b19bd064c0	SYCL : support non-contiguous tensors in binary ops (add, sub, etc) (#12399 ) * sycl : support non-contiguous tensors in binary ops * sycl : silence unused variable warning --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>	2025-03-15 22:19:30 +08:00
Concedo	67851e5415	Merge branch 'upstream' into concedo_experimental # Conflicts: # examples/run/run.cpp # ggml/src/ggml-cann/aclnn_ops.cpp	2025-03-15 19:54:19 +08:00
Concedo	e84596ec1a	add config for default gen tokens and bos toggle	2025-03-15 19:53:06 +08:00
Concedo	bfc30066c9	fixed a clip processing bug	2025-03-15 17:49:49 +08:00
Concedo	7272165e0e	verbosity	2025-03-15 12:13:04 +08:00
Concedo	4212f0b8e8	wip on multiple fixes	2025-03-15 10:50:36 +08:00
Chenguang Li	92a391327e	[CANN]MUL_MAT optimization (#12382 )	2025-03-15 09:31:08 +08:00
Eric Curtin	9f2250ba72	Add CLI arg to llama-run to adjust the number of threads used (#12370 ) We default to 4, sometimes we want to manually adjust this Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-03-14 16:41:20 +00:00
Sigbjørn Skjæret	774973b8f3	main : add -sysf / --system-prompt-file (#12249 ) (#12250 ) * add system_prompt_file * add -sysf / --system-prompt-file * remove system_prompt_file	2025-03-14 16:57:05 +01:00
Concedo	4a29e216e7	edit readme	2025-03-14 21:06:55 +08:00
fairydreaming	8fcb563613	Load all MoE experts during warmup (#11571 ) * llama : introduce llama_set_warmup() API call that controls warmup mode; use all MoE experts during warmup * common : use new API to enable warmup mode during model warmup --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>	2025-03-14 13:47:05 +01:00
Concedo	d7498e7e8a	added model switching to gguf in admin mode (auto guess layers)	2025-03-14 19:45:55 +08:00
Concedo	30cb77a900	rename replace_instruct_placeholders field	2025-03-14 18:37:12 +08:00
Concedo	be3bba67ff	Merge branch 'upstream' into concedo_experimental # Conflicts: # src/llama-model.cpp	2025-03-14 18:25:21 +08:00
Victor	add2a3aa5a	server: fix "--grammar-file" parameter (#12285 )	2025-03-14 11:21:17 +01:00
Concedo	782e1e193a	replaced winclinfo.exe with a simplified simpleclinfo.exe that only provides device names and nothing else (+1 squashed commits) Squashed commits: [4a73c8d3] replaced winclinfo.exe with a simplified simpleclinfo.exe that only provides device names and nothing else	2025-03-14 18:18:32 +08:00
Concedo	6a1dd57435	gemma3 template, updated lite, fixed tool calling, reenable ctx shift for gemma3	2025-03-14 17:47:01 +08:00
Georgi Gerganov	c522ce4143	graph : simplify attn input build for unified KV cache (#12381 ) ggml-ci	2025-03-14 10:47:44 +02:00
Georgi Gerganov	081bee8c64	hparams : add SWA rope parameters (#12374 ) ggml-ci	2025-03-14 09:03:24 +02:00
Concedo	7dc72db9de	Merge branch 'upstream' into concedo_experimental	2025-03-14 11:58:53 +08:00
Concedo	0db4ae6237	traded my ink for a pen	2025-03-14 11:58:15 +08:00
Georgi Gerganov	84d5475541	llama : fix Gemma3 SWA KV cache shift (#12373 ) * llama : fix Gemma3 SWA KV cache shift ggml-ci * hparams : add comment [no ci]	2025-03-13 19:08:07 +02:00
Concedo	52cf1ded0c	remove unwanted print	2025-03-14 00:24:28 +08:00
Concedo	bdf2977372	fixed windows ci	2025-03-13 20:45:16 +08:00
Concedo	0460d92cc3	disable context shifting for gemma3	2025-03-13 20:28:26 +08:00
Concedo	ca698f0cbe	tweaked sd img metadata	2025-03-13 20:04:29 +08:00
Wagner Bruna	5413be2c1b	sd: add generation parameters to image metadata (#1416 ) Straight adaptation from stable-diffusion.cpp main.cpp.	2025-03-13 19:35:06 +08:00
Xuan-Son Nguyen	be7c303410	arg : no n_predict = -2 for examples except for main and infill (#12364 )	2025-03-13 12:34:54 +01:00
Concedo	2c9ade61fe	test automatic vk shader rebuilding	2025-03-13 19:34:15 +08:00
Georgi Gerganov	e0dbec0bc6	llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 ) * llama : refactor llama_context, llama_kv_cache, llm_build_context ggml-ci * graph : don't mutate the KV cache during defrag ggml-ci * context : reduce virtuals + remove test function ggml-ci * context : move interface implementation to source file + factory ggml-ci * graph : move KV cache build functions to llama_context impl ggml-ci * graph : remove model reference from build_pooling ggml-ci * graph : remove llama_model reference ggml-ci * kv_cache : provide rope factors ggml-ci * graph : rework inputs to use only unique_ptr, remove attn input abstraction ggml-ci * context : remove llama_context_i abstraction ggml-ci * context : clean-up ggml-ci * graph : clean-up ggml-ci * llama : remove redundant keywords (struct, enum) ggml-ci * model : adapt gemma3 ggml-ci * graph : restore same attention ops as on master ggml-ci * llama : remove TODO + fix indent ggml-ci	2025-03-13 12:35:44 +02:00
Ishaan Gandhi	2048b5913d	server : fix crash when using verbose output with input tokens that are not in printable range (#12178 ) (#12338 ) * Fix DOS index bug * Remove new APIs * remove extra line * Remove from API * Add extra newline * Update examples/server/server.cpp --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2025-03-13 11:10:05 +01:00
Concedo	e75539e8cb	too many issues without BOS (+1 squashed commits) Squashed commits: [7138d941] only print bos alert in debug	2025-03-13 16:48:29 +08:00
Concedo	1ef41c2124	streamline output console log (+1 squashed commits) Squashed commits: [ca474bdd] streamline output console log	2025-03-13 15:33:49 +08:00
Concedo	16137f4281	gemma3 now works correctly	2025-03-13 14:34:18 +08:00
Concedo	57c9523405	sd lora from url	2025-03-13 10:55:01 +08:00
Oscar Barenys	f08f4b3187	Update build.yml for Windows Vulkan builder to use Vulkan 1.4.304 SDK for VK_NV_cooperative_matrix2 support (#12301 )	2025-03-12 20:06:58 +01:00
Concedo	77debb1b1b	gemma3 vision works, but is using more tokens than expected - may need resizing	2025-03-13 00:31:16 +08:00

1 2 3 4 5 ...

7248 commits