koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	2b02cd75c7	reformat debug logging	2024-02-01 23:20:51 +08:00
Concedo	340fbbbb04	show warning if genamt >= ctxsize, show t/s values	2024-01-31 18:51:42 +08:00
Concedo	13dcf4b556	print seed	2024-01-31 14:42:47 +08:00
Concedo	21ab727e83	change split mode to rows	2024-01-30 22:30:08 +08:00
Concedo	ed09a854f0	Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .gitignore # CMakeLists.txt # Makefile # README.md # ci/run.sh # ggml-opencl.cpp # tests/CMakeLists.txt	2024-01-27 11:45:07 +08:00
Concedo	762eeb6204	triage for opencl	2024-01-27 11:09:43 +08:00
Concedo	d9a7bd577a	gpu layer offloading disabled for phi models in clblast	2024-01-25 17:40:05 +08:00
Concedo	08236ccc97	better abort handling, added support for dynatemp exponent	2024-01-23 16:56:12 +08:00
Concedo	5ff53507c4	fixed compile issues for cublas	2024-01-21 14:23:48 +08:00
Concedo	5639c1a520	units (+2 squashed commit) Squashed commit: [166979d9] units coversion [038dd5d4] get rid of all warnings (+1 squashed commits) Squashed commits: [6efd1e1b] get rid of all warnings	2024-01-20 23:53:21 +08:00
Concedo	db14de5c32	fossilize ggml library ver 3, to support ggjtv3	2024-01-20 10:49:25 +08:00
kalomaze	123bff9a0f	Full DynaTemp implementation + UI (#600 ) * move Dynatemp changes to new branch * fix float header * Properly reintroduce variable expert count Controllable through experts.txt * first pass at DynaTemp UI Checkbox partial implemented, Min and Max Temp implemented * DynaTemp UI Checkbox Trigger DynaTemp on checkbox * DynaTemp UI checkbox edition Hell Yeah! DynaTemp! * Remove greedy dynatemp * Fix race condition caused by debug print * Fixed broken presets and miro Fixes broken presets and mirostat * Remove debug function + HHI temp Also removed unnecessary softmax double precision * Fix whitespace (?) for generate function * epic upstream renaming scheme fix * fix stupid indents * Other cleanup Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp * Slight indent fix * revert batch pyinstaller maker to mainline and also delete experts.txt since adjustable routing is also being removed for the PR * compact dynatemp into a single value dynatemp_range. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value. --------- Co-authored-by: Alexander Abushady <aabushady214@gmail.com> Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-01-06 11:13:16 +08:00
Concedo	e49d398f73	use same struct size for cuda and non cuda (+1 squashed commits) Squashed commits: [6eee8e2f] use same struct size for cuda and non cuda	2024-01-03 16:05:54 +08:00
Concedo	94e68fe474	added field to show recent seed	2024-01-02 15:35:04 +08:00
Concedo	5e59112de8	prevent other calls when uninitialized	2023-12-28 12:04:53 +08:00
Concedo	2d5d82e915	addlocate gpt_params on heap instead to avoid rare segfault	2023-12-28 11:48:21 +08:00
DebuggingLife46	e733a9e425	Add logit_bias to the OpenAI api (#577 ) * Add logit_bias to the OpenAI api * Cleanup and refactor, test in swagger. --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2023-12-27 00:26:19 +08:00
Concedo	8823e8b06d	added presence penalty into lite ui	2023-12-23 10:39:40 +08:00
Concedo	77463e0e9c	batch size improvements	2023-12-22 15:27:40 +08:00
Concedo	3f863eed72	add presence penalty	2023-12-19 23:18:56 +08:00
Concedo	7469f202ea	use lowvram flag for offload qkv	2023-12-08 18:16:14 +08:00
Concedo	ec21fa7712	Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .gitignore # CMakeLists.txt # Makefile # Package.swift # README.md # ggml-cuda.cu # llama.cpp # llama.h # scripts/sync-ggml.sh # tests/CMakeLists.txt	2023-12-08 17:42:26 +08:00
Concedo	c7511526a2	noscript mode is done	2023-12-07 00:52:25 +08:00
Concedo	6570a2005b	token count includes ids	2023-12-03 15:44:53 +08:00
Concedo	c142c5634a	fixed segfault with clblast by reversing commit in issue https://github.com/ggerganov/llama.cpp/issues/4296	2023-12-03 00:56:00 +08:00
Concedo	12f66eaa1d	adjust fragmentation fix	2023-12-02 15:59:08 +08:00
Concedo	a012342a77	updated docs, shifted kv extra space to be subtracted from user's ctx value instead of added on load.	2023-11-30 14:19:40 +08:00
Concedo	ba5c33319b	Allocate a small amount of extra context for GGUF to deal with KV fragmentation causing issues in some scenarios.	2023-11-28 20:55:14 +08:00
Concedo	bffa78116d	explore quiet mode	2023-11-26 23:57:27 +08:00
Concedo	a6eb9b8010	Fix GPT2 not loading due to graph too small	2023-11-26 23:06:42 +08:00
Concedo	eb42c73953	revert auto rope scaling for already-ropetuned models - just use their values	2023-11-24 14:20:36 +08:00
Concedo	4d7c14be73	fix stop seq escaping newline	2023-11-20 22:35:45 +08:00
Concedo	cf646fa809	try to scale custom roped models	2023-11-19 16:24:13 +08:00
Concedo	8b919b5b57	allow customized rope to use model set values	2023-11-15 16:21:52 +08:00
Concedo	be92cfa125	added preloadstory	2023-11-10 13:05:22 +08:00
Concedo	fb3bcac368	handle memory separately for kcpp	2023-11-07 17:15:14 +08:00
Concedo	1e7088a80b	autopick cublas in gui if possible, better layer picking logic	2023-11-05 01:35:27 +08:00
Concedo	ae2cd56de8	kobold integration of min_p sampler (+1 squashed commits) Squashed commits: [8ad2e349] kobold integration for min_p sampler	2023-11-01 19:08:45 +08:00
Concedo	cc5b282350	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # build.zig # flake.lock # flake.nix # ggml.c	2023-10-31 20:44:04 +08:00
Concedo	9eba77c6a0	finally got something workable	2023-10-30 23:30:21 +08:00
Concedo	7f050b5d16	tweak numbers	2023-10-29 22:46:19 +08:00
Concedo	7924592a83	context shift feature done	2023-10-29 18:21:39 +08:00
Concedo	338d6c265d	fixes to smartcontextpro	2023-10-29 10:42:37 +08:00
Concedo	20ef442c2a	fixed for smartcontext	2023-10-28 19:09:22 +08:00
Concedo	15f525c580	revamped smart context for llama models	2023-10-28 12:59:08 +08:00
Concedo	0f46534866	wip	2023-10-26 21:58:51 +08:00
Concedo	5db89b90b7	Merge branch 'master' into concedo_experimental # Conflicts: # .gitignore # CMakeLists.txt # Makefile # README.md # build.zig # ggml-opencl.cpp # tests/CMakeLists.txt # tests/test-double-float.cpp # tests/test-sampling.cpp	2023-10-25 23:58:15 +08:00
Concedo	839fc6dac8	handle freq_base_train	2023-10-24 23:44:22 +08:00
Concedo	cff75061fe	fixed some old models failing due to tokenizer changes, update lite (+1 squashed commits) Squashed commits: [9dee81ec] fixed some old models failing due to tokenizer changes, update lite tooltip (+3 squashed commit) Squashed commit: [5ab95a79] fixes [a561d5e2] fixed some old models failing due to tokenizer changes [95e65daf] lite updates	2023-10-22 11:04:59 +08:00
kalomaze	ddce116ec9	Fix for Top K disabling (#480 ) * Update gpttype_adapter.cpp * use n_vocab instead of 32000 for when top k is off	2023-10-19 23:20:44 +08:00

1 2 3 4 5

203 commits