koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Ira Peach	e00e17b3f9	Flush STDOUT when server starts listening. (#651 ) This works around a Win32 issue when piping output from a PyInstaller context, such as when doing so in a perl script or to an output file. Print statements from a Python context don't properly get output unless flushed. This strategically flushes the print statements so no information is lost, though it may be better to flush all print statements in a Python context via a subroutine wrapper. See also: https://mail.python.org/pipermail/python-bugs-list/2004-August/024923.html https://stackoverflow.com/a/466849 https://stackoverflow.com/q/62693079	2024-01-31 14:40:45 +08:00
Concedo	f81404e33c	updated class py, added imatrix	2024-01-28 22:37:11 +08:00
Concedo	c2e497ccfb	deferred aborting for queued generations	2024-01-28 14:24:15 +08:00
Concedo	61ca3a0d30	show total of 8 backends	2024-01-27 17:05:33 +08:00
Concedo	87d852b85c	get gpu names with vulkaninfo	2024-01-26 12:58:30 +08:00
Concedo	2a4a7241e6	Merge branch 'vulkan_test' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # llama.cpp	2024-01-25 23:01:44 +08:00
Concedo	346c1a97de	fixed file select cancel, updated lite	2024-01-24 16:36:53 +08:00
Concedo	0f6fa6be93	try adding other fallback backends for linux	2024-01-23 23:37:56 +08:00
Concedo	a4ed5c6471	added 48k ctx option	2024-01-23 17:27:02 +08:00
Concedo	08236ccc97	better abort handling, added support for dynatemp exponent	2024-01-23 16:56:12 +08:00
Concedo	dc7bc0cb50	Merge commit '`584d674be6`' into concedo_experimental # Conflicts: # .github/workflows/nix-flake-update.yml # Makefile # Package.swift # ggml-cuda.cu # tests/test-quantize-fns.cpp	2024-01-14 16:29:44 +08:00
kalomaze	bd77a48037	Do not default to Repetition Penalty 1.1 (#615 ) * Do not default to Repetition Penalty * apply all known aliases for repetition penalty when using the OAI endpoint. rep pen defaults to 1, range to 256 --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-01-13 22:20:02 +08:00
Concedo	b9ad08af19	improved dynatemp wizard	2024-01-11 11:26:14 +08:00
Concedo	5cc64ebb52	dynatemp wizard	2024-01-09 15:51:32 +08:00
Concedo	550829ed98	dont get stuck if cloudflared failed to download correctly	2024-01-08 21:11:17 +08:00
kalomaze	123bff9a0f	Full DynaTemp implementation + UI (#600 ) * move Dynatemp changes to new branch * fix float header * Properly reintroduce variable expert count Controllable through experts.txt * first pass at DynaTemp UI Checkbox partial implemented, Min and Max Temp implemented * DynaTemp UI Checkbox Trigger DynaTemp on checkbox * DynaTemp UI checkbox edition Hell Yeah! DynaTemp! * Remove greedy dynatemp * Fix race condition caused by debug print * Fixed broken presets and miro Fixes broken presets and mirostat * Remove debug function + HHI temp Also removed unnecessary softmax double precision * Fix whitespace (?) for generate function * epic upstream renaming scheme fix * fix stupid indents * Other cleanup Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp * Slight indent fix * revert batch pyinstaller maker to mainline and also delete experts.txt since adjustable routing is also being removed for the PR * compact dynatemp into a single value dynatemp_range. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value. --------- Co-authored-by: Alexander Abushady <aabushady214@gmail.com> Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-01-06 11:13:16 +08:00
Concedo	427ba21e62	add stub values for usage, revert cuda malloc pool implementation (+1 squashed commits) Squashed commits: [fd4cfb44] add stub values for usage, revert cuda malloc pool implementation	2024-01-05 21:58:16 +08:00
Concedo	20261049c9	try to reuse cloudflared file	2024-01-05 18:04:09 +08:00
Concedo	234f79fe9d	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # ci/run.sh # llama.cpp	2024-01-03 22:33:38 +08:00
Concedo	94e68fe474	added field to show recent seed	2024-01-02 15:35:04 +08:00
Concedo	eee674045e	use native cl if found	2023-12-31 00:53:22 +08:00
Concedo	6177196052	tweak tooltips	2023-12-30 11:02:30 +08:00
Concedo	7ad92dbf4a	cleaned up the quick tab based on the suggested removals from discord members.	2023-12-30 10:41:46 +08:00
Concedo	63b65efb78	added tooltips for all items in the GUI launcher	2023-12-28 23:08:57 +08:00
Concedo	ec46661a32	wip adding tooltips	2023-12-28 15:54:22 +08:00
DebuggingLife46	e733a9e425	Add logit_bias to the OpenAI api (#577 ) * Add logit_bias to the OpenAI api * Cleanup and refactor, test in swagger. --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2023-12-27 00:26:19 +08:00
Concedo	c2d87b6545	increase multiuser default	2023-12-25 23:49:45 +08:00
Concedo	78a9d206d3	randomize horde genkey	2023-12-25 22:47:21 +08:00
Concedo	cc64f2cad1	Merge branch 'master' into concedo_experimental # Conflicts: # .github/ISSUE_TEMPLATE/bug.md # Makefile # README.md # ggml-cuda.cu # tests/test-grad0.cpp	2023-12-25 18:47:21 +08:00
Concedo	bd0d9039ec	better approach to multiuser check	2023-12-24 20:03:33 +08:00
Concedo	bc24c9334c	prevent prompt leakage during usage of check endpoint when genkey is provided in multiuser mode	2023-12-24 17:08:43 +08:00
Concedo	8823e8b06d	added presence penalty into lite ui	2023-12-23 10:39:40 +08:00
Concedo	852ca780c9	cherrypicked the Hipblas fixed from PR #571	2023-12-22 21:29:20 +08:00
Concedo	77463e0e9c	batch size improvements	2023-12-22 15:27:40 +08:00
Concedo	2378a29bde	better error handling, try to avoid segfault in sillytavern	2023-12-21 22:58:48 +08:00
Eugene Palmoff	a787ebe7cf	Handle broken pipe error (#572 )	2023-12-21 17:51:36 +08:00
Concedo	3f863eed72	add presence penalty	2023-12-19 23:18:56 +08:00
Concedo	da2db0302c	Added support for ssl cert and key	2023-12-19 22:23:19 +08:00
Concedo	49a5dfc604	Merge branch 'master' into concedo_experimental # Conflicts: # Makefile # README.md	2023-12-19 16:07:48 +08:00
Concedo	1f77d2ad73	move multiprocessing import into function scope	2023-12-19 15:56:58 +08:00
ebolam	6948da5a0d	Fix for windows model unloading not releasing memory (#569 ) * Add in model processes as a separate process so it can be killed when unloading to release memory on windows * Fix from Henky	2023-12-19 15:55:41 +08:00
Concedo	ec05230703	updated lite, up ver	2023-12-17 14:38:39 +08:00
Concedo	aac7f0b944	Merge branch 'master' into concedo_experimental # Conflicts: # ggml.c	2023-12-14 17:24:42 +08:00
Concedo	f0de4953ae	fixed length exceeding max ctx	2023-12-14 16:58:41 +08:00
Concedo	0e31f53422	Revert "lowvram var defaults" This reverts commit `7a691522a6`.	2023-12-14 15:14:11 +08:00
Concedo	8dd975653d	removing existing yml files	2023-12-14 14:47:03 +08:00
Concedo	74acc5441d	Revert "Hide hipBLAS (ROCm) if CuBLAS exists - vice versa" This reverts commit `4b854d46a4`.	2023-12-12 10:53:34 +08:00
Concedo	06581f243f	perf endpoint lets you monitor if the embedded horde worker has issues	2023-12-11 16:54:42 +08:00
YellowRoseCx	4b854d46a4	Hide hipBLAS (ROCm) if CuBLAS exists - vice versa	2023-12-10 22:49:35 -06:00
Concedo	7a691522a6	lowvram var defaults	2023-12-08 21:06:32 +08:00

1 2 3 4 5 ...

446 commits