Ira Peach
e00e17b3f9
Flush STDOUT when server starts listening. ( #651 )
...
This works around a Win32 issue when piping output from a PyInstaller
context, such as when doing so in a perl script or to an output file.
Print statements from a Python context don't properly get output unless
flushed.
This strategically flushes the print statements so no information is
lost, though it may be better to flush all print statements in a Python
context via a subroutine wrapper.
See also:
https://mail.python.org/pipermail/python-bugs-list/2004-August/024923.html
https://stackoverflow.com/a/466849
https://stackoverflow.com/q/62693079
2024-01-31 14:40:45 +08:00
Concedo
f81404e33c
updated class py, added imatrix
2024-01-28 22:37:11 +08:00
Concedo
c2e497ccfb
deferred aborting for queued generations
2024-01-28 14:24:15 +08:00
Concedo
61ca3a0d30
show total of 8 backends
2024-01-27 17:05:33 +08:00
Concedo
87d852b85c
get gpu names with vulkaninfo
2024-01-26 12:58:30 +08:00
Concedo
2a4a7241e6
Merge branch 'vulkan_test' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# llama.cpp
2024-01-25 23:01:44 +08:00
Concedo
346c1a97de
fixed file select cancel, updated lite
2024-01-24 16:36:53 +08:00
Concedo
0f6fa6be93
try adding other fallback backends for linux
2024-01-23 23:37:56 +08:00
Concedo
a4ed5c6471
added 48k ctx option
2024-01-23 17:27:02 +08:00
Concedo
08236ccc97
better abort handling, added support for dynatemp exponent
2024-01-23 16:56:12 +08:00
Concedo
dc7bc0cb50
Merge commit ' 584d674be6
' into concedo_experimental
...
# Conflicts:
# .github/workflows/nix-flake-update.yml
# Makefile
# Package.swift
# ggml-cuda.cu
# tests/test-quantize-fns.cpp
2024-01-14 16:29:44 +08:00
kalomaze
bd77a48037
Do not default to Repetition Penalty 1.1 ( #615 )
...
* Do not default to Repetition Penalty
* apply all known aliases for repetition penalty when using the OAI endpoint. rep pen defaults to 1, range to 256
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-01-13 22:20:02 +08:00
Concedo
b9ad08af19
improved dynatemp wizard
2024-01-11 11:26:14 +08:00
Concedo
5cc64ebb52
dynatemp wizard
2024-01-09 15:51:32 +08:00
Concedo
550829ed98
dont get stuck if cloudflared failed to download correctly
2024-01-08 21:11:17 +08:00
kalomaze
123bff9a0f
Full DynaTemp implementation + UI ( #600 )
...
* move Dynatemp changes to new branch
* fix float header
* Properly reintroduce variable expert count
Controllable through experts.txt
* first pass at DynaTemp UI
Checkbox partial implemented, Min and Max Temp implemented
* DynaTemp UI Checkbox
Trigger DynaTemp on checkbox
* DynaTemp UI checkbox edition
Hell Yeah! DynaTemp!
* Remove greedy dynatemp
* Fix race condition caused by debug print
* Fixed broken presets and miro
Fixes broken presets and mirostat
* Remove debug function + HHI temp
Also removed unnecessary softmax double precision
* Fix whitespace (?) for generate function
* epic upstream renaming scheme fix
* fix stupid indents
* Other cleanup
Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp
* Slight indent fix
* revert batch pyinstaller maker to mainline
and also delete experts.txt since adjustable routing is also being removed for the PR
* compact dynatemp into a single value dynatemp_range. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value.
---------
Co-authored-by: Alexander Abushady <aabushady214@gmail.com>
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-01-06 11:13:16 +08:00
Concedo
427ba21e62
add stub values for usage, revert cuda malloc pool implementation (+1 squashed commits)
...
Squashed commits:
[fd4cfb44] add stub values for usage, revert cuda malloc pool implementation
2024-01-05 21:58:16 +08:00
Concedo
20261049c9
try to reuse cloudflared file
2024-01-05 18:04:09 +08:00
Concedo
234f79fe9d
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# ci/run.sh
# llama.cpp
2024-01-03 22:33:38 +08:00
Concedo
94e68fe474
added field to show recent seed
2024-01-02 15:35:04 +08:00
Concedo
eee674045e
use native cl if found
2023-12-31 00:53:22 +08:00
Concedo
6177196052
tweak tooltips
2023-12-30 11:02:30 +08:00
Concedo
7ad92dbf4a
cleaned up the quick tab based on the suggested removals from discord members.
2023-12-30 10:41:46 +08:00
Concedo
63b65efb78
added tooltips for all items in the GUI launcher
2023-12-28 23:08:57 +08:00
Concedo
ec46661a32
wip adding tooltips
2023-12-28 15:54:22 +08:00
DebuggingLife46
e733a9e425
Add logit_bias to the OpenAI api ( #577 )
...
* Add logit_bias to the OpenAI api
* Cleanup and refactor, test in swagger.
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-12-27 00:26:19 +08:00
Concedo
c2d87b6545
increase multiuser default
2023-12-25 23:49:45 +08:00
Concedo
78a9d206d3
randomize horde genkey
2023-12-25 22:47:21 +08:00
Concedo
cc64f2cad1
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .github/ISSUE_TEMPLATE/bug.md
# Makefile
# README.md
# ggml-cuda.cu
# tests/test-grad0.cpp
2023-12-25 18:47:21 +08:00
Concedo
bd0d9039ec
better approach to multiuser check
2023-12-24 20:03:33 +08:00
Concedo
bc24c9334c
prevent prompt leakage during usage of check endpoint when genkey is provided in multiuser mode
2023-12-24 17:08:43 +08:00
Concedo
8823e8b06d
added presence penalty into lite ui
2023-12-23 10:39:40 +08:00
Concedo
852ca780c9
cherrypicked the Hipblas fixed from PR #571
2023-12-22 21:29:20 +08:00
Concedo
77463e0e9c
batch size improvements
2023-12-22 15:27:40 +08:00
Concedo
2378a29bde
better error handling, try to avoid segfault in sillytavern
2023-12-21 22:58:48 +08:00
Eugene Palmoff
a787ebe7cf
Handle broken pipe error ( #572 )
2023-12-21 17:51:36 +08:00
Concedo
3f863eed72
add presence penalty
2023-12-19 23:18:56 +08:00
Concedo
da2db0302c
Added support for ssl cert and key
2023-12-19 22:23:19 +08:00
Concedo
49a5dfc604
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# Makefile
# README.md
2023-12-19 16:07:48 +08:00
Concedo
1f77d2ad73
move multiprocessing import into function scope
2023-12-19 15:56:58 +08:00
ebolam
6948da5a0d
Fix for windows model unloading not releasing memory ( #569 )
...
* Add in model processes as a separate process so it can be killed when unloading to release memory on windows
* Fix from Henky
2023-12-19 15:55:41 +08:00
Concedo
ec05230703
updated lite, up ver
2023-12-17 14:38:39 +08:00
Concedo
aac7f0b944
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# ggml.c
2023-12-14 17:24:42 +08:00
Concedo
f0de4953ae
fixed length exceeding max ctx
2023-12-14 16:58:41 +08:00
Concedo
0e31f53422
Revert "lowvram var defaults"
...
This reverts commit 7a691522a6
.
2023-12-14 15:14:11 +08:00
Concedo
8dd975653d
removing existing yml files
2023-12-14 14:47:03 +08:00
Concedo
74acc5441d
Revert "Hide hipBLAS (ROCm) if CuBLAS exists - vice versa"
...
This reverts commit 4b854d46a4
.
2023-12-12 10:53:34 +08:00
Concedo
06581f243f
perf endpoint lets you monitor if the embedded horde worker has issues
2023-12-11 16:54:42 +08:00
YellowRoseCx
4b854d46a4
Hide hipBLAS (ROCm) if CuBLAS exists - vice versa
2023-12-10 22:49:35 -06:00
Concedo
7a691522a6
lowvram var defaults
2023-12-08 21:06:32 +08:00