Concedo
66134bb36e
ui for loading SD models done
2024-02-29 17:08:22 +08:00
Concedo
524ba12abd
refactor - do not use a copy buffer to store generation outputs, instead return a cpp allocated ptr
2024-02-29 14:02:20 +08:00
Concedo
f75e479db0
WIP on sdcpp integration
2024-02-29 00:40:07 +08:00
Concedo
ad638285de
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# Makefile
# README.md
# flake.lock
# ggml-cuda.cu
# llama.cpp
# tests/test-backend-ops.cpp
# tests/test-quantize-fns.cpp
2024-02-28 13:41:35 +08:00
Concedo
71898cf728
unlock custom contextsize
2024-02-27 18:10:43 +08:00
Concedo
39ae58ef0d
fix tooltip glitch
2024-02-26 11:35:58 +08:00
YellowRoseCx
7b85917827
add additional tooltips ( #710 )
2024-02-26 11:15:57 +08:00
Concedo
a6ba735b07
up version for 1.59.1 makefile changes
2024-02-26 10:40:12 +08:00
Concedo
a6aff3fba0
fix typo
2024-02-25 19:40:40 +08:00
Concedo
1bcbd2e21b
updated lite
2024-02-24 17:59:44 +08:00
Concedo
f3a0e05d91
added noavx2 vulkan
2024-02-22 16:56:25 +08:00
Concedo
2d71256d21
try to make prints flush
2024-02-21 17:16:49 +08:00
Concedo
6181b46eef
added nocertify mode
2024-02-19 16:05:17 +08:00
Concedo
db0834593b
hide smartconext toggle when contextshift toggle is on
2024-02-18 14:09:07 +08:00
Concedo
e8e86ecf9f
fixed SSL not working with streaming
2024-02-16 17:04:07 +08:00
Concedo
7eccc5ffa6
change listen count, fix null
2024-02-16 16:01:24 +08:00
Concedo
39f8cbd1f3
send done for textcompletions too
2024-02-13 23:39:00 +08:00
Concedo
99b7cf7d1c
stream switch to LF newline
2024-02-13 23:12:03 +08:00
Concedo
3cec37c2e0
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .flake8
# .github/workflows/python-lint.yml
# flake.lock
# ggml-cuda.cu
# ggml-quants.c
# llama.cpp
# pocs/vdot/q8dot.cpp
# pocs/vdot/vdot.cpp
# tests/test-quantize-fns.cpp
# tests/test-quantize-perf.cpp
2024-02-13 00:14:22 +08:00
Concedo
603fe941c1
increase cloudflared check size
2024-02-12 17:19:58 +08:00
Concedo
6f3196ad8e
fix benchmark line
2024-02-10 21:49:14 +08:00
Concedo
c3d1a7d123
benchmark coherence fix
2024-02-09 19:03:48 +08:00
Concedo
35111ce01a
row split mode is now a toggle
2024-02-09 18:35:58 +08:00
Concedo
d1aff0e964
benchmark only save under 1mb
2024-02-09 15:40:29 +08:00
Concedo
992eea71d7
fixes for vulkan multigpu
2024-02-09 14:42:27 +08:00
Concedo
fe424a5466
tensor split active text
2024-02-09 12:02:23 +08:00
Concedo
4cd571db89
vulkan multigpu, show uptime
2024-02-08 16:54:38 +08:00
Concedo
de7be2f4e0
benchmarker done
2024-02-07 22:04:53 +08:00
Concedo
5cd9b1d23a
placeholder for benchmark
2024-02-06 21:48:07 +08:00
Concedo
f43667f499
runmode untouched fix
2024-02-05 21:52:33 +08:00
Concedo
330921db15
runmode untouched fix
2024-02-05 20:26:08 +08:00
Alexander Abushady
4cb956c7db
Quadratic Sampling UI ( #652 )
...
* Quadratic Sampling UI
Kalomaze's Quadratic Sampling, now has a UI within KCPP.
* remove debug prints
* cleanup, add smooth sampler to dynatemp
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-02-04 16:26:27 +08:00
Concedo
d229150d28
Checkpoint to test for speed
2024-01-31 22:26:33 +08:00
Concedo
340fbbbb04
show warning if genamt >= ctxsize, show t/s values
2024-01-31 18:51:42 +08:00
Concedo
916780eaf4
fixed a bug with stop seq processing
2024-01-31 15:16:08 +08:00
Ira Peach
e00e17b3f9
Flush STDOUT when server starts listening. ( #651 )
...
This works around a Win32 issue when piping output from a PyInstaller
context, such as when doing so in a perl script or to an output file.
Print statements from a Python context don't properly get output unless
flushed.
This strategically flushes the print statements so no information is
lost, though it may be better to flush all print statements in a Python
context via a subroutine wrapper.
See also:
https://mail.python.org/pipermail/python-bugs-list/2004-August/024923.html
https://stackoverflow.com/a/466849
https://stackoverflow.com/q/62693079
2024-01-31 14:40:45 +08:00
Concedo
f81404e33c
updated class py, added imatrix
2024-01-28 22:37:11 +08:00
Concedo
c2e497ccfb
deferred aborting for queued generations
2024-01-28 14:24:15 +08:00
Concedo
61ca3a0d30
show total of 8 backends
2024-01-27 17:05:33 +08:00
Concedo
87d852b85c
get gpu names with vulkaninfo
2024-01-26 12:58:30 +08:00
Concedo
2a4a7241e6
Merge branch 'vulkan_test' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# llama.cpp
2024-01-25 23:01:44 +08:00
Concedo
346c1a97de
fixed file select cancel, updated lite
2024-01-24 16:36:53 +08:00
Concedo
0f6fa6be93
try adding other fallback backends for linux
2024-01-23 23:37:56 +08:00
Concedo
a4ed5c6471
added 48k ctx option
2024-01-23 17:27:02 +08:00
Concedo
08236ccc97
better abort handling, added support for dynatemp exponent
2024-01-23 16:56:12 +08:00
Concedo
dc7bc0cb50
Merge commit ' 584d674be6
' into concedo_experimental
...
# Conflicts:
# .github/workflows/nix-flake-update.yml
# Makefile
# Package.swift
# ggml-cuda.cu
# tests/test-quantize-fns.cpp
2024-01-14 16:29:44 +08:00
kalomaze
bd77a48037
Do not default to Repetition Penalty 1.1 ( #615 )
...
* Do not default to Repetition Penalty
* apply all known aliases for repetition penalty when using the OAI endpoint. rep pen defaults to 1, range to 256
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-01-13 22:20:02 +08:00
Concedo
b9ad08af19
improved dynatemp wizard
2024-01-11 11:26:14 +08:00
Concedo
5cc64ebb52
dynatemp wizard
2024-01-09 15:51:32 +08:00
Concedo
550829ed98
dont get stuck if cloudflared failed to download correctly
2024-01-08 21:11:17 +08:00