Commit graph

538 commits

Author SHA1 Message Date
Concedo
040de7d899 try add tunnels for macos 2024-03-01 17:52:09 +08:00
Concedo
55af5446ad Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
#	ci/run.sh
#	llama.cpp
#	scripts/sync-ggml.last
2024-03-01 17:41:37 +08:00
Concedo
e5861e993d fix benchmark 2024-03-01 16:54:25 +08:00
Concedo
80011ed8aa KCPP SD: add warn and step restriction., updated lite, handle quant mode 2024-03-01 16:41:19 +08:00
Concedo
3463688a0e image generation is fully working over api (+1 squashed commits)
Squashed commits:

[c98ab0b4] single image generation is working now
2024-03-01 14:43:44 +08:00
Concedo
e8f4d7b3da added model and config endpoints for sdcpp, added more samplers. speed is still not good 2024-02-29 22:56:09 +08:00
Concedo
5a44d4de2b refactor and clean identifiers for sd, fix cmake 2024-02-29 18:28:45 +08:00
Concedo
66134bb36e ui for loading SD models done 2024-02-29 17:08:22 +08:00
Concedo
524ba12abd refactor - do not use a copy buffer to store generation outputs, instead return a cpp allocated ptr 2024-02-29 14:02:20 +08:00
Concedo
f75e479db0 WIP on sdcpp integration 2024-02-29 00:40:07 +08:00
Concedo
ad638285de Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	README.md
#	flake.lock
#	ggml-cuda.cu
#	llama.cpp
#	tests/test-backend-ops.cpp
#	tests/test-quantize-fns.cpp
2024-02-28 13:41:35 +08:00
Concedo
71898cf728 unlock custom contextsize 2024-02-27 18:10:43 +08:00
Concedo
39ae58ef0d fix tooltip glitch 2024-02-26 11:35:58 +08:00
YellowRoseCx
7b85917827
add additional tooltips (#710) 2024-02-26 11:15:57 +08:00
Concedo
a6ba735b07 up version for 1.59.1 makefile changes 2024-02-26 10:40:12 +08:00
Concedo
a6aff3fba0 fix typo 2024-02-25 19:40:40 +08:00
Concedo
1bcbd2e21b updated lite 2024-02-24 17:59:44 +08:00
Concedo
f3a0e05d91 added noavx2 vulkan 2024-02-22 16:56:25 +08:00
Concedo
2d71256d21 try to make prints flush 2024-02-21 17:16:49 +08:00
Concedo
6181b46eef added nocertify mode 2024-02-19 16:05:17 +08:00
Concedo
db0834593b hide smartconext toggle when contextshift toggle is on 2024-02-18 14:09:07 +08:00
Concedo
e8e86ecf9f fixed SSL not working with streaming 2024-02-16 17:04:07 +08:00
Concedo
7eccc5ffa6 change listen count, fix null 2024-02-16 16:01:24 +08:00
Concedo
39f8cbd1f3 send done for textcompletions too 2024-02-13 23:39:00 +08:00
Concedo
99b7cf7d1c stream switch to LF newline 2024-02-13 23:12:03 +08:00
Concedo
3cec37c2e0 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.flake8
#	.github/workflows/python-lint.yml
#	flake.lock
#	ggml-cuda.cu
#	ggml-quants.c
#	llama.cpp
#	pocs/vdot/q8dot.cpp
#	pocs/vdot/vdot.cpp
#	tests/test-quantize-fns.cpp
#	tests/test-quantize-perf.cpp
2024-02-13 00:14:22 +08:00
Concedo
603fe941c1 increase cloudflared check size 2024-02-12 17:19:58 +08:00
Concedo
6f3196ad8e fix benchmark line 2024-02-10 21:49:14 +08:00
Concedo
c3d1a7d123 benchmark coherence fix 2024-02-09 19:03:48 +08:00
Concedo
35111ce01a row split mode is now a toggle 2024-02-09 18:35:58 +08:00
Concedo
d1aff0e964 benchmark only save under 1mb 2024-02-09 15:40:29 +08:00
Concedo
992eea71d7 fixes for vulkan multigpu 2024-02-09 14:42:27 +08:00
Concedo
fe424a5466 tensor split active text 2024-02-09 12:02:23 +08:00
Concedo
4cd571db89 vulkan multigpu, show uptime 2024-02-08 16:54:38 +08:00
Concedo
de7be2f4e0 benchmarker done 2024-02-07 22:04:53 +08:00
Concedo
5cd9b1d23a placeholder for benchmark 2024-02-06 21:48:07 +08:00
Concedo
f43667f499 runmode untouched fix 2024-02-05 21:52:33 +08:00
Concedo
330921db15 runmode untouched fix 2024-02-05 20:26:08 +08:00
Alexander Abushady
4cb956c7db
Quadratic Sampling UI (#652)
* Quadratic Sampling UI

Kalomaze's Quadratic Sampling, now has a UI within KCPP.

* remove debug prints

* cleanup, add smooth sampler to dynatemp

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-02-04 16:26:27 +08:00
Concedo
d229150d28 Checkpoint to test for speed 2024-01-31 22:26:33 +08:00
Concedo
340fbbbb04 show warning if genamt >= ctxsize, show t/s values 2024-01-31 18:51:42 +08:00
Concedo
916780eaf4 fixed a bug with stop seq processing 2024-01-31 15:16:08 +08:00
Ira Peach
e00e17b3f9
Flush STDOUT when server starts listening. (#651)
This works around a Win32 issue when piping output from a PyInstaller
context, such as when doing so in a perl script or to an output file.
Print statements from a Python context don't properly get output unless
flushed.

This strategically flushes the print statements so no information is
lost, though it may be better to flush all print statements in a Python
context via a subroutine wrapper.

See also:

    https://mail.python.org/pipermail/python-bugs-list/2004-August/024923.html
    https://stackoverflow.com/a/466849
    https://stackoverflow.com/q/62693079
2024-01-31 14:40:45 +08:00
Concedo
f81404e33c updated class py, added imatrix 2024-01-28 22:37:11 +08:00
Concedo
c2e497ccfb deferred aborting for queued generations 2024-01-28 14:24:15 +08:00
Concedo
61ca3a0d30 show total of 8 backends 2024-01-27 17:05:33 +08:00
Concedo
87d852b85c get gpu names with vulkaninfo 2024-01-26 12:58:30 +08:00
Concedo
2a4a7241e6 Merge branch 'vulkan_test' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	llama.cpp
2024-01-25 23:01:44 +08:00
Concedo
346c1a97de fixed file select cancel, updated lite 2024-01-24 16:36:53 +08:00
Concedo
0f6fa6be93 try adding other fallback backends for linux 2024-01-23 23:37:56 +08:00