Commit graph

565 commits

Author SHA1 Message Date
Concedo
73b99a7266 add premade chat completions adapter 2024-06-27 00:13:06 +08:00
Concedo
e42bc5d677 add negative prompt support to chat completions adapter 2024-06-26 11:12:24 +08:00
Concedo
151ff95a67 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
#	ggml-cuda.cu
#	ggml-cuda/common.cuh
2024-06-25 19:25:14 +08:00
Concedo
13398477a1 fix ubatch, autoselect vulkan dgpu if possible 2024-06-22 00:23:46 +08:00
Nexesenex
153527745b
Augmented benchmark stats (#929)
* Augmented benchmark stats v1

* output instead of coherence

* populate bench flags as a flags field instead of multiple lines

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-06-18 21:30:36 +08:00
Concedo
ba9ef4d01b fix to allow clblast to work even after blas backend splitoff 2024-06-17 15:02:55 +08:00
Concedo
623390e4ab allow sdui when img model not loaded, allow sdclamped to provide a custom clamp size (+1 squashed commits)
Squashed commits:

[957c9c9c] allow sdui when img model not loaded, allow sdclamped to provide a custom clamp size
2024-06-14 16:58:50 +08:00
Concedo
e69da9c9d8 strings rename kobold lite to koboldai lite 2024-06-13 20:00:28 +08:00
Concedo
49e4c3fd7b adjust lite default port, disable double BOS warning, whisper and SD go quiet when horde mode is set too 2024-06-13 15:10:35 +08:00
Concedo
02357eadf8 Merge commit '7672adeec7' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	kompute-shaders/op_rope_f16.comp
#	kompute-shaders/op_rope_f32.comp
#	kompute-shaders/rope_common.comp
#	tests/test-backend-ops.cpp
#	tests/test-grad0.cpp
#	tests/test-rope.cpp
2024-06-09 15:35:51 +08:00
Concedo
813cf829b5 allow selecting multigpu on vulkan 2024-06-06 18:36:56 +08:00
Concedo
10b148f4c2 added skip bos for tokenize endpoint 2024-06-05 10:49:11 +08:00
Concedo
a541a3d509 quantkv will not trigger if fa is off or ctx shift is on 2024-06-03 19:14:22 +08:00
Concedo
efee37a708 gui for quantkv 2024-06-03 18:25:57 +08:00
Concedo
10a1d628ad added new binding fields for quant k and quant v 2024-06-03 14:35:59 +08:00
Concedo
267ee78651 change max payload to 32mb 2024-06-02 16:44:19 +08:00
Concedo
b0a7d1aba6 fixed makefile (+1 squashed commits)
Squashed commits:

[ef6ddaf5] try fix makefile
2024-06-02 15:21:48 +08:00
Concedo
9e64f0b5af added whisper file upload mode 2024-06-02 12:04:56 +08:00
Concedo
a65e0800ab update docs, added gui for whisper 2024-06-01 02:01:49 +08:00
Concedo
961c789c91 wav file resampling 2024-05-30 13:41:58 +08:00
Concedo
62ab344b1e transcribe api is functional 2024-05-30 00:07:53 +08:00
Concedo
f24aef8792 initial whisper integration 2024-05-29 23:13:11 +08:00
Concedo
dd59303ae1 Merge branch 'concedo_experimental' of https://github.com/LostRuins/koboldcpp into concedo_experimental 2024-05-28 18:25:13 +08:00
Concedo
38d4d743bb add flash attn and quiet mode to quick launch 2024-05-28 18:25:00 +08:00
jojorne
dc53e30785
Why not search for cuda_path as well? (#865)
Let's add dll directory for cuda on Windows too.
2024-05-27 21:38:17 +08:00
Concedo
27e784a42d up ver 2024-05-25 00:03:22 +08:00
Concedo
fac6373b13 fix tools 2024-05-24 23:50:08 +08:00
Concedo
09adfa70ad limit default threads to max 8 to deal with ecores 2024-05-22 14:47:57 +08:00
Concedo
618e60c279 model download if its a url 2024-05-21 18:56:11 +08:00
Concedo
2cbf39cba2 disable ui resize on macos 2024-05-17 15:56:10 +08:00
Concedo
1db3421c52 multiple minor fixes 2024-05-17 15:47:53 +08:00
Concedo
6d9d846bdd prevent mixing lora and quant 2024-05-16 00:29:03 +08:00
Concedo
08993696c3 try apply lora on load 2024-05-15 22:53:23 +08:00
Concedo
44443edfda rep pen slope works (+1 squashed commits)
Squashed commits:

[535ad566] experiment with rep pen range
2024-05-15 17:20:57 +08:00
Concedo
5ce2fdad24 taesd for sdxl, add lora loading done 2024-05-14 23:02:56 +08:00
Concedo
2ee808a747 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	README.md
#	ci/run.sh
#	llama.cpp
#	models/ggml-vocab-llama-bpe.gguf.inp
#	models/ggml-vocab-llama-bpe.gguf.out
#	requirements.txt
#	scripts/compare-llama-bench.py
#	scripts/sync-ggml.last
#	tests/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-grammar-integration.cpp
#	tests/test-tokenizer-1-bpe.cpp
2024-05-14 19:28:47 +08:00
Concedo
5d15f8f76a vae test 2024-05-14 19:17:01 +08:00
Concedo
4807b66907 wip sd 2024-05-13 23:23:16 +08:00
Concedo
bd95ee7d9a temporary version for archiving 2024-05-13 21:53:58 +08:00
Concedo
d8a52321da ditched the coherent flag 2024-05-13 20:38:51 +08:00
Concedo
f4746572d9 wildcare sdui url 2024-05-12 11:09:59 +08:00
Concedo
eff01660e4 re-added smart context due to people complaining 2024-05-11 17:25:03 +08:00
Concedo
702be65ed1 don't show embedded sdui if no model 2024-05-11 08:56:56 +08:00
Concedo
1effe16861 fixed horde worker flag 2024-05-11 01:17:04 +08:00
Concedo
7967377ebc fix for sdui showing when sdmodel not loaded, and not showing when remote tunnel is used. 2024-05-10 23:40:20 +08:00
Concedo
69570daf31 tidy argparse 2024-05-10 17:28:08 +08:00
Concedo
dbe72b959e tidy up and refactor code to support old flags 2024-05-10 16:50:53 +08:00
Concedo
eccc2ddca2 better warnings 2024-05-10 11:27:40 +08:00
Concedo
6f23ca24fb deprecated some old flags 2024-05-10 10:57:52 +08:00
Concedo
868446bd1a replace sdconfig and hordeconfig 2024-05-09 22:43:50 +08:00