Commit graph

294 commits

Author SHA1 Message Date
Concedo
6b8b50b350 try fix ipv6 (+1 squashed commits)
Squashed commits:

[8d95a639] try fix ipv6
2024-08-06 15:36:46 +08:00
Concedo
bfdf4b021f adjust v4-v6 allocation, default back to localhost 2024-08-04 11:42:16 +08:00
Concedo
9a0976761e use loopback ip instead of localhost 2024-08-03 00:41:32 +08:00
Concedo
bf35652ef7 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	flake.lock
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
2024-07-30 22:31:49 +08:00
Concedo
ba5babb876 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/apps.nix
#	.devops/tools.sh
#	Makefile
#	README.md
#	docs/backend/SYCL.md
#	docs/build.md
#	examples/CMakeLists.txt
#	ggml/include/ggml.h
#	src/llama-vocab.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat-template.cpp
#	tests/test-sampling.cpp
2024-07-27 23:15:54 +08:00
Concedo
e28c42d7f7 adjusted layer estimation 2024-07-24 21:54:49 +08:00
Concedo
b7fc8e644a fix broken template, updated lite 2024-07-24 20:47:05 +08:00
Concedo
44ef87f14c update lite, try fix ci 2024-07-24 16:31:34 +08:00
Concedo
0ecf13fc13 updated lite, extra error logging 2024-07-21 17:55:47 +08:00
Concedo
1a23d49c32 serve tags endpoint 2024-07-19 16:08:54 +08:00
Concedo
6080fa38ce updated lite 2024-07-18 15:55:45 +08:00
Concedo
eca7521c13 allowed embedded chat adapters 2024-07-17 18:08:43 +08:00
Concedo
d775a419b2 updated lite with chat inject, added layer detect, added more console logging 2024-07-16 23:10:15 +08:00
Concedo
21179d675b try ci for avx1, up ver (+2 squashed commit)
Squashed commit:

[74150175] up version

[97b6163c] try ci for avx1 linux
2024-07-15 23:07:07 +08:00
Concedo
066e7ac540 minor fixes: colab gpu backend, lite bugs, package python file with embd 2024-07-15 17:36:03 +08:00
Concedo
5caf5f9770 update lite 2024-07-13 23:36:42 +08:00
Concedo
97c7cf66fe fixed typo 2024-07-13 19:03:00 +08:00
Concedo
33c51d3987 updated lite (+1 squashed commits)
Squashed commits:

[5af323f9] updated lite
2024-07-13 16:32:49 +08:00
Concedo
116d5fe58e updated lite 2024-07-09 20:42:51 +08:00
Concedo
82202aebda updated lite, add gemma 2 template 2024-07-02 21:02:52 +08:00
Concedo
e433afb261 updated lite 2024-06-27 15:51:34 +08:00
Concedo
24bfa54f3c updated lite 2024-06-26 18:53:32 +08:00
Concedo
92afdfcae4 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/labeler.yml
#	.github/workflows/server.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	README-sycl.md
#	README.md
#	llama.cpp
#	requirements/requirements-convert-hf-to-gguf-update.txt
#	requirements/requirements-convert-hf-to-gguf.txt
#	requirements/requirements-convert-legacy-llama.txt
#	scripts/sync-ggml.last
#	tests/test-tokenizer-random.py
2024-06-22 01:33:44 +08:00
Concedo
a0ecd0d8e6 update build job count, updated lite 2024-06-18 21:12:16 +08:00
Concedo
5c647419a9 updated lite 2024-06-17 21:22:37 +08:00
Concedo
e69da9c9d8 strings rename kobold lite to koboldai lite 2024-06-13 20:00:28 +08:00
Concedo
49e4c3fd7b adjust lite default port, disable double BOS warning, whisper and SD go quiet when horde mode is set too 2024-06-13 15:10:35 +08:00
Concedo
92bbebb357 Check for keyanti before accessing it 2024-06-13 12:54:23 +08:00
Concedo
562d980140 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/full-cuda.Dockerfile
#	.devops/full.Dockerfile
#	.devops/main-cuda.Dockerfile
#	.devops/main-rocm.Dockerfile
#	.devops/main-vulkan.Dockerfile
#	.devops/main.Dockerfile
#	.devops/server-cuda.Dockerfile
#	.devops/server.Dockerfile
#	README.md
#	common/CMakeLists.txt
#	grammars/README.md
#	tests/test-grammar-integration.cpp
#	tests/test-grammar-parser.cpp
#	tests/test-json-schema-to-grammar.cpp
2024-06-09 17:30:05 +08:00
Concedo
813cf829b5 allow selecting multigpu on vulkan 2024-06-06 18:36:56 +08:00
Concedo
3e4c44bace update lite 2024-06-05 11:41:56 +08:00
Concedo
10b148f4c2 added skip bos for tokenize endpoint 2024-06-05 10:49:11 +08:00
Concedo
5789417802 logit bias tokenize feature 2024-06-04 15:47:34 +08:00
Concedo
cb62ab237b updated settings menu 2024-06-03 22:06:38 +08:00
Concedo
0978806a3c fix inverted probability 2024-06-03 16:02:35 +08:00
Concedo
10a1d628ad added new binding fields for quant k and quant v 2024-06-03 14:35:59 +08:00
Concedo
b0a7d1aba6 fixed makefile (+1 squashed commits)
Squashed commits:

[ef6ddaf5] try fix makefile
2024-06-02 15:21:48 +08:00
Concedo
7ef31e541c update lite and readme 2024-06-01 23:21:40 +08:00
Concedo
cb93aa5243 updated lite 2024-06-01 18:02:44 +08:00
Concedo
10cf077848 tweaking audio capture 2024-06-01 10:56:00 +08:00
Concedo
a65e0800ab update docs, added gui for whisper 2024-06-01 02:01:49 +08:00
Concedo
318d5b87fc fixed html replace 2024-05-25 02:15:04 +08:00
Concedo
dd558d8458 impersonate user bug 2024-05-25 00:54:50 +08:00
Concedo
13d1405104 fixed sysprompt and swapped toggles 2024-05-25 00:31:58 +08:00
Concedo
f6ab54aeff updated lite, add api key masking 2024-05-23 20:00:53 +08:00
Concedo
cb872f7d9b change gemini default 2024-05-22 21:58:47 +08:00
Concedo
ff5743100e updated lite 2024-05-21 22:11:52 +08:00
Concedo
6c3fe1a17f big clamp 2024-05-19 22:24:38 +08:00
Concedo
4b664b3409 improved EOT handling 2024-05-19 22:04:51 +08:00
Concedo
d5d5dda02b Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/package.nix
#	.github/workflows/build.yml
#	.github/workflows/server.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	ggml-cuda.cu
#	tests/test-backend-ops.cpp
2024-05-19 17:55:20 +08:00