Commit graph

333 commits

Author SHA1 Message Date
Concedo
87c8e2131b testing removed an assert 2024-11-08 11:22:50 +08:00
Concedo
859ec03cd0 updated lite 2024-11-06 22:37:35 +08:00
Concedo
3cfc4dc581 avoid euler a for flux (+4 squashed commit)
Squashed commit:

[5a4b72385] fix cuda build

[5f969a645] add vulkan information

[6849e7398] fixed flux

[740e80419] update readme
2024-11-05 22:50:14 +08:00
Concedo
bc30ebd044 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	Makefile
#	README.md
#	examples/CMakeLists.txt
#	examples/main/README.md
#	ggml/src/CMakeLists.txt
#	ggml/src/kompute-shaders/common.comp
#	scripts/sync-ggml.last
#	src/llama.cpp
2024-11-02 21:57:29 +08:00
Concedo
2a07f2dc2c minor fix 2024-11-01 22:42:57 +08:00
Concedo
6a27003a06 logprobs feature completed 2024-11-01 15:24:07 +08:00
Concedo
f7406dfdb1 updated lite 2024-11-01 01:13:15 +08:00
Concedo
aa26a58085 added logprobs api and logprobs viewer 2024-11-01 00:22:15 +08:00
Concedo
d0a6a52855 hide flash attention in quick launch for vulkan, updated lite 2024-10-24 22:00:09 +08:00
Concedo
5ad826b82a updated lite (+2 squashed commit)
Squashed commit:

[31a99e1f] bump baned phrase a bit more again

[c999736b] small fix
2024-10-11 11:05:04 +08:00
Concedo
fe5479f286 unify antislop and token bans 2024-10-10 18:21:07 +08:00
Concedo
2ed4ebc687 updated lite 2024-10-10 01:10:31 +08:00
Concedo
f78f8d3d45 wip anti slop 2024-10-07 23:18:13 +08:00
Concedo
ea55f69dc1 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.dockerignore
#	.github/workflows/build.yml
#	.github/workflows/docker.yml
#	Makefile
#	README.md
#	examples/infill/infill.cpp
#	examples/perplexity/perplexity.cpp
#	examples/server/README.md
#	examples/speculative/speculative.cpp
#	flake.lock
#	ggml/src/CMakeLists.txt
#	scripts/sync-ggml.last
#	tests/test-backend-ops.cpp
#	tests/test-sampling.cpp
2024-09-27 11:21:28 +08:00
Concedo
c38d1ecc8d update templates, fix rwkv 2024-09-22 01:32:12 +08:00
Concedo
cd1a52a29e Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	scripts/sync-ggml.last
#	tests/test-backend-ops.cpp
#	tests/test-grad0.cpp
2024-09-21 11:23:54 +08:00
Concedo
68aee56498 updated lite, allow GUI to import launcher args and config files with showgui 2024-09-20 17:52:52 +08:00
Concedo
e958b2f78b updated lite 2024-09-17 00:16:21 +08:00
Concedo
01c7d82185 updated lite 2024-09-14 11:34:16 +08:00
Concedo
5b49c3a878 update lite 2024-09-13 22:05:59 +08:00
Concedo
0fd85c3940 updated lite 2024-09-13 15:51:59 +08:00
Concedo
5b658ab6d4 updated lite 2024-09-12 10:47:47 +08:00
Concedo
12fd16bfd4 Merge commit 'df270ef745' into concedo_experimental
# Conflicts:
#	Makefile
#	common/CMakeLists.txt
#	common/common.h
#	common/sampling.cpp
#	common/sampling.h
#	examples/infill/infill.cpp
#	examples/llama-bench/llama-bench.cpp
#	examples/quantize-stats/quantize-stats.cpp
#	examples/server/server.cpp
#	include/llama.h
#	src/llama-sampling.cpp
#	src/llama-sampling.h
#	src/llama.cpp
#	tests/test-grammar-integration.cpp
#	tests/test-grammar-parser.cpp
#	tests/test-json-schema-to-grammar.cpp
#	tests/test-llama-grammar.cpp
#	tests/test-sampling.cpp
2024-09-09 17:10:08 +08:00
Concedo
2e74bd0327 updated lite, added compile flag fix 2024-09-07 10:27:37 +08:00
Concedo
73dca7e5bc Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/full-cuda.Dockerfile
#	.devops/nix/devshells.nix
#	.devops/nix/nixpkgs-instances.nix
#	.devops/nix/package.nix
#	.devops/nix/scope.nix
#	README.md
#	docs/docker.md
#	examples/llama-bench/llama-bench.cpp
#	flake.lock
#	flake.nix
#	grammars/README.md
#	src/llama.cpp
2024-09-06 01:07:31 +08:00
Concedo
d777995991 able to handle kcpp protected model name endpoints 2024-09-04 16:26:28 +08:00
Concedo
b6f9aaa9ab Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	docs/backend/SYCL.md
2024-08-30 20:25:34 +08:00
Concedo
a22a666f3b allow online apis when in nomodel mode 2024-08-27 15:30:54 +08:00
Concedo
97aa8648ed allow launching with no models loaded 2024-08-25 23:57:32 +08:00
Concedo
5bf527a6ae added xtc sampler 2024-08-21 23:57:15 +08:00
Concedo
cd69ab218e fixed DRY 2024-08-21 17:01:28 +08:00
Concedo
7ee359a59b on multigpu setups, pick lowest free mem instead of highest for auto layers 2024-08-20 19:02:16 +08:00
Concedo
b3b00750b7 update lite 2024-08-18 18:23:21 +08:00
Concedo
e9eb6fe51a move chat compl to models tab 2024-08-18 14:56:10 +08:00
Concedo
98dff80b9c update lite 2024-08-18 12:00:06 +08:00
Concedo
d71b5477c5 update lite, cleanup, fix interrogate format 2024-08-18 00:48:53 +08:00
Concedo
1edf83761a Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/bench.yml.disabled
#	Makefile
#	README.md
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-vulkan.cpp
2024-08-17 16:21:14 +08:00
Concedo
e8de0af3ec Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/bench.yml
#	.github/workflows/build.yml
#	.github/workflows/python-check-requirements.yml
#	README.md
#	docs/backend/SYCL.md
#	flake.lock
#	ggml/CMakeLists.txt
#	ggml/src/kompute-shaders/op_rope_f16.comp
#	ggml/src/kompute-shaders/op_rope_f32.comp
#	ggml/src/kompute-shaders/rope_common.comp
2024-08-14 22:25:43 +08:00
Concedo
86e687ae8b updated lite, added promptlimit 2024-08-10 16:05:24 +08:00
Concedo
6b8b50b350 try fix ipv6 (+1 squashed commits)
Squashed commits:

[8d95a639] try fix ipv6
2024-08-06 15:36:46 +08:00
Concedo
bfdf4b021f adjust v4-v6 allocation, default back to localhost 2024-08-04 11:42:16 +08:00
Concedo
9a0976761e use loopback ip instead of localhost 2024-08-03 00:41:32 +08:00
Concedo
bf35652ef7 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	flake.lock
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
2024-07-30 22:31:49 +08:00
Concedo
ba5babb876 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/apps.nix
#	.devops/tools.sh
#	Makefile
#	README.md
#	docs/backend/SYCL.md
#	docs/build.md
#	examples/CMakeLists.txt
#	ggml/include/ggml.h
#	src/llama-vocab.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat-template.cpp
#	tests/test-sampling.cpp
2024-07-27 23:15:54 +08:00
Concedo
e28c42d7f7 adjusted layer estimation 2024-07-24 21:54:49 +08:00
Concedo
b7fc8e644a fix broken template, updated lite 2024-07-24 20:47:05 +08:00
Concedo
44ef87f14c update lite, try fix ci 2024-07-24 16:31:34 +08:00
Concedo
0ecf13fc13 updated lite, extra error logging 2024-07-21 17:55:47 +08:00
Concedo
1a23d49c32 serve tags endpoint 2024-07-19 16:08:54 +08:00
Concedo
6080fa38ce updated lite 2024-07-18 15:55:45 +08:00