Concedo
740f91e3fd
lower aria interval
2025-06-06 17:43:38 +08:00
Concedo
9cf32e5fee
step limits over adapter for sd
2025-06-06 14:12:43 +08:00
Concedo
f6bbc350f2
various qol fixes
2025-06-05 10:26:02 +08:00
Concedo
736030bb9f
save and load state upgraded to 3 available states
2025-06-04 22:09:40 +08:00
Concedo
06d2bc3404
ollama compat fixes
2025-06-04 19:22:29 +08:00
Concedo
53f1511396
use a static buffer for kv reloads instead. also, added into lite ui
2025-06-03 22:32:46 +08:00
Concedo
4b57108508
Save KV State and Load KV State to memory added. GUI not yet updated
2025-06-03 17:46:29 +08:00
Concedo
6ce85c54d6
not working correctly
2025-06-02 22:12:10 +08:00
Concedo
8e1ebc55b5
dropped support for lora base as upstream no longer uses it. If provided it will be silently ignored
2025-06-02 12:49:53 +08:00
Concedo
51dc1cf920
added scale for text lora
2025-06-02 00:13:42 +08:00
Concedo
74ef097c4a
added ability to set koboldcpp as default handler for gguf and kcpps
2025-06-01 22:36:41 +08:00
Concedo
f3bb947a13
cuda use wmma flash attention for turing (+1 squashed commits)
...
Squashed commits:
[3c5112398] 117 (+10 squashed commit)
Squashed commit:
[4f01bb2d4] 117 graphs 80v
[7549034ea] 117 graphs
[dabf9cb99] checking if cuda 11.5.2 works
[ba7ccdb7a] another try cu11.7 only
[752cf2ae5] increase aria2c download log rate
[dc4f198fd] test send turing to wmma flash attention
[496a22e83] temp build test cu11.7.0
[ca759c424] temp build test cu11.7
[c46ada17c] test build: enable virtual80 for oldcpu
[3ccfd939a] test build: with cuda graphs for all
2025-06-01 11:41:45 +08:00
Concedo
b08dca65ed
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/CMakeLists.txt
# common/arg.cpp
# common/chat.cpp
# examples/parallel/README.md
# examples/parallel/parallel.cpp
# ggml/cmake/common.cmake
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/rope.cpp
# models/ggml-vocab-bert-bge.gguf.inp
# models/ggml-vocab-bert-bge.gguf.out
# models/ggml-vocab-command-r.gguf.inp
# models/ggml-vocab-command-r.gguf.out
# models/ggml-vocab-deepseek-coder.gguf.inp
# models/ggml-vocab-deepseek-coder.gguf.out
# models/ggml-vocab-deepseek-llm.gguf.inp
# models/ggml-vocab-deepseek-llm.gguf.out
# models/ggml-vocab-falcon.gguf.inp
# models/ggml-vocab-falcon.gguf.out
# models/ggml-vocab-gpt-2.gguf.inp
# models/ggml-vocab-gpt-2.gguf.out
# models/ggml-vocab-llama-bpe.gguf.inp
# models/ggml-vocab-llama-bpe.gguf.out
# models/ggml-vocab-llama-spm.gguf.inp
# models/ggml-vocab-llama-spm.gguf.out
# models/ggml-vocab-mpt.gguf.inp
# models/ggml-vocab-mpt.gguf.out
# models/ggml-vocab-phi-3.gguf.inp
# models/ggml-vocab-phi-3.gguf.out
# models/ggml-vocab-qwen2.gguf.inp
# models/ggml-vocab-qwen2.gguf.out
# models/ggml-vocab-refact.gguf.inp
# models/ggml-vocab-refact.gguf.out
# models/ggml-vocab-starcoder.gguf.inp
# models/ggml-vocab-starcoder.gguf.out
# requirements/requirements-gguf_editor_gui.txt
# tests/CMakeLists.txt
# tests/test-chat.cpp
# tests/test-grammar-integration.cpp
# tests/test-json-schema-to-grammar.cpp
# tools/mtmd/CMakeLists.txt
# tools/run/run.cpp
# tools/server/CMakeLists.txt
2025-05-31 13:04:21 +08:00
Concedo
c923e9fe46
added option to unload model from admin control
2025-05-31 11:51:09 +08:00
Concedo
08e0745e7e
added singleinstance flag and local shutdown api
2025-05-31 11:37:32 +08:00
Concedo
6529326c59
allow temperatures up to 1.0 when function calling
2025-05-30 15:59:18 +08:00
Concedo
c881bb7348
match a few common oai voices
2025-05-29 23:29:17 +08:00
Concedo
26bf5b446d
fixed thread count <=0 , fixed clip skip <= 0
2025-05-28 00:38:15 +08:00
Concedo
f97bbdde00
fix to allow all EOGs to trigger a stop, occam's glm4 fix,
2025-05-24 22:55:11 +08:00
Concedo
ec04115ae9
swa options now available
2025-05-24 11:50:37 +08:00
Concedo
748dfcc2e4
massively improved tool calling
2025-05-24 02:26:11 +08:00
Concedo
c4df151298
experimental swa flag
2025-05-23 21:33:26 +08:00
Concedo
499283c63a
rename define to match upstream
2025-05-23 17:10:12 +08:00
Concedo
e68a5f448c
add ddim sampler
2025-05-22 21:28:01 +08:00
Concedo
f125e724eb
fix off-by-one npast during some instances of fast forwarding
2025-05-22 19:51:21 +08:00
Concedo
440350327c
set random range for seed
2025-05-21 23:47:18 +08:00
Wagner Bruna
5d0cfc9db3
store on the image the actual random seed, for reproducibility ( #1549 )
2025-05-21 23:40:47 +08:00
Concedo
8b6dfbd1be
disabling the gMask prefix for glm-4 completions
2025-05-21 17:29:24 +08:00
Concedo
49305942ab
try disabling the gMask prefix for glm-4 completions
2025-05-21 16:47:08 +08:00
Concedo
5f4923bf24
backend tag replacement for endtags. view results with debug mode.
2025-05-19 23:14:43 +08:00
Concedo
710c747b60
minor noscript edit
2025-05-19 17:51:44 +08:00
Concedo
c546cb638e
disable showgui if skiplauncher is used
2025-05-18 01:42:14 +08:00
Concedo
ca4274e384
added size info into HF searcher
2025-05-17 00:31:54 +08:00
Concedo
5ccd4b2bf5
horde default max ctx matches main ctx
2025-05-15 10:26:20 +08:00
Concedo
c5ea7fad93
updated lite, only show processed input in debugmode
2025-05-14 17:46:54 +08:00
Concedo
21e31e255b
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/docker.yml
# README.md
# build-xcframework.sh
# common/CMakeLists.txt
# examples/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-metal/ggml-metal.m
# ggml/src/ggml-metal/ggml-metal.metal
# ggml/src/ggml-sycl/CMakeLists.txt
# ggml/src/ggml-sycl/backend.hpp
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/mmvq.cpp
# ggml/src/ggml-sycl/vecdotq.hpp
# scripts/compare-llama-bench.py
# src/CMakeLists.txt
# src/llama-model.cpp
# src/llama.cpp
# tests/test-backend-ops.cpp
# tests/test-opt.cpp
# tools/llama-bench/README.md
# tools/llama-bench/llama-bench.cpp
# tools/mtmd/CMakeLists.txt
# tools/mtmd/README.md
# tools/mtmd/clip.cpp
# tools/rpc/rpc-server.cpp
# tools/server/CMakeLists.txt
# tools/server/README.md
2025-05-13 00:28:35 +08:00
Concedo
40eb3a54c4
rename some toolip texts
2025-05-11 22:50:40 +08:00
Concedo
1eb6d25010
truncate middle instead of end for long strings
2025-05-11 20:26:17 +08:00
Concedo
48c3682c2c
improve search
2025-05-10 19:25:26 +08:00
Concedo
50e1064ffe
better passthrough handling
2025-05-10 19:11:09 +08:00
Concedo
c4a0b323f0
remove fa restrictions for vulkan
2025-05-09 17:34:14 +08:00
Concedo
b6220669f4
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/docker.yml
# Makefile
# examples/CMakeLists.txt
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/convert.cpp
# ggml/src/ggml-sycl/convert.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# scripts/sync-ggml.last
2025-05-08 23:07:33 +08:00
Concedo
7c5d47f688
multigpu warning only once
2025-05-08 00:55:09 +08:00
Concedo
fa22c1a5a4
fixed cfg scale, but turns out it sucks. embedded aria2c into pyinstaller
2025-05-07 18:30:36 +08:00
Concedo
a5b6f372a3
cfg scale wip
2025-05-07 00:36:00 +08:00
Concedo
0fa435b2a6
Merge commit ' 9b61acf060
' into concedo_experimental
...
# Conflicts:
# Makefile
# docs/multimodal/MobileVLM.md
# docs/multimodal/glmedge.md
# docs/multimodal/llava.md
# docs/multimodal/minicpmo2.6.md
# docs/multimodal/minicpmv2.5.md
# docs/multimodal/minicpmv2.6.md
# requirements/requirements-all.txt
# tools/mtmd/CMakeLists.txt
# tools/mtmd/README.md
# tools/mtmd/android/adb_run.sh
# tools/mtmd/android/build_64.sh
# tools/mtmd/clip-quantize-cli.cpp
2025-05-06 23:34:21 +08:00
Concedo
38a8778f24
wip cfg scale
2025-05-06 23:06:25 +08:00
Concedo
13cee48740
embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits)
...
Squashed commits:
[b9b695217] embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits)
Squashed commits:
[90b5d389d] embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits)
Squashed commits:
[fbbaa989f] embed aria2c for windows
2025-05-06 18:56:02 +08:00
Concedo
f59b5eb561
added toggle for guidance
2025-05-05 22:21:46 +08:00
Concedo
1228f91ccb
even better comfyui handling, dynamic node ids
2025-05-03 11:21:22 +08:00