Concedo
|
4b2ca1169c
|
more consistency fixes
|
2025-08-13 19:28:53 +08:00 |
|
Concedo
|
955cf66bbc
|
load embedding at current maxctx instead of max trained ctx by default
|
2025-08-13 18:42:14 +08:00 |
|
Concedo
|
06a3ee4c3b
|
populate better server identifier headers.
|
2025-08-13 16:10:30 +08:00 |
|
Concedo
|
30e2f25c05
|
alias tensorsplit , fixed python error
|
2025-08-10 22:38:14 +08:00 |
|
Concedo
|
8e6d27f629
|
handle if assistant_message_gen and assistant_message_gen!=assistant_message_start, replace final output tag with unspaced (gen) version if exists
|
2025-08-10 16:51:34 +08:00 |
|
kallewoof
|
204739e7f1
|
Adapter fixes (#1659)
* test adapters
* add assistant_gen adapter key
* add support for chat templates stored as .jinja files
* removed mistakenly commited gated-tokenizers link
* autoguess: Harmony: add missing newline prefixes to system_end
|
2025-08-10 16:19:50 +08:00 |
|
Concedo
|
89266ac6b8
|
autoguess adapter make case insensitive
|
2025-08-10 00:58:47 +08:00 |
|
Concedo
|
487d509b44
|
try fix oldpc cuda broken without flash attn since upstream pr14361 between 1.94 and 1.95 (+1 squashed commits)
Squashed commits:
[940f0c639] try fix oldpc cuda broken without flash attn since upstream pr14361 between 1.94 and 1.95
|
2025-08-10 00:10:37 +08:00 |
|
Concedo
|
4c1faf61b2
|
increment version (+1 squashed commits)
Squashed commits:
[6e5080ad2] increment version
|
2025-08-09 20:53:26 +08:00 |
|
Concedo
|
ced98823a1
|
kai api tool calling
|
2025-08-09 10:51:10 +08:00 |
|
Concedo
|
9e7a940ce4
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/softmax_4_f16.cl
# ggml/src/ggml-opencl/kernels/softmax_4_f32.cl
# ggml/src/ggml-opencl/kernels/softmax_f16.cl
# ggml/src/ggml-opencl/kernels/softmax_f32.cl
# ggml/src/ggml-rpc/ggml-rpc.cpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
|
2025-08-09 01:24:52 +08:00 |
|
Concedo
|
8a71eb03c0
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# .github/workflows/build.yml
# ggml/cmake/ggml-config.cmake.in
# ggml/src/ggml-cann/CMakeLists.txt
# ggml/src/ggml-cann/common.h
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cuda/fattn.cu
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# requirements/requirements-convert_hf_to_gguf.txt
# scripts/compare-llama-bench.py
# tests/test-chat-template.cpp
# tests/test-chat.cpp
# tools/llama-bench/llama-bench.cpp
|
2025-08-07 21:23:09 +08:00 |
|
Concedo
|
e40d26b9e7
|
allow offloading moe to cpu with --moecpu
|
2025-08-05 23:42:42 +08:00 |
|
Concedo
|
9fbbd9e127
|
half measure for mistral spaced formats
|
2025-08-04 23:48:11 +08:00 |
|
Concedo
|
6cb8f95b5b
|
tool calling params have been ported over to KAI api and can be used, same syntax as OAI endpoint
|
2025-08-03 16:21:57 +08:00 |
|
Concedo
|
fa815f76c9
|
updated model recs (+1 squashed commits)
Squashed commits:
[3e0431ae1] updated model recs
|
2025-08-02 11:41:37 +08:00 |
|
Concedo
|
cd0dc0abec
|
allow tool calls to be triggered by any role
|
2025-08-02 10:00:35 +08:00 |
|
Concedo
|
a87c05f8c1
|
move function call determination to separate method
|
2025-07-31 14:14:38 +08:00 |
|
Concedo
|
cade9f42bc
|
bump defaults
|
2025-07-31 12:05:57 +08:00 |
|
Concedo
|
1976bb3f53
|
fixes for tool calling
|
2025-07-30 19:25:39 +08:00 |
|
Concedo
|
abf527a207
|
clearer multimodal capability display
|
2025-07-28 22:54:49 +08:00 |
|
Concedo
|
ecb2cbf547
|
fix url params parse search
|
2025-07-27 16:41:42 +08:00 |
|
Concedo
|
8192cd6747
|
handle multi tool calls
|
2025-07-25 23:06:23 +08:00 |
|
Concedo
|
f25339c92b
|
handle empty objects returned by tool calls, also remove misinterpretation of the tools calls instruct tag within ChatML autoguess
|
2025-07-25 22:22:27 +08:00 |
|
Concedo
|
0d72c794fa
|
Merge commit 'c8ade30036 ' into concedo_experimental
# Conflicts:
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-opencl/CMakeLists.txt
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/im2col_f16.cl
# ggml/src/ggml-opencl/kernels/im2col_f32.cl
# ggml/src/ggml-sycl/im2col.cpp
# tools/mtmd/clip.cpp
|
2025-07-25 19:42:45 +08:00 |
|
Concedo
|
8f622cfb50
|
debugmode longer prints
|
2025-07-23 19:28:39 +08:00 |
|
Concedo
|
4b348d0b7e
|
add 2 more save slots
|
2025-07-22 21:07:19 +08:00 |
|
Concedo
|
75154a3d91
|
add ping endpoint
|
2025-07-22 18:55:35 +08:00 |
|
Concedo
|
30675b0798
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# CODEOWNERS
# docs/build.md
# scripts/sync-ggml.last
# tests/test-backend-ops.cpp
# tools/imatrix/README.md
# tools/imatrix/imatrix.cpp
|
2025-07-20 22:47:31 +08:00 |
|
Concedo
|
b028dd4e84
|
minor fixes
|
2025-07-18 13:22:59 +08:00 |
|
Concedo
|
1ca666f9c1
|
allow handling multipart files up to 999
|
2025-07-18 01:18:28 +08:00 |
|
Concedo
|
afca31bfbe
|
handle clean_env for remotetunnel
|
2025-07-17 18:21:22 +08:00 |
|
Concedo
|
d4a394ff73
|
label attached media with ids
|
2025-07-17 10:04:46 +08:00 |
|
Concedo
|
d3d5e36af6
|
backwards compat for older flags in config load
|
2025-07-15 22:05:57 +08:00 |
|
Concedo
|
51cac6f30c
|
add a title to load config
|
2025-07-15 18:22:44 +08:00 |
|
Concedo
|
8396add5be
|
removed hunyuan autoguess template, fixed multi file loading up to 999 parts
|
2025-07-15 17:49:49 +08:00 |
|
Concedo
|
b7f8d0fe2b
|
handle inconsistent final message content being sent with finish_reason
|
2025-07-14 22:17:18 +08:00 |
|
Concedo
|
0e8f96414a
|
add in backwards compatibility for older clients with incorrect json_schema passing
|
2025-07-14 17:41:08 +08:00 |
|
Concedo
|
e7eb6d3200
|
increase default ctx size to 8k, rename usecublas to usecuda
|
2025-07-13 18:27:42 +08:00 |
|
Concedo
|
811463a704
|
split audio and vision detection separately
|
2025-07-13 17:47:15 +08:00 |
|
Concedo
|
0938af7c83
|
fixed noscript image gen
|
2025-07-13 11:37:52 +08:00 |
|
Concedo
|
6f4f1b7389
|
allowing resuming incomplete aria2 downloads
|
2025-07-13 11:14:39 +08:00 |
|
Concedo
|
e9473305d0
|
wip2 (+1 squashed commits)
Squashed commits:
[4628777b6] wip
|
2025-07-12 18:54:40 +08:00 |
|
Concedo
|
57ce374240
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# .github/ISSUE_TEMPLATE/010-bug-compilation.yml
# .github/ISSUE_TEMPLATE/011-bug-results.yml
# .github/labeler.yml
# .github/workflows/build.yml
# .github/workflows/release.yml
# .gitmodules
# CMakeLists.txt
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cann/aclnn_ops.cpp
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-opencl/ggml-opencl.cpp
# ggml/src/ggml-opencl/kernels/softmax_4_f16.cl
# ggml/src/ggml-opencl/kernels/softmax_4_f32.cl
# ggml/src/ggml-opencl/kernels/softmax_f16.cl
# ggml/src/ggml-opencl/kernels/softmax_f32.cl
# ggml/src/ggml-sycl/element_wise.cpp
# ggml/src/ggml-sycl/element_wise.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# scripts/sync-ggml-am.sh
# scripts/sync-ggml.last
# scripts/sync-ggml.sh
# tests/test-backend-ops.cpp
# tests/test-c.c
|
2025-07-05 12:16:28 +08:00 |
|
Concedo
|
ae0c6b02f8
|
print system info
|
2025-07-05 11:58:00 +08:00 |
|
Wagner Bruna
|
d74c16e6e0
|
enable flash attention for image generation (#1633)
|
2025-07-05 11:20:51 +08:00 |
|
Wagner Bruna
|
bc3e4c1197
|
remove obsolete warning about flash attention on Vulkan (#1634)
|
2025-07-03 16:57:03 +08:00 |
|
Concedo
|
f407aa3b8a
|
emulated oai image generation
|
2025-07-02 16:01:56 +08:00 |
|
Concedo
|
cdda9d16e0
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# .devops/tools.sh
# build-xcframework.sh
# ci/run.sh
# examples/Miku.sh
# examples/chat-13B.sh
# examples/chat-persistent.sh
# examples/chat-vicuna.sh
# examples/chat.sh
# examples/jeopardy/jeopardy.sh
# examples/reason-act.sh
# examples/server-llama2-13B.sh
# examples/sycl/build.sh
# examples/sycl/run-llama2.sh
# examples/sycl/run-llama3.sh
# examples/ts-type-to-grammar.sh
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-sycl/element_wise.cpp
# ggml/src/ggml-sycl/element_wise.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# scripts/apple/validate-apps.sh
# scripts/apple/validate-ios.sh
# scripts/apple/validate-macos.sh
# scripts/apple/validate-tvos.sh
# scripts/apple/validate-visionos.sh
# scripts/check-requirements.sh
# scripts/ci-run.sh
# scripts/compare-commits.sh
# scripts/debug-test.sh
# scripts/gen-authors.sh
# scripts/get-hellaswag.sh
# scripts/get-pg.sh
# scripts/get-wikitext-103.sh
# scripts/get-wikitext-2.sh
# scripts/get-winogrande.sh
# scripts/hf.sh
# scripts/qnt-all.sh
# scripts/run-all-perf.sh
# scripts/run-all-ppl.sh
# scripts/sync-ggml-am.sh
# scripts/sync-ggml.sh
# scripts/tool_bench.sh
# tests/test-backend-ops.cpp
# tests/test-lora-conversion-inference.sh
# tests/test-tokenizer-0.sh
# tools/server/README.md
|
2025-06-30 20:38:44 +08:00 |
|
Concedo
|
b6edb79648
|
filter out empty entries
|
2025-06-28 20:22:34 +08:00 |
|