Commit graph

1059 commits

Author SHA1 Message Date
Concedo
ced98823a1 kai api tool calling 2025-08-09 10:51:10 +08:00
Concedo
9e7a940ce4 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/softmax_4_f16.cl
#	ggml/src/ggml-opencl/kernels/softmax_4_f32.cl
#	ggml/src/ggml-opencl/kernels/softmax_f16.cl
#	ggml/src/ggml-opencl/kernels/softmax_f32.cl
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
2025-08-09 01:24:52 +08:00
Concedo
8a71eb03c0 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	ggml/cmake/ggml-config.cmake.in
#	ggml/src/ggml-cann/CMakeLists.txt
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cuda/fattn.cu
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	requirements/requirements-convert_hf_to_gguf.txt
#	scripts/compare-llama-bench.py
#	tests/test-chat-template.cpp
#	tests/test-chat.cpp
#	tools/llama-bench/llama-bench.cpp
2025-08-07 21:23:09 +08:00
Concedo
e40d26b9e7 allow offloading moe to cpu with --moecpu 2025-08-05 23:42:42 +08:00
Concedo
9fbbd9e127 half measure for mistral spaced formats 2025-08-04 23:48:11 +08:00
Concedo
6cb8f95b5b tool calling params have been ported over to KAI api and can be used, same syntax as OAI endpoint 2025-08-03 16:21:57 +08:00
Concedo
fa815f76c9 updated model recs (+1 squashed commits)
Squashed commits:

[3e0431ae1] updated model recs
2025-08-02 11:41:37 +08:00
Concedo
cd0dc0abec allow tool calls to be triggered by any role 2025-08-02 10:00:35 +08:00
Concedo
a87c05f8c1 move function call determination to separate method 2025-07-31 14:14:38 +08:00
Concedo
cade9f42bc bump defaults 2025-07-31 12:05:57 +08:00
Concedo
1976bb3f53 fixes for tool calling 2025-07-30 19:25:39 +08:00
Concedo
abf527a207 clearer multimodal capability display 2025-07-28 22:54:49 +08:00
Concedo
ecb2cbf547 fix url params parse search 2025-07-27 16:41:42 +08:00
Concedo
8192cd6747 handle multi tool calls 2025-07-25 23:06:23 +08:00
Concedo
f25339c92b handle empty objects returned by tool calls, also remove misinterpretation of the tools calls instruct tag within ChatML autoguess 2025-07-25 22:22:27 +08:00
Concedo
0d72c794fa Merge commit 'c8ade30036' into concedo_experimental
# Conflicts:
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/im2col_f16.cl
#	ggml/src/ggml-opencl/kernels/im2col_f32.cl
#	ggml/src/ggml-sycl/im2col.cpp
#	tools/mtmd/clip.cpp
2025-07-25 19:42:45 +08:00
Concedo
8f622cfb50 debugmode longer prints 2025-07-23 19:28:39 +08:00
Concedo
4b348d0b7e add 2 more save slots 2025-07-22 21:07:19 +08:00
Concedo
75154a3d91 add ping endpoint 2025-07-22 18:55:35 +08:00
Concedo
30675b0798 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	CODEOWNERS
#	docs/build.md
#	scripts/sync-ggml.last
#	tests/test-backend-ops.cpp
#	tools/imatrix/README.md
#	tools/imatrix/imatrix.cpp
2025-07-20 22:47:31 +08:00
Concedo
b028dd4e84 minor fixes 2025-07-18 13:22:59 +08:00
Concedo
1ca666f9c1 allow handling multipart files up to 999 2025-07-18 01:18:28 +08:00
Concedo
afca31bfbe handle clean_env for remotetunnel 2025-07-17 18:21:22 +08:00
Concedo
d4a394ff73 label attached media with ids 2025-07-17 10:04:46 +08:00
Concedo
d3d5e36af6 backwards compat for older flags in config load 2025-07-15 22:05:57 +08:00
Concedo
51cac6f30c add a title to load config 2025-07-15 18:22:44 +08:00
Concedo
8396add5be removed hunyuan autoguess template, fixed multi file loading up to 999 parts 2025-07-15 17:49:49 +08:00
Concedo
b7f8d0fe2b handle inconsistent final message content being sent with finish_reason 2025-07-14 22:17:18 +08:00
Concedo
0e8f96414a add in backwards compatibility for older clients with incorrect json_schema passing 2025-07-14 17:41:08 +08:00
Concedo
e7eb6d3200 increase default ctx size to 8k, rename usecublas to usecuda 2025-07-13 18:27:42 +08:00
Concedo
811463a704 split audio and vision detection separately 2025-07-13 17:47:15 +08:00
Concedo
0938af7c83 fixed noscript image gen 2025-07-13 11:37:52 +08:00
Concedo
6f4f1b7389 allowing resuming incomplete aria2 downloads 2025-07-13 11:14:39 +08:00
Concedo
e9473305d0 wip2 (+1 squashed commits)
Squashed commits:

[4628777b6] wip
2025-07-12 18:54:40 +08:00
Concedo
57ce374240 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/ISSUE_TEMPLATE/010-bug-compilation.yml
#	.github/ISSUE_TEMPLATE/011-bug-results.yml
#	.github/labeler.yml
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	.gitmodules
#	CMakeLists.txt
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cann/aclnn_ops.cpp
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-opencl/kernels/softmax_4_f16.cl
#	ggml/src/ggml-opencl/kernels/softmax_4_f32.cl
#	ggml/src/ggml-opencl/kernels/softmax_f16.cl
#	ggml/src/ggml-opencl/kernels/softmax_f32.cl
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/element_wise.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.last
#	scripts/sync-ggml.sh
#	tests/test-backend-ops.cpp
#	tests/test-c.c
2025-07-05 12:16:28 +08:00
Concedo
ae0c6b02f8 print system info 2025-07-05 11:58:00 +08:00
Wagner Bruna
d74c16e6e0
enable flash attention for image generation (#1633) 2025-07-05 11:20:51 +08:00
Wagner Bruna
bc3e4c1197
remove obsolete warning about flash attention on Vulkan (#1634) 2025-07-03 16:57:03 +08:00
Concedo
f407aa3b8a emulated oai image generation 2025-07-02 16:01:56 +08:00
Concedo
cdda9d16e0 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/tools.sh
#	build-xcframework.sh
#	ci/run.sh
#	examples/Miku.sh
#	examples/chat-13B.sh
#	examples/chat-persistent.sh
#	examples/chat-vicuna.sh
#	examples/chat.sh
#	examples/jeopardy/jeopardy.sh
#	examples/reason-act.sh
#	examples/server-llama2-13B.sh
#	examples/sycl/build.sh
#	examples/sycl/run-llama2.sh
#	examples/sycl/run-llama3.sh
#	examples/ts-type-to-grammar.sh
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-sycl/element_wise.cpp
#	ggml/src/ggml-sycl/element_wise.hpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	scripts/apple/validate-apps.sh
#	scripts/apple/validate-ios.sh
#	scripts/apple/validate-macos.sh
#	scripts/apple/validate-tvos.sh
#	scripts/apple/validate-visionos.sh
#	scripts/check-requirements.sh
#	scripts/ci-run.sh
#	scripts/compare-commits.sh
#	scripts/debug-test.sh
#	scripts/gen-authors.sh
#	scripts/get-hellaswag.sh
#	scripts/get-pg.sh
#	scripts/get-wikitext-103.sh
#	scripts/get-wikitext-2.sh
#	scripts/get-winogrande.sh
#	scripts/hf.sh
#	scripts/qnt-all.sh
#	scripts/run-all-perf.sh
#	scripts/run-all-ppl.sh
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.sh
#	scripts/tool_bench.sh
#	tests/test-backend-ops.cpp
#	tests/test-lora-conversion-inference.sh
#	tests/test-tokenizer-0.sh
#	tools/server/README.md
2025-06-30 20:38:44 +08:00
Concedo
b6edb79648 filter out empty entries 2025-06-28 20:22:34 +08:00
Concedo
a88c56e70c Merge remote-tracking branch 'origin/upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	examples/eval-callback/eval-callback.cpp
#	ggml/src/ggml-cann/common.h
#	ggml/src/ggml-vulkan/CMakeLists.txt
#	ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
#	tests/test-backend-ops.cpp
2025-06-28 17:47:53 +08:00
Concedo
4ec0e0fd21 now accept multiple images for reference images 2025-06-28 17:30:28 +08:00
Concedo
0bd648ffa4 photomaker renamed to extra image to handle future extension 2025-06-28 10:26:06 +08:00
tsite
df47b51bd1
support python 3.13 (#1621) 2025-06-27 00:18:30 +08:00
Concedo
ace537d44e Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/release.yml
#	CMakeLists.txt
#	examples/simple-chat/simple-chat.cpp
#	src/llama-quant.cpp
#	tools/run/run.cpp
#	tools/server/README.md
2025-06-24 23:06:16 +08:00
Concedo
2d822d3059 fixed a typo 2025-06-22 23:28:29 +08:00
Concedo
abc1d8ac25 better way of checking for avx2 support 2025-06-22 22:56:50 +08:00
Concedo
52dcfe42d6 try auto selecting correct backend while checking intrinsics 2025-06-22 18:16:02 +08:00
Concedo
72d467c6d5 vision is now working in ollama owui 2025-06-21 23:43:43 +08:00