Concedo
59300dbdf5
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/actions/windows-setup-curl/action.yml
# .github/workflows/build-linux-cross.yml
# README.md
# common/CMakeLists.txt
# examples/parallel/README.md
# examples/parallel/parallel.cpp
# ggml/src/ggml-sycl/element_wise.cpp
# ggml/src/ggml-vulkan/CMakeLists.txt
# tools/server/README.md
2025-05-18 23:27:53 +08:00
Xuan-Son Nguyen
6aa892ec2a
server : do not return error out of context (with ctx shift disabled) ( #13577 )
2025-05-16 21:50:00 +02:00
Concedo
e5d26a2356
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# common/CMakeLists.txt
# docs/backend/SYCL.md
# ggml/CMakeLists.txt
# ggml/src/ggml-sycl/CMakeLists.txt
# ggml/src/ggml-sycl/binbcast.cpp
# ggml/src/ggml-sycl/convert.cpp
# ggml/src/ggml-sycl/dequantize.hpp
# ggml/src/ggml-sycl/dmmv.cpp
# ggml/src/ggml-sycl/gemm.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/mmvq.cpp
# ggml/src/ggml-sycl/vecdotq.hpp
# ggml/src/ggml-vulkan/CMakeLists.txt
# ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
# ggml/src/gguf.cpp
# scripts/compare-llama-bench.py
# tests/CMakeLists.txt
# tests/test-chat.cpp
# tools/llama-bench/llama-bench.cpp
# tools/server/README.md
2025-05-16 15:30:31 +08:00
Olivier Chafik
aa48e373f2
server: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802 )
...
* Inject date_string in llama 3.x + fix for functionary v2
https://github.com/ggml-org/llama.cpp/issues/12729
* move/fix detection of functionary v3.1 before llama 3.x, fix & test their non-tool mode
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* generate more tokens in test_completion_with_required_tool_tiny_fast to avoid truncation
---------
Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-05-15 02:39:51 +01:00
Xuan-Son Nguyen
360a9c98e1
server : fix cache_tokens bug with no cache_prompt ( #13533 )
2025-05-14 13:35:07 +02:00
Concedo
21e31e255b
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/docker.yml
# README.md
# build-xcframework.sh
# common/CMakeLists.txt
# examples/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cuda/CMakeLists.txt
# ggml/src/ggml-metal/ggml-metal.m
# ggml/src/ggml-metal/ggml-metal.metal
# ggml/src/ggml-sycl/CMakeLists.txt
# ggml/src/ggml-sycl/backend.hpp
# ggml/src/ggml-sycl/common.hpp
# ggml/src/ggml-sycl/ggml-sycl.cpp
# ggml/src/ggml-sycl/mmvq.cpp
# ggml/src/ggml-sycl/vecdotq.hpp
# scripts/compare-llama-bench.py
# src/CMakeLists.txt
# src/llama-model.cpp
# src/llama.cpp
# tests/test-backend-ops.cpp
# tests/test-opt.cpp
# tools/llama-bench/README.md
# tools/llama-bench/llama-bench.cpp
# tools/mtmd/CMakeLists.txt
# tools/mtmd/README.md
# tools/mtmd/clip.cpp
# tools/rpc/rpc-server.cpp
# tools/server/CMakeLists.txt
# tools/server/README.md
2025-05-13 00:28:35 +08:00
Xuan-Son Nguyen
33eff40240
server : vision support via libmtmd ( #12898 )
...
* server : (experimental) vision support via libmtmd
* mtmd : add more api around mtmd_image_tokens
* mtmd : add more api around mtmd_image_tokens
* mtmd : ability to calc image hash
* shared_ptr for mtmd_image_tokens
* move hash to user-define ID (fixed)
* abstract out the batch management
* small fix
* refactor logic adding tokens to batch
* implement hashing image
* use FNV hash, now hash bitmap instead of file data
* allow decoding image embedding to be split into batches
* rm whitespace
* disable some features when mtmd is on
* fix --no-mmproj-offload
* mtmd_context_params no timings
* refactor server_inp to server_tokens
* fix the failing test case
* init
* wip
* working version
* add mtmd::bitmaps
* add test target
* rm redundant define
* test: mtmd_input_chunks_free
* rm outdated comment
* fix merging issue
* explicitly create mtmd::input_chunks
* mtmd_input_chunk_copy
* add clone()
* improve server_input struct
* clip : fix confused naming ffn_up and ffn_down
* rm ffn_i/o/g naming
* rename n_embd, n_ff
* small fix
* no check n_ff
* fix detokenize
* add const to various places
* add warning about breaking changes
* add c api
* helper: use mtmd_image_tokens_get_n_pos
* fix ctx_shift
* fix name shadowing
* more strict condition
* support remote image_url
* remote image_url log
* add CI test
* do not log base64
* add "has_multimodal" to /props
* remove dangling image
* speculative: use slot.cache_tokens.insert
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* rm can_be_detokenized
* on prmpt processing done, assert cache_tokens.size
* handle_completions_impl returns void
* adapt the new web ui
* update docs and hot topics
* rm assert
* small fix (2)
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-05-09 19:29:37 +02:00
Concedo
5a2808ffaf
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .flake8
# .github/labeler.yml
# .github/workflows/bench.yml.disabled
# .github/workflows/build-linux-cross.yml
# .github/workflows/build.yml
# .github/workflows/server.yml
# .gitignore
# CMakeLists.txt
# CODEOWNERS
# Makefile
# README.md
# SECURITY.md
# build-xcframework.sh
# ci/run.sh
# docs/development/HOWTO-add-model.md
# docs/multimodal/MobileVLM.md
# docs/multimodal/glmedge.md
# docs/multimodal/llava.md
# docs/multimodal/minicpmo2.6.md
# docs/multimodal/minicpmv2.5.md
# docs/multimodal/minicpmv2.6.md
# examples/CMakeLists.txt
# examples/pydantic_models_to_grammar_examples.py
# grammars/README.md
# pyrightconfig.json
# requirements/requirements-all.txt
# scripts/fetch_server_test_models.py
# scripts/tool_bench.py
# scripts/xxd.cmake
# tests/CMakeLists.txt
# tests/run-json-schema-to-grammar.mjs
# tools/batched-bench/CMakeLists.txt
# tools/batched-bench/README.md
# tools/batched-bench/batched-bench.cpp
# tools/cvector-generator/CMakeLists.txt
# tools/cvector-generator/README.md
# tools/cvector-generator/completions.txt
# tools/cvector-generator/cvector-generator.cpp
# tools/cvector-generator/mean.hpp
# tools/cvector-generator/negative.txt
# tools/cvector-generator/pca.hpp
# tools/cvector-generator/positive.txt
# tools/export-lora/CMakeLists.txt
# tools/export-lora/README.md
# tools/export-lora/export-lora.cpp
# tools/gguf-split/CMakeLists.txt
# tools/gguf-split/README.md
# tools/imatrix/CMakeLists.txt
# tools/imatrix/README.md
# tools/imatrix/imatrix.cpp
# tools/llama-bench/CMakeLists.txt
# tools/llama-bench/README.md
# tools/llama-bench/llama-bench.cpp
# tools/llava/CMakeLists.txt
# tools/llava/README.md
# tools/llava/android/adb_run.sh
# tools/llava/android/build_64.sh
# tools/llava/clip-quantize-cli.cpp
# tools/main/CMakeLists.txt
# tools/main/README.md
# tools/perplexity/CMakeLists.txt
# tools/perplexity/README.md
# tools/perplexity/perplexity.cpp
# tools/quantize/CMakeLists.txt
# tools/rpc/CMakeLists.txt
# tools/rpc/README.md
# tools/rpc/rpc-server.cpp
# tools/run/CMakeLists.txt
# tools/run/README.md
# tools/run/linenoise.cpp/linenoise.cpp
# tools/run/linenoise.cpp/linenoise.h
# tools/run/run.cpp
# tools/server/CMakeLists.txt
# tools/server/README.md
# tools/server/bench/README.md
# tools/server/public_simplechat/readme.md
# tools/server/tests/README.md
# tools/server/themes/README.md
# tools/server/themes/buttons-top/README.md
# tools/server/themes/wild/README.md
# tools/tokenize/CMakeLists.txt
# tools/tokenize/tokenize.cpp
2025-05-03 12:15:36 +08:00
Diego Devesa
1d36b3670b
llama : move end-user examples to tools directory ( #13249 )
...
* llama : move end-user examples to tools directory
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00