Commit graph

16 commits

Author SHA1 Message Date
Concedo
38b3bffcef Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	CMakePresets.json
#	ggml/src/ggml-cuda/CMakeLists.txt
#	tests/test-sampling.cpp
#	tools/mtmd/clip.cpp
2025-05-07 19:47:44 +08:00
Xuan-Son Nguyen
32916a4907
clip : refactor graph builder (#13321)
* mtmd : refactor graph builder

* fix qwen2vl

* clean up siglip cgraph

* pixtral migrated

* move minicpmv to a dedicated build function

* move max_feature_layer to build_llava

* use build_attn for minicpm resampler

* fix windows build

* add comment for batch_size

* also support tinygemma3 test model

* qwen2vl does not use RMS norm

* fix qwen2vl norm (2)
2025-05-06 22:40:24 +02:00
Concedo
ffe23f0e93 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	pyproject.toml
2025-05-06 23:39:45 +08:00
Concedo
0fa435b2a6 Merge commit '9b61acf060' into concedo_experimental
# Conflicts:
#	Makefile
#	docs/multimodal/MobileVLM.md
#	docs/multimodal/glmedge.md
#	docs/multimodal/llava.md
#	docs/multimodal/minicpmo2.6.md
#	docs/multimodal/minicpmv2.5.md
#	docs/multimodal/minicpmv2.6.md
#	requirements/requirements-all.txt
#	tools/mtmd/CMakeLists.txt
#	tools/mtmd/README.md
#	tools/mtmd/android/adb_run.sh
#	tools/mtmd/android/build_64.sh
#	tools/mtmd/clip-quantize-cli.cpp
2025-05-06 23:34:21 +08:00
Concedo
1377a93a73 Merge commit '5215b91e93' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	cmake/x64-windows-llvm.cmake
#	ggml/src/ggml-rpc/ggml-rpc.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	tests/CMakeLists.txt
#	tools/imatrix/imatrix.cpp
#	tools/llava/clip.cpp
#	tools/rpc/rpc-server.cpp
2025-05-06 23:15:04 +08:00
oobabooga
233461f812
sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (#13264)
* sampling: add Top-nσ sampler to `llama-server` and sampler ordering

* revert: sampler ordering

* revert: VS' crappy auto-formatting

* revert: VS' crappy auto-formatting pt.2

* revert: my crappy eye sight...

* sampling: add XTC to Top-nσ sampler chain

* sampling: add Dyna. Temp. to Top-nσ sampler chain

* sampling: actually remove Top-nσ from sampler(oops)

* Integrate top_n_sigma into main sampler chain

* Define COMMON_SAMPLER_TYPE_TOP_N_SIGMA

* Formatting

* Lint

* Exit early in the sampler if nsigma < 0

---------

Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-05-05 22:12:19 +02:00
igardev
b34c859146
server : Webui - change setText command from parent window to also send the message. (#13309)
* setText command from parent window for llama-vscode now sends the message automatically.

* Upgrade packages versions to fix vulnerabilities with "npm audit fix" command.

* Fix code formatting.

* Add index.html.gz changes.

* Revert "Upgrade packages versions to fix vulnerabilities with "npm audit fix" command."

This reverts commit 67687b7fda8a293724ba92ea30bb151677406bc8.

* easier approach

* add setTimeout

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-05 16:03:31 +02:00
Xuan-Son Nguyen
9b61acf060
mtmd : rename llava directory to mtmd (#13311)
* mv llava to mtmd

* change ref everywhere
2025-05-05 16:02:55 +02:00
Xuan-Son Nguyen
5215b91e93
clip : fix confused naming ffn_up and ffn_down (#13290)
* clip :  fix confused naming ffn_up and ffn_down

* rm ffn_i/o/g naming

* rename n_embd, n_ff

* small fix

* no check n_ff
2025-05-05 12:54:44 +02:00
Xuan-Son Nguyen
27aa259532
mtmd : add C public API (#13184)
* init

* wip

* working version

* add mtmd::bitmaps

* add test target

* rm redundant define

* test: mtmd_input_chunks_free

* rm outdated comment

* fix merging issue

* explicitly create mtmd::input_chunks

* mtmd_input_chunk_copy

* add clone()

* add const to various places

* add warning about breaking changes

* helper: use mtmd_image_tokens_get_n_pos
2025-05-04 23:43:42 +02:00
Diego Devesa
9fdfcdaedd
rpc : use backend registry, support dl backends (#13304) 2025-05-04 21:25:43 +02:00
Diego Devesa
86bd60d3fe
llava/mtmd : fixes to fully support dl backends (#13303) 2025-05-04 17:05:20 +02:00
Johannes Gäßler
3e959f0976
imatrix: fix oob writes if src1 is not contiguous (#13286) 2025-05-04 00:50:37 +02:00
Xuan-Son Nguyen
36667c8edc
clip : revert the change of BOI/EOI token for GLM-edge (⚠️ breaking change) (#13259) 2025-05-03 20:07:54 +02:00
Concedo
5a2808ffaf Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.flake8
#	.github/labeler.yml
#	.github/workflows/bench.yml.disabled
#	.github/workflows/build-linux-cross.yml
#	.github/workflows/build.yml
#	.github/workflows/server.yml
#	.gitignore
#	CMakeLists.txt
#	CODEOWNERS
#	Makefile
#	README.md
#	SECURITY.md
#	build-xcframework.sh
#	ci/run.sh
#	docs/development/HOWTO-add-model.md
#	docs/multimodal/MobileVLM.md
#	docs/multimodal/glmedge.md
#	docs/multimodal/llava.md
#	docs/multimodal/minicpmo2.6.md
#	docs/multimodal/minicpmv2.5.md
#	docs/multimodal/minicpmv2.6.md
#	examples/CMakeLists.txt
#	examples/pydantic_models_to_grammar_examples.py
#	grammars/README.md
#	pyrightconfig.json
#	requirements/requirements-all.txt
#	scripts/fetch_server_test_models.py
#	scripts/tool_bench.py
#	scripts/xxd.cmake
#	tests/CMakeLists.txt
#	tests/run-json-schema-to-grammar.mjs
#	tools/batched-bench/CMakeLists.txt
#	tools/batched-bench/README.md
#	tools/batched-bench/batched-bench.cpp
#	tools/cvector-generator/CMakeLists.txt
#	tools/cvector-generator/README.md
#	tools/cvector-generator/completions.txt
#	tools/cvector-generator/cvector-generator.cpp
#	tools/cvector-generator/mean.hpp
#	tools/cvector-generator/negative.txt
#	tools/cvector-generator/pca.hpp
#	tools/cvector-generator/positive.txt
#	tools/export-lora/CMakeLists.txt
#	tools/export-lora/README.md
#	tools/export-lora/export-lora.cpp
#	tools/gguf-split/CMakeLists.txt
#	tools/gguf-split/README.md
#	tools/imatrix/CMakeLists.txt
#	tools/imatrix/README.md
#	tools/imatrix/imatrix.cpp
#	tools/llama-bench/CMakeLists.txt
#	tools/llama-bench/README.md
#	tools/llama-bench/llama-bench.cpp
#	tools/llava/CMakeLists.txt
#	tools/llava/README.md
#	tools/llava/android/adb_run.sh
#	tools/llava/android/build_64.sh
#	tools/llava/clip-quantize-cli.cpp
#	tools/main/CMakeLists.txt
#	tools/main/README.md
#	tools/perplexity/CMakeLists.txt
#	tools/perplexity/README.md
#	tools/perplexity/perplexity.cpp
#	tools/quantize/CMakeLists.txt
#	tools/rpc/CMakeLists.txt
#	tools/rpc/README.md
#	tools/rpc/rpc-server.cpp
#	tools/run/CMakeLists.txt
#	tools/run/README.md
#	tools/run/linenoise.cpp/linenoise.cpp
#	tools/run/linenoise.cpp/linenoise.h
#	tools/run/run.cpp
#	tools/server/CMakeLists.txt
#	tools/server/README.md
#	tools/server/bench/README.md
#	tools/server/public_simplechat/readme.md
#	tools/server/tests/README.md
#	tools/server/themes/README.md
#	tools/server/themes/buttons-top/README.md
#	tools/server/themes/wild/README.md
#	tools/tokenize/CMakeLists.txt
#	tools/tokenize/tokenize.cpp
2025-05-03 12:15:36 +08:00
Diego Devesa
1d36b3670b
llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00