koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-11 01:24:36 +00:00

Author	SHA1	Message	Date
Concedo	21e31e255b	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/docker.yml # README.md # build-xcframework.sh # common/CMakeLists.txt # examples/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-cuda/CMakeLists.txt # ggml/src/ggml-metal/ggml-metal.m # ggml/src/ggml-metal/ggml-metal.metal # ggml/src/ggml-sycl/CMakeLists.txt # ggml/src/ggml-sycl/backend.hpp # ggml/src/ggml-sycl/common.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/mmvq.cpp # ggml/src/ggml-sycl/vecdotq.hpp # scripts/compare-llama-bench.py # src/CMakeLists.txt # src/llama-model.cpp # src/llama.cpp # tests/test-backend-ops.cpp # tests/test-opt.cpp # tools/llama-bench/README.md # tools/llama-bench/llama-bench.cpp # tools/mtmd/CMakeLists.txt # tools/mtmd/README.md # tools/mtmd/clip.cpp # tools/rpc/rpc-server.cpp # tools/server/CMakeLists.txt # tools/server/README.md	2025-05-13 00:28:35 +08:00
Xuan-Son Nguyen	de4c07f937	clip : cap max image size 1024 for qwen vl model (#13478 )	2025-05-12 15:06:51 +02:00
City	c104023994	mtmd : Use RMS norm for InternVL 3 38B and 78B mmproj (#13459 )	2025-05-12 00:39:06 +02:00
David Huang	7f323a589f	Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386 )	2025-05-11 14:18:39 +02:00
City	3eac209319	mtmd : support InternVL 3 38B and 78B mmproj (#13443 ) * Support InternVL 3 38B and 78B mmproj * Swap norms in clip.cpp * Group variables together	2025-05-11 11:35:52 +02:00
Concedo	f841b29c41	fixed unicode paths	2025-05-11 14:05:54 +08:00
Xuan-Son Nguyen	15e6125a39	mtmd : add hard limit on image resolution for qwen2vl / qwen2.5vl (#13434 ) * mtmd : add hard limit on image resolution for qwen2vl / qwen2.5vl * fix typo	2025-05-10 19:57:54 +02:00
Xuan-Son Nguyen	053367d149	mtmd : support InternVL 2.5 and 3 (#13422 ) * convert : internvl support * InternVL3-1B working * fix regression * rm mobilevlm from test * fix conversion * add test for internvl * add to list of pre-quant * restore boi/eoi check * add clarify comment for norm eps	2025-05-10 16:26:42 +02:00
Diego Devesa	27ebfcacba	llama : do not crash if there is no CPU backend (#13395 ) * llama : do not crash if there is no CPU backend * add checks to examples	2025-05-09 13:02:07 +02:00
Concedo	2439014a03	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # examples/embedding/embedding.cpp # tools/imatrix/imatrix.cpp # tools/perplexity/perplexity.cpp	2025-05-08 23:41:02 +08:00
welix	0ccc121354	mtmd : fix the calculation of n_tokens for smolvlm (#13381 ) Co-authored-by: Taichi Nishimura <Taichi.A.Nishimura@sony.com>	2025-05-08 15:03:53 +02:00
Concedo	38b3bffcef	Merge branch 'upstream' into concedo_experimental # Conflicts: # CMakePresets.json # ggml/src/ggml-cuda/CMakeLists.txt # tests/test-sampling.cpp # tools/mtmd/clip.cpp	2025-05-07 19:47:44 +08:00
Xuan-Son Nguyen	32916a4907	clip : refactor graph builder (#13321 ) * mtmd : refactor graph builder * fix qwen2vl * clean up siglip cgraph * pixtral migrated * move minicpmv to a dedicated build function * move max_feature_layer to build_llava * use build_attn for minicpm resampler * fix windows build * add comment for batch_size * also support tinygemma3 test model * qwen2vl does not use RMS norm * fix qwen2vl norm (2)	2025-05-06 22:40:24 +02:00
Concedo	0fa435b2a6	Merge commit '`9b61acf060`' into concedo_experimental # Conflicts: # Makefile # docs/multimodal/MobileVLM.md # docs/multimodal/glmedge.md # docs/multimodal/llava.md # docs/multimodal/minicpmo2.6.md # docs/multimodal/minicpmv2.5.md # docs/multimodal/minicpmv2.6.md # requirements/requirements-all.txt # tools/mtmd/CMakeLists.txt # tools/mtmd/README.md # tools/mtmd/android/adb_run.sh # tools/mtmd/android/build_64.sh # tools/mtmd/clip-quantize-cli.cpp	2025-05-06 23:34:21 +08:00
Xuan-Son Nguyen	9b61acf060	mtmd : rename llava directory to mtmd (#13311 ) * mv llava to mtmd * change ref everywhere	2025-05-05 16:02:55 +02:00

15 commits