koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-11 09:34:37 +00:00

Author	SHA1	Message	Date
Concedo	17360a3b32	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # examples/llava/clip.cpp	2025-04-20 17:59:58 +08:00
Xuan-Son Nguyen	37b9f0d29d	clip : refactor, add `image_manipulation` and `llava_uhd` classes (#13011 ) * clip : refactor, add `image_manipulation` and `llava_uhd` * refactor llava-1.6 preprocessing * simplify logic for llava-1.5 * missing include	2025-04-19 09:15:45 +02:00
Concedo	95d1aaf4d4	Merge branch 'upstream' into concedo_experimental # Conflicts: # examples/rpc/rpc-server.cpp # ggml/src/ggml-rpc/ggml-rpc.cpp # ggml/src/ggml-sycl/backend.hpp # ggml/src/ggml-sycl/common.hpp # ggml/src/ggml-sycl/element_wise.cpp # ggml/src/ggml-sycl/element_wise.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # requirements/requirements-all.txt	2025-04-19 13:17:13 +08:00
Daniel Tang	6408210082	main : Fix Ctrl+D/newline handling (#12951 ) This restores the behavior from #491. This does not affect Ctrl+D's ability to terminate --multiline-input lines (#1040). This also actually implements #587: "If the user wants the text to end in a newline, this should be accomplished by explicitly adding a newline by using \ followed by return, then returning control by pressing return again." Fixes #12949	2025-04-18 22:02:55 +02:00
Xuan-Son Nguyen	35370ba945	server : use std::move whenever possible (#12936 ) * server : use std::move whenever possible * use r-value ref * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * make task creation scoped * restore std::move * fix task_id not set correctly * apply changes from suggestion Co-authored-by: ggerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-04-18 19:58:12 +02:00
Xuan-Son Nguyen	b9154ecff9	mtmd : add methods to access `mtmd_image_tokens` (#12906 ) * mtmd : add more api around mtmd_image_tokens * mtmd : ability to calc image hash * shared_ptr for mtmd_image_tokens * move hash to user-define ID (fixed) * fix prompt_modified * rm redundant data member	2025-04-18 10:04:51 +02:00
Radoslav Gerganov	2db9ba1464	rpc : add RPC_CMD_HELLO (#12955 ) Add RPC_CMD_HELLO for getting the version of the protocol implemend by the server. Follow the semantic versioning rules at https://semver.org Hopefully this bring better user experience when we make breaking changes at the protocol level and avoid issues like #12465	2025-04-18 10:13:42 +03:00
Concedo	06159939d9	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # Makefile # docs/build.md # examples/rpc/rpc-server.cpp # examples/sycl/build.sh # ggml/CMakeLists.txt # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/ggml-cann.cpp # ggml/src/ggml-hip/CMakeLists.txt # scripts/sync-ggml.last	2025-04-17 00:52:37 +08:00
Russyyds	d6d2c2ab8c	Add performance print for gemma3 in example (#12929 )	2025-04-14 19:18:20 +02:00
Neo Zhang Jianyu	81c7e64fc2	dsiable curl lib check, this action is missed by commit `bd3f59f812` (#12761 ) (#12937 )	2025-04-14 18:19:07 +08:00
Ed Addario	71e90e8813	quantize: Handle user-defined quantization levels for additional tensors (#12511 ) * Add llama_model_quantize_params parameters * Add new quantize parameters parsing and validation * Update usage * Add new parameters defaults * Add new quantization parameters logic * Add llama_model_quantize_params parameters * Add new quantize parameters parsing and validation * Update usage * Add new parameters defaults * Add new quantization parameters logic * Minor refactoring as per the contributors' coding guidelines * Update descriptions to match existing style * Add llama_model_quantize_params parameters * Add new quantize parameters parsing and validation * Update usage * Add new parameters defaults * Add new quantization parameters logic * Minor refactoring as per the contributors' guidelines * Implement general --tensor-type instead of tensor-specific command option * Fix implied type bug * Restore missing #includes * Add regex capability for tensor selection * Refactor function name and update ALLOWED_TENSOR_TYPE * Add missing #include * Handle edge case when tensor name is cls.output * Minor logging improvement	2025-04-13 21:29:28 +03:00
Prajwal B Mehendarkar	bc091a4dc5	common : Define cache directory on AIX (#12915 )	2025-04-12 17:33:39 +02:00
Concedo	a6149ad0fc	fixed g3 adapter back	2025-04-12 23:17:54 +08:00
Concedo	9f94f62768	fixed segfault	2025-04-12 19:08:27 +08:00
Concedo	7b4254bef9	not working on cpu	2025-04-12 18:55:29 +08:00
Matt Clayton	e59ea539b8	llava: Fix cpu-only clip image encoding sefault (#12907 ) * llava: Fix cpu-only clip image encoding * clip : no smart ptr for ggml_backend_t * Fix for backend_ptr push_back --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-04-12 07:29:03 +02:00
Concedo	7e1289ade8	fixes for sdcpp	2025-04-12 10:08:23 +08:00
Concedo	a0ae187563	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/docker.yml # README.md # build-xcframework.sh # examples/llava/CMakeLists.txt # examples/llava/clip.cpp # examples/rpc/rpc-server.cpp # examples/run/run.cpp # ggml/src/ggml-cann/ggml-cann.cpp # scripts/sync-ggml-am.sh # scripts/sync-ggml.last # tests/test-backend-ops.cpp # tests/test-chat.cpp	2025-04-12 10:06:47 +08:00
Concedo	ea9bd61e47	Merge commit '`64eda5deb9`' into concedo_experimental # Conflicts: # .devops/cuda.Dockerfile # .devops/intel.Dockerfile # .devops/llama-cli-cann.Dockerfile # .devops/musa.Dockerfile # .devops/rocm.Dockerfile # .devops/vulkan.Dockerfile # .github/workflows/build.yml # .github/workflows/docker.yml # README.md # docs/backend/SYCL.md # examples/llava/clip.cpp # examples/server_embd.py # ggml/src/ggml-cann/acl_tensor.cpp # ggml/src/ggml-cann/aclnn_ops.cpp # ggml/src/ggml-cann/aclnn_ops.h # ggml/src/ggml-cann/ggml-cann.cpp # src/CMakeLists.txt # tests/test-chat-template.cpp	2025-04-12 08:31:22 +08:00
Georgi Gerganov	c94085df28	server : add VSCode's Github Copilot Chat support (#12896 ) * server : add VSCode's Github Copilot Chat support * cont : update handler name	2025-04-11 23:37:41 +03:00
yuri@FreeBSD	e8a62631b3	rpc : Set cache directory in rpc-server.cpp on FreeBSD (#12903 )	2025-04-11 22:04:14 +02:00
tastelikefeet	b2034c2b55	contrib: support modelscope community (#12664 ) * support download from modelscope * support login * remove comments * add arguments * fix code * fix win32 * test passed * fix readme * revert readme * change to MODEL_ENDPOINT * revert tail line * fix readme * refactor model endpoint * remove blank line * fix header * fix as comments * update comment * update readme --------- Co-authored-by: tastelikefeet <yuze.zyz@alibaba-inc/com>	2025-04-11 14:01:56 +02:00
Xuan-Son Nguyen	0c50923944	clip : use smart pointer (⚠️ breaking change) (#12869 ) * clip : use smart pointers * fix warmup * add forward declaration * misisng include * fix include (2) * composite * simplify batch ptr * fix conflict	2025-04-11 12:09:39 +02:00
Xuan-Son Nguyen	8b9cc7cdd8	llava : introduce libmtmd (#12849 ) * wip llava2 * migrated gemma3 to llava2 * add timings * correct pre/postfix * fix missing include * fix compilation unused var warn * update llava2_tokenize * change name llava2 --> mtmd * improve api * refine helpers * Update examples/llava/mtmd.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-04-10 22:57:16 +02:00
Plamen Minev	381603a775	ci: detach common from the library (#12827 ) * fix: detach common from the library * fix: building chat test template	2025-04-09 10:11:11 +02:00
Xuan-Son Nguyen	65a69e6e1b	clip : do not print ftype (#12832 )	2025-04-09 10:09:53 +02:00
Matt Clayton	b32efad2bc	llava: improve clip_ctx destructor to not memleak load_image_size (#12834 )	2025-04-08 22:01:58 +02:00
Georgi Gerganov	a19b5cef16	llama : fix FA when KV cache is not used (i.e. embeddings) (#12825 ) * ggml : FA supports F32 V * graph : cast KV to F16 when the KV cache is not used ggml-ci * server : add test that exercises embeddings with FA enabled ggml-ci	2025-04-08 19:54:51 +03:00
Xuan-Son Nguyen	78a1ba0a4f	server : fix thread.join() on exit (#12831 )	2025-04-08 18:37:06 +02:00
dm4	2dabf759e7	llava: add more helper functions to check projector types in clip context (#12824 ) Signed-off-by: dm4 <sunrisedm4@gmail.com>	2025-04-08 15:49:13 +02:00
Concedo	ebf924c5d1	Merge branch 'upstream' into concedo_experimental	2025-04-08 21:46:30 +08:00
Concedo	88660dd59d	merged qwen2.5vl again	2025-04-08 21:32:25 +08:00
Concedo	822cf2430e	Merge commit '`f1e3eb4249`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # README.md # docs/backend/SYCL.md # examples/llava/clip.cpp # ggml/src/ggml-sycl/CMakeLists.txt # ggml/src/ggml-vulkan/cmake/host-toolchain.cmake.in	2025-04-08 20:48:53 +08:00
Concedo	c58e9a2be3	revert q2.5vl before merge (+1 squashed commits) Squashed commits: [3197ea95] Revert "add tentative support for qwen2.5vl vision from HimariO fork" This reverts commit `911669087a`.	2025-04-08 20:38:41 +08:00
characharm	8ca6e1c3a4	server : webui : Improve Chat Input with Auto-Sizing Textarea (#12785 ) * Update ChatScreen.tsx * useAutosizeTextarea.ts useAutosizeTextarea to encapsulate the logic. * Implement responsive auto-sizing chat textarea Replaces the manual textarea resizing with an automatic height adjustment based on content. - `useChatTextarea` hook to manage textarea state and auto-sizing logic via refs, preserving the optimization - Textarea now grows vertically up to a maximum height (`lg:max-h-48`) on large screens (lg breakpoint and up). - Disables auto-sizing and enables manual vertical resizing (`resize-vertical`) on smaller screens for better mobile usability. - Aligns the "Send" button to the bottom of the textarea (`items-end`) for consistent positioning during resize. * -update compressed index.html.gz after npm run build -refactor: replace OptimizedTextareaValue with AutosizeTextareaApi in VSCode context hook * chore: normalize line endings to LF refactor: AutosizeTextareaApi -> chatTextareaApi * refactor: Rename interface to PascalCase --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-04-08 11:14:59 +02:00
stduhpf	4ccea213bc	hellaswag: display estimated score confidence interval (#12797 )	2025-04-07 18:47:08 +03:00
HimariO	b28ad7ecca	fix attn weight scaling after rebase	2025-04-07 22:07:56 +08:00
HimariO	223edef897	remove commented-out code blocks	2025-04-07 21:52:37 +08:00
HimariO	dde96b4774	remove not so often use `qwen2vl-cli` debug functions	2025-04-07 21:52:37 +08:00
HimariO	8fcf682b28	ignore transformers Qwen2_5_xxx type check	2025-04-07 21:52:37 +08:00
HimariO	fdae70a832	cleaning up	2025-04-07 21:52:37 +08:00
HimariO	c891300c1e	move position id remap out of ggml to avoid int32 cuda operations	2025-04-07 21:52:37 +08:00
HimariO	e18f6a3238	fix few incorrect tensor memory layout	2025-04-07 21:52:37 +08:00
HimariO	ecd673f0c5	add debug utils	2025-04-07 21:51:18 +08:00
HimariO	9c827814e6	handle window attention inputs	2025-04-07 21:51:18 +08:00
HimariO	9c7cc6de9c	implment vision model architecture, gguf convertor	2025-04-07 21:46:06 +08:00
Concedo	a3f7de7142	fixed outetts docs	2025-04-07 21:31:43 +08:00
Xuan-Son Nguyen	bd3f59f812	cmake : enable curl by default (#12761 ) * cmake : enable curl by default * no curl if no examples * fix build * fix build-linux-cross * add windows-setup-curl * fix * shell * fix path * fix windows-latest-cmake* * run: include_directories * LLAMA_RUN_EXTRA_LIBS * sycl: no llama_curl * no test-arg-parser on windows * clarification * try riscv64 / arm64 * windows: include libcurl inside release binary * add msg * fix mac / ios / android build * will this fix xcode? * try clearing the cache * add bunch of licenses * revert clear cache * fix xcode * fix xcode (2) * fix typo	2025-04-07 13:35:19 +02:00
Concedo	5edbacdd0e	fix tools (+3 squashed commit) Squashed commit: [95a489ee] fix tools build [1d3d3451] add accelerate [`2837705c`] edit a line	2025-04-06 21:30:48 +08:00
Sergey Fedorov	f1e3eb4249	common : fix includes in arg.cpp and gemma3-cli.cpp (#12766 ) * arg.cpp: add a missing include * gemma3-cli.cpp: fix cinttypes include	2025-04-05 17:46:00 +02:00

1 2 3 4 5 ...

1703 commits