koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-12 05:52:26 +00:00

Author	SHA1	Message	Date
Asghar Ghorbani	994cfb1acb	readme : update UI list (#9972 ) add PocketPal AI app	2024-10-21 21:20:59 +03:00
Loïc Carrère	45f097645e	readme : update bindings list (#9951 ) Update the binding list by adding LM-Kit.NET (C# & VB.NET)	2024-10-20 19:25:41 +03:00
icppWorld	7cab2083c7	readme : update infra list (#9942 ) llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.	2024-10-20 19:01:34 +03:00
Ma Mingfei	60ce97c9d8	add amx kernel for gemm (#8998 ) add intel amx isa detection add vnni kernel for gemv cases add vnni and amx kernel support for block_q8_0 code cleanup fix packing B issue enable openmp fine tune amx kernel switch to aten parallel pattern add error message for nested parallelism code cleanup add f16 support in ggml-amx add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS update CMakeList update README fix some compilation warning fix compiler warning when amx is not enabled minor change ggml-ci move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp ggml-ci update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16 ggml-ci add amx as an ggml-backend update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h minor change update CMakeLists.txt minor change apply weight prepacking in set_tensor method in ggml-backend fix compile error ggml-ci minor change ggml-ci update CMakeLists.txt ggml-ci add march dependency minor change ggml-ci change ggml_backend_buffer_is_host to return false for amx backend ggml-ci fix supports_op use device reg for AMX backend ggml-ci minor change ggml-ci minor change fix rebase set .buffer_from_host_ptr to be false for AMX backend	2024-10-18 13:34:36 +08:00
Tim Wang	3752217ed5	readme : update bindings list (#9918 ) Co-authored-by: Tim Wang <tim.wang@ing.com>	2024-10-17 09:57:14 +03:00
Michał Tuszyński	4c42f93b22	readme : update bindings list (#9889 )	2024-10-15 11:20:34 +03:00
R0CKSTAR	943d20b411	musa : update doc (#9856 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-10-12 08:09:53 +03:00
Viet-Anh NGUYEN (Andrew)	71967c2a6d	Add Llama Assistant (#9744 )	2024-10-04 20:29:35 +02:00
Paweł Wodnicki	3f1ae2e32c	Update README.md (#9591 ) Add Bielik model.	2024-10-01 19:18:46 +02:00
Georgi Gerganov	589b48d41e	contrib : add Resources section (#9675 )	2024-09-29 14:38:18 +03:00
Aarni Koskela	43bcdd9703	readme : add tool (#9655 )	2024-09-28 15:07:14 +03:00
Georgi Gerganov	b5de3b74a5	readme : update hot topics	2024-09-27 20:57:51 +03:00
Concedo	6342b414ea	update readme	2024-09-24 23:04:23 +08:00
Riceball LEE	1d48e98e4f	readme : add programmable prompt engine language CLI (#9599 )	2024-09-23 18:58:17 +03:00
Shane A	0aadac10c7	llama : support OLMoE (#9462 )	2024-09-16 09:47:37 +03:00
Concedo	de0c96818e	update readme	2024-09-15 21:36:20 +08:00
Concedo	53bf0fb32d	removed openblas backend, merged into CPU (with llamafile for BLAS). GPU backend is now automatically selected when running from CLI unless noblas is specified.	2024-09-15 19:21:52 +08:00
OSecret	d6b37c881f	readme : update tools list (#9475 ) * Added link to proprietary wrapper for Unity3d into README.md Wrapper has prebuild library and was tested on iOS, Android, WebGL, PC, Mac platforms, has online demos like [this](https://d23myu0xfn2ttc.cloudfront.net/rich/index.html) and [that](https://d23myu0xfn2ttc.cloudfront.net/). * Update README.md Fixes upon review	2024-09-15 10:36:53 +03:00
Faisal Zaghloul	449ccfb6f5	Add Jais to list of supported models (#9439 ) Co-authored-by: fmz <quic_fzaghlou@quic.com>	2024-09-12 02:29:53 +02:00
Georgi Gerganov	38ca6f644b	readme : update hot topics	2024-09-09 15:51:37 +03:00
Antonis Makropoulos	5ed087573e	readme : add LLMUnity to UI projects (#9381 ) * add LLMUnity to UI projects * add newline to examples/rpc/README.md to fix editorconfig-checker unit test	2024-09-09 14:21:38 +03:00
Concedo	27bbdf7d2a	added link for novita AI, added legacy warning for old GGML models	2024-09-09 11:19:32 +08:00
Georgi Gerganov	b69a480af4	readme : refactor API section + remove old hot topics	2024-09-03 10:00:36 +03:00
Younes Belkada	b40eb84895	llama : support for `falcon-mamba` architecture (#9074 ) * feat: initial support for llama.cpp * fix: lint * refactor: better refactor * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix: address comments * Update convert_hf_to_gguf.py Co-authored-by: compilade <git@compilade.net> * fix: add more cleanup and harmonization * fix: lint * Update gguf-py/gguf/gguf_writer.py Co-authored-by: compilade <git@compilade.net> * fix: change name * Apply suggestions from code review Co-authored-by: compilade <git@compilade.net> * add in operator * fix: add `dt_b_c_rms` in `llm_load_print_meta` * fix: correct printf format for bool * fix: correct print format * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * llama : quantize more Mamba tensors * llama : use f16 as the fallback of fallback quant types --------- Co-authored-by: compilade <git@compilade.net>	2024-08-21 11:06:36 +03:00
wangshuai09	cfac111e2b	cann: add doc for cann backend (#8867 ) Co-authored-by: xuedinge233 <damow890@gmail.com> Co-authored-by: hipudding <huafengchun@gmail.com>	2024-08-19 16:46:38 +08:00
Concedo	314a620e96	added readme for macos	2024-08-18 13:11:49 +08:00
Minsoo Cheong	c679e0cb5c	llama : add EXAONE model support (#9025 ) * add exaone model support * add chat template * fix whitespace Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add ftype * add exaone pre-tokenizer in `llama-vocab.cpp` Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com> * fix lint Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com> * add `EXAONE` to supported models in `README.md` * fix space Co-authored-by: compilade <git@compilade.net> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: compilade <113953597+compilade@users.noreply.github.com> Co-authored-by: compilade <git@compilade.net>	2024-08-16 09:35:18 +03:00
Frank Mai	84eb2f4fad	docs: introduce gpustack and gguf-parser (#8873 ) * readme: introduce gpustack GPUStack is an open-source GPU cluster manager for running large language models, which uses llama.cpp as the backend. Signed-off-by: thxCode <thxcode0824@gmail.com> * readme: introduce gguf-parser GGUF Parser is a tool to review/check the GGUF file and estimate the memory usage without downloading the whole model. Signed-off-by: thxCode <thxcode0824@gmail.com> --------- Signed-off-by: thxCode <thxcode0824@gmail.com>	2024-08-12 14:45:50 +02:00
Eric Curtin	b42978e7e4	readme : add ramalama to the availables UI (#8811 ) ramalama is a repo agnostic boring CLI tool that supports pulling from ollama, huggingface and oci registries. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2024-08-05 15:45:01 +03:00
BarfingLemurs	400ae6f65f	readme : update model list (#8851 )	2024-08-05 08:54:10 +03:00
Concedo	23caa63f94	up ver	2024-08-04 23:42:22 +08:00
R0CKSTAR	e54c35e4fb	feat: Support Moore Threads GPU (#8383 ) * Update doc for MUSA Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Add GGML_MUSA in Makefile Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Add GGML_MUSA in CMake Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * CUDA => MUSA Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * MUSA adds support for __vsubss4 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Fix CI build failure Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-07-28 01:41:25 +02:00
MorganRO8	68504f0970	readme : update games list (#8673 ) Added link to game I made that depends on llama	2024-07-24 19:48:00 +03:00
Thorsten Sommer	3a7ac5300a	readme : update UI list [no ci] (#8505 )	2024-07-24 15:52:30 +03:00
Georgi Gerganov	be0cfb4175	readme : fix server badge	2024-07-19 14:34:55 +03:00
Concedo	24b9616344	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/full-cuda.Dockerfile # .devops/full-rocm.Dockerfile # .devops/full.Dockerfile # .devops/llama-cli-cuda.Dockerfile # .devops/llama-cli-intel.Dockerfile # .devops/llama-cli-rocm.Dockerfile # .devops/llama-cli-vulkan.Dockerfile # .devops/llama-cli.Dockerfile # .devops/llama-server-cuda.Dockerfile # .devops/llama-server-intel.Dockerfile # .devops/llama-server-rocm.Dockerfile # .devops/llama-server-vulkan.Dockerfile # .devops/llama-server.Dockerfile # CMakeLists.txt # CONTRIBUTING.md # Makefile # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # requirements.txt # src/llama.cpp # tests/test-backend-ops.cpp	2024-07-19 14:23:33 +08:00
DontEatOreo	eeecaf442a	docs(README.md): add guide for nix and nixos (#980 ) Co-authored-by: LostRuins Concedo <39025047+LostRuins@users.noreply.github.com>	2024-07-10 21:35:41 +08:00
Concedo	6d0f9fdd2a	update section on readme for third party stuff	2024-07-10 18:04:05 +08:00
Andy Salerno	fd560fe680	Update README.md to fix broken link to docs (#8399 ) Update the "Performance troubleshooting" doc link to be correct - the file was moved into a dir called 'development'	2024-07-09 14:58:44 -04:00
Concedo	116d5fe58e	updated lite	2024-07-09 20:42:51 +08:00
b4b4o	c4dd11d1d3	readme : fix web link error [no ci] (#8347 )	2024-07-08 17:19:24 +03:00
toyer	04ce3a8b19	readme : add supported glm models (#8360 )	2024-07-08 08:57:19 +03:00
Andy Tai	f1948f1e10	readme : update bindings list (#8222 ) * adding guile_llama_cpp to binding list * fix formatting * fix formatting	2024-07-07 16:21:37 +03:00
Xuan Son Nguyen	60d83a0149	update main readme (#8333 )	2024-07-06 19:01:23 +02:00
Concedo	43b3cf08d8	change default rec	2024-07-06 10:11:07 +08:00
Concedo	5b605d03ea	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/ISSUE_TEMPLATE/config.yml # .gitignore # CMakeLists.txt # CONTRIBUTING.md # Makefile # README.md # ci/run.sh # common/common.h # examples/main-cmake-pkg/CMakeLists.txt # ggml/src/CMakeLists.txt # models/ggml-vocab-bert-bge.gguf.inp # models/ggml-vocab-bert-bge.gguf.out # models/ggml-vocab-deepseek-coder.gguf.inp # models/ggml-vocab-deepseek-coder.gguf.out # models/ggml-vocab-deepseek-llm.gguf.inp # models/ggml-vocab-deepseek-llm.gguf.out # models/ggml-vocab-falcon.gguf.inp # models/ggml-vocab-falcon.gguf.out # models/ggml-vocab-gpt-2.gguf.inp # models/ggml-vocab-gpt-2.gguf.out # models/ggml-vocab-llama-bpe.gguf.inp # models/ggml-vocab-llama-bpe.gguf.out # models/ggml-vocab-llama-spm.gguf.inp # models/ggml-vocab-llama-spm.gguf.out # models/ggml-vocab-mpt.gguf.inp # models/ggml-vocab-mpt.gguf.out # models/ggml-vocab-phi-3.gguf.inp # models/ggml-vocab-phi-3.gguf.out # models/ggml-vocab-starcoder.gguf.inp # models/ggml-vocab-starcoder.gguf.out # requirements.txt # requirements/requirements-convert_legacy_llama.txt # scripts/check-requirements.sh # scripts/pod-llama.sh # src/CMakeLists.txt # src/llama.cpp # tests/test-rope.cpp	2024-07-06 00:25:10 +08:00
Xuan Son Nguyen	be20e7f49d	Reorganize documentation pages (#8325 ) * re-organize docs * add link among docs * add link to build docs * fix style * de-duplicate sections	2024-07-05 18:08:32 +02:00
Georgi Gerganov	6c05752c50	contributing : update guidelines (#8316 )	2024-07-05 09:09:47 +03:00
Georgi Gerganov	e235b267a2	py : switch to snake_case (#8305 ) * py : switch to snake_case ggml-ci * cont ggml-ci * cont ggml-ci * cont : fix link * gguf-py : use snake_case in scripts entrypoint export * py : rename requirements for convert_legacy_llama.py Needed for scripts/check-requirements.sh --------- Co-authored-by: Francis Couture-Harpin <git@compilade.net>	2024-07-05 07:53:33 +03:00
Mateusz Charytoniuk	dae57a1ebc	readme: add Paddler to the list of projects (#8239 )	2024-07-01 20:13:22 +03:00

1 2 3 4 5 ...

535 commits