koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-05-07 09:02:04 +00:00

Author	SHA1	Message	Date
Concedo	03adb90dc6	prompt command done	2024-08-07 20:52:28 +08:00
Concedo	c7108742f4	fix typo	2024-08-06 17:24:58 +08:00
henk717	0d534d810f	Mac builds (#1037 ) * OSX attempt 1 * OSX Pyinstaller * Update kcpp-build-release-osx.yaml * Update kcpp-build-release-osx.yaml * Update kcpp-build-release-osx.yaml * Add .metal file * Update kcpp-build-release-osx.yaml * Polish Mac (cherry picked from commit `52cc0daa1b`)	2024-08-06 17:11:19 +08:00
Concedo	a84f7c5d81	revert num old cpu for ci	2024-07-25 13:24:34 +08:00
Concedo	e28c42d7f7	adjusted layer estimation	2024-07-24 21:54:49 +08:00
Concedo	44ef87f14c	update lite, try fix ci	2024-07-24 16:31:34 +08:00
Concedo	8412946b9f	fix oldcpu build avx1	2024-07-15 23:42:22 +08:00
Concedo	21179d675b	try ci for avx1, up ver (+2 squashed commit) Squashed commit: [74150175] up version [97b6163c] try ci for avx1 linux	2024-07-15 23:07:07 +08:00
Concedo	1a6855f597	Merge branch 'concedo_experimental' into concedo	2024-07-15 00:02:50 +08:00
Concedo	2cad736260	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/nix/package.nix # .github/labeler.yml # .gitignore # CMakeLists.txt # Makefile # Package.swift # README.md # ci/run.sh # docs/build.md # examples/CMakeLists.txt # flake.lock # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # grammars/README.md # requirements/requirements-convert_hf_to_gguf.txt # requirements/requirements-convert_hf_to_gguf_update.txt # scripts/check-requirements.sh # scripts/compare-llama-bench.py # scripts/gen-unicode-data.py # scripts/sync-ggml-am.sh # scripts/sync-ggml.last # scripts/sync-ggml.sh # tests/test-backend-ops.cpp # tests/test-chat-template.cpp # tests/test-tokenizer-random.py	2024-07-11 16:36:16 +08:00
LostRuins Concedo	cc133401db	Update issue templates (#986 )	2024-07-10 11:36:00 +08:00
Alberto Cabrera Pérez	a130eccef4	labeler : updated sycl to match docs and code refactor (#8373 )	2024-07-08 22:35:17 +02:00
compilade	3fd62a6b1c	py : type-check all Python scripts with Pyright (#8341 ) * py : type-check all Python scripts with Pyright * server-tests : use trailing slash in openai base_url * server-tests : add more type annotations * server-tests : strip "chat" from base_url in oai_chat_completions * server-tests : model metadata is a dict * ci : disable pip cache in type-check workflow The cache is not shared between branches, and it's 250MB in size, so it would become quite a big part of the 10GB cache limit of the repo. * py : fix new type errors from master branch * tests : fix test-tokenizer-random.py Apparently, gcc applies optimisations even when pre-processing, which confuses pycparser. * ci : only show warnings and errors in python type-check The "information" level otherwise has entries from 'examples/pydantic_models_to_grammar.py', which could be confusing for someone trying to figure out what failed, considering that these messages can safely be ignored even though they look like errors.	2024-07-07 15:04:39 -04:00
Concedo	ecec9fb478	add target for oldcpu cuda (cherry picked from commit `572aba8e9c`)	2024-07-06 00:40:23 +08:00
Concedo	572aba8e9c	add target for oldcpu cuda	2024-07-06 00:37:01 +08:00
Clint Herron	07a3fc0608	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
Olivier Chafik	8748d8ac6f	json: attempt to skip slow tests when running under emulator (#8189 )	2024-06-28 18:02:05 +01:00
loonerin	558f44bf83	CI: fix release build (Ubuntu+Mac) (#8170 ) * CI: fix release build (Ubuntu) PR #8006 changes defaults to build shared libs. However, CI for releases expects static builds. * CI: fix release build (Mac) --------- Co-authored-by: loonerin <loonerin@users.noreply.github.com>	2024-06-27 21:01:23 +02:00
slaren	ae5d0f4b89	ci : publish new docker images only when the files change (#8142 )	2024-06-26 21:59:28 +02:00
Georgi Gerganov	f3f65429c4	llama : reorganize source code + improve CMake (#8006 ) * scripts : update sync [no ci] * files : relocate [no ci] * ci : disable kompute build [no ci] * cmake : fixes [no ci] * server : fix mingw build ggml-ci * cmake : minor [no ci] * cmake : link math library [no ci] * cmake : build normal ggml library (not object library) [no ci] * cmake : fix kompute build ggml-ci * make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE ggml-ci * move public backend headers to the public include directory (#8122) * move public backend headers to the public include directory * nix test * spm : fix metal header --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * scripts : fix sync paths [no ci] * scripts : sync ggml-blas.h [no ci] --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-06-26 18:33:02 +03:00
Concedo	c66371fbb0	cu toolkit ver	2024-06-26 12:41:05 +08:00
slaren	dd047b476c	disable docker CI on pull requests (#8110 )	2024-06-25 19:20:06 +02:00
henk717	fdca385cd9	Give the CI builds a recognizable AVX1 name (#937 )	2024-06-25 19:25:50 +08:00
slaren	8cb508d0d5	disable publishing the full-rocm docker image (#8083 )	2024-06-24 08:36:11 +03:00
slaren	b6b9a8e606	fix CI failures (#8066 ) * test-backend-ops : increase cpy max nmse * server ci : disable thread sanitizer	2024-06-23 13:14:45 +02:00
slaren	9c77ec1d74	ggml : synchronize threads using barriers (#7993 )	2024-06-19 15:04:15 +02:00
Georgi Gerganov	a04a953cab	codecov : remove (#8004 )	2024-06-19 13:04:36 +03:00
Concedo	967c1d8df5	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # CMakeLists.txt # Makefile # README-sycl.md # README.md # flake.lock # tests/test-backend-ops.cpp	2024-06-17 15:14:47 +08:00
Georgi Gerganov	c8a82194a8	github : update pr template	2024-06-16 10:46:51 +03:00
olexiyb	f8ec8877b7	ci : fix macos x86 build (#7940 ) In order to use old `macos-latest` we should use `macos-12` Potentially will fix: https://github.com/ggerganov/llama.cpp/issues/6975	2024-06-14 20:28:34 +03:00
Concedo	a8db72eca0	Merge commit '`ef52d1d16a`' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/server.yml # CMakeLists.txt # README.md # flake.lock # grammars/README.md # grammars/json.gbnf # grammars/json_arr.gbnf # tests/test-json-schema-to-grammar.cpp	2024-06-13 18:26:45 +08:00
Olivier Chafik	1c641e6aac	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 ) * `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew * server: update refs -> llama-server gitignore llama-server * server: simplify nix package * main: update refs -> llama fix examples/main ref * main/server: fix targets * update more names * Update build.yml * rm accidentally checked in bins * update straggling refs * Update .gitignore * Update server-llm.sh * main: target name -> llama-cli * Prefix all example bins w/ llama- * fix main refs * rename {main->llama}-cmake-pkg binary * prefix more cmake targets w/ llama- * add/fix gbnf-validator subfolder to cmake * sort cmake example subdirs * rm bin files * fix llama-lookup-* Makefile rules * gitignore /llama-* * rename Dockerfiles * rename llama\|main -> llama-cli; consistent RPM bin prefixes * fix some missing -cli suffixes * rename dockerfile w/ llama-cli * rename(make): llama-baby-llama * update dockerfile refs * more llama-cli(.exe) * fix test-eval-callback * rename: llama-cli-cmake-pkg(.exe) * address gbnf-validator unused fread warning (switched to C++ / ifstream) * add two missing llama- prefixes * Updating docs for eval-callback binary to use new `llama-` prefix. * Updating a few lingering doc references for rename of main to llama-cli * Updating `run-with-preset.py` to use new binary names. Updating docs around `perplexity` binary rename. * Updating documentation references for lookup-merge and export-lora * Updating two small `main` references missed earlier in the finetune docs. * Update apps.nix * update grammar/README.md w/ new llama-* names * update llama-rpc-server bin name + doc * Revert "update llama-rpc-server bin name + doc" This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930. * add hot topic notice to README.md * Update README.md * Update README.md * rename gguf-split & quantize bins refs in **/tests.sh --------- Co-authored-by: HanClinto <hanclinto@gmail.com>	2024-06-13 00:41:52 +01:00
Deven Mistry	14f83526cd	fix broken link in pr template (#7880 ) [no ci] * fix broken link in pr template * Update pull_request_template.md [no ci] --------- Co-authored-by: Brian <mofosyne@gmail.com>	2024-06-12 02:18:58 +10:00
Brian	6fe42d073f	github: move PR template to .github/ root (#7868 )	2024-06-11 17:43:41 +03:00
slaren	c2ce6c47e4	fix CUDA CI by using a windows-2019 image (#7861 ) * try to fix CUDA ci with --allow-unsupported-compiler * trigger when build.yml changes * another test * try exllama/bdashore3 method * install vs build tools before cuda toolkit * try win-2019	2024-06-11 08:59:20 +03:00
slaren	fd5ea0f897	ci : try win-2019 on server windows test (#7854 )	2024-06-10 15:18:41 +03:00
Nicolás Pérez	57bf62ce7c	docs: Added initial PR template with directions for doc only changes and squash merges [no ci] (#7700 ) This commit adds pull_request_template.md and CONTRIBUTING.md . It focuses on explaining to contributors the need to rate PR complexity level, when to add [no ci] and how to format PR title and descriptions. Co-authored-by: Brian <mofosyne@gmail.com> Co-authored-by: compilade <git@compilade.net>	2024-06-10 01:24:29 +10:00
Concedo	4fddbab024	rename workflows	2024-06-09 19:09:01 +08:00
Concedo	1487a4bc81	add workflow for noavx2 cuda ad hoc build	2024-06-09 19:03:33 +08:00
Georgi Gerganov	554c247caf	ggml : remove OpenCL (#7735 ) ggml-ci	2024-06-04 21:23:20 +03:00
Masaya, Kato	a5735e4426	ggml : use OpenMP as a thread pool (#7606 ) * ggml: Added OpenMP for multi-threads processing * ggml : Limit the number of threads used to avoid deadlock * update shared state n_threads in parallel region * clear numa affinity for main thread even with openmp * enable openmp by default * fix msvc build * disable openmp on macos * ci : disable openmp with thread sanitizer * Update ggml.c Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-06-03 17:14:15 +02:00
Concedo	a2e304ed4d	remove issue templates	2024-06-03 22:52:09 +08:00
Concedo	b0a7d1aba6	fixed makefile (+1 squashed commits) Squashed commits: [ef6ddaf5] try fix makefile	2024-06-02 15:21:48 +08:00
Concedo	a97f7d5f91	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/full-cuda.Dockerfile # .devops/full-rocm.Dockerfile # .devops/full.Dockerfile # .devops/main-cuda.Dockerfile # .devops/main-intel.Dockerfile # .devops/main-rocm.Dockerfile # .devops/main.Dockerfile # .devops/server-cuda.Dockerfile # .devops/server-intel.Dockerfile # .devops/server-rocm.Dockerfile # .devops/server.Dockerfile # .devops/tools.sh # .github/workflows/docker.yml # CMakeLists.txt # Makefile # README-sycl.md # README.md # ci/run.sh # llama.cpp # requirements.txt # requirements/requirements-convert-hf-to-gguf-update.txt # requirements/requirements-convert-hf-to-gguf.txt # requirements/requirements-convert-legacy-llama.txt # requirements/requirements-convert-llama-ggml-to-gguf.txt # scripts/check-requirements.sh # scripts/compare-llama-bench.py # scripts/convert-gg.sh # scripts/pod-llama.sh # scripts/sync-ggml-am.sh # scripts/sync-ggml.last # scripts/sync-ggml.sh # tests/CMakeLists.txt # tests/test-backend-ops.cpp # tests/test-tokenizer-0.sh # tests/test-tokenizer-random.py	2024-06-02 12:28:38 +08:00
Brian	e6157f94c8	github: add contact links to issues and convert question into research [no ci] (#7612 )	2024-05-30 21:55:36 +10:00
Meng, Hengyu	3854c9d07f	[SYCL] fix intel docker (#7630 ) * Update main-intel.Dockerfile * workaround for https://github.com/intel/oneapi-containers/issues/70 * reset intel docker in CI * add missed in server	2024-05-30 16:19:08 +10:00
Concedo	4ed9ba7352	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/docker.yml # CMakeLists.txt # Makefile # README.md # flake.lock # tests/test-backend-ops.cpp	2024-05-28 21:57:19 +08:00
Brian	271ff3fc44	github: add refactor to issue template (#7561 ) * github: add refactor issue template [no ci] * Update 07-refactor.yml	2024-05-28 20:27:27 +10:00
Brian	d6ef0e77dd	github: add self sorted issue ticket forms (#7543 ) * github: add self sorted issue ticket forms [no ci] * github: consolidate BSD in bug issue ticket * github: remove contact from bug ticket template [no ci] * github: remove bios from os dropdown in bug report [no ci]	2024-05-27 10:54:30 +10:00
Brian	3cbd23ed88	labeler: added Apple Metal detector (+Kompute) (#7529 ) * labeler: added Apple Metal detector [no ci] * labeler: add Kompute to detector [no ci]	2024-05-25 19:30:42 +10:00

1 2 3 4 5 ...

271 commits