Commit graph

271 commits

Author SHA1 Message Date
Concedo
03adb90dc6 prompt command done 2024-08-07 20:52:28 +08:00
Concedo
c7108742f4 fix typo 2024-08-06 17:24:58 +08:00
henk717
0d534d810f Mac builds (#1037)
* OSX attempt 1

* OSX Pyinstaller

* Update kcpp-build-release-osx.yaml

* Update kcpp-build-release-osx.yaml

* Update kcpp-build-release-osx.yaml

* Add .metal file

* Update kcpp-build-release-osx.yaml

* Polish Mac

(cherry picked from commit 52cc0daa1b)
2024-08-06 17:11:19 +08:00
Concedo
a84f7c5d81 revert num old cpu for ci 2024-07-25 13:24:34 +08:00
Concedo
e28c42d7f7 adjusted layer estimation 2024-07-24 21:54:49 +08:00
Concedo
44ef87f14c update lite, try fix ci 2024-07-24 16:31:34 +08:00
Concedo
8412946b9f fix oldcpu build avx1 2024-07-15 23:42:22 +08:00
Concedo
21179d675b try ci for avx1, up ver (+2 squashed commit)
Squashed commit:

[74150175] up version

[97b6163c] try ci for avx1 linux
2024-07-15 23:07:07 +08:00
Concedo
1a6855f597 Merge branch 'concedo_experimental' into concedo 2024-07-15 00:02:50 +08:00
Concedo
2cad736260 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/nix/package.nix
#	.github/labeler.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	Package.swift
#	README.md
#	ci/run.sh
#	docs/build.md
#	examples/CMakeLists.txt
#	flake.lock
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	grammars/README.md
#	requirements/requirements-convert_hf_to_gguf.txt
#	requirements/requirements-convert_hf_to_gguf_update.txt
#	scripts/check-requirements.sh
#	scripts/compare-llama-bench.py
#	scripts/gen-unicode-data.py
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.last
#	scripts/sync-ggml.sh
#	tests/test-backend-ops.cpp
#	tests/test-chat-template.cpp
#	tests/test-tokenizer-random.py
2024-07-11 16:36:16 +08:00
LostRuins Concedo
cc133401db
Update issue templates (#986) 2024-07-10 11:36:00 +08:00
Alberto Cabrera Pérez
a130eccef4
labeler : updated sycl to match docs and code refactor (#8373) 2024-07-08 22:35:17 +02:00
compilade
3fd62a6b1c
py : type-check all Python scripts with Pyright (#8341)
* py : type-check all Python scripts with Pyright

* server-tests : use trailing slash in openai base_url

* server-tests : add more type annotations

* server-tests : strip "chat" from base_url in oai_chat_completions

* server-tests : model metadata is a dict

* ci : disable pip cache in type-check workflow

The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.

* py : fix new type errors from master branch

* tests : fix test-tokenizer-random.py

Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.

* ci : only show warnings and errors in python type-check

The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
2024-07-07 15:04:39 -04:00
Concedo
ecec9fb478 add target for oldcpu cuda
(cherry picked from commit 572aba8e9c)
2024-07-06 00:40:23 +08:00
Concedo
572aba8e9c add target for oldcpu cuda 2024-07-06 00:37:01 +08:00
Clint Herron
07a3fc0608
Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258) 2024-07-02 12:18:10 -04:00
Olivier Chafik
8748d8ac6f
json: attempt to skip slow tests when running under emulator (#8189) 2024-06-28 18:02:05 +01:00
loonerin
558f44bf83
CI: fix release build (Ubuntu+Mac) (#8170)
* CI: fix release build (Ubuntu)

PR #8006 changes defaults to build shared libs. However, CI for releases
expects static builds.

* CI: fix release build (Mac)

---------

Co-authored-by: loonerin <loonerin@users.noreply.github.com>
2024-06-27 21:01:23 +02:00
slaren
ae5d0f4b89
ci : publish new docker images only when the files change (#8142) 2024-06-26 21:59:28 +02:00
Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake (#8006)
* scripts : update sync [no ci]

* files : relocate [no ci]

* ci : disable kompute build [no ci]

* cmake : fixes [no ci]

* server : fix mingw build

ggml-ci

* cmake : minor [no ci]

* cmake : link math library [no ci]

* cmake : build normal ggml library (not object library) [no ci]

* cmake : fix kompute build

ggml-ci

* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE

ggml-ci

* move public backend headers to the public include directory (#8122)

* move public backend headers to the public include directory

* nix test

* spm : fix metal header

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* scripts : fix sync paths [no ci]

* scripts : sync ggml-blas.h [no ci]

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00
Concedo
c66371fbb0 cu toolkit ver 2024-06-26 12:41:05 +08:00
slaren
dd047b476c
disable docker CI on pull requests (#8110) 2024-06-25 19:20:06 +02:00
henk717
fdca385cd9
Give the CI builds a recognizable AVX1 name (#937) 2024-06-25 19:25:50 +08:00
slaren
8cb508d0d5
disable publishing the full-rocm docker image (#8083) 2024-06-24 08:36:11 +03:00
slaren
b6b9a8e606
fix CI failures (#8066)
* test-backend-ops : increase cpy max nmse

* server ci : disable thread sanitizer
2024-06-23 13:14:45 +02:00
slaren
9c77ec1d74
ggml : synchronize threads using barriers (#7993) 2024-06-19 15:04:15 +02:00
Georgi Gerganov
a04a953cab
codecov : remove (#8004) 2024-06-19 13:04:36 +03:00
Concedo
967c1d8df5 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README-sycl.md
#	README.md
#	flake.lock
#	tests/test-backend-ops.cpp
2024-06-17 15:14:47 +08:00
Georgi Gerganov
c8a82194a8
github : update pr template 2024-06-16 10:46:51 +03:00
olexiyb
f8ec8877b7
ci : fix macos x86 build (#7940)
In order to use old `macos-latest` we should use `macos-12`

Potentially will fix: https://github.com/ggerganov/llama.cpp/issues/6975
2024-06-14 20:28:34 +03:00
Concedo
a8db72eca0 Merge commit 'ef52d1d16a' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/server.yml
#	CMakeLists.txt
#	README.md
#	flake.lock
#	grammars/README.md
#	grammars/json.gbnf
#	grammars/json_arr.gbnf
#	tests/test-json-schema-to-grammar.cpp
2024-06-13 18:26:45 +08:00
Olivier Chafik
1c641e6aac
build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew

* server: update refs -> llama-server

gitignore llama-server

* server: simplify nix package

* main: update refs -> llama

fix examples/main ref

* main/server: fix targets

* update more names

* Update build.yml

* rm accidentally checked in bins

* update straggling refs

* Update .gitignore

* Update server-llm.sh

* main: target name -> llama-cli

* Prefix all example bins w/ llama-

* fix main refs

* rename {main->llama}-cmake-pkg binary

* prefix more cmake targets w/ llama-

* add/fix gbnf-validator subfolder to cmake

* sort cmake example subdirs

* rm bin files

* fix llama-lookup-* Makefile rules

* gitignore /llama-*

* rename Dockerfiles

* rename llama|main -> llama-cli; consistent RPM bin prefixes

* fix some missing -cli suffixes

* rename dockerfile w/ llama-cli

* rename(make): llama-baby-llama

* update dockerfile refs

* more llama-cli(.exe)

* fix test-eval-callback

* rename: llama-cli-cmake-pkg(.exe)

* address gbnf-validator unused fread warning (switched to C++ / ifstream)

* add two missing llama- prefixes

* Updating docs for eval-callback binary to use new `llama-` prefix.

* Updating a few lingering doc references for rename of main to llama-cli

* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.

* Updating documentation references for lookup-merge and export-lora

* Updating two small `main` references missed earlier in the finetune docs.

* Update apps.nix

* update grammar/README.md w/ new llama-* names

* update llama-rpc-server bin name + doc

* Revert "update llama-rpc-server bin name + doc"

This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930.

* add hot topic notice to README.md

* Update README.md

* Update README.md

* rename gguf-split & quantize bins refs in **/tests.sh

---------

Co-authored-by: HanClinto <hanclinto@gmail.com>
2024-06-13 00:41:52 +01:00
Deven Mistry
14f83526cd
fix broken link in pr template (#7880) [no ci]
* fix broken link in pr template

* Update pull_request_template.md [no ci]

---------

Co-authored-by: Brian <mofosyne@gmail.com>
2024-06-12 02:18:58 +10:00
Brian
6fe42d073f
github: move PR template to .github/ root (#7868) 2024-06-11 17:43:41 +03:00
slaren
c2ce6c47e4
fix CUDA CI by using a windows-2019 image (#7861)
* try to fix CUDA ci with --allow-unsupported-compiler

* trigger when build.yml changes

* another test

* try exllama/bdashore3 method

* install vs build tools before cuda toolkit

* try win-2019
2024-06-11 08:59:20 +03:00
slaren
fd5ea0f897
ci : try win-2019 on server windows test (#7854) 2024-06-10 15:18:41 +03:00
Nicolás Pérez
57bf62ce7c
docs: Added initial PR template with directions for doc only changes and squash merges [no ci] (#7700)
This commit adds pull_request_template.md and CONTRIBUTING.md . It focuses on explaining to contributors the need to rate PR complexity level, when to add [no ci] and how to format PR title and descriptions.

Co-authored-by: Brian <mofosyne@gmail.com>
Co-authored-by: compilade <git@compilade.net>
2024-06-10 01:24:29 +10:00
Concedo
4fddbab024 rename workflows 2024-06-09 19:09:01 +08:00
Concedo
1487a4bc81 add workflow for noavx2 cuda ad hoc build 2024-06-09 19:03:33 +08:00
Georgi Gerganov
554c247caf
ggml : remove OpenCL (#7735)
ggml-ci
2024-06-04 21:23:20 +03:00
Masaya, Kato
a5735e4426
ggml : use OpenMP as a thread pool (#7606)
* ggml: Added OpenMP for multi-threads processing

* ggml : Limit the number of threads used to avoid deadlock

* update shared state n_threads in parallel region

* clear numa affinity for main thread even with openmp

* enable openmp by default

* fix msvc build

* disable openmp on macos

* ci : disable openmp with thread sanitizer

* Update ggml.c

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: slaren <slarengh@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-06-03 17:14:15 +02:00
Concedo
a2e304ed4d remove issue templates 2024-06-03 22:52:09 +08:00
Concedo
b0a7d1aba6 fixed makefile (+1 squashed commits)
Squashed commits:

[ef6ddaf5] try fix makefile
2024-06-02 15:21:48 +08:00
Concedo
a97f7d5f91 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/full-cuda.Dockerfile
#	.devops/full-rocm.Dockerfile
#	.devops/full.Dockerfile
#	.devops/main-cuda.Dockerfile
#	.devops/main-intel.Dockerfile
#	.devops/main-rocm.Dockerfile
#	.devops/main.Dockerfile
#	.devops/server-cuda.Dockerfile
#	.devops/server-intel.Dockerfile
#	.devops/server-rocm.Dockerfile
#	.devops/server.Dockerfile
#	.devops/tools.sh
#	.github/workflows/docker.yml
#	CMakeLists.txt
#	Makefile
#	README-sycl.md
#	README.md
#	ci/run.sh
#	llama.cpp
#	requirements.txt
#	requirements/requirements-convert-hf-to-gguf-update.txt
#	requirements/requirements-convert-hf-to-gguf.txt
#	requirements/requirements-convert-legacy-llama.txt
#	requirements/requirements-convert-llama-ggml-to-gguf.txt
#	scripts/check-requirements.sh
#	scripts/compare-llama-bench.py
#	scripts/convert-gg.sh
#	scripts/pod-llama.sh
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.last
#	scripts/sync-ggml.sh
#	tests/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-tokenizer-0.sh
#	tests/test-tokenizer-random.py
2024-06-02 12:28:38 +08:00
Brian
e6157f94c8
github: add contact links to issues and convert question into research [no ci] (#7612) 2024-05-30 21:55:36 +10:00
Meng, Hengyu
3854c9d07f
[SYCL] fix intel docker (#7630)
* Update main-intel.Dockerfile

* workaround for https://github.com/intel/oneapi-containers/issues/70

* reset intel docker in CI

* add missed in server
2024-05-30 16:19:08 +10:00
Concedo
4ed9ba7352 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/docker.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	flake.lock
#	tests/test-backend-ops.cpp
2024-05-28 21:57:19 +08:00
Brian
271ff3fc44
github: add refactor to issue template (#7561)
* github: add refactor issue template [no ci]

* Update 07-refactor.yml
2024-05-28 20:27:27 +10:00
Brian
d6ef0e77dd
github: add self sorted issue ticket forms (#7543)
* github: add self sorted issue ticket forms [no ci]

* github: consolidate BSD in bug issue ticket

* github: remove contact from bug ticket template [no ci]

* github: remove bios from os dropdown in bug report [no ci]
2024-05-27 10:54:30 +10:00
Brian
3cbd23ed88
labeler: added Apple Metal detector (+Kompute) (#7529)
* labeler: added Apple Metal detector [no ci]

* labeler: add Kompute to detector [no ci]
2024-05-25 19:30:42 +10:00