Concedo
03adb90dc6
prompt command done
2024-08-07 20:52:28 +08:00
Concedo
c7108742f4
fix typo
2024-08-06 17:24:58 +08:00
henk717
0d534d810f
Mac builds ( #1037 )
...
* OSX attempt 1
* OSX Pyinstaller
* Update kcpp-build-release-osx.yaml
* Update kcpp-build-release-osx.yaml
* Update kcpp-build-release-osx.yaml
* Add .metal file
* Update kcpp-build-release-osx.yaml
* Polish Mac
(cherry picked from commit 52cc0daa1b )
2024-08-06 17:11:19 +08:00
Concedo
a84f7c5d81
revert num old cpu for ci
2024-07-25 13:24:34 +08:00
Concedo
e28c42d7f7
adjusted layer estimation
2024-07-24 21:54:49 +08:00
Concedo
44ef87f14c
update lite, try fix ci
2024-07-24 16:31:34 +08:00
Concedo
8412946b9f
fix oldcpu build avx1
2024-07-15 23:42:22 +08:00
Concedo
21179d675b
try ci for avx1, up ver (+2 squashed commit)
...
Squashed commit:
[74150175] up version
[97b6163c] try ci for avx1 linux
2024-07-15 23:07:07 +08:00
Concedo
1a6855f597
Merge branch 'concedo_experimental' into concedo
2024-07-15 00:02:50 +08:00
Concedo
2cad736260
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/nix/package.nix
# .github/labeler.yml
# .gitignore
# CMakeLists.txt
# Makefile
# Package.swift
# README.md
# ci/run.sh
# docs/build.md
# examples/CMakeLists.txt
# flake.lock
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# grammars/README.md
# requirements/requirements-convert_hf_to_gguf.txt
# requirements/requirements-convert_hf_to_gguf_update.txt
# scripts/check-requirements.sh
# scripts/compare-llama-bench.py
# scripts/gen-unicode-data.py
# scripts/sync-ggml-am.sh
# scripts/sync-ggml.last
# scripts/sync-ggml.sh
# tests/test-backend-ops.cpp
# tests/test-chat-template.cpp
# tests/test-tokenizer-random.py
2024-07-11 16:36:16 +08:00
LostRuins Concedo
cc133401db
Update issue templates ( #986 )
2024-07-10 11:36:00 +08:00
Alberto Cabrera Pérez
a130eccef4
labeler : updated sycl to match docs and code refactor ( #8373 )
2024-07-08 22:35:17 +02:00
compilade
3fd62a6b1c
py : type-check all Python scripts with Pyright ( #8341 )
...
* py : type-check all Python scripts with Pyright
* server-tests : use trailing slash in openai base_url
* server-tests : add more type annotations
* server-tests : strip "chat" from base_url in oai_chat_completions
* server-tests : model metadata is a dict
* ci : disable pip cache in type-check workflow
The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.
* py : fix new type errors from master branch
* tests : fix test-tokenizer-random.py
Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.
* ci : only show warnings and errors in python type-check
The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
2024-07-07 15:04:39 -04:00
Concedo
ecec9fb478
add target for oldcpu cuda
...
(cherry picked from commit 572aba8e9c )
2024-07-06 00:40:23 +08:00
Concedo
572aba8e9c
add target for oldcpu cuda
2024-07-06 00:37:01 +08:00
Clint Herron
07a3fc0608
Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. ( #8258 )
2024-07-02 12:18:10 -04:00
Olivier Chafik
8748d8ac6f
json: attempt to skip slow tests when running under emulator ( #8189 )
2024-06-28 18:02:05 +01:00
loonerin
558f44bf83
CI: fix release build (Ubuntu+Mac) ( #8170 )
...
* CI: fix release build (Ubuntu)
PR #8006 changes defaults to build shared libs. However, CI for releases
expects static builds.
* CI: fix release build (Mac)
---------
Co-authored-by: loonerin <loonerin@users.noreply.github.com>
2024-06-27 21:01:23 +02:00
slaren
ae5d0f4b89
ci : publish new docker images only when the files change ( #8142 )
2024-06-26 21:59:28 +02:00
Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake ( #8006 )
...
* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00
Concedo
c66371fbb0
cu toolkit ver
2024-06-26 12:41:05 +08:00
slaren
dd047b476c
disable docker CI on pull requests ( #8110 )
2024-06-25 19:20:06 +02:00
henk717
fdca385cd9
Give the CI builds a recognizable AVX1 name ( #937 )
2024-06-25 19:25:50 +08:00
slaren
8cb508d0d5
disable publishing the full-rocm docker image ( #8083 )
2024-06-24 08:36:11 +03:00
slaren
b6b9a8e606
fix CI failures ( #8066 )
...
* test-backend-ops : increase cpy max nmse
* server ci : disable thread sanitizer
2024-06-23 13:14:45 +02:00
slaren
9c77ec1d74
ggml : synchronize threads using barriers ( #7993 )
2024-06-19 15:04:15 +02:00
Georgi Gerganov
a04a953cab
codecov : remove ( #8004 )
2024-06-19 13:04:36 +03:00
Concedo
967c1d8df5
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# Makefile
# README-sycl.md
# README.md
# flake.lock
# tests/test-backend-ops.cpp
2024-06-17 15:14:47 +08:00
Georgi Gerganov
c8a82194a8
github : update pr template
2024-06-16 10:46:51 +03:00
olexiyb
f8ec8877b7
ci : fix macos x86 build ( #7940 )
...
In order to use old `macos-latest` we should use `macos-12`
Potentially will fix: https://github.com/ggerganov/llama.cpp/issues/6975
2024-06-14 20:28:34 +03:00
Concedo
a8db72eca0
Merge commit ' ef52d1d16a' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/server.yml
# CMakeLists.txt
# README.md
# flake.lock
# grammars/README.md
# grammars/json.gbnf
# grammars/json_arr.gbnf
# tests/test-json-schema-to-grammar.cpp
2024-06-13 18:26:45 +08:00
Olivier Chafik
1c641e6aac
build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )
...
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930.
* add hot topic notice to README.md
* Update README.md
* Update README.md
* rename gguf-split & quantize bins refs in **/tests.sh
---------
Co-authored-by: HanClinto <hanclinto@gmail.com>
2024-06-13 00:41:52 +01:00
Deven Mistry
14f83526cd
fix broken link in pr template ( #7880 ) [no ci]
...
* fix broken link in pr template
* Update pull_request_template.md [no ci]
---------
Co-authored-by: Brian <mofosyne@gmail.com>
2024-06-12 02:18:58 +10:00
Brian
6fe42d073f
github: move PR template to .github/ root ( #7868 )
2024-06-11 17:43:41 +03:00
slaren
c2ce6c47e4
fix CUDA CI by using a windows-2019 image ( #7861 )
...
* try to fix CUDA ci with --allow-unsupported-compiler
* trigger when build.yml changes
* another test
* try exllama/bdashore3 method
* install vs build tools before cuda toolkit
* try win-2019
2024-06-11 08:59:20 +03:00
slaren
fd5ea0f897
ci : try win-2019 on server windows test ( #7854 )
2024-06-10 15:18:41 +03:00
Nicolás Pérez
57bf62ce7c
docs: Added initial PR template with directions for doc only changes and squash merges [no ci] ( #7700 )
...
This commit adds pull_request_template.md and CONTRIBUTING.md . It focuses on explaining to contributors the need to rate PR complexity level, when to add [no ci] and how to format PR title and descriptions.
Co-authored-by: Brian <mofosyne@gmail.com>
Co-authored-by: compilade <git@compilade.net>
2024-06-10 01:24:29 +10:00
Concedo
4fddbab024
rename workflows
2024-06-09 19:09:01 +08:00
Concedo
1487a4bc81
add workflow for noavx2 cuda ad hoc build
2024-06-09 19:03:33 +08:00
Georgi Gerganov
554c247caf
ggml : remove OpenCL ( #7735 )
...
ggml-ci
2024-06-04 21:23:20 +03:00
Masaya, Kato
a5735e4426
ggml : use OpenMP as a thread pool ( #7606 )
...
* ggml: Added OpenMP for multi-threads processing
* ggml : Limit the number of threads used to avoid deadlock
* update shared state n_threads in parallel region
* clear numa affinity for main thread even with openmp
* enable openmp by default
* fix msvc build
* disable openmp on macos
* ci : disable openmp with thread sanitizer
* Update ggml.c
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: slaren <slarengh@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-06-03 17:14:15 +02:00
Concedo
a2e304ed4d
remove issue templates
2024-06-03 22:52:09 +08:00
Concedo
b0a7d1aba6
fixed makefile (+1 squashed commits)
...
Squashed commits:
[ef6ddaf5] try fix makefile
2024-06-02 15:21:48 +08:00
Concedo
a97f7d5f91
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/full-cuda.Dockerfile
# .devops/full-rocm.Dockerfile
# .devops/full.Dockerfile
# .devops/main-cuda.Dockerfile
# .devops/main-intel.Dockerfile
# .devops/main-rocm.Dockerfile
# .devops/main.Dockerfile
# .devops/server-cuda.Dockerfile
# .devops/server-intel.Dockerfile
# .devops/server-rocm.Dockerfile
# .devops/server.Dockerfile
# .devops/tools.sh
# .github/workflows/docker.yml
# CMakeLists.txt
# Makefile
# README-sycl.md
# README.md
# ci/run.sh
# llama.cpp
# requirements.txt
# requirements/requirements-convert-hf-to-gguf-update.txt
# requirements/requirements-convert-hf-to-gguf.txt
# requirements/requirements-convert-legacy-llama.txt
# requirements/requirements-convert-llama-ggml-to-gguf.txt
# scripts/check-requirements.sh
# scripts/compare-llama-bench.py
# scripts/convert-gg.sh
# scripts/pod-llama.sh
# scripts/sync-ggml-am.sh
# scripts/sync-ggml.last
# scripts/sync-ggml.sh
# tests/CMakeLists.txt
# tests/test-backend-ops.cpp
# tests/test-tokenizer-0.sh
# tests/test-tokenizer-random.py
2024-06-02 12:28:38 +08:00
Brian
e6157f94c8
github: add contact links to issues and convert question into research [no ci] ( #7612 )
2024-05-30 21:55:36 +10:00
Meng, Hengyu
3854c9d07f
[SYCL] fix intel docker ( #7630 )
...
* Update main-intel.Dockerfile
* workaround for https://github.com/intel/oneapi-containers/issues/70
* reset intel docker in CI
* add missed in server
2024-05-30 16:19:08 +10:00
Concedo
4ed9ba7352
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/docker.yml
# CMakeLists.txt
# Makefile
# README.md
# flake.lock
# tests/test-backend-ops.cpp
2024-05-28 21:57:19 +08:00
Brian
271ff3fc44
github: add refactor to issue template ( #7561 )
...
* github: add refactor issue template [no ci]
* Update 07-refactor.yml
2024-05-28 20:27:27 +10:00
Brian
d6ef0e77dd
github: add self sorted issue ticket forms ( #7543 )
...
* github: add self sorted issue ticket forms [no ci]
* github: consolidate BSD in bug issue ticket
* github: remove contact from bug ticket template [no ci]
* github: remove bios from os dropdown in bug report [no ci]
2024-05-27 10:54:30 +10:00
Brian
3cbd23ed88
labeler: added Apple Metal detector (+Kompute) ( #7529 )
...
* labeler: added Apple Metal detector [no ci]
* labeler: add Kompute to detector [no ci]
2024-05-25 19:30:42 +10:00