Commit graph

16 commits

Author SHA1 Message Date
Concedo
70aee82552 attempts a backflip, but does he stick the landing? 2024-11-16 17:05:45 +08:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries (#10256)
* ggml : build backends as libraries

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00
Concedo
e692a79aab Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/docker.yml
#	CMakeLists.txt
#	CONTRIBUTING.md
#	docs/android.md
#	docs/docker.md
#	examples/embedding/embedding.cpp
#	examples/imatrix/imatrix.cpp
#	examples/infill/infill.cpp
#	examples/llama-bench/llama-bench.cpp
#	examples/main/README.md
#	examples/parallel/parallel.cpp
#	examples/perplexity/perplexity.cpp
#	examples/quantize-stats/quantize-stats.cpp
#	examples/save-load-state/save-load-state.cpp
#	examples/server/README.md
#	examples/simple/CMakeLists.txt
#	examples/speculative/speculative.cpp
#	flake.lock
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-blas.cpp
#	pocs/vdot/q8dot.cpp
#	pocs/vdot/vdot.cpp
#	scripts/debug-test.sh
#	scripts/sync-ggml.last
#	src/llama.cpp
#	tests/test-backend-ops.cpp
#	tests/test-chat-template.cpp
#	tests/test-quantize-fns.cpp
#	tests/test-quantize-perf.cpp
#	tests/test-tokenizer-0.cpp
#	tests/test-tokenizer-1-bpe.cpp
#	tests/test-tokenizer-1-spm.cpp
2024-10-11 11:59:59 +08:00
Georgi Gerganov
d5ac8cf2f2
ggml : add metal backend registry / device (#9713)
* ggml : add metal backend registry / device

ggml-ci

* metal : fix names [no ci]

* metal : global registry and device instances

ggml-ci

* cont : alternative initialization of global objects

ggml-ci

* llama : adapt to backend changes

ggml-ci

* fixes

* metal : fix indent

* metal : fix build when MTLGPUFamilyApple3 is not available

ggml-ci

* fix merge

* metal : avoid unnecessary singleton accesses

ggml-ci

* metal : minor fix [no ci]

* metal : g_state -> g_ggml_ctx_dev_main [no ci]

* metal : avoid reference of device context in the backend context

ggml-ci

* metal : minor [no ci]

* metal : fix maxTransferRate check

* metal : remove transfer rate stuff

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-10-07 18:27:51 +03:00
Concedo
da6cf261a8 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/close-issue.yml
#	.github/workflows/nix-ci-aarch64.yml
#	.github/workflows/nix-ci.yml
#	README.md
#	ci/run.sh
#	examples/server/README.md
#	ggml/src/ggml-cuda.cu
#	ggml/src/ggml-metal.m
#	scripts/sync-ggml.last
#	tests/test-backend-ops.cpp
2024-10-05 22:24:08 +08:00
Concedo
3e1cbedbae Merge commit 'c83ad6d01e' into concedo_experimental
# Conflicts:
#	.github/workflows/bench.yml.disabled
#	Makefile
#	Package.swift
#	README.md
#	docs/backend/SYCL.md
#	examples/CMakeLists.txt
#	examples/benchmark/benchmark-matmult.cpp
#	ggml/src/CMakeLists.txt
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.sh
#	src/llama.cpp
#	tests/test-backend-ops.cpp
2024-10-05 22:17:33 +08:00
bandoti
d6fe7abf04
ggml: unify backend logging mechanism (#9709)
* Add scaffolding for ggml logging macros

* Metal backend now uses GGML logging

* Cuda backend now uses GGML logging

* Cann backend now uses GGML logging

* Add enum tag to parameters

* Use C memory allocation funcs

* Fix compile error

* Use GGML_LOG instead of GGML_PRINT

* Rename llama_state to llama_logger_state

* Prevent null format string

* Fix whitespace

* Remove log callbacks from ggml backends

* Remove cuda log statement
2024-10-03 17:39:03 +02:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces (#9707)
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
Concedo
ce7f9c9a2c Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/full-rocm.Dockerfile
#	.devops/llama-cli-rocm.Dockerfile
#	.devops/llama-server-rocm.Dockerfile
#	.github/workflows/build.yml
#	.github/workflows/python-type-check.yml
#	CMakeLists.txt
#	CONTRIBUTING.md
#	README.md
#	ci/run.sh
#	examples/embedding/embedding.cpp
#	examples/server/README.md
#	flake.lock
#	ggml/include/ggml.h
#	ggml/src/ggml.c
#	requirements/requirements-convert_legacy_llama.txt
#	scripts/sync-ggml.last
#	src/llama-vocab.cpp
#	src/llama.cpp
#	tests/test-backend-ops.cpp
#	tests/test-grad0.cpp
#	tests/test-tokenizer-0.cpp
2024-10-02 01:00:57 +08:00
Georgi Gerganov
cad341d889
metal : reduce command encoding overhead (#9698)
* metal : reduce command encoding overhead

ggml-ci

* metal : add comments
2024-10-01 16:00:25 +03:00
Concedo
bdfe8526b8 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.gitignore
#	CONTRIBUTING.md
#	Makefile
#	examples/llava/CMakeLists.txt
#	scripts/sync-ggml-am.sh
#	scripts/sync-ggml.last
#	scripts/sync-ggml.sh
#	src/llama-vocab.cpp
2024-08-10 11:42:32 +08:00
Conrad Kramer
85fca8deb6
metal : add abort callback (ggml/905) 2024-08-08 13:19:30 +03:00
Concedo
5b605d03ea Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/ISSUE_TEMPLATE/config.yml
#	.gitignore
#	CMakeLists.txt
#	CONTRIBUTING.md
#	Makefile
#	README.md
#	ci/run.sh
#	common/common.h
#	examples/main-cmake-pkg/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	models/ggml-vocab-bert-bge.gguf.inp
#	models/ggml-vocab-bert-bge.gguf.out
#	models/ggml-vocab-deepseek-coder.gguf.inp
#	models/ggml-vocab-deepseek-coder.gguf.out
#	models/ggml-vocab-deepseek-llm.gguf.inp
#	models/ggml-vocab-deepseek-llm.gguf.out
#	models/ggml-vocab-falcon.gguf.inp
#	models/ggml-vocab-falcon.gguf.out
#	models/ggml-vocab-gpt-2.gguf.inp
#	models/ggml-vocab-gpt-2.gguf.out
#	models/ggml-vocab-llama-bpe.gguf.inp
#	models/ggml-vocab-llama-bpe.gguf.out
#	models/ggml-vocab-llama-spm.gguf.inp
#	models/ggml-vocab-llama-spm.gguf.out
#	models/ggml-vocab-mpt.gguf.inp
#	models/ggml-vocab-mpt.gguf.out
#	models/ggml-vocab-phi-3.gguf.inp
#	models/ggml-vocab-phi-3.gguf.out
#	models/ggml-vocab-starcoder.gguf.inp
#	models/ggml-vocab-starcoder.gguf.out
#	requirements.txt
#	requirements/requirements-convert_legacy_llama.txt
#	scripts/check-requirements.sh
#	scripts/pod-llama.sh
#	src/CMakeLists.txt
#	src/llama.cpp
#	tests/test-rope.cpp
2024-07-06 00:25:10 +08:00
Clint Herron
07a3fc0608
Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258) 2024-07-02 12:18:10 -04:00
Concedo
9c10486204 merge the file structure refactor, testing 2024-06-29 12:14:38 +08:00
Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake (#8006)
* scripts : update sync [no ci]

* files : relocate [no ci]

* ci : disable kompute build [no ci]

* cmake : fixes [no ci]

* server : fix mingw build

ggml-ci

* cmake : minor [no ci]

* cmake : link math library [no ci]

* cmake : build normal ggml library (not object library) [no ci]

* cmake : fix kompute build

ggml-ci

* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE

ggml-ci

* move public backend headers to the public include directory (#8122)

* move public backend headers to the public include directory

* nix test

* spm : fix metal header

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* scripts : fix sync paths [no ci]

* scripts : sync ggml-blas.h [no ci]

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00
Renamed from ggml-metal.h (Browse further)