Commit graph

102 commits

Author SHA1 Message Date
Concedo
f96f29be7b Merge branch 'master' into concedo_experimental
# Conflicts:
#	.devops/nix/nixpkgs-instances.nix
#	.devops/nix/package.nix
#	.devops/nix/scope.nix
#	.github/workflows/build.yml
#	.github/workflows/nix-ci.yml
#	CMakeLists.txt
#	flake.nix
#	ggml.c
2024-01-22 22:31:22 +08:00
Someone Serge
fe8b3c0d4b workflows: nix-ci: drop the redundant "paths" filter 2024-01-22 12:19:30 +00:00
Someone Serge
f4dd059259 workflows: nix-build-aarch64: rate limit 2024-01-22 12:19:30 +00:00
Someone Serge
f7276f7500 workflows: nix-ci: rebuild on flake.lock updates 2024-01-22 12:19:30 +00:00
bobqianic
57744932c6
ci : fix Windows CI by updating Intel SDE version (#5053) 2024-01-22 10:55:05 +02:00
Concedo
71e9a64171 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/nix-ci.yml
#	CMakeLists.txt
#	Makefile
#	ggml-cuda.cu
#	ggml-opencl.cpp
#	llama.cpp
2024-01-20 23:27:42 +08:00
Neuman Vong
862f5e41ab
android : introduce starter project example (#4926)
* Introduce starter project for Android

Based on examples/llama.swiftui.

* Add github workflow

* Set NDK version

* Only build arm64-v8a in CI

* Sync bench code

* Rename CI prop to skip-armeabi-v7a

* Remove unused tests
2024-01-16 15:47:34 +02:00
Someone
6b48ed0893
workflows: unbreak nix-build-aarch64, and split it out (#4915)
The fix should be just the `sudo apt-get update`
2024-01-13 16:29:16 +00:00
Someone
d8d90aa343
ci: nix-flake-update: new token with pr permissions (#4879)
* ci: nix-flake-update: new token with pr permissions

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-11 17:22:34 +00:00
Concedo
94e68fe474 added field to show recent seed 2024-01-02 15:35:04 +08:00
Concedo
9e0dee769b Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	flake.lock
#	flake.nix
2024-01-01 16:04:17 +08:00
Someone Serge
d836174731 workflows: nix-ci: add a qemu job for jetsons 2023-12-31 13:14:58 -08:00
Someone Serge
06f2a5d190 workflows: nix-flakestry: drop tag filters
...and add a job for flakehub.com
2023-12-31 13:14:58 -08:00
Someone Serge
c5239944ba workflows: weekly nix flake update 2023-12-31 13:14:58 -08:00
Someone Serge
1e9ae54cf2 workflows: nix-ci: add a job for eval 2023-12-31 13:14:58 -08:00
Someone Serge
7adedecbe3 workflows: nix-ci: init; build flake outputs 2023-12-31 13:14:58 -08:00
Concedo
fe7c200610 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.devops/full-cuda.Dockerfile
#	.devops/full-rocm.Dockerfile
#	.devops/full.Dockerfile
#	.devops/main-rocm.Dockerfile
#	README.md
#	flake.lock
#	flake.nix
#	ggml-cuda.cu
#	requirements.txt
#	tests/CMakeLists.txt
2023-12-31 00:42:59 +08:00
crasm
04ac0607e9
python : add check-requirements.sh and GitHub workflow (#4585)
* python: add check-requirements.sh and GitHub workflow

This script and workflow forces package versions to remain compatible
across all convert*.py scripts, while allowing secondary convert scripts
to import dependencies not wanted in convert.py.

* Move requirements into ./requirements

* Fail on "==" being used for package requirements (but can be suppressed)

* Enforce "compatible release" syntax instead of ==

* Update workflow

* Add upper version bound for transformers and protobuf

* improve check-requirements.sh

* small syntax change

* don't remove venvs if nocleanup is passed

* See if this fixes docker workflow

* Move check-requirements.sh into ./scripts/

---------

Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2023-12-29 16:50:29 +02:00
Philip Taron
68eccbdc5b
flake.nix : rewrite (#4605)
* flake.lock: update to hotfix CUDA::cuda_driver

Required to support https://github.com/ggerganov/llama.cpp/pull/4606

* flake.nix: rewrite

1. Split into separate files per output.

2. Added overlays, so that this flake can be integrated into others.
   The names in the overlay are `llama-cpp`, `llama-cpp-opencl`,
   `llama-cpp-cuda`, and `llama-cpp-rocm` so that they fit into the
   broader set of Nix packages from [nixpkgs](https://github.com/nixos/nixpkgs).

3. Use [callPackage](https://summer.nixos.org/blog/callpackage-a-tool-for-the-lazy/)
   rather than `with pkgs;` so that there's dependency injection rather
   than dependency lookup.

4. Add a description and meta information for each package.
   The description includes a bit about what's trying to accelerate each one.

5. Use specific CUDA packages instead of cudatoolkit on the advice of SomeoneSerge.

6. Format with `serokell/nixfmt` for a consistent style.

7. Update `flake.lock` with the latest goods.

* flake.nix: use finalPackage instead of passing it manually

* nix: unclutter darwin support

* nix: pass most darwin frameworks unconditionally

...for simplicity

* *.nix: nixfmt

nix shell github:piegamesde/nixfmt/rfc101-style --command \
    nixfmt flake.nix .devops/nix/*.nix

* flake.nix: add maintainers

* nix: move meta down to follow Nixpkgs style more closely

* nix: add missing meta attributes

nix: clarify the interpretation of meta.maintainers

nix: clarify the meaning of "broken" and "badPlatforms"

nix: passthru: expose the use* flags for inspection

E.g.:

```
❯ nix eval .#cuda.useCuda
true
```

* flake.nix: avoid re-evaluating nixpkgs too many times

* flake.nix: use flake-parts

* nix: migrate to pname+version

* flake.nix: overlay: expose both the namespace and the default attribute

* ci: add the (Nix) flakestry workflow

* nix: cmakeFlags: explicit OFF bools

* nix: cuda: reduce runtime closure

* nix: fewer rebuilds

* nix: respect config.cudaCapabilities

* nix: add the impure driver's location to the DT_RUNPATHs

* nix: clean sources more thoroughly

...this way outPaths change less frequently,
and so there are fewer rebuilds

* nix: explicit mpi support

* nix: explicit jetson support

* flake.nix: darwin: only expose the default

---------

Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>
2023-12-29 16:42:26 +02:00
henk717
5006b23099
CUDA 11.4 for Github CI (#582)
* Downgrade CUDA to 11.4

This helps the binary be smaller and adds K80 support, the manual compiles we did already had this.

* Update kcpp-build-release-win-cuda.yaml

* Update kcpp-build-release-win-cuda.yaml

* Update kcpp-build-release-win-cuda.yaml

* Update kcpp-build-release-win-cuda.yaml

* Update kcpp-build-release-win-cuda.yaml

* Update kcpp-build-release-win-cuda.yaml

* Restore concedo_experimental
2023-12-26 11:23:43 +08:00
Samuel Maynard
925e5584a0
ci(docker): fix tags in "Build and push docker image (tagged)" (#4603) 2023-12-23 11:35:55 +02:00
rhuddleston
f31b984898
ci : tag docker image with build number (#4584) 2023-12-22 08:56:34 +02:00
Samuel Maynard
4a5f9d629e
ci : add jlumbroso/free-disk-space to docker workflow (#4150)
* [github][workflows][docker]: removes hardcoded `ggerganov` from `ghcr` repo

* [github][workflows][docker]: adds `jlumbroso/free-disk-space`
2023-12-21 22:36:26 +02:00
Concedo
e1f013bbf8 testing workflow for windows cuda builds 2023-12-21 19:36:52 +08:00
Concedo
7798587990 Workflow Build from experimental branch 2023-12-14 19:17:19 +08:00
Concedo
ae3d829d0c manual workflow for generating builds instead 2023-12-14 19:00:58 +08:00
henk717
146e3bb3aa
Automatically generate Linux Binaries (#564)
* .sh script V1

* koboldcpp.sh polish

* koboldcpp.sh dist generator

* Include html's in dist

* RWKV in Linux Dist

* Lower dependency requirements

* Eliminate wget dependency

* More distinct binary name

I know its technically amd64, but I don't want to cause confusion among nvidia users.

* Use System OpenCL

Unsure how this will behave in the pyinstaller build, but pocl ended up CPU only. With a bit of luck the pyinstaller uses the one from the actual system if compiled in a system without opencl, while conda now includes it for that specific system.

* Add cblas dependency

Missing this causes compile failures on some system's

* ICD workaround

Ideally we find a better solution, but conda forces ICD and needs this for the successful compile. However, pyinstaller then embeds the ICD causing it to be limited to the system it was compiled for. By temporarily removing the ICD pyinstaller can't find it and everything remains functional. Ideally we do this on a pyinstaller level, but I could not find any good options to do so yet.

* Fix & Nocuda

* Automatically build Linux Binary

* Auto build on v tag

* Better on release

* Fix missing jobs:

* More distinct name

* I am to retro...

* Fix release upload

* Another upload attempt

* Another upload attempt

* Also rebuild on release edit

* Placebo commit to maybe fix CI

---------

Co-authored-by: root <root@DESKTOP-DQ1QRAG>
2023-12-14 14:48:22 +08:00
Concedo
8dd975653d removing existing yml files 2023-12-14 14:47:03 +08:00
Georgi Gerganov
fe680e3d10
sync : ggml (new ops, tests, backend, etc.) (#4359)
* sync : ggml (part 1)

* sync : ggml (part 2, CUDA)

* sync : ggml (part 3, Metal)

* ggml : build fixes

ggml-ci

* cuda : restore lost changes

* cuda : restore lost changes (StableLM rope)

* cmake : enable separable compilation for CUDA

ggml-ci

* ggml-cuda : remove device side dequantize

* Revert "cmake : enable separable compilation for CUDA"

This reverts commit 09e35d04b1c4ca67f9685690160b35bc885a89ac.

* cuda : remove assert for rope

* tests : add test-backend-ops

* ggml : fix bug in ggml_concat

* ggml : restore `ggml_get_n_tasks()` logic in `ggml_graph_plan()`

* ci : try to fix macOS

* ggml-backend : remove backend self-registration

* ci : disable Metal for macOS cmake build

ggml-ci

* metal : fix "supports family" call

* metal : fix assert

* metal : print resource path

ggml-ci

---------

Co-authored-by: slaren <slarengh@gmail.com>
2023-12-07 22:26:54 +02:00
Bailey Chittle
bb03290c17
examples : iOS example with swift ui (#4159)
* copy to llama.cpp as subdir

* attempt enabling metal, fails

* ggml metal compiles!

* Update README.md

* initial conversion to new format, utf8 errors?

* bug fixes, but now has an invalid memory access :(

* added O3, now has insufficient memory access

* begin sync with master

* update to match latest code, new errors

* fixed it!

* fix for loop conditionals, increase result size

* fix current workflow errors

* attempt a llama.swiftui workflow

* Update .github/workflows/build.yml

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-27 16:56:52 +02:00
Concedo
56a5fa7a60 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	tests/test-tokenizer-0-falcon.py
#	tests/test-tokenizer-0-llama.py
2023-11-20 22:37:06 +08:00
Galunid
f23c0359a3
ci : add flake8 to github actions (python linting) (#4129)
Disabled rules:

* E203 Whitespace before ':' - disabled because we often use 'C' Style where values are aligned

* E211 Whitespace before '(' (E211) - disabled because we often use 'C' Style where values are aligned

* E221 Multiple spaces before operator - disabled because we often use 'C' Style where values are aligned

* E225 Missing whitespace around operator - disabled because it's broken so often it seems like a standard

* E231 Missing whitespace after ',', ';', or ':' - disabled because we often use 'C' Style where values are aligned

* E241 Multiple spaces after ',' - disabled because we often use 'C' Style where values are aligned

* E251 Unexpected spaces around keyword / parameter equals - disabled because it's broken so often it seems like a standard

* E261 At least two spaces before inline comment - disabled because it's broken so often it seems like a standard

* E266 Too many leading '#' for block comment - sometimes used as "section" separator

* E501 Line too long - disabled because it's broken so often it seems like a standard

* E701 Multiple statements on one line (colon) - broken only in convert.py when defining abstract methods (we can use# noqa instead)

* E704 Multiple statements on one line - broken only in convert.py when defining abstract methods (we can use# noqa instead)
2023-11-20 11:35:47 +01:00
Eve
a7fac013cf
ci : use intel sde when ci cpu doesn't support avx512 (#3949) 2023-11-05 09:46:44 +02:00
Zane Shannon
24ba3d829e
examples : add batched.swift + improve CI for swift (#3562) 2023-10-11 06:14:05 -05:00
Concedo
e967717385 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	build.zig
2023-10-08 22:55:44 +08:00
Matheus C. França
eee42c670e
ci : add Zig CI/CD and fix build (#2996)
* zig CI/CD and fix build

Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com>

* fix build_compiler

* ci : remove trailing whitespace

---------

Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-08 16:59:20 +03:00
Georgi Gerganov
94e502dfb7
ci : enable on obj-c changes + fix metal build (#3540) 2023-10-08 11:24:50 +03:00
Concedo
f797cba377 Merge branch 'master' into concedo_experimental 2023-10-08 10:43:34 +08:00
M. Yusuf Sarıgöz
4d03833211
gguf.py : fix CI for publishing GGUF package (#3532)
* Fix CI for publishing GGUF package

* Bump version

* fix

* bump version

* bump version

* bump version
2023-10-07 22:14:10 +03:00
Jhen-Jie Hong
04b2f4386e
ci : fix xcodebuild destinations (#3491)
* ci : fix xcodebuild destinations

* ci : add .swift to paths
2023-10-06 13:36:43 +03:00
Jhen-Jie Hong
0745384449
ci : add swift build via xcodebuild (#3482) 2023-10-05 16:56:21 +03:00
Eve
017efe899d
cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273)
* fix LLAMA_NATIVE

* syntax

* alternate implementation

* my eyes must be getting bad...

* set cmake LLAMA_NATIVE=ON by default

* march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc

* revert 8283237 and only allow LLAMA_NATIVE on x86 like the Makefile

* remove -DLLAMA_MPI=ON

---------

Co-authored-by: netrunnereve <netrunnereve@users.noreply.github.com>
2023-10-03 19:53:15 +03:00
Eve
0512d66670
ci : multithreaded builds (#3311)
* mac and linux threads

* windows

* Update build.yml

* Update build.yml

* Update build.yml

* automatically get thread count

* windows syntax

* try to fix freebsd

* Update build.yml

* Update build.yml

* Update build.yml
2023-09-28 22:31:04 +03:00
Georgi Gerganov
2619109ad5
ci : disable freeBSD builds due to lack of VMs (#3381) 2023-09-28 19:36:36 +03:00
Alon
a40f2b656f
CI: FreeBSD fix (#3258)
* - freebsd ci: use qemu
2023-09-20 14:06:36 +02:00
Erik Scholz
7ddf185537
ci : switch cudatoolkit install on windows to networked (#3236) 2023-09-18 02:21:47 +02:00
IsaacDynamo
b541b4f0b1
Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) 2023-09-16 19:35:25 +02:00
Cebtenzzre
69eb67e282
fix build numbers by setting fetch-depth=0 (#3197) 2023-09-15 15:18:15 -04:00
Concedo
8b8eb18567 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/docker.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	flake.nix
#	tests/CMakeLists.txt
2023-09-15 23:51:18 +08:00
Alon
83a53b753a
CI: add FreeBSD & simplify CUDA windows (#3053)
* add freebsd to ci

* bump actions/checkout to v3
* bump cuda 12.1.0 -> 12.2.0
* bump Jimver/cuda-toolkit version

* unify and simplify "Copy and pack Cuda runtime"
* install only necessary cuda sub packages
2023-09-14 19:21:25 +02:00