Commit graph

45 commits

Author SHA1 Message Date
Concedo
1d48db4f63 dont build quantize 2023-04-07 17:11:26 +08:00
Concedo
5c1920df43 why nobody ever told me the makefile doesnt work outside x86 xD 2023-04-05 17:15:42 +08:00
Concedo
57e9f929ee renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags 2023-04-05 15:22:13 +08:00
Concedo
14273fea7a integrated gpt2 support 2023-04-04 23:15:47 +08:00
Concedo
52de932842 removed main.exe to reduce clutter, added support for rep pen in gptj 2023-04-04 20:43:13 +08:00
Concedo
eb5b22dda2 rebrand to koboldcpp 2023-04-03 10:35:18 +08:00
Concedo
8dd8ab1659 Various enhancement and integration pygmalion.cpp 2023-04-03 00:04:43 +08:00
Concedo
bb965cc120 Merge branch 'master' into concedo
# Conflicts:
#	README.md
2023-04-02 17:13:28 +08:00
Concedo
9aabb0d9db massive refactor completed, GPT-J integrated 2023-04-02 17:03:30 +08:00
Fabian
c4f89d8d73
make : use -march=native -mtune=native on x86 (#609) 2023-04-02 10:17:05 +03:00
Concedo
b1f08813e3 added support for gpt4all original format 2023-04-02 00:53:46 +08:00
Concedo
085a9f90a7 still refactoring 2023-04-01 11:56:34 +08:00
Concedo
6e6125ebdb updated pyinstaller to clean temp dir,removed warning flags from makefile because they are just clutter. 2023-04-01 09:25:41 +08:00
Concedo
801b178f2a still refactoring, but need a checkpoint to prepare build for 1.0.7 2023-04-01 08:55:14 +08:00
Concedo
6b86f5ea22 halfway refactoring, wip adding other model types 2023-04-01 01:13:05 +08:00
Concedo
559a1967f7 Backwards compatibility formats all done
Merge branch 'master' into concedo

# Conflicts:
#	CMakeLists.txt
#	README.md
#	llama.cpp
2023-03-31 19:01:33 +08:00
david raistrick
1f0414feec
make : fix darwin f16c flags check (#615)
...there was no check.  ported upstream from https://github.com/zanussbaum/gpt4all.cpp/pull/2 (I dont see any clean path for upstream patches)
2023-03-30 20:34:45 +03:00
Concedo
354d4f232f fixed linux openblas build errors 2023-03-30 11:55:35 +08:00
Concedo
664b277c27 integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed. 2023-03-30 00:43:52 +08:00
Concedo
49c4c225b5 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
2023-03-29 21:08:03 +08:00
Stephan Walter
436e561931
all : be more strict about converting float to double (#458)
* Be more strict about converting float to double

* Test equivalence of round, SILU implementations

Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.

* Fix softmax in perplexity.cpp

* all : prefer float over double where appropriate

* perplexity : add <cmath>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 19:48:20 +03:00
Concedo
bf30406f50 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/docker.yml
#	Makefile
#	README.md
2023-03-28 17:13:38 +08:00
RJ Adriaansen
4b8efff0e3
Add embedding example to Makefile (#540) 2023-03-28 09:11:09 +03:00
Concedo
57474944d6 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
2023-03-26 14:52:08 +08:00
Georgi Gerganov
a316a425d0
Overhaul the examples structure
- main -> examples
- utils -> examples (renamed to "common")
- quantize -> examples
- separate tools for "perplexity" and "embedding"

Hope I didn't break something !
2023-03-25 20:26:40 +02:00
Concedo
3c78124aac Merge branch 'master' into concedo
# Conflicts:
#	README.md
2023-03-25 11:20:04 +08:00
Concedo
506cd62638 changed some defaults to hopefully increase compatibility 2023-03-25 10:40:11 +08:00
Cameron Kaiser
481044d50c
additional optimizations for POWER9 (#454) 2023-03-24 17:19:26 +02:00
Concedo
1166fda943 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
2023-03-23 23:51:07 +08:00
Kerfuffle
a140219e81
Fix Makefile echo escape codes (by removing them). (#418) 2023-03-23 12:41:32 +01:00
Concedo
86c7457e24 Merge branch 'master' into concedo
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	main.cpp
2023-03-22 22:31:45 +08:00
Georgi Gerganov
f5a77a629b
Introduce C-style API (#370)
* Major refactoring - introduce C-style API

* Clean up

* Add <cassert>

* Add <iterator>

* Add <algorithm> ....

* Fix timing reporting and accumulation

* Measure eval time only for single-token calls

* Change llama_tokenize return meaning
2023-03-22 07:32:36 +02:00
Alex von Gluck IV
f157088cb7
makefile: Fix CPU feature detection on Haiku (#218) 2023-03-21 18:21:06 +02:00
Kevin Lo
715d292ee0
Add OpenBSD support (#314) 2023-03-21 17:50:09 +02:00
Qingyou Meng
c3b2306b18
Makefile: slightly cleanup for Mac Intel; echo instead of run ./main -h (#335) 2023-03-21 17:44:11 +02:00
Georgi Gerganov
eb34620aec
Add tokenizer test + revert to C++11 (#355)
* Add test-tokenizer-0 to do a few tokenizations - feel free to expand
* Added option to convert-pth-to-ggml.py script to dump just the vocabulary
* Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
* Added utility to load vocabulary file from previous point (temporary implementation)
* Avoid using std::string_view and drop back to C++11 (hope I didn't break something)
* Rename gpt_vocab -> llama_vocab
* All CMake binaries go into ./bin/ now
2023-03-21 17:29:41 +02:00
Casey Primozic
2e664f1ff4
Add initial AVX512 support for dot product on Linux (#320)
* Update Makefile to detect AVX512 support and add compiler flags if it's available
 * Based on existing AVX2 implementation, dot product on one 32-value block of 4-bit quantized ints at a time
 * Perform 8 bit -> 16 bit sign extension and multiply+add on 32 values at time instead of 16
 * Use built-in AVX512 horizontal reduce add to get sum at the end
 * Manual unrolling on inner dot product loop to reduce loop counter overhead
2023-03-21 15:35:42 +01:00
Concedo
8d39365af6 update license, added backwards compatibility with both ggml model formats, fixed context length issues. 2023-03-20 23:43:35 +08:00
Concedo
a2c10e0d2f Merge branch 'master' into concedo
# Conflicts:
#	.devops/full.Dockerfile
#	README.md
#	main.cpp
2023-03-20 20:58:27 +08:00
Mack Straight
074bea2eb1
sentencepiece bpe compatible tokenizer (#252)
* potential out of bounds read

* fix quantize

* style

* Update convert-pth-to-ggml.py

* mild cleanup

* don't need the space-prefixing here rn since main.cpp already does it

* new file magic + version header field

* readme notice

* missing newlines

Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
2023-03-20 03:17:23 -07:00
Concedo
f952b7c613 Removed junk, fixed some bugs and support dynamic number of sharded files
Merge remote-tracking branch 'origin/master' into concedo

# Conflicts:
#	README.md
2023-03-19 11:13:00 +08:00
Concedo
2c8f870f53 Created a python bindings for llama.cpp and emulated a simple Kobold HTTP API Endpoint 2023-03-19 00:07:11 +08:00
Thomas Klausner
41be0a3b3d
Add NetBSD support. (#90) 2023-03-13 18:40:54 +02:00
Georgi Gerganov
7211862c94
Update Makefile var + add comment 2023-03-11 12:27:02 +02:00
Georgi Gerganov
26c0846629
Initial release 2023-03-10 20:56:40 +02:00