Commit graph

113 commits

Author SHA1 Message Date
Concedo
c9eb2ba1c5 Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
#	ggml-opencl.c
2023-05-13 15:51:05 +08:00
Rinne
6456a4eb9f
embedding : remove unused code (#1426) 2023-05-13 10:24:20 +03:00
Georgi Gerganov
fb62f92433
llama : fix --mtest option (close #1414) 2023-05-12 21:44:20 +03:00
Concedo
e9caff1cda Interim merge. Do not use.
Merge branch 'master' into concedo_experimental

# Conflicts:
#	README.md
#	SHA256SUMS
#	examples/quantize/quantize.cpp
#	ggml-opencl.c
#	ggml.c
#	ggml.h
#	llama.cpp
#	llama.h
2023-05-12 23:20:27 +08:00
Johannes Gäßler
773ee249fb
CLI args use - instead of _, backwards compatible (#1416) 2023-05-12 14:34:55 +00:00
Georgi Gerganov
b9fd7eee57
ggml : remove bit shuffling (#1405)
* ggml : remove Q4_0 bit shufling (ARM NEON)

* ggml : remove Q4_1 bit shuffling (ARM NEON + reference)

* ggml : nibbles_from_floats() + bytes_from_nibbles() (ARM NEON)

* ggml : remove Q4_2 bit shuffling (WIP, BROKEN)

* ggml : remove Q5_0 bit shuffling (ARM NEON)

* ggml : 2x faster scalar implementations

* ggml : remove Q5_1 bit shuffling (ARM NEON + scalar)

* ggml : simplify scalar dot

* ggml : remove WASM SIMD bit shuffling + remove vzip for ARM 32-bit

* ggml : fix Q4_1 quantization

* ggml : update cuBLAS + normalize variable names

* ggml : remove Q4_2 mode

* ggml : minor formatting

* ggml : fix Q5_0 quantization

* scripts : add script for measuring the time per token

* AVX implementations (#1370)

* ggml : uniform 5th bit extraction

* llama : produce error upon loading old model files

* llama : fix model magic/version write

* ggml : speed-up Q5_0 + Q5_1 at 4 threads

* ggml : preserve old Q4 and Q5 formats

* ggml : simplify Q8_1 - no need for low / high sums anymore

* ggml : fix Q8_0 and Q8_1 rounding

* Revert "AVX implementations (#1370)"

This reverts commit 948d124837f9d287d8490f41338e0e4cceb0814f.

* ggml : fix AVX2 implementation

* sha : update hashes for 7B and 13B

* readme : update timings + remove warning banner

* llama : update v2 PR number to 1405

* ggml : fix WASM comments

* ggml : back to original bit order

* readme : add note that Q4 and Q5 have been changed

* llama : fix return for unknown version

---------

Co-authored-by: Stephan Walter <stephan@walter.name>
2023-05-12 00:23:08 +03:00
Evan Jones
cf348a60e0
main : add option to save full output to session (#1338)
* main : add option to save full output to session

* split behavior into --session and --prompt-cache

* restore original implementation with new names

* PR comments

* move the check for incompatible parameters to gpt_params_parse

* Fix whitespace

Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>

---------

Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
2023-05-10 11:37:14 -04:00
Concedo
19dbb3b2a5 Merge branch 'master' into concedo_experimental 2023-05-10 18:35:53 +08:00
DannyDaemonic
e6a46b0ed1
Locale fix for Windows (#1379) 2023-05-09 19:53:28 +02:00
Concedo
54194911ac Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-05-09 16:50:43 +08:00
DannyDaemonic
41654efea8
Interface improvements and --multiline-input (previously --author-mode) (#1040)
* Interface improvements
* Multiline input
* Track character width
* Works with all characters and control codes + Windows console fixes
2023-05-08 19:45:48 -07:00
Georgi Gerganov
f9a6364912
llama : require first token to be BOS (#1303)
* llama : require first token to be BOS

* scripts : add ppl-run-all.sh

* perplexity : add BOS for each chunk

* readme : update perplexity values after BOS fix

* perplexity : add clarifying comments
2023-05-08 17:41:54 +03:00
Concedo
1083876a1b Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
2023-05-08 11:12:42 +08:00
Johannes Gäßler
1f48b0abcf
Documented CUDA reproducibility, added warning (#1346) 2023-05-08 02:42:01 +02:00
Concedo
62beded0e7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	Makefile
#	README.md
2023-05-07 19:10:01 +08:00
Jed Fox
3924088512
Remove default arguments from sampling functions (#1343) 2023-05-06 17:01:47 -04:00
Concedo
39f3d1cf48 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	README.md
#	examples/quantize/quantize.cpp
2023-05-05 21:34:33 +08:00
slaren
94c5652fc0
quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301) 2023-05-05 00:58:56 +02:00
44670
2edbdb0f99
main : add --in-suffix option (#1318)
* adding --in-suffix option

* print input suffix before generation
2023-05-04 18:41:12 +03:00
DannyDaemonic
db1080876a
Only escape prompts when used with -e (#1311) 2023-05-04 05:08:25 -07:00
DannyDaemonic
c65a7fbfa9
Update main's README.md with new features (#1296) 2023-05-04 03:02:59 -07:00
Tomas
f647ce040f
fix #1224 reverse prompt and multi line (#1297)
* fix reverse prompt and multi line

* Code Formatting

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-04 03:02:30 -07:00
Concedo
e01dc631f7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-05-04 14:04:41 +08:00
khimaros
6daa09d879
examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
Concedo
ede8e4edbb Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
2023-05-03 23:34:50 +08:00
CRD716
a8a2efdc81
examples : various prompt and example fixes (#1298)
* fix dan.txt

* miku prompt improvements

* use common characters
2023-05-03 18:26:47 +03:00
DannyDaemonic
2485d7a4d3
Process escape sequences given in prompts (#1173) 2023-05-02 18:46:20 -07:00
DannyDaemonic
13b0c68ed7
Handle signals properly on Windows (#1123) 2023-05-02 18:01:57 -07:00
slaren
bf4b22ffe4
fix missing parameters in llama_init_from_gpt_params (#1293) 2023-05-03 01:36:45 +02:00
Ron Evans
67c77799e0
examples : add llama_init_from_gpt_params() common function (#1290)
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 23:39:51 +03:00
Georgi Gerganov
0e6cbff1b7
llama : fix compile warnings 2023-05-02 23:09:08 +03:00
Ron Evans
8c9be35ff9
examples : improve vertical alignment of a few variables (#1286)
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 20:53:52 +03:00
Robert Brisita
2bb992f034
llama : allow 0 as a seed number. (#1275) 2023-05-02 19:23:44 +03:00
Ron Evans
e2cd506999
main : switch input_noecho to input_echo to remove negation (#979)
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 19:13:26 +03:00
Concedo
94827172e0 Merge branch 'master' into concedo
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	ggml-cuda.cu
#	ggml-cuda.h
2023-05-02 14:38:31 +08:00
DannyDaemonic
f4cef87edf
Add git-based build information for better issue tracking (#1232)
* Add git-based build information for better issue tracking

* macOS fix

* "build (hash)" and "CMAKE_SOURCE_DIR" changes

* Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages

* Fix conditional dependency on missing target

* Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile

* 4 space indenting for cmake, attempt to clean up my mess in Makefile

* Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
2023-05-01 18:23:47 +02:00
Georgi Gerganov
70269cae37
llama : fix session load / save (#1263) 2023-05-01 14:54:59 +03:00
Concedo
3de34ee492 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	ggml-opencl.c
2023-05-01 12:03:46 +08:00
jon-chuang
a5d30b1f53
common : better default number of threads (#934)
* commit

* fix

* try-catch

* apply code review

* improve

* improve

* add macos headers

* done

* remove color

* fix windows

* minor

* fix

* Apply suggestions from code review

Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>

* remove

* minor

* minor

---------

Co-authored-by: jon-chuang <jon-chuang@users.noreply.github.com>
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
2023-04-30 21:41:35 +03:00
Stephan Walter
f0d70f147d
Various fixes to mat_mul benchmark (#1253) 2023-04-30 12:32:37 +00:00
Concedo
0061b90ec6 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
2023-04-30 10:35:02 +08:00
Georgi Gerganov
305eb5afd5
build : fix reference to old llama_util.h 2023-04-29 13:53:12 +03:00
Georgi Gerganov
84ca9c2ecf
examples : fix save-load-state + rename llama-util.h 2023-04-29 13:48:11 +03:00
Concedo
da0c34b028 Merge branch 'master' into concedo_experimental 2023-04-29 18:27:06 +08:00
Georgi Gerganov
334637e43e
common : change default parameters to pre-#1126 (#1223) 2023-04-29 09:51:06 +03:00
Ivan Stepanov
dd7eff57d8
llama : new sampling algorithms (#1126)
* Sample interface, new samplers.

New samplers:
- locally typical sampling
- tail free sampling
- frequency and presence penalty
- mirostat

Ignore EOS fix: -inf should be used.

* mirostat

* Added --logit-bias and --no-penalize-nl, removed std::span

* Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k)

Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k)

* Save and load example adjust

* Tests

* Windows build fix

* Windows test fix
2023-04-29 08:34:41 +03:00
Concedo
bb282a4ecf reinstated the q4_3 format, for backwards compatibility. 2023-04-29 11:42:04 +08:00
Stephan Walter
36d19a603b
Remove Q4_3 which is no better than Q5 (#1218) 2023-04-28 23:10:43 +00:00
CRD716
5fba3c016b
examples : add Jeopardy example (#1168)
* Basic Setup

* Prevent Results.txt from coming up

* Prefixes, Line separators, etc

* editorcheck

* introduction to give more consistent results

* Basic graph thing

* Grading, ready for testing!

* Y'all ready to get funky?

* fix column removal stuff

* missed a few
2023-04-28 19:13:33 +03:00
Evan Jones
1481a9cf25
llama : add session file format and saved sessions in main (#1169) 2023-04-28 18:59:37 +03:00