Concedo
c9eb2ba1c5
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
# ggml-opencl.c
2023-05-13 15:51:05 +08:00
Rinne
6456a4eb9f
embedding : remove unused code ( #1426 )
2023-05-13 10:24:20 +03:00
Georgi Gerganov
fb62f92433
llama : fix --mtest option ( close #1414 )
2023-05-12 21:44:20 +03:00
Concedo
e9caff1cda
Interim merge. Do not use.
...
Merge branch 'master' into concedo_experimental
# Conflicts:
# README.md
# SHA256SUMS
# examples/quantize/quantize.cpp
# ggml-opencl.c
# ggml.c
# ggml.h
# llama.cpp
# llama.h
2023-05-12 23:20:27 +08:00
Johannes Gäßler
773ee249fb
CLI args use - instead of _, backwards compatible ( #1416 )
2023-05-12 14:34:55 +00:00
Georgi Gerganov
b9fd7eee57
ggml : remove bit shuffling ( #1405 )
...
* ggml : remove Q4_0 bit shufling (ARM NEON)
* ggml : remove Q4_1 bit shuffling (ARM NEON + reference)
* ggml : nibbles_from_floats() + bytes_from_nibbles() (ARM NEON)
* ggml : remove Q4_2 bit shuffling (WIP, BROKEN)
* ggml : remove Q5_0 bit shuffling (ARM NEON)
* ggml : 2x faster scalar implementations
* ggml : remove Q5_1 bit shuffling (ARM NEON + scalar)
* ggml : simplify scalar dot
* ggml : remove WASM SIMD bit shuffling + remove vzip for ARM 32-bit
* ggml : fix Q4_1 quantization
* ggml : update cuBLAS + normalize variable names
* ggml : remove Q4_2 mode
* ggml : minor formatting
* ggml : fix Q5_0 quantization
* scripts : add script for measuring the time per token
* AVX implementations (#1370 )
* ggml : uniform 5th bit extraction
* llama : produce error upon loading old model files
* llama : fix model magic/version write
* ggml : speed-up Q5_0 + Q5_1 at 4 threads
* ggml : preserve old Q4 and Q5 formats
* ggml : simplify Q8_1 - no need for low / high sums anymore
* ggml : fix Q8_0 and Q8_1 rounding
* Revert "AVX implementations (#1370 )"
This reverts commit 948d124837f9d287d8490f41338e0e4cceb0814f.
* ggml : fix AVX2 implementation
* sha : update hashes for 7B and 13B
* readme : update timings + remove warning banner
* llama : update v2 PR number to 1405
* ggml : fix WASM comments
* ggml : back to original bit order
* readme : add note that Q4 and Q5 have been changed
* llama : fix return for unknown version
---------
Co-authored-by: Stephan Walter <stephan@walter.name>
2023-05-12 00:23:08 +03:00
Evan Jones
cf348a60e0
main : add option to save full output to session ( #1338 )
...
* main : add option to save full output to session
* split behavior into --session and --prompt-cache
* restore original implementation with new names
* PR comments
* move the check for incompatible parameters to gpt_params_parse
* Fix whitespace
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
---------
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
2023-05-10 11:37:14 -04:00
Concedo
19dbb3b2a5
Merge branch 'master' into concedo_experimental
2023-05-10 18:35:53 +08:00
DannyDaemonic
e6a46b0ed1
Locale fix for Windows ( #1379 )
2023-05-09 19:53:28 +02:00
Concedo
54194911ac
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-09 16:50:43 +08:00
DannyDaemonic
41654efea8
Interface improvements and --multiline-input
(previously --author-mode
) ( #1040 )
...
* Interface improvements
* Multiline input
* Track character width
* Works with all characters and control codes + Windows console fixes
2023-05-08 19:45:48 -07:00
Georgi Gerganov
f9a6364912
llama : require first token to be BOS ( #1303 )
...
* llama : require first token to be BOS
* scripts : add ppl-run-all.sh
* perplexity : add BOS for each chunk
* readme : update perplexity values after BOS fix
* perplexity : add clarifying comments
2023-05-08 17:41:54 +03:00
Concedo
1083876a1b
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# README.md
2023-05-08 11:12:42 +08:00
Johannes Gäßler
1f48b0abcf
Documented CUDA reproducibility, added warning ( #1346 )
2023-05-08 02:42:01 +02:00
Concedo
62beded0e7
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# Makefile
# README.md
2023-05-07 19:10:01 +08:00
Jed Fox
3924088512
Remove default arguments from sampling functions ( #1343 )
2023-05-06 17:01:47 -04:00
Concedo
39f3d1cf48
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# Makefile
# README.md
# examples/quantize/quantize.cpp
2023-05-05 21:34:33 +08:00
slaren
94c5652fc0
quantize: make output filename optional, default to ggml-model-<ftype>.bin ( #1301 )
2023-05-05 00:58:56 +02:00
44670
2edbdb0f99
main : add --in-suffix option ( #1318 )
...
* adding --in-suffix option
* print input suffix before generation
2023-05-04 18:41:12 +03:00
DannyDaemonic
db1080876a
Only escape prompts when used with -e
( #1311 )
2023-05-04 05:08:25 -07:00
DannyDaemonic
c65a7fbfa9
Update main's README.md with new features ( #1296 )
2023-05-04 03:02:59 -07:00
Tomas
f647ce040f
fix #1224 reverse prompt and multi line ( #1297 )
...
* fix reverse prompt and multi line
* Code Formatting
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-04 03:02:30 -07:00
Concedo
e01dc631f7
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-04 14:04:41 +08:00
khimaros
6daa09d879
examples : read chat prompts from a template file ( #1196 )
2023-05-03 20:58:11 +03:00
Concedo
ede8e4edbb
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# README.md
2023-05-03 23:34:50 +08:00
CRD716
a8a2efdc81
examples : various prompt and example fixes ( #1298 )
...
* fix dan.txt
* miku prompt improvements
* use common characters
2023-05-03 18:26:47 +03:00
DannyDaemonic
2485d7a4d3
Process escape sequences given in prompts ( #1173 )
2023-05-02 18:46:20 -07:00
DannyDaemonic
13b0c68ed7
Handle signals properly on Windows ( #1123 )
2023-05-02 18:01:57 -07:00
slaren
bf4b22ffe4
fix missing parameters in llama_init_from_gpt_params
( #1293 )
2023-05-03 01:36:45 +02:00
Ron Evans
67c77799e0
examples : add llama_init_from_gpt_params() common function ( #1290 )
...
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 23:39:51 +03:00
Georgi Gerganov
0e6cbff1b7
llama : fix compile warnings
2023-05-02 23:09:08 +03:00
Ron Evans
8c9be35ff9
examples : improve vertical alignment of a few variables ( #1286 )
...
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 20:53:52 +03:00
Robert Brisita
2bb992f034
llama : allow 0 as a seed number. ( #1275 )
2023-05-02 19:23:44 +03:00
Ron Evans
e2cd506999
main : switch input_noecho to input_echo to remove negation ( #979 )
...
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 19:13:26 +03:00
Concedo
94827172e0
Merge branch 'master' into concedo
...
# Conflicts:
# CMakeLists.txt
# Makefile
# ggml-cuda.cu
# ggml-cuda.h
2023-05-02 14:38:31 +08:00
DannyDaemonic
f4cef87edf
Add git-based build information for better issue tracking ( #1232 )
...
* Add git-based build information for better issue tracking
* macOS fix
* "build (hash)" and "CMAKE_SOURCE_DIR" changes
* Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages
* Fix conditional dependency on missing target
* Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile
* 4 space indenting for cmake, attempt to clean up my mess in Makefile
* Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
2023-05-01 18:23:47 +02:00
Georgi Gerganov
70269cae37
llama : fix session load / save ( #1263 )
2023-05-01 14:54:59 +03:00
Concedo
3de34ee492
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# ggml-opencl.c
2023-05-01 12:03:46 +08:00
jon-chuang
a5d30b1f53
common : better default number of threads ( #934 )
...
* commit
* fix
* try-catch
* apply code review
* improve
* improve
* add macos headers
* done
* remove color
* fix windows
* minor
* fix
* Apply suggestions from code review
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
* remove
* minor
* minor
---------
Co-authored-by: jon-chuang <jon-chuang@users.noreply.github.com>
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
2023-04-30 21:41:35 +03:00
Stephan Walter
f0d70f147d
Various fixes to mat_mul benchmark ( #1253 )
2023-04-30 12:32:37 +00:00
Concedo
0061b90ec6
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
2023-04-30 10:35:02 +08:00
Georgi Gerganov
305eb5afd5
build : fix reference to old llama_util.h
2023-04-29 13:53:12 +03:00
Georgi Gerganov
84ca9c2ecf
examples : fix save-load-state + rename llama-util.h
2023-04-29 13:48:11 +03:00
Concedo
da0c34b028
Merge branch 'master' into concedo_experimental
2023-04-29 18:27:06 +08:00
Georgi Gerganov
334637e43e
common : change default parameters to pre-#1126 ( #1223 )
2023-04-29 09:51:06 +03:00
Ivan Stepanov
dd7eff57d8
llama : new sampling algorithms ( #1126 )
...
* Sample interface, new samplers.
New samplers:
- locally typical sampling
- tail free sampling
- frequency and presence penalty
- mirostat
Ignore EOS fix: -inf should be used.
* mirostat
* Added --logit-bias and --no-penalize-nl, removed std::span
* Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k)
Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k)
* Save and load example adjust
* Tests
* Windows build fix
* Windows test fix
2023-04-29 08:34:41 +03:00
Concedo
bb282a4ecf
reinstated the q4_3 format, for backwards compatibility.
2023-04-29 11:42:04 +08:00
Stephan Walter
36d19a603b
Remove Q4_3 which is no better than Q5 ( #1218 )
2023-04-28 23:10:43 +00:00
CRD716
5fba3c016b
examples : add Jeopardy example ( #1168 )
...
* Basic Setup
* Prevent Results.txt from coming up
* Prefixes, Line separators, etc
* editorcheck
* introduction to give more consistent results
* Basic graph thing
* Grading, ready for testing!
* Y'all ready to get funky?
* fix column removal stuff
* missed a few
2023-04-28 19:13:33 +03:00
Evan Jones
1481a9cf25
llama : add session file format and saved sessions in main ( #1169 )
2023-04-28 18:59:37 +03:00