* Further tidy on Android instructions README.md
Fixed some logic when following readme direction
* Clean up redundent information
A new user arriving will see simple directions on llama.cpp homepage
* corrected puncuation
Period after cmake, colon after termux
* re-word for clarity
method seems to be more correct, instead of alternative in this context
* Organized required packages per build type
building llama.cpp with NDK on a pc doesn't require installing clang, cmake, git, or wget in termux.
* README.md
corrected title
* fix trailing whitespace
* Fixed save_imatrix to match old behaviour for MoE
This fix is simple and clear, but unnecessarily doubles the memory overhead..
* Fixed missing idx variable
* Unconditionally increment ncall
Co-authored-by: slaren <slarengh@gmail.com>
* Fixed 2 bugs in save_imatrix()
- Fixed segfault bug because the counts vector needed to be created.
- Fixed pre-existing bug didn't actually add to the counts for "--combine" option.
* ncall needs summing too
* Trailing whitespace
---------
Co-authored-by: slaren <slarengh@gmail.com>
* Update log text (EOS to EOG)
The log text "found EOS" is no longer always correct, here, because there is now an is-EOG check that also returns true for EOT.
* Improve log msg. further by using "an" instead of "some".
As suggested, to avoid misunderstanding (no multiple EOG tokens found, just one).
* Disable benchmark on forked repo
* only check owner on schedule event
* check owner on push also
* more readable as multi-line
* ternary won't work
* style++
* test++
* enable actions debug
* test--
* remove debug
* test++
* do debug where we can get logs
* test--
* this is driving me crazy
* correct github.event usage
* remove test condition
* correct github.event usage
* test++
* test--
* event_name is pull_request_target
* test++
* test--
* update ref checks
This will reproduce the issue in llama13b
{
'prompt': 'Q: hello world \nA: ',
'stop': ['\n'],
'temperature': 0.0,
'n_predict': 10,
'cache_prompt': True,
'n_probs': 10
}
* convert.py: add python logging instead of print()
* convert.py: verbose flag takes priority over dump flag log suppression
* convert.py: named instance logging
* convert.py: use explicit logger id string
* convert.py: convert extra print() to named logger
* convert.py: sys.stderr.write --> logger.error
* *.py: Convert all python scripts to use logging module
* requirements.txt: remove extra line
* flake8: update flake8 ignore and exclude to match ci settings
* gh-actions: add flake8-no-print to flake8 lint step
* pre-commit: add flake8-no-print to flake8 and also update pre-commit version
* convert-hf-to-gguf.py: print() to logger conversion
* *.py: logging basiconfig refactor to use conditional expression
* *.py: removed commented out logging
* fixup! *.py: logging basiconfig refactor to use conditional expression
* constant.py: logger.error then exit should be a raise exception instead
* *.py: Convert logger error and sys.exit() into a raise exception (for atypical error)
* gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar
* verify-checksum-model.py: This is the result of the program, it should be printed to stdout.
* compare-llama-bench.py: add blank line for readability during missing repo response
* reader.py: read_gguf_file() use print() over logging
* convert.py: warning goes to stderr and won't hurt the dump output
* gguf-dump.py: dump_metadata() should print to stdout
* convert-hf-to-gguf.py: print --> logger.debug or ValueError()
* verify-checksum-models.py: use print() for printing table
* *.py: refactor logging.basicConfig()
* gguf-py/gguf/*.py: use __name__ as logger name
Since they will be imported and not run directly.
* python-lint.yml: use .flake8 file instead
* constants.py: logger no longer required
* convert-hf-to-gguf.py: add additional logging
* convert-hf-to-gguf.py: print() --> logger
* *.py: fix flake8 warnings
* revert changes to convert-hf-to-gguf.py for get_name()
* convert-hf-to-gguf-update.py: use triple quoted f-string instead
* *.py: accidentally corrected the wrong line
* *.py: add compilade warning suggestions and style fixes
* llama : rename ctx to user_data in progress_callback
This commit renames the `ctx` parameter to `user_data` in the
`llama_progress_callback` typedef.
The motivation for this is that other callbacks use `user_data` or
`data`, and using `ctx` in this case might be confusing as it could be
confused with `llama_context`.
---------
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
* Allow KCPP_CUDA to specify CUDA version
* CUDA 12 CI Linux
* CUDA 12 CI
* Fix KCPP_CUDA indent
* KCPP_CUDA ENV Fix
StackOverflow is bad for advice sometimes....
* Lowcase cuda on output filename
* Strip . from filename output