mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2025-09-11 17:44:38 +00:00
Merge commit '751fcfc6c3
' into concedo_experimental
# Conflicts: # .github/workflows/build.yml # CONTRIBUTING.md # README.md # flake.lock # tests/CMakeLists.txt # tests/test-backend-ops.cpp
This commit is contained in:
commit
c81d1623b4
41 changed files with 102934 additions and 92608 deletions
|
@ -444,7 +444,7 @@ node index.js
|
|||
|
||||
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. Default: `-1`, where `-1` is infinity.
|
||||
|
||||
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded.
|
||||
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. The number excludes the BOS token.
|
||||
By default, this value is set to `0`, meaning no tokens are kept. Use `-1` to retain all tokens from the prompt.
|
||||
|
||||
`stream`: It allows receiving each predicted token in real-time instead of waiting for the completion to finish. To enable this, set to `true`.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue