mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2025-09-10 17:14:36 +00:00
Fix the penultimate token sometimes being lost with SSE streaming (#1031)
The token immediately before an eot token was lost when SSE streaming was enabled if that token was contained entirely within a stop sequence. As an example of when this could happen, consider this prompt: Type the phrase 'pleas' once. In a Llama 3-derived model, 'pleas' tokenizes as 'ple' 'as'. The token 'as' is contained within this instruct mode stop sequence: <|eot_id|><|start_header_id|>assistant<|end_header_id|> due to the word 'assistant'. Since `string_contains_sequence_substring` returns True for 'as', this token is added to `tokenReserve` instead of being streamed immediately. If the '<|eot_id|>' token was generated next, the text in `tokenReserve` would be discarded.
This commit is contained in:
parent
948646ff7a
commit
26f1df5e5f
1 changed files with 1 additions and 1 deletions
|
@ -1447,7 +1447,7 @@ class ServerRequestHandler(http.server.SimpleHTTPRequestHandler):
|
|||
tokenReserve += tokenStr
|
||||
await asyncio.sleep(async_sleep_short) #if a stop sequence could trigger soon, do not send output
|
||||
else:
|
||||
if tokenStr!="":
|
||||
if tokenStr!="" or tokenReserve!="":
|
||||
tokenStr = tokenReserve + tokenStr
|
||||
tokenReserve = ""
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue