common/parser: add proper reasoning tag prefill reading (#20424)

* Implement proper prefill extraction

* Refactor cli parameters, update docs, move reasoning budget sampler part to common/reasoning-budget.cpp

* Update tools/server/server-task.cpp

* refactor: move grammars to variant, remove grammar_external, handle exception internally

* Make code less C++y

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Piotr Wilkin (ilintar) 2026-03-19 16:58:21 +01:00 committed by GitHub
parent c1258830b2
commit 5e54d51b19
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
33 changed files with 651 additions and 454 deletions

View file

@ -210,6 +210,7 @@ def test_completion_with_response_format(response_format: dict, n_predicted: int
def test_completion_with_json_schema(jinja: bool, json_schema: dict, n_predicted: int, re_content: str):
global server
server.jinja = jinja
server.debug = True
server.start()
res = server.make_request("POST", "/chat/completions", data={
"max_tokens": n_predicted,