docs : update speculative decoding parameters after refactor (#22397) (#22539)

* docs : update speculative decoding parameters after refactor (#22397) Update docs/speculative.md to reflect the new parameter naming scheme introduced in PR #22397: - Replace --draft-max/--draft-min with --spec-draft-n-max/--spec-draft-n-min - Replace --spec-ngram-size-n/m with per-implementation variants - Add documentation for all new --spec-ngram-*- parameters - Update all example commands Assisted-by: llama.cpp:local pi * pi : add rule to use gh CLI for GitHub resources Assisted-by: llama.cpp:local pi * docs : run llama-gen-docs * arg : fix typo
2026-05-19 08:00:25 +00:00 · 2026-05-04 08:52:07 +03:00 · 2026-05-04 08:52:07 +03:00 · 846262d787
commit 846262d787
parent 6dcd824fce
6 changed files with 209 additions and 69 deletions
--- a/common/arg.cpp
+++ b/common/arg.cpp
@ -3380,7 +3380,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
    ).set_spec().set_examples({LLAMA_EXAMPLE_SPECULATIVE, LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_CLI}));
    add_opt(common_arg(
        {"--spec-draft-poll", "--poll-draft"}, "<0|1>",
-        "Use polling to wait for draft model work (default: same as --poll])",
+        "Use polling to wait for draft model work (default: same as --poll)",
        [](common_params & params, int value) {
            params.speculative.draft.cpuparams.poll = value;
        }