docs : update speculative decoding parameters after refactor (#22397) (#22539)

* docs : update speculative decoding parameters after refactor (#22397)

Update docs/speculative.md to reflect the new parameter naming scheme
introduced in PR #22397:

- Replace --draft-max/--draft-min with --spec-draft-n-max/--spec-draft-n-min
- Replace --spec-ngram-size-n/m with per-implementation variants
- Add documentation for all new --spec-ngram-*- parameters
- Update all example commands

Assisted-by: llama.cpp:local pi

* pi : add rule to use gh CLI for GitHub resources

Assisted-by: llama.cpp:local pi

* docs : run llama-gen-docs

* arg : fix typo
This commit is contained in:
Georgi Gerganov 2026-05-04 08:52:07 +03:00 committed by GitHub
parent 6dcd824fce
commit 846262d787
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 209 additions and 69 deletions

View file

@ -3380,7 +3380,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
).set_spec().set_examples({LLAMA_EXAMPLE_SPECULATIVE, LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_CLI}));
add_opt(common_arg(
{"--spec-draft-poll", "--poll-draft"}, "<0|1>",
"Use polling to wait for draft model work (default: same as --poll])",
"Use polling to wait for draft model work (default: same as --poll)",
[](common_params & params, int value) {
params.speculative.draft.cpuparams.poll = value;
}