mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-05-26 15:53:38 +00:00
docs: Update documentation with Granite 4.0/4.1 (#23404)
This commit is contained in:
parent
99d4026b11
commit
95feeab52e
2 changed files with 2 additions and 0 deletions
|
|
@ -489,6 +489,7 @@ The following templates have active tests in `tests/test-chat.cpp`:
|
|||
| Qwen-QwQ-32B | Reasoning | Forced-open thinking |
|
||||
| NousResearch Hermes 2 Pro | JSON_NATIVE | `<tool_call>` wrapper |
|
||||
| IBM Granite 3.3 | JSON_NATIVE | `<think></think>` + `<response></response>` |
|
||||
| IBM Granite 4.0 | JSON_NATIVE | `<tool_call>` wrapper (same template used by 4.1) |
|
||||
| ByteDance Seed-OSS | TAG_WITH_TAGGED | Custom `<seed:think>` and `<seed:tool_call>` tags |
|
||||
| Qwen3-Coder | TAG_WITH_TAGGED | XML-style tool format |
|
||||
| DeepSeek V3.1 | JSON_NATIVE | Forced thinking mode |
|
||||
|
|
|
|||
|
|
@ -291,6 +291,7 @@ Here are some models known to work (w/ chat template override when needed):
|
|||
llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
|
||||
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q6_K_L
|
||||
llama-server --jinja -fa -hf bartowski/Llama-3.3-70B-Instruct-GGUF:Q4_K_M
|
||||
llama-server --jinja -fa -hf ibm-granite/granite-4.1-3b-GGUF:Q4_K_M
|
||||
|
||||
# Native support for DeepSeek R1 works best w/ our template override (official template is buggy, although we do work around it)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue