docs: document usage object in server timings response (#23110)

* docs: document `usage` object in server timings response

Co-Authored-By: julien-agent <Agents+cyolo@huggingface.co>

* Apply suggestion from @julien-c

---------

Co-authored-by: julien-agent <Agents+cyolo@huggingface.co>
This commit is contained in:
Julien Chaumond 2026-05-15 19:33:12 +02:00 committed by GitHub
parent 72e60f500d
commit 6831fe470c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -1322,6 +1322,22 @@ This provides information on the performance of the server. It also allows calcu
The total number of tokens in context is equal to `prompt_n + cache_n + predicted_n`
The response also includes a standard `usage` object:
```js
{
// ...
"usage": {
"completion_tokens": 48,
"prompt_tokens": 44,
"total_tokens": 92,
"prompt_tokens_details": {
"cached_tokens": 0
}
}
}
```
*Reasoning support*
The server supports parsing and returning reasoning via the `reasoning_content` field, similar to Deepseek API.