mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-05-07 09:02:04 +00:00
* minor : code style * server : fix prompt similarity calculation * server : initial host-memory prompt caching * cont * server : refactor * cont * cont : make the server task of the slot const * cont : minor [no ci] * server : cache prompts and checkpoints only for completion tasks * server : improve prompt caching logic * cont : fix check for number of cached prompts [no ci] * server : improve caching logic, add -cram CLI arg * server : print prompt mismatch info * cont : better naming [no ci] * server : improve prompt cache loading logic * server : add option to debug the slot contents (#16482) * server : add option to debug the slot contents * Update tools/server/server.cpp --------- Co-authored-by: Xuan-Son Nguyen <son@huggingface.co> * server : add option to disable prompt cache --------- Co-authored-by: Xuan-Son Nguyen <son@huggingface.co> |
||
|---|---|---|
| .. | ||
| test_basic.py | ||
| test_chat_completion.py | ||
| test_completion.py | ||
| test_ctx_shift.py | ||
| test_embedding.py | ||
| test_infill.py | ||
| test_lora.py | ||
| test_rerank.py | ||
| test_security.py | ||
| test_slot_save.py | ||
| test_speculative.py | ||
| test_template.py | ||
| test_tokenize.py | ||
| test_tool_call.py | ||
| test_vision_api.py | ||