mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-05-05 15:31:04 +00:00
common : add standard Hugging Face cache support (#20775)
* common : add standard Hugging Face cache support - Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check with the quant tag Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Cleanup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Improve error handling and report API errors Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Restore common_cached_model_info and align mmproj filtering Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Prefer main when getting cached ref Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use cached files when HF API fails Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use final_path.. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check all inputs Signed-off-by: Adrien Gallouët <angt@huggingface.co> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co>
This commit is contained in:
parent
e852eb4901
commit
8c7957ca33
8 changed files with 1061 additions and 330 deletions
|
|
@ -103,8 +103,8 @@ def test_router_models_max_evicts_lru():
|
|||
|
||||
candidate_models = [
|
||||
"ggml-org/tinygemma3-GGUF:Q8_0",
|
||||
"ggml-org/test-model-stories260K",
|
||||
"ggml-org/test-model-stories260K-infill",
|
||||
"ggml-org/test-model-stories260K:F32",
|
||||
"ggml-org/test-model-stories260K-infill:F32",
|
||||
]
|
||||
|
||||
# Load only the first 2 models to fill the cache
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue