kvcache-ai-ktransformers/ktransformers/server
2025-03-01 11:28:25 +00:00
..
api Merge branch 'main' into temperature_top_p_from_request 2025-02-27 18:08:55 +08:00
backend support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
config support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
crud Initial commit 2024-07-27 16:06:58 +08:00
models Initial commit 2024-07-27 16:06:58 +08:00
schemas Merge branch 'main' into temperature_top_p_from_request 2025-02-27 18:08:55 +08:00
utils Initial commit 2024-07-27 16:06:58 +08:00
__init__.py Initial commit 2024-07-27 16:06:58 +08:00
args.py support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
exceptions.py Initial commit 2024-07-27 16:06:58 +08:00
main.py fix: fix server for triton kernel 2025-02-17 18:08:45 +08:00
requirements.txt Initial commit 2024-07-27 16:06:58 +08:00