kvcache-ai-ktransformers/ktransformers/server
2025-03-07 14:52:16 +00:00
..
api [update] support openai chat completion api 2025-03-07 08:51:09 +00:00
backend [fix] thread context bug 2025-03-07 14:52:16 +00:00
config support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
crud Initial commit 2024-07-27 16:06:58 +08:00
models Initial commit 2024-07-27 16:06:58 +08:00
schemas [update] support openai chat completion api 2025-03-07 08:51:09 +00:00
utils Initial commit 2024-07-27 16:06:58 +08:00
__init__.py Initial commit 2024-07-27 16:06:58 +08:00
args.py support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
exceptions.py Initial commit 2024-07-27 16:06:58 +08:00
main.py fix: fix server for triton kernel 2025-02-17 18:08:45 +08:00
requirements.txt [update] support openai chat completion api 2025-03-07 08:51:09 +00:00