kvcache-ai-ktransformers/ktransformers/server
2025-03-10 09:10:28 +08:00
..
api Merge branch 'kvcache-ai:main' into main 2025-03-10 09:10:28 +08:00
backend Merge pull request #842 from BITcyman/fix-openai_chat_completion 2025-03-07 22:56:19 +08:00
config support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
crud Initial commit 2024-07-27 16:06:58 +08:00
models Initial commit 2024-07-27 16:06:58 +08:00
schemas [update] support openai chat completion api 2025-03-07 08:51:09 +00:00
utils Initial commit 2024-07-27 16:06:58 +08:00
__init__.py Initial commit 2024-07-27 16:06:58 +08:00
args.py support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
exceptions.py Initial commit 2024-07-27 16:06:58 +08:00
main.py fix: fix server for triton kernel 2025-02-17 18:08:45 +08:00
requirements.txt [update] support openai chat completion api 2025-03-07 08:51:09 +00:00