kvcache-ai-ktransformers/ktransformers/server
2025-02-18 11:15:17 +08:00
..
api feat: use model name in openai endpoint 2025-02-17 00:27:32 +08:00
backend fix: use 'cuda:0' by default if torch_device is 'cuda' 2025-02-18 11:15:17 +08:00
config update force_think 2025-02-12 11:42:55 +08:00
crud Initial commit 2024-07-27 16:06:58 +08:00
models Initial commit 2024-07-27 16:06:58 +08:00
schemas Initial commit 2024-07-27 16:06:58 +08:00
utils Initial commit 2024-07-27 16:06:58 +08:00
__init__.py Initial commit 2024-07-27 16:06:58 +08:00
args.py feat: add prefix cache for server 2025-02-17 00:10:55 +08:00
exceptions.py Initial commit 2024-07-27 16:06:58 +08:00
main.py fix: fix server for triton kernel 2025-02-17 18:08:45 +08:00
requirements.txt Initial commit 2024-07-27 16:06:58 +08:00