kvcache-ai-ktransformers/ktransformers/server
2025-05-15 04:09:34 +00:00
..
api fix load default max_new_tokens 2025-04-25 04:20:12 +00:00
backend fix flashinfer float_workspace_buffer small 2025-05-14 09:33:52 +00:00
balance_serve fix deduplicate_and_sort cudagraphs 2025-05-15 04:09:34 +00:00
config support qwen3, dont speak human language 2025-04-28 08:44:47 +00:00
crud Initial commit 2024-07-27 16:06:58 +08:00
models Initial commit 2024-07-27 16:06:58 +08:00
schemas fix load default max_new_tokens 2025-04-25 04:20:12 +00:00
utils add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
__init__.py Initial commit 2024-07-27 16:06:58 +08:00
args.py support safetensor load, delete architectures argument 2025-05-09 10:38:29 +00:00
exceptions.py Initial commit 2024-07-27 16:06:58 +08:00
main.py Move KV cache creation to balance_serve 2025-04-18 10:10:07 +00:00
requirements.txt support qwen3, dont speak human language 2025-04-28 08:44:47 +00:00