kvcache-ai-ktransformers/ktransformers/server/balance_serve/inference
2025-05-15 04:09:34 +00:00
..
distributed add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
sampling add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
__init__.py add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
config.py add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
forward_batch.py support safetensor load, delete architectures argument 2025-05-09 10:38:29 +00:00
model_runner.py fix deduplicate_and_sort cudagraphs 2025-05-15 04:09:34 +00:00
query_manager.py remove hard code max_length 2025-04-18 12:11:18 +08:00