kvcache-ai-ktransformers/ktransformers/operators
2024-08-08 09:04:36 +00:00
..
__init__.py Initial commit 2024-07-27 16:06:58 +08:00
attention.py Initial commit 2024-07-27 16:06:58 +08:00
base_operator.py Initial commit 2024-07-27 16:06:58 +08:00
experts.py 1) Linear and MLP operators support qlen>1; 2) All operators now share a single memory buffer; 3) Refactor CPUInfer submit/sync logic. 2024-08-08 09:04:36 +00:00
layer_wise_prefill.py Initial commit 2024-07-27 16:06:58 +08:00
linear.py Initial commit 2024-07-27 16:06:58 +08:00
RoPE.py Initial commit 2024-07-27 16:06:58 +08:00