mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2025-09-11 07:44:35 +00:00
Marlin quantized linear only supports GPU device, when change generate_op
to "KLinearMarlin", generate_device need to be changed to "cuda" accordingly.
Fixes:
|
||
---|---|---|
.. | ||
optimize_rules | ||
optimize.py |