mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2025-09-15 01:29:42 +00:00
Marlin quantized linear only supports GPU device, when change generate_op
to "KLinearMarlin", generate_device need to be changed to "cuda" accordingly.
Fixes:
|
||
---|---|---|
.. | ||
configs | ||
ktransformers_ext | ||
models | ||
operators | ||
optimize | ||
server | ||
tests | ||
util | ||
website | ||
__init__.py | ||
local_chat.py | ||
local_chat_test.py |