kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-11 07:44:35 +00:00

History

Aubrey Li a12e8ab46e yaml: fix Marlin AssertionError Marlin quantized linear only supports GPU device, when change generate_op to "KLinearMarlin", generate_device need to be changed to "cuda" accordingly. Fixes: `e5b001d76f` ("Update readme; Format code; Add example yaml.")		2025-03-21 23:58:20 +08:00
..
optimize_rules	yaml: fix Marlin AssertionError	2025-03-21 23:58:20 +08:00
optimize.py	optimize gguf dequant, save mem, support Q2_K	2025-02-22 06:13:01 +00:00