Commit graph

9 commits

Author SHA1 Message Date
Azure-Tang
ed8437413b merge main; Add torch q8 linear 2025-03-14 05:52:07 -04:00
Atream
f4c198bd42 support absorb for prefill long context 2025-02-25 08:52:02 +00:00
Atream
006e8c6abc remove causal mask 2025-02-23 07:40:47 +00:00
Azure
907251c743 done support deepseekv3 2025-02-04 15:53:38 +00:00
Azure
f748cd29f0 fix rope; update moegate 2025-02-01 18:05:45 +00:00
Azure
476b1d8dc6 support deepseekv3; runable but have precition problem 2025-01-31 08:27:24 +00:00
TangJingqi
6735beb5b6 Fix cannot offload whole layer in cpu 2024-08-29 19:10:14 +08:00
chenxl
4d1d561d28 [feature] release 0.1.3 2024-08-28 16:11:43 +00:00
TangJingqi
67043b4b5c [fix] format classes and files name 2024-08-15 10:44:59 +08:00
Renamed from ktransformers/operators/layer_wise_prefill.py (Browse further)