Azure-Tang
|
203b853c75
|
rm KMoEGateDeepSeekV3, fall back to KMoEGate
|
2025-04-01 07:13:05 +00:00 |
|
Atream
|
a889288fc1
|
use compile for gate, slight performance improvement
|
2025-03-14 12:43:28 +00:00 |
|
Atream
|
5ec33d046d
|
optimize gguf dequant, save mem, support Q2_K
use marlin for lm_head, lm_head only calc last token for prefill
extend context window to 19K for DeepSeek-V3/R1 within 24GB VRAM
|
2025-02-22 06:13:01 +00:00 |
|
liam
|
83401dbb3b
|
⚡ ready to publish
|
2025-02-10 12:29:23 +08:00 |
|
Azure
|
907251c743
|
done support deepseekv3
|
2025-02-04 15:53:38 +00:00 |
|
Azure
|
f748cd29f0
|
fix rope; update moegate
|
2025-02-01 18:05:45 +00:00 |
|
Azure
|
f873558a89
|
update rope calculation; update modeling.py; update gate for moe
|
2025-02-01 07:32:21 +00:00 |
|
Azure
|
476b1d8dc6
|
support deepseekv3; runable but have precition problem
|
2025-01-31 08:27:24 +00:00 |
|