liam
|
098602b08f
|
⚡ v0.2 ongoing
|
2025-02-09 22:41:14 +08:00 |
|
liam
|
c18ecd7b7f
|
⚡ add flush print in local_chat output and change default optimize yaml of deepseekv3 to single gpu
|
2025-02-08 13:15:52 +08:00 |
|
liam
|
0262f954c7
|
Merge branch 'feat-DeepSeekV3' of github.com:kvcache-ai/ktransformers into feat-DeepSeekV3
|
2025-02-06 22:41:25 +08:00 |
|
liam
|
3dca28d23b
|
⚡ fix moe.cpp int overflow problem
|
2025-02-06 22:39:16 +08:00 |
|
Azure
|
027b11266c
|
modify moeinfer param
|
2025-02-06 14:07:38 +00:00 |
|
Azure
|
ee24a27001
|
update v3 single gpu rule yaml;
|
2025-02-04 16:14:35 +00:00 |
|
Azure
|
f873558a89
|
update rope calculation; update modeling.py; update gate for moe
|
2025-02-01 07:32:21 +00:00 |
|
Azure
|
5a50b34627
|
fix hard coding caused by rope dim calculation, load from config now
|
2025-01-31 15:25:50 +00:00 |
|
Azure
|
476b1d8dc6
|
support deepseekv3; runable but have precition problem
|
2025-01-31 08:27:24 +00:00 |
|
anyanqilin
|
a72dc6ed15
|
wjh change
|
2024-11-04 14:02:19 +08:00 |
|
liam
|
7c94df4bcf
|
🚑️: back transformer.py bugs version, and fix typo error in local_chat.py
|
2024-11-04 14:02:19 +08:00 |
|
liam
|
dd1d8667f3
|
✨: refactor local_chat and fix message slice bug in server
|
2024-11-04 14:02:19 +08:00 |
|
TangJingqi
|
6735beb5b6
|
Fix cannot offload whole layer in cpu
|
2024-08-29 19:10:14 +08:00 |
|
chenxl
|
4d1d561d28
|
[feature] release 0.1.3
|
2024-08-28 16:11:43 +00:00 |
|
chenxl
|
f5f79f5c0e
|
[ADD] support multi-gpu qlen>1 q5_k
|
2024-08-12 11:41:26 +00:00 |
|
chenxl
|
18c42e67df
|
Initial commit
|
2024-07-27 16:06:58 +08:00 |
|