Commit graph

10 commits

Author SHA1 Message Date
Atream
b0318fc01c fix-hopper-flashinfer 2025-04-29 11:06:34 +08:00
Azure-Tang
31677181c3 Fix ktransformers-server flashinfer wrapper position arg issue;
Fix db position issue
2025-04-01 07:30:23 +00:00
Atream
25cee5810e add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
Atream
d453c320f1 fix flashinfer precision 2025-03-07 14:07:00 +00:00
Atream
f35e8d41d8 support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
Atream
e645d84794 use generation config from json file in official repo 2025-02-27 11:48:34 +00:00
Atream
477ac28a9c fix-update-flashinfer_wrapper_local_chat 2025-02-25 12:47:31 +00:00
Atream
f4c198bd42 support absorb for prefill long context 2025-02-25 08:52:02 +00:00
Atream
a529518346 clean PR code and disable flashinfer 2025-02-19 04:42:47 +00:00
Atream
038bc30888 fix precision bug imported by position_ids in 0.2.0 2025-02-17 09:23:14 +00:00