Commit graph

4 commits

Author SHA1 Message Date
wangkuigang-yewu-cmss
4538bdae97 prevent rpc process from crashing on long prompt
当prompt超过cache_len的时候,rpc进程会crash掉,导致整体不可用。
这里增加一个检查,让过长的prompt在请求早期就被提前过滤掉
2025-04-13 16:13:16 +08:00
dongjw
ec03bcbd7f fix temperature=0, flashinfer sample error 2025-04-07 12:30:47 +08:00
dongjw
5c7ed7b579 fix top_p = 0 bug 2025-04-01 20:38:33 +08:00
Atream
25cee5810e add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00