mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2025-09-15 17:49:42 +00:00
commit
3a3021502e
1 changed files with 1 additions and 1 deletions
|
@ -86,7 +86,7 @@ Memory: standard DDR5-4800 server DRAM (1 TB), each socket with 8×DDR5-4800
|
||||||
#### Change Log
|
#### Change Log
|
||||||
- Longer Context (from 4K to 8K for 24GB VRAM) and Slightly Faster Speed (+15%):<br>
|
- Longer Context (from 4K to 8K for 24GB VRAM) and Slightly Faster Speed (+15%):<br>
|
||||||
Integrated the highly efficient Triton MLA Kernel from the fantastic sglang project, enable much longer context length and slightly faster prefill/decode speed
|
Integrated the highly efficient Triton MLA Kernel from the fantastic sglang project, enable much longer context length and slightly faster prefill/decode speed
|
||||||
- We suspect that some of the improvements come from the change of hardwre platform (4090D->4090)
|
- We suspect that some of the improvements come from the change of hardware platform (4090D->4090)
|
||||||
#### Benchmark Results
|
#### Benchmark Results
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue