kvcache-ai-ktransformers/ktransformers
2025-02-20 15:01:35 +08:00
..
configs update rope calculation; update modeling.py; update gate for moe 2025-02-01 07:32:21 +00:00
ktransformers_ext toy support for experts on GPU, no CUDA Graph 2025-02-15 15:16:00 +00:00
models Merge branch 'fix_precision_MLA' of https://github.com/kvcache-ai/ktransformers into server-prefix-cache 2025-02-17 18:08:04 +08:00
operators clean PR code and disable flashinfer 2025-02-19 04:42:47 +00:00
optimize toy support for experts on GPU, no CUDA Graph 2025-02-15 15:16:00 +00:00
server fix: fix SSE formatting 2025-02-20 15:01:35 +08:00
tests add mmlu_pro test 2025-02-18 14:43:38 +08:00
util fix precision bug imported by position_ids in 0.2.0 2025-02-17 09:23:14 +00:00
website : refactor local_chat and fix message slice bug in server 2024-11-04 14:02:19 +08:00
__init__.py 🔖 release v0.2.1.post1 2025-02-18 20:45:48 +08:00
local_chat.py fix precision bug imported by position_ids in 0.2.0 2025-02-17 09:23:14 +00:00