kvcache-ai-ktransformers/ktransformers
2025-03-07 22:56:19 +08:00
..
configs update rope calculation; update modeling.py; update gate for moe 2025-02-01 07:32:21 +00:00
ktransformers_ext update compile option for avx512vpopcntdq 2025-03-06 12:18:04 +08:00
models optimize gguf dequant, save mem, support Q2_K 2025-02-22 06:13:01 +00:00
operators fix flashinfer precision 2025-03-07 14:07:00 +00:00
optimize Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml 2025-02-26 21:53:50 +08:00
server Merge pull request #842 from BITcyman/fix-openai_chat_completion 2025-03-07 22:56:19 +08:00
tests fix flashinfer precision 2025-03-07 14:07:00 +00:00
util fix flashinfer precision 2025-03-07 14:07:00 +00:00
website : refactor local_chat and fix message slice bug in server 2024-11-04 14:02:19 +08:00
__init__.py Update __init__.py 2025-03-07 22:08:48 +08:00
local_chat.py Update local_chat.py 2025-03-01 21:52:48 +08:00