kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2026-05-05 23:50:14 +00:00

History

BITcyman 08a8b553d6 [fix] thread context bug		2025-03-07 14:52:16 +00:00
..
configs	update rope calculation; update modeling.py; update gate for moe	2025-02-01 07:32:21 +00:00
ktransformers_ext	⚡ update compile option for avx512vpopcntdq	2025-03-06 12:18:04 +08:00
models	optimize gguf dequant, save mem, support Q2_K	2025-02-22 06:13:01 +00:00
operators	fix: wrong shape in KLinearMarlin.	2025-03-03 17:34:45 +08:00
optimize	Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml	2025-02-26 21:53:50 +08:00
server	[fix] thread context bug	2025-03-07 14:52:16 +00:00
tests	⚡ update compile option for avx512vpopcntdq	2025-03-06 12:18:04 +08:00
util	support chunk prefill, support 139K context for 24G VRAM	2025-03-01 11:28:25 +00:00
website	✨: refactor local_chat and fix message slice bug in server	2024-11-04 14:02:19 +08:00
__init__.py	⚡ release v0.2.3	2025-03-05 20:21:04 +08:00
local_chat.py	Update local_chat.py	2025-03-01 21:52:48 +08:00