kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-06 12:40:02 +00:00

History

Azure ff6b265e53 Mock triton mla due to precision issue		2025-02-16 06:03:12 +00:00
..
configs	update rope calculation; update modeling.py; update gate for moe	2025-02-01 07:32:21 +00:00
ktransformers_ext	toy support for experts on GPU, no CUDA Graph	2025-02-15 15:16:00 +00:00
models	Fix NoneType object has no attribute zero_	2025-02-15 22:04:45 +08:00
operators	Mock triton mla due to precision issue	2025-02-16 06:03:12 +00:00
optimize	toy support for experts on GPU, no CUDA Graph	2025-02-15 15:16:00 +00:00
server	Add a lock to server inference()	2025-02-13 10:05:22 +00:00
tests	[fix] format classes and files name	2024-08-15 10:44:59 +08:00
util	Merge pull request #333 from kvcache-ai/feat_experts_gpu	2025-02-15 23:30:24 +08:00
website	✨: refactor local_chat and fix message slice bug in server	2024-11-04 14:02:19 +08:00
__init__.py	[feature] update docker image and entrypoint	2025-02-15 07:55:33 +00:00
local_chat.py	⚡ support force thinking	2025-02-12 12:43:53 +08:00