models
|
Fix NoneType object has no attribute zero_
|
2025-02-15 22:04:45 +08:00 |
operators
|
Mock triton mla due to precision issue
|
2025-02-16 06:03:12 +00:00 |
optimize
|
toy support for experts on GPU, no CUDA Graph
|
2025-02-15 15:16:00 +00:00 |
server
|
Add a lock to server inference()
|
2025-02-13 10:05:22 +00:00 |
tests
|
[fix] format classes and files name
|
2024-08-15 10:44:59 +08:00 |
util
|
Merge pull request #333 from kvcache-ai/feat_experts_gpu
|
2025-02-15 23:30:24 +08:00 |
__init__.py
|
[feature] update docker image and entrypoint
|
2025-02-15 07:55:33 +00:00 |
local_chat.py
|
⚡ support force thinking
|
2025-02-12 12:43:53 +08:00 |