|
models
|
optimize gguf dequant, save mem, support Q2_K
|
2025-02-22 06:13:01 +00:00 |
|
operators
|
support Moonlight
|
2025-02-23 14:21:18 +00:00 |
|
optimize
|
fix KExpertsMarlin on GPU with out CUDA Graph
|
2025-02-24 09:30:54 +00:00 |
|
server
|
Merge branch 'main' into feat-more-context
|
2025-02-22 06:17:39 +00:00 |
|
tests
|
⚡ fix .so bug
|
2025-02-20 21:24:46 +08:00 |
|
util
|
fix KExpertsMarlin on GPU with out CUDA Graph
|
2025-02-24 09:30:54 +00:00 |
|
__init__.py
|
🔖 release v0.2.1.post1
|
2025-02-18 20:45:48 +08:00 |
|
local_chat.py
|
support Moonlight
|
2025-02-23 14:21:18 +00:00 |