.. |
__init__.py
|
Initial commit
|
2024-07-27 16:06:58 +08:00 |
configuration_deepseek.py
|
Initial commit
|
2024-07-27 16:06:58 +08:00 |
configuration_deepseek_v3.py
|
add balance-serve, support concurrence
|
2025-03-31 22:55:32 +08:00 |
configuration_llama.py
|
[feature] release 0.1.3
|
2024-08-28 16:11:43 +00:00 |
configuration_qwen2_moe.py
|
support qwen3, dont speak human language
|
2025-04-28 08:44:47 +00:00 |
configuration_qwen3_moe.py
|
support qwen3, dont speak human language
|
2025-04-28 08:44:47 +00:00 |
custom_cache.py
|
support qwen3, dont speak human language
|
2025-04-28 08:44:47 +00:00 |
custom_modeling_deepseek_v2.py
|
add balance-serve, support concurrence
|
2025-03-31 22:55:32 +08:00 |
custom_modeling_deepseek_v3.py
|
Fix bug with non-base-multiple chunk_size, update test examples, and resolve issue with writing model_config. Hugging Face URL input is still unsupported.
|
2025-04-04 15:41:07 +08:00 |
custom_modeling_qwen2_moe.py
|
support qwen3, dont speak human language
|
2025-04-28 08:44:47 +00:00 |
custom_modeling_qwen3_moe.py
|
support qwen3, dont speak human language
|
2025-04-28 08:44:47 +00:00 |
modeling_deepseek.py
|
optimize gguf dequant, save mem, support Q2_K
|
2025-02-22 06:13:01 +00:00 |
modeling_deepseek_v3.py
|
Update modeling_deepseek_v3.py
|
2025-04-03 17:13:06 +08:00 |
modeling_llama.py
|
[feature] release 0.1.3
|
2024-08-28 16:11:43 +00:00 |
modeling_mixtral.py
|
[ADD] support multi-gpu qlen>1 q5_k
|
2024-08-12 11:41:26 +00:00 |
modeling_qwen2_moe.py
|
Initial commit
|
2024-07-27 16:06:58 +08:00 |
modeling_qwen3_moe.py
|
support qwen3, dont speak human language
|
2025-04-28 08:44:47 +00:00 |