kvcache-ai-ktransformers/ktransformers
chenmz00 b2ba795cfd
fix: list models API
Fix the list models API to match the corresponding OpenAI API format.
2025-03-05 21:49:27 +08:00
..
configs update rope calculation; update modeling.py; update gate for moe 2025-02-01 07:32:21 +00:00
ktransformers_ext Merge pull request #622 from akemimadoka/fix-msvc 2025-02-27 17:42:00 +08:00
models optimize gguf dequant, save mem, support Q2_K 2025-02-22 06:13:01 +00:00
operators fix: wrong shape in KLinearMarlin. 2025-03-03 17:34:45 +08:00
optimize Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml 2025-02-26 21:53:50 +08:00
server fix: list models API 2025-03-05 21:49:27 +08:00
tests release v0.2.3 2025-03-05 20:21:04 +08:00
util support chunk prefill, support 139K context for 24G VRAM 2025-03-01 11:28:25 +00:00
website : refactor local_chat and fix message slice bug in server 2024-11-04 14:02:19 +08:00
__init__.py release v0.2.3 2025-03-05 20:21:04 +08:00
local_chat.py Update local_chat.py 2025-03-01 21:52:48 +08:00