mirror of
https://github.com/LostRuins/koboldcpp.git
synced 2026-04-26 10:41:25 +00:00
CUDA: reduce MMQ stream-k overhead (#22298)
update-ops-docs.yml #310 -Commit
9725a313be
pushed by
vrr
parser: fix structured output bug (#22302)
python-type-check.yml #309 -Commit
0adede866d
pushed by
vrr
parser: fix structured output bug (#22302)
python-check-requirements.yml #308 -Commit
0adede866d
pushed by
vrr
parser: fix structured output bug (#22302)
pre-tokenizer-hashes.yml #307 -Commit
0adede866d
pushed by
vrr
chat: fix parallel_tool_calls default setting based on model capabilities, add tests for parallel tool calls and structured outputs (#22217)
python-type-check.yml #306 -Commit
8bccdbbff9
pushed by
vrr
chat: fix parallel_tool_calls default setting based on model capabilities, add tests for parallel tool calls and structured outputs (#22217)
python-check-requirements.yml #305 -Commit
8bccdbbff9
pushed by
vrr
chat: fix parallel_tool_calls default setting based on model capabilities, add tests for parallel tool calls and structured outputs (#22217)
pre-tokenizer-hashes.yml #304 -Commit
8bccdbbff9
pushed by
vrr
fix: GLM-DSA crash in llama-tokenize when using vocab_only (#22102)
python-type-check.yml #303 -Commit
81df3f7cfa
pushed by
vrr
fix: GLM-DSA crash in llama-tokenize when using vocab_only (#22102)
python-check-requirements.yml #302 -Commit
81df3f7cfa
pushed by
vrr
fix: GLM-DSA crash in llama-tokenize when using vocab_only (#22102)
pre-tokenizer-hashes.yml #301 -Commit
81df3f7cfa
pushed by
vrr
android : libcommon -> libllama-common (#22076)
python-type-check.yml #300 -Commit
23b8cc4991
pushed by
vrr
android : libcommon -> libllama-common (#22076)
python-check-requirements.yml #299 -Commit
23b8cc4991
pushed by
vrr
android : libcommon -> libllama-common (#22076)
pre-tokenizer-hashes.yml #298 -Commit
23b8cc4991
pushed by
vrr
ci : add android arm64 build and release (#21647)
update-ops-docs.yml #297 -Commit
a279d0f0f4
pushed by
vrr
ci : add android arm64 build and release (#21647)
python-type-check.yml #296 -Commit
a279d0f0f4
pushed by
vrr
ci : add android arm64 build and release (#21647)
python-check-requirements.yml #295 -Commit
a279d0f0f4
pushed by
vrr
ci : add android arm64 build and release (#21647)
pre-tokenizer-hashes.yml #294 -Commit
a279d0f0f4
pushed by
vrr
vulkan: Flash Attention DP4A shader for quantized KV cache (#20797)
python-type-check.yml #293 -Commit
75f3bc94e6
pushed by
vrr
vulkan: Flash Attention DP4A shader for quantized KV cache (#20797)
python-check-requirements.yml #292 -Commit
75f3bc94e6
pushed by
vrr
vulkan: Flash Attention DP4A shader for quantized KV cache (#20797)
pre-tokenizer-hashes.yml #291 -Commit
75f3bc94e6
pushed by
vrr
CUDA: skip compilation of superfluous FA kernels (#21768)
python-type-check.yml #290 -Commit
ff5ef82786
pushed by
vrr
CUDA: skip compilation of superfluous FA kernels (#21768)
python-check-requirements.yml #289 -Commit
ff5ef82786
pushed by
vrr
CUDA: skip compilation of superfluous FA kernels (#21768)
pre-tokenizer-hashes.yml #288 -Commit
ff5ef82786
pushed by
vrr
fix: Fix broken structured output when using $refs in json_schema (#21699)
python-type-check.yml #287 -Commit
b136b62cf9
pushed by
vrr
fix: Fix broken structured output when using $refs in json_schema (#21699)
python-check-requirements.yml #286 -Commit
b136b62cf9
pushed by
vrr
fix: Fix broken structured output when using $refs in json_schema (#21699)
pre-tokenizer-hashes.yml #285 -Commit
b136b62cf9
pushed by
vrr
CUDA: also store `node->src->data` ptrs for equality check (#21635)
python-type-check.yml #284 -Commit
d12cc3d1ca
pushed by
vrr
CUDA: also store `node->src->data` ptrs for equality check (#21635)
python-check-requirements.yml #283 -Commit
d12cc3d1ca
pushed by
vrr
CUDA: also store `node->src->data` ptrs for equality check (#21635)
pre-tokenizer-hashes.yml #282 -Commit
d12cc3d1ca
pushed by
vrr
docs: fix typo in build.md (emdawbwebgpu -> emdawnwebgpu) (#21518)
update-ops-docs.yml #281 -Commit
0033f53a07
pushed by
vrr