Concedo
|
9a46faa1c3
|
fix for override tensors not passing correctly
|
2025-11-28 13:03:40 +08:00 |
|
Concedo
|
782ec5bffe
|
bad identifier name
|
2025-11-27 11:07:13 +08:00 |
|
Concedo
|
d68f4a5ae5
|
disable clip fa for now
|
2025-11-27 10:20:38 +08:00 |
|
Concedo
|
6770767d8a
|
allow FA for clip but with wmma disabled for turing on bad sizes
|
2025-11-27 01:03:29 +08:00 |
|
Concedo
|
e6ad29341b
|
disable FA for clip test
|
2025-11-27 01:02:19 +08:00 |
|
CasualAutopsy
|
7703bed260
|
Temp: Fix Needlessly Iterating on Candidates During Greedy Sampling (#1854)
|
2025-11-22 16:06:50 +08:00 |
|
Concedo
|
8631bbcee3
|
linting
|
2025-11-18 18:56:31 +08:00 |
|
LostRuins Concedo
|
7aea1d7c02
|
clean up unused llava functions, fix qwen3vl loading
|
2025-11-18 10:34:55 +08:00 |
|
LostRuins Concedo
|
281542aa0d
|
add smoothing curve, not tested
|
2025-11-17 23:07:35 +08:00 |
|
LostRuins Concedo
|
3fe0e39b62
|
Merge commit '4dca015b7e' into concedo_experimental
# Conflicts:
# .github/copilot-instructions.md
# README.md
# docs/ops.md
# docs/ops/CPU.csv
# docs/ops/CUDA.csv
# docs/ops/Vulkan.csv
# ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
# src/CMakeLists.txt
# tests/test-backend-ops.cpp
|
2025-11-16 18:33:58 +08:00 |
|
LostRuins Concedo
|
86f907272a
|
relocated shader compile warning
|
2025-11-15 23:17:47 +08:00 |
|
LostRuins Concedo
|
d6a2ad8455
|
still not really working right
|
2025-11-09 01:57:48 +08:00 |
|
LostRuins Concedo
|
cfb22b5c9d
|
rename a missed BLAS -> batch
|
2025-11-06 16:11:26 +08:00 |
|
Concedo
|
0891b0752d
|
qwen3vl fixed (+2 squashed commit)
Squashed commit:
[89f65ed0c] wip fixing q3vl
[6fa34cff2] wip fixing q3vl
|
2025-10-31 17:52:33 +08:00 |
|
Concedo
|
57e1d9c822
|
rename blasbatchsize to batchsize
|
2025-10-24 18:16:54 +08:00 |
|
Concedo
|
68c9d955d2
|
support multiple override kv
|
2025-10-24 17:28:54 +08:00 |
|
Concedo
|
e92f9fd422
|
cursed hack for RNN models
|
2025-10-11 23:14:55 +08:00 |
|
Concedo
|
3b30f12ca7
|
future proof handling of rnn models
|
2025-10-07 19:12:47 +08:00 |
|
Concedo
|
5d89a48a50
|
add more rnn models supported
|
2025-09-24 18:14:59 +08:00 |
|
Concedo
|
7e35954695
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# docs/build.md
# docs/function-calling.md
# examples/eval-callback/eval-callback.cpp
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/kleidiai/kernels.cpp
# ggml/src/ggml-cpu/kleidiai/kernels.h
# ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
# scripts/compare-llama-bench.py
# scripts/server-bench.py
# scripts/tool_bench.py
# tests/test-chat.cpp
# tools/batched-bench/batched-bench.cpp
# tools/llama-bench/llama-bench.cpp
# tools/server/README.md
|
2025-08-31 23:33:36 +08:00 |
|
Concedo
|
3210b378e8
|
better tool calls
|
2025-08-20 22:11:31 +08:00 |
|
Concedo
|
5a921a40f9
|
add overridenativecontext flag, stop nagging me
|
2025-08-14 22:54:45 +08:00 |
|
Concedo
|
4c1faf61b2
|
increment version (+1 squashed commits)
Squashed commits:
[6e5080ad2] increment version
|
2025-08-09 20:53:26 +08:00 |
|
Concedo
|
338b1fe97e
|
readjusted mistral and oai template, fixed compile issue on termux, updated lite, show generated token ids in debug mode
|
2025-08-07 21:14:48 +08:00 |
|
Concedo
|
34487d3c02
|
gpt oss harmony template
|
2025-08-06 11:39:40 +08:00 |
|
Concedo
|
e40d26b9e7
|
allow offloading moe to cpu with --moecpu
|
2025-08-05 23:42:42 +08:00 |
|
Concedo
|
428a07416a
|
cleanup some debug
|
2025-08-05 00:07:22 +08:00 |
|
Concedo
|
3284757b56
|
voxstral mini is really bad
|
2025-07-29 21:22:17 +08:00 |
|
Concedo
|
abf527a207
|
clearer multimodal capability display
|
2025-07-28 22:54:49 +08:00 |
|
Concedo
|
12a6088a65
|
added voxtral support, however without the magic token it hears audio as text
|
2025-07-28 22:35:59 +08:00 |
|
Concedo
|
b87864144b
|
no ctx shift for all mrope
|
2025-07-25 13:53:20 +08:00 |
|
Concedo
|
9f4d0f6ccf
|
fixed swa pp bug by retrying smaller batches
|
2025-07-21 23:34:22 +08:00 |
|
Concedo
|
6d50def409
|
default kv_unified to true, handle LLAMA_SET_ROWS.
|
2025-07-21 16:13:20 +08:00 |
|
Concedo
|
b028dd4e84
|
minor fixes
|
2025-07-18 13:22:59 +08:00 |
|
Concedo
|
f0564f9caf
|
updated lite, added better separators for multimodal chunks (universal)
|
2025-07-17 00:11:08 +08:00 |
|
Concedo
|
bc2877d2fe
|
test without g3n fix
|
2025-07-13 23:42:59 +08:00 |
|
Concedo
|
811463a704
|
split audio and vision detection separately
|
2025-07-13 17:47:15 +08:00 |
|
Concedo
|
dca49de059
|
fixed qwen2 audio issues, works fine now (+3 squashed commit)
Squashed commit:
[b3053a1ba] updated lite
[5071630d6] fixed mtmd issues, audio works
[06efa5af4] fix mtmd compile
|
2025-07-12 18:54:41 +08:00 |
|
Concedo
|
5a3b2e3921
|
fix for jamba models - they have recurrent layers like rwkv, so context shifting and forwarding wont work on them.
|
2025-07-12 18:54:40 +08:00 |
|
Concedo
|
e9473305d0
|
wip2 (+1 squashed commits)
Squashed commits:
[4628777b6] wip
|
2025-07-12 18:54:40 +08:00 |
|
Concedo
|
c45b8dc56f
|
fix for gemma3n
|
2025-07-10 17:39:08 +08:00 |
|
Reithan
|
0097de5c57
|
improve performance by actually applying nsigma's masking (#1602)
merging, please report any issues.
|
2025-07-07 15:41:46 +08:00 |
|
Concedo
|
2e14338455
|
additional padding for the swa kv cache itself
|
2025-06-28 15:52:48 +08:00 |
|
Concedo
|
815d2056d9
|
gentoken reservations
|
2025-06-28 09:16:20 +08:00 |
|
Concedo
|
39b0699c71
|
fixed savestates with drafting
|
2025-06-27 20:35:38 +08:00 |
|
Reithan
|
54dde5e565
|
Add memoized cache to llama_grammar_reject_candidates_for_stack (#1615)
* Add memoized cache to llama_grammar_reject_candidates_for_stack
* make size cutoff more aggressive and move to outer branch
* update comment
* add cache reset whenever grammar is reloaded
* remove explicit reference types for compiler transportability
|
2025-06-25 19:22:19 +08:00 |
|
Concedo
|
65ff041827
|
added more perf stats
|
2025-06-21 12:12:28 +08:00 |
|
Reithan
|
f07434f4c1
|
streamline grammar sampler to speed up generation while using heavy grammar (#1606)
|
2025-06-17 23:04:59 +08:00 |
|
Concedo
|
c494525b33
|
update deprecated apis
|
2025-06-13 22:21:15 +08:00 |
|
Reithan
|
f1c9db4174
|
fix-loss-of-destroyed-tokens-in-grammar-pre-pass (#1600)
|
2025-06-13 18:46:38 +08:00 |
|