Commit graph

494 commits

Author SHA1 Message Date
Concedo
349c461453 add stop reason for error 2026-02-04 20:23:18 +08:00
Concedo
226c79338f handle glm4.7 flash template 2026-01-28 23:29:08 +08:00
Concedo
5c6cc02985 remove clblast, part 2 2026-01-23 14:09:46 +08:00
Concedo
3816391a74 increase logprobs returned to 10 2026-01-18 11:13:42 +08:00
Concedo
62bea5ef4f allow overriding the devices directly 2026-01-17 19:08:06 +08:00
Concedo
983baac46b Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/vulkan.Dockerfile
#	.github/workflows/build.yml
#	ci/run.sh
#	examples/model-conversion/Makefile
#	examples/model-conversion/README.md
#	examples/model-conversion/scripts/causal/compare-logits.py
#	examples/model-conversion/scripts/embedding/run-converted-model.sh
#	examples/model-conversion/scripts/utils/common.py
#	examples/model-conversion/scripts/utils/semantic_check.py
#	ggml/src/ggml-cann/ggml-cann.cpp
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-opencl/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-sycl/ggml-sycl.cpp
#	ggml/src/ggml-webgpu/ggml-webgpu.cpp
#	scripts/pr2wt.sh
#	scripts/sync_vendor.py
#	tests/test-arg-parser.cpp
2026-01-09 01:23:10 +08:00
Concedo
d8942cde14 smartcache allow custom number of slots 2026-01-02 17:19:40 +08:00
Concedo
bfa2ae7744 fixed smartcache bug when used with images 2026-01-02 00:35:05 +08:00
Concedo
51edb6ae61 allow clip fa for anything besides cuda on gpu 2026-01-01 21:09:51 +08:00
Concedo
54e419f587 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/docker.yml
#	docs/ops.md
#	docs/ops/Metal.csv
#	ggml/CMakeLists.txt
#	ggml/src/ggml-sycl/CMakeLists.txt
#	grammars/README.md
#	models/templates/llama-cpp-deepseek-r1.jinja
#	scripts/sync-ggml.last
#	tests/test-chat.cpp
2026-01-01 15:34:10 +08:00
Concedo
76ef726ec8 adaptive p sharpness to 10.0f 2025-12-31 17:28:30 +08:00
Concedo
0e26e4d354 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/ISSUE_TEMPLATE/010-bug-compilation.yml
#	.github/ISSUE_TEMPLATE/011-bug-results.yml
#	.github/ISSUE_TEMPLATE/019-bug-misc.yml
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cuda/CMakeLists.txt
#	ggml/src/ggml-opencl/ggml-opencl.cpp
#	ggml/src/ggml-rpc/ggml-rpc.cpp
2025-12-28 23:47:55 +08:00
Concedo
21d801f6d5 init total weight for adaptive p 2025-12-28 15:33:06 +08:00
Concedo
27261bfc26 adaptive decay as an overridable param (+1 squashed commits)
Squashed commits:

[d94df7843] adaptive decay as an overridable param
2025-12-28 13:34:20 +08:00
Concedo
6548645aaa rename power law sampler to adaptive p 2025-12-27 17:50:58 +08:00
Concedo
9bb362cce9 revised power law sampling 2025-12-27 10:59:46 +08:00
Concedo
91d8863f18 power law sampler added 2025-12-27 09:46:06 +08:00
Concedo
cf4201e213 wip power law sampling 2025-12-25 22:01:16 +08:00
Concedo
4b899b19dc fixed save state a bit better 2025-12-21 22:24:13 +08:00
Concedo
80a0269dbe improve snapshotting for rnn 2025-12-21 21:07:31 +08:00
Concedo
fedd529fdc autofit counts overheads 2025-12-21 14:31:08 +08:00
Concedo
2e57e5ead4 rename eval function 2025-12-19 17:54:23 +08:00
Concedo
e9ae0cb2dd added support for RNN models in smartcache 2025-12-19 16:36:25 +08:00
Concedo
cde4791e36 fix tools building 2025-12-19 12:08:29 +08:00
Concedo
fb31059f9c fixed a bug in vision with mrope, mrope is refactored to match upstream, should be more accurate now 2025-12-19 01:23:52 +08:00
Concedo
cefb32df19 track clip img patch nx and ny 2025-12-18 22:58:10 +08:00
Concedo
fae2ff6d2d fix override tensors string matching issue (+2 squashed commit)
Squashed commit:

[850340501] will be deleted later, quick test

[eb4f569a7] debug buffer types
2025-12-18 21:22:49 +08:00
Concedo
04122d0a24 projector tags 2025-12-17 23:12:25 +08:00
Concedo
1daeed5d4d Merge commit '9963b81f63' into concedo_experimental
# Conflicts:
#	.github/workflows/server.yml
#	SECURITY.md
#	docs/backend/SYCL.md
#	examples/model-conversion/README.md
#	examples/model-conversion/scripts/embedding/compare-embeddings-logits.sh
#	ggml/src/ggml-hexagon/ggml-hexagon.cpp
#	ggml/src/ggml-hexagon/htp/matmul-ops.c
#	tests/CMakeLists.txt
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
2025-12-17 20:30:34 +08:00
Concedo
1e083d9c8b integrate autofit for upstream, removed forceversion 2025-12-17 18:42:47 +08:00
Concedo
cacfa37611 wip 2025-12-17 16:04:45 +08:00
Concedo
fd0d0cab03 move pipeline parallelism to a --pipelineparallel launch flag 2025-12-11 21:03:41 +08:00
Concedo
34634aef1b tweak to smartcache for contextshifting 2025-12-10 20:08:11 +08:00
Concedo
8a18e094f5 added smartcaching implementation inspired from Pento95 (+2 squashed commit)
Squashed commit:

[fcc498688] wip basic smart caching test

[b6e8b2577] wip basic smart caching test
2025-12-10 18:00:03 +08:00
Concedo
b867b67e7e added mechanics for a full clear if fast forward is not used, this should help recover from bad states 2025-12-05 16:43:37 +08:00
Concedo
65a3b75dac rnn warning fix 2025-11-30 12:55:43 +08:00
Ruben Garcia
06d39dff73
Fix warnings (#1864) 2025-11-29 20:18:38 +08:00
Concedo
9a46faa1c3 fix for override tensors not passing correctly 2025-11-28 13:03:40 +08:00
Concedo
782ec5bffe bad identifier name 2025-11-27 11:07:13 +08:00
Concedo
d68f4a5ae5 disable clip fa for now 2025-11-27 10:20:38 +08:00
Concedo
6770767d8a allow FA for clip but with wmma disabled for turing on bad sizes 2025-11-27 01:03:29 +08:00
Concedo
e6ad29341b disable FA for clip test 2025-11-27 01:02:19 +08:00
CasualAutopsy
7703bed260
Temp: Fix Needlessly Iterating on Candidates During Greedy Sampling (#1854) 2025-11-22 16:06:50 +08:00
Concedo
8631bbcee3 linting 2025-11-18 18:56:31 +08:00
LostRuins Concedo
7aea1d7c02 clean up unused llava functions, fix qwen3vl loading 2025-11-18 10:34:55 +08:00
LostRuins Concedo
281542aa0d add smoothing curve, not tested 2025-11-17 23:07:35 +08:00
LostRuins Concedo
3fe0e39b62 Merge commit '4dca015b7e' into concedo_experimental
# Conflicts:
#	.github/copilot-instructions.md
#	README.md
#	docs/ops.md
#	docs/ops/CPU.csv
#	docs/ops/CUDA.csv
#	docs/ops/Vulkan.csv
#	ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
2025-11-16 18:33:58 +08:00
LostRuins Concedo
86f907272a relocated shader compile warning 2025-11-15 23:17:47 +08:00
LostRuins Concedo
d6a2ad8455 still not really working right 2025-11-09 01:57:48 +08:00
LostRuins Concedo
cfb22b5c9d rename a missed BLAS -> batch 2025-11-06 16:11:26 +08:00