Commit graph

63 commits

Author SHA1 Message Date
Concedo
52606e9b1d tts cpp model is now loadable in kcpp 2025-08-17 15:47:22 +08:00
Concedo
7b5cf7143f handle gguf already containing renamed diffusion tensors prefix 2025-08-12 22:42:29 +08:00
Concedo
3468c2834d fixed adv mode 2025-08-08 22:26:36 +08:00
Concedo
61c19fea56 fixed glm4 sop, lower regex max stacks (+2 squashed commit)
Squashed commit:

[47e39ae5d] lower regex max stack again

[0a32ca232] lower regex max stack again
2025-08-06 17:10:57 +08:00
Concedo
5a3b2e3921 fix for jamba models - they have recurrent layers like rwkv, so context shifting and forwarding wont work on them. 2025-07-12 18:54:40 +08:00
Concedo
c45b8dc56f fix for gemma3n 2025-07-10 17:39:08 +08:00
Concedo
f125e724eb fix off-by-one npast during some instances of fast forwarding 2025-05-22 19:51:21 +08:00
Concedo
f841b29c41 fixed unicode paths 2025-05-11 14:05:54 +08:00
Concedo
c2802af9e8 fix qwen3, fixed sd, fixed glm4 2025-04-29 20:50:46 +08:00
Concedo
4decd6bea1 GLM4 batch clamp 2025-04-26 09:42:17 +08:00
Concedo
35dc8387e9 fixed rwkv7 handling 2025-04-26 02:13:06 +08:00
Concedo
0460d92cc3 disable context shifting for gemma3 2025-03-13 20:28:26 +08:00
Concedo
b162c25a5e fixed moe experts to use detected arch for key 2025-02-10 17:46:08 +08:00
Concedo
e788b8289a You'll never take us alive
We swore that death will do us part
They'll call our crimes a work of art
2025-01-09 11:27:06 +08:00
Concedo
00d154b32b wip on qwen2vl integration, updated msvc runtimes 2024-12-15 23:58:02 +08:00
Concedo
bb13925f39 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	CMakePresets.json
#	Makefile
#	Package.swift
#	ci/run.sh
#	common/CMakeLists.txt
#	examples/CMakeLists.txt
#	flake.lock
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-backend.cpp
#	ggml/src/ggml.c
#	pocs/vdot/q8dot.cpp
#	pocs/vdot/vdot.cpp
#	tests/test-backend-ops.cpp
#	tests/test-grad0.cpp
#	tests/test-quantize-fns.cpp
#	tests/test-quantize-perf.cpp
#	tests/test-rope.cpp
2024-11-04 16:54:53 +08:00
Concedo
fc7fe2e7a0 allow rwkv6 to run although its broken 2024-09-09 20:50:58 +08:00
Concedo
0dd3907940 qwen2 warning FA 2024-07-09 20:53:25 +08:00
Nexesenex
cb2336f5d9
Gradient rope formula with offsets (#938)
* Gradient rope formula with offsets

Positive for Solar models
Negative for Llama 1 and 2 models

* Update gpttype_adapter.cpp

Remove L1/L2

* cleanup PR, skip llama models, keep prints behind debug mode

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-06-25 20:46:34 +08:00
askmyteapot
1e72b65c38
GradientAI Auto ROPE Base calculation (#910)
* GradientAI Auto ROPE Base calculation

https://gradient.ai/blog/scaling-rotational-embeddings-for-long-context-language-models
has a formula that better fits the ideal rope scaling. 

Tested with Lllama3, checked calculation is correct for llama2. Retains logic for not scaling rope if under trained CTX.

* add in solar scaling logic

Solar based models require the context values to be multiplied by 8. This is (i'm guessing) because the positions as based on a 32k context, but sliding window of 4k.

* Update model_adapter.h

adding in tensor count to identify solar models based on tensor count of 435.

* Update model_adapter.cpp

add in n_tensor count for solar identification

* refactor and cleanup GradientAI rope scaling

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-06-13 18:12:00 +08:00
Concedo
47c42fd45c fix for mamba processing 2024-03-13 13:27:46 +08:00
Concedo
f75e479db0 WIP on sdcpp integration 2024-02-29 00:40:07 +08:00
Concedo
762eeb6204 triage for opencl 2024-01-27 11:09:43 +08:00
Concedo
d9a7bd577a gpu layer offloading disabled for phi models in clblast 2024-01-25 17:40:05 +08:00
Concedo
375003b458 always show reported arch 2023-12-22 11:15:07 +08:00
Concedo
8b919b5b57 allow customized rope to use model set values 2023-11-15 16:21:52 +08:00
Concedo
5db89b90b7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	README.md
#	build.zig
#	ggml-opencl.cpp
#	tests/CMakeLists.txt
#	tests/test-double-float.cpp
#	tests/test-sampling.cpp
2023-10-25 23:58:15 +08:00
Concedo
839fc6dac8 handle freq_base_train 2023-10-24 23:44:22 +08:00
Concedo
c1ca1de2ac fixed support for old falcon models 2023-10-18 17:20:44 +08:00
Concedo
7fb809b94b fixed auto rope scaling (+1 squashed commits)
Squashed commits:

[b1767874] wip
2023-09-07 14:45:08 +08:00
Concedo
d4c22a8b02 updated lite, added autorope config based on trained ctxlen, hotfix for falcon gpu broken 2023-08-30 16:50:55 +08:00
Concedo
4b00916ac7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.dockerignore
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	flake.lock
#	flake.nix
#	tests/CMakeLists.txt
2023-08-28 14:19:05 +08:00
Concedo
bfdc596d58 gguf reader in file format detection 2023-08-23 19:19:52 +08:00
Concedo
39cc83e8c9 incomplete merge, compiles but generates rubbish 2023-08-22 23:12:47 +08:00
Concedo
3a7853d259 handle stablecode-completion-alpha-3b 2023-08-09 21:07:57 +08:00
Concedo
df9135e3a9 fixing memory bugs 2023-06-23 18:41:23 +08:00
Concedo
9b6c35b651 rwkv speed enhancements (batch processing), fixed a rwkv token processing bug 2023-06-13 16:02:12 +08:00
Concedo
6f82e17b7a added MPT support 2023-06-03 16:14:08 +08:00
Concedo
5d9f5b28a6 rwkv integration completed 2023-05-28 00:48:56 +08:00
Concedo
01a0f206df added support for starcoder, which is basically gpt2 2023-05-27 13:35:40 +08:00
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
1225fab2ec fix f16 format detection in neox 2023-05-20 11:26:50 +08:00
Concedo
f65bae760a Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental
# Conflicts:
#	ggml-opencl.cpp
2023-05-18 15:52:35 +08:00
Concedo
00da2a5f4e neox is updated 2023-05-17 14:56:54 +08:00
Concedo
b692e4d2a4 wip 2023-05-14 17:21:07 +08:00
Concedo
8a5fe628df recognize q8_0 as an older format as the new clblast doesnt work correctly with it 2023-05-14 11:06:23 +08:00
Concedo
05cf5f7d6e partially working, but the blas matmul is broken 2023-05-13 11:35:38 +08:00
Concedo
5eec5d6ed9 Added backwards compatibility to an earlier version of NeoX. 2023-04-25 20:34:18 +08:00
Concedo
c454f8b848 Gpt NeoX / Pythia integration completed 2023-04-22 11:23:25 +08:00
Concedo
ef13443047 wip pythia integration 2023-04-22 01:08:23 +08:00