Commit graph

10 commits

Author SHA1 Message Date
Xuan-Son Nguyen
994118a183
model: move load_hparams and load_tensors to per-model definition (#22004)
* git-friendly migration

* add build_graph

* nits

* exclude old code from build

* wip

* add llm_arch_model_i

* prepare downstream functions

* nits

* nits

* wip

* wip

* add back create_tensor_qkv

* fix files missing include

* enforce one llm_build per arch

* cmake: use glob

* missing model params

* nits

* wip

* wip (2)

* wip (3)

* test-llama-archs is happy

* improve switch case

* move more stuff into llm_arch_model_i

* fix downstream code

* nits

* nits (2)

* fix order

* llama_model_base

* LLAMA_LOAD_LOCALS

* small fix

* fix build errors

* auto

* rm migration script and ifdef
2026-05-04 12:36:59 +02:00
Sigbjørn Skjæret
4f02d47339
model : refactor bias tensor variable names (#22079)
* refactor bias tensor variable names

* use create_tensor_qkv for jina-bert-v2
2026-04-18 20:12:00 +02:00
PikaPikachu
9db77a020c
model : refactor QKV into common build_qkv and create_tensor_qkv helpers (#21245)
* model : refactor QKV into common build_qkv and create_tensor_qkv helpers

* model : extend build_qkv to bert/mpt/dbrx/olmo/lfm2/nemotron-h/granite-hybrid/gemma3n-iswa/t5-dec and fix wqkv_s
2026-04-16 17:41:34 +02:00
Sigbjørn Skjæret
f772f6e434
model : support NVFP4 tensors for Gemma4 (#21971)
* support nvfp4 tensors for Gemma4

* add wo_s to build_attn

* add wo_s to build_attn

* fix glm4
2026-04-16 16:51:47 +02:00
Michael Grau
6729d4920c
model : add control vector support where missing (#20653)
* Add control vector functions to qwen3.5 and qwen-next models

* Add missing cvec compatibility to the rest of the models

* Adjust comments and formatting

* cleanup

* whitespace

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-18 23:25:12 +01:00
Xuan-Son Nguyen
59db9a357d
llama: dynamic head_dim and n_rot for SWA (#20301)
* llama: dynamic head_dim and n_rot for SWA

* also add gguf_writer wrappers

* fix build

* build_rope_shift arg reorder
2026-03-09 22:22:39 +01:00
Sigbjørn Skjæret
35bee031e1
graph : remove redundant scale_w parameter (#20235) 2026-03-08 18:58:28 +01:00
Sigbjørn Skjæret
eadc4184ca
llama : refactor rope_freq_base/scale_swa conversion and init (#18553)
* refactor rope_freq_base/scale_swa conversion and init

* safe defaults for unknowns

* update relevant models

* grammar

* add get_rope_freq_scale to modern-bert

* const

* const

* log swa info
2026-01-05 09:14:04 +01:00
Sigbjørn Skjæret
b164259bba
chore : fix models indent after refactor (#16992) 2025-11-04 12:29:15 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp (#16252)
* Sqashed: llama-model.cpp refactoring

* Fix formatting of attn / ffn / ffn_moe calls

* Fix import regression / unify spacing in models.h

* totally DID NOT miss those!

* Add missing qwen3vl(moe) models

* Add missing new .cpp files to build

* Remove extra semicolons

* Editor checker

* Update src/models/models.h

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-10-31 23:40:23 +01:00