Xuan-Son Nguyen
994118a183
model: move load_hparams and load_tensors to per-model definition ( #22004 )
...
* git-friendly migration
* add build_graph
* nits
* exclude old code from build
* wip
* add llm_arch_model_i
* prepare downstream functions
* nits
* nits
* wip
* wip
* add back create_tensor_qkv
* fix files missing include
* enforce one llm_build per arch
* cmake: use glob
* missing model params
* nits
* wip
* wip (2)
* wip (3)
* test-llama-archs is happy
* improve switch case
* move more stuff into llm_arch_model_i
* fix downstream code
* nits
* nits (2)
* fix order
* llama_model_base
* LLAMA_LOAD_LOCALS
* small fix
* fix build errors
* auto
* rm migration script and ifdef
2026-05-04 12:36:59 +02:00
Sigbjørn Skjæret
4f02d47339
model : refactor bias tensor variable names ( #22079 )
...
* refactor bias tensor variable names
* use create_tensor_qkv for jina-bert-v2
2026-04-18 20:12:00 +02:00
PikaPikachu
9db77a020c
model : refactor QKV into common build_qkv and create_tensor_qkv helpers ( #21245 )
...
* model : refactor QKV into common build_qkv and create_tensor_qkv helpers
* model : extend build_qkv to bert/mpt/dbrx/olmo/lfm2/nemotron-h/granite-hybrid/gemma3n-iswa/t5-dec and fix wqkv_s
2026-04-16 17:41:34 +02:00
Sigbjørn Skjæret
f772f6e434
model : support NVFP4 tensors for Gemma4 ( #21971 )
...
* support nvfp4 tensors for Gemma4
* add wo_s to build_attn
* add wo_s to build_attn
* fix glm4
2026-04-16 16:51:47 +02:00
Xuan-Son Nguyen
59db9a357d
llama: dynamic head_dim and n_rot for SWA ( #20301 )
...
* llama: dynamic head_dim and n_rot for SWA
* also add gguf_writer wrappers
* fix build
* build_rope_shift arg reorder
2026-03-09 22:22:39 +01:00
megemini
237958db33
model: Add PaddleOCR-VL model support ( #18825 )
...
* support PaddleOCR-VL
* clip: update PaddleOCR model loader parameters to prevent OOM during warmup
* [update] add paddleocr vl text model instead of ernie4.5
* [update] restore change of minicpmv
* [update] format
* [update] format
* [update] positions and patch merge permute
* [update] mtmd_decode_use_mrope for paddleocr
* [update] image min/max pixels
* [update] remove set_limit_image_tokens
* upate: preprocess without padding
* clean up
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-02-19 17:05:25 +01:00