vocab : refactor tokenizer to reduce init overhead (#9449)

* refactor tokenizer * llama : make llm_tokenizer more private ggml-ci * refactor tokenizer * refactor tokenizer * llama : make llm_tokenizer more private ggml-ci * remove unused files * remove unused fileds to avoid unused filed build error * avoid symbol link error * Update src/llama.cpp * Update src/llama.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-09-10 00:54:41 +00:00 · 2024-09-28 20:10:58 +08:00 · 2024-09-28 20:10:58 +08:00 · 6102037bbb
commit 6102037bbb
parent 9a913110cf
5 changed files with 238 additions and 141 deletions
--- a/src/llama.cpp
+++ b/src/llama.cpp
@ -6464,6 +6464,8 @@ static void llm_load_vocab(
    }
    GGML_ASSERT(vocab.id_to_token.size() == vocab.token_to_id.size());

+    vocab.init_tokenizer();
+
    // determine the newline token: LLaMA "<0x0A>" == 10 == '\n', Falcon 193 == '\n'
    if (vocab.type == LLAMA_VOCAB_TYPE_SPM) {
        // For Fill-In-the-Middle (FIM)/infill models which where converted