prima.cpp/models
jaime-m-p 3b38d48609
Per token attributes (#7685)
* Add per token attributes enum
* Using phi-3 for testing 'rstrip'
* Using jina-v2 for testing 'lstrip'
* Brute force test for 'lstrip' and 'rstrip'
* Implement 'rstrip' and 'lstrip'
* Update phi-3 GGUF file (obsolete since 917dc8c)
* Replace llama_token_type with llama_token_attribs
2024-06-04 09:17:17 +02:00
..
.editorconfig
ggml-vocab-aquila.gguf
ggml-vocab-baichuan.gguf
ggml-vocab-bert-bge.gguf
ggml-vocab-bert-bge.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-bert-bge.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-command-r.gguf command-r : add BPE pre-tokenization (#7063) 2024-05-05 08:19:30 +03:00
ggml-vocab-command-r.gguf.inp command-r : add BPE pre-tokenization (#7063) 2024-05-05 08:19:30 +03:00
ggml-vocab-command-r.gguf.out command-r : add BPE pre-tokenization (#7063) 2024-05-05 08:19:30 +03:00
ggml-vocab-deepseek-coder.gguf
ggml-vocab-deepseek-coder.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-deepseek-coder.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-deepseek-llm.gguf
ggml-vocab-deepseek-llm.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-deepseek-llm.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-falcon.gguf
ggml-vocab-falcon.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-falcon.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-gpt-2.gguf
ggml-vocab-gpt-2.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-gpt-2.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-gpt-neox.gguf
ggml-vocab-gpt2.gguf
ggml-vocab-llama-bpe.gguf
ggml-vocab-llama-bpe.gguf.inp llama : lookup word in vocab before doing BPE merges (#7193) 2024-05-11 11:12:06 +03:00
ggml-vocab-llama-bpe.gguf.out llama : lookup word in vocab before doing BPE merges (#7193) 2024-05-11 11:12:06 +03:00
ggml-vocab-llama-spm.gguf
ggml-vocab-llama-spm.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-llama-spm.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-mpt.gguf
ggml-vocab-mpt.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-mpt.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-phi-3.gguf Per token attributes (#7685) 2024-06-04 09:17:17 +02:00
ggml-vocab-phi-3.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-phi-3.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-qwen2.gguf llama : add BPE pre-tokenization for Qwen2 (#7114) 2024-05-08 15:06:43 +03:00
ggml-vocab-qwen2.gguf.inp llama : add BPE pre-tokenization for Qwen2 (#7114) 2024-05-08 15:06:43 +03:00
ggml-vocab-qwen2.gguf.out llama : add BPE pre-tokenization for Qwen2 (#7114) 2024-05-08 15:06:43 +03:00
ggml-vocab-refact.gguf tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-refact.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-refact.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-stablelm.gguf
ggml-vocab-starcoder.gguf
ggml-vocab-starcoder.gguf.inp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
ggml-vocab-starcoder.gguf.out tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00