koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-11 01:24:36 +00:00

Author	SHA1	Message	Date
Concedo	27b9358baf	Merge branch 'upstream' into concedo_experimental # Conflicts: # examples/run/run.cpp # scripts/sync-ggml.last	2025-02-08 01:31:49 +08:00
Christian Fillion	2d219b389e	vocab : ignore invalid UTF-8 input in the BPE tokenizer (#11729 ) Silently insert U+FFFD(s) (Unicode replacement character) instead until the next valid codepoint can be found. This fixes `llama_tokenize` throwing an exception across the C API boundary or libllama's module boundary (the caller's runtime might be incompatible!) Returing a proper error code might be desirable, however the signature of `llama_tokenize` doesn't allow it as all return values already have existing meaning.	2025-02-07 15:55:47 +02:00
Concedo	5329df2bdf	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/server.yml # CMakeLists.txt # cmake/build-info.cmake # examples/run/CMakeLists.txt # examples/run/run.cpp # examples/simple-chat/simple-chat.cpp # tests/CMakeLists.txt # tests/test-backend-ops.cpp # tests/test-sampling.cpp	2025-01-21 00:25:07 +08:00
Georgi Gerganov	4dd34ff831	cmake : add sanitizer flags for llama.cpp (#11279 ) * cmake : add sanitizer flags for llama.cpp ggml-ci * tests : fix compile warnings ggml-ci * cmake : move sanitizer flags to llama_add_compile_flags ggml-ci * cmake : move llama.cpp compile flags to top level lists ggml-ci * cmake : apply only sanitizer flags at top level ggml-ci * tests : fix gguf context use in same_tensor_data * gguf-test: tensor data comparison * dummy : trigger ggml-ci * unicode : silence gcc warnings ggml-ci * ci : use sanitizer builds only in Debug mode ggml-ci * cmake : add status messages [no ci] --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2025-01-18 16:18:15 +02:00
Concedo	dcfa1eca4e	Merge commit '`017cc5f446`' into concedo_experimental # Conflicts: # .github/ISSUE_TEMPLATE/010-bug-compilation.yml # .github/ISSUE_TEMPLATE/019-bug-misc.yml # CODEOWNERS # examples/batched-bench/batched-bench.cpp # examples/batched/batched.cpp # examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp # examples/gritlm/gritlm.cpp # examples/llama-bench/llama-bench.cpp # examples/passkey/passkey.cpp # examples/quantize-stats/quantize-stats.cpp # examples/run/run.cpp # examples/simple-chat/simple-chat.cpp # examples/simple/simple.cpp # examples/tokenize/tokenize.cpp # ggml/CMakeLists.txt # ggml/src/ggml-metal/CMakeLists.txt # ggml/src/ggml-vulkan/CMakeLists.txt # scripts/sync-ggml.last # src/llama.cpp # tests/test-autorelease.cpp # tests/test-model-load-cancel.cpp # tests/test-tokenizer-0.cpp # tests/test-tokenizer-1-bpe.cpp # tests/test-tokenizer-1-spm.cpp	2025-01-08 23:15:21 +08:00
fairydreaming	9394bbd484	llama : Add support for DeepSeek V3 (#11049 ) * convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type * vocab : add DeepSeek V3 pre-tokenizer regexes * unicode : handle ACCENT_MARK and SYMBOL categories in regex * llama : add DeepSeek V3 chat template, handle new model parameters and tensor types --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>	2025-01-04 21:06:11 +01:00
Concedo	ee486bad3e	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # README.md # examples/CMakeLists.txt # examples/batched/batched.cpp # examples/gritlm/gritlm.cpp # examples/llama.android/llama/build.gradle.kts # examples/main/README.md # examples/retrieval/retrieval.cpp # examples/server/CMakeLists.txt # examples/server/README.md # ggml/CMakeLists.txt # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml.c # scripts/compare-commits.sh # scripts/sync-ggml.last # tests/CMakeLists.txt # tests/test-backend-ops.cpp # tests/test-chat-template.cpp # tests/test-sampling.cpp	2024-12-19 11:57:43 +08:00
Georgi Gerganov	08ea539df2	unicode : improve naming style (#10838 ) * unicode : improve naming style ggml-ci * cont [no ci]	2024-12-16 12:31:45 +02:00
Concedo	557bcaf86e	Merge branch 'upstream' into concedo_experimental # Conflicts: # .clang-tidy # .github/workflows/build.yml # Makefile # Package.swift # common/CMakeLists.txt # examples/batched-bench/CMakeLists.txt # examples/batched/CMakeLists.txt # examples/convert-llama2c-to-ggml/CMakeLists.txt # examples/cvector-generator/CMakeLists.txt # examples/embedding/CMakeLists.txt # examples/eval-callback/CMakeLists.txt # examples/export-lora/CMakeLists.txt # examples/gbnf-validator/CMakeLists.txt # examples/gguf-split/CMakeLists.txt # examples/gguf/CMakeLists.txt # examples/gritlm/CMakeLists.txt # examples/imatrix/CMakeLists.txt # examples/infill/CMakeLists.txt # examples/llama-bench/CMakeLists.txt # examples/llava/CMakeLists.txt # examples/lookahead/CMakeLists.txt # examples/lookup/CMakeLists.txt # examples/main-cmake-pkg/CMakeLists.txt # examples/main/CMakeLists.txt # examples/parallel/CMakeLists.txt # examples/passkey/CMakeLists.txt # examples/perplexity/CMakeLists.txt # examples/quantize-stats/CMakeLists.txt # examples/quantize/CMakeLists.txt # examples/retrieval/CMakeLists.txt # examples/run/CMakeLists.txt # examples/save-load-state/CMakeLists.txt # examples/server/CMakeLists.txt # examples/simple-chat/CMakeLists.txt # examples/simple/CMakeLists.txt # examples/speculative-simple/CMakeLists.txt # examples/speculative/CMakeLists.txt # examples/tokenize/CMakeLists.txt # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # ggml/src/ggml-backend.cpp # ggml/src/ggml-cpu/CMakeLists.txt # ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt # pocs/vdot/CMakeLists.txt # src/CMakeLists.txt # src/unicode.cpp # tests/test-sampling.cpp	2024-11-30 12:24:51 +08:00
Diego Devesa	7cc2d2c889	ggml : move AMX to the CPU backend (#10570 ) * ggml : move AMX to the CPU backend --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-11-29 21:54:58 +01:00
Concedo	3e1cbedbae	Merge commit '`c83ad6d01e`' into concedo_experimental # Conflicts: # .github/workflows/bench.yml.disabled # Makefile # Package.swift # README.md # docs/backend/SYCL.md # examples/CMakeLists.txt # examples/benchmark/benchmark-matmult.cpp # ggml/src/CMakeLists.txt # scripts/sync-ggml-am.sh # scripts/sync-ggml.sh # src/llama.cpp # tests/test-backend-ops.cpp	2024-10-05 22:17:33 +08:00
Xuan Son Nguyen	a39ab216aa	llama : reduce compile time and binary size (#9712 ) * llama : speed up compile time * fix build * fix build (2)	2024-10-02 15:49:55 +02:00
Concedo	29625c3d2e	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/server.yml # CMakeLists.txt # Makefile # README.md # ci/run.sh # common/CMakeLists.txt # common/common.cpp # docs/backend/SYCL.md # examples/embedding/embedding.cpp # examples/imatrix/imatrix.cpp # examples/infill/infill.cpp # examples/llama-bench/llama-bench.cpp # examples/main/README.md # examples/parallel/parallel.cpp # examples/perplexity/perplexity.cpp # examples/server/CMakeLists.txt # examples/server/README.md # examples/server/bench/README.md # examples/server/tests/README.md # examples/speculative/speculative.cpp # flake.lock # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # grammars/README.md # scripts/compare-commits.sh # scripts/compare-llama-bench.py # tests/CMakeLists.txt	2024-09-19 14:53:57 +08:00
Yuri Khrustalev	503147a9f9	unicode : add <algorithm> (#9508 )	2024-09-17 09:51:15 +03:00
Concedo	eb5b4d0186	Merge branch 'upstream' into concedo_experimental # Conflicts: # Makefile # Package.swift # src/CMakeLists.txt # src/llama.cpp # tests/test-grammar-integration.cpp # tests/test-llama-grammar.cpp	2024-07-23 23:20:32 +08:00
Georgi Gerganov	938943cdbf	llama : move vocab, grammar and sampling into separate files (#8508 ) * llama : move sampling code into llama-sampling ggml-ci * llama : move grammar code into llama-grammar ggml-ci * cont ggml-ci * cont : pre-fetch rules * cont ggml-ci * llama : deprecate llama_sample_grammar * llama : move tokenizers into llama-vocab ggml-ci * make : update llama.cpp deps [no ci] * llama : redirect external API to internal APIs ggml-ci * llama : suffix the internal APIs with "_impl" ggml-ci * llama : clean-up	2024-07-23 13:10:17 +03:00
Concedo	2cad736260	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/nix/package.nix # .github/labeler.yml # .gitignore # CMakeLists.txt # Makefile # Package.swift # README.md # ci/run.sh # docs/build.md # examples/CMakeLists.txt # flake.lock # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # grammars/README.md # requirements/requirements-convert_hf_to_gguf.txt # requirements/requirements-convert_hf_to_gguf_update.txt # scripts/check-requirements.sh # scripts/compare-llama-bench.py # scripts/gen-unicode-data.py # scripts/sync-ggml-am.sh # scripts/sync-ggml.last # scripts/sync-ggml.sh # tests/test-backend-ops.cpp # tests/test-chat-template.cpp # tests/test-tokenizer-random.py	2024-07-11 16:36:16 +08:00
Borislav Stanimirov	7a80710d93	msvc : silence codecvt c++17 deprecation warnings (#8395 )	2024-07-10 14:40:53 +03:00
Concedo	8e5fd6f509	Merge branch 'upstream' into concedo_experimental # Conflicts: # .gitignore # README.md # docs/backend/BLIS.md # docs/backend/SYCL.md # docs/development/llama-star/idea-arch.key # docs/development/llama-star/idea-arch.pdf # docs/development/token_generation_performance_tips.md # src/llama.cpp # tests/test-tokenizer-0.cpp # tests/test-tokenizer-1-bpe.cpp # tests/test-tokenizer-1-spm.cpp # tests/test-tokenizer-random.py	2024-07-06 19:39:24 +08:00
jaime-m-p	213701b51a	Detokenizer fixes (#8039 ) * Add llama_detokenize(): - Update header files location - UNKNOWN and CONTROL are 'special pieces' - Remove space after UNKNOWN and CONTROL - Refactor llama_token_to_piece() - Add flag: clean_up_tokenization_spaces - Symmetric params for llama_tokenize() and llama_detokenize() * Update and fix tokenizer tests: - Using llama_detokenize() - Unexpected vocab type as test fail instead of error - Useful when automating tests: - If you don't know in advance the vocab type - Differenciate other loading errors - Skip unicode surrogaes and undefined - Gracefully exit threads - Using exit() is throwing random exceptions - Clean old known problematic codepoints - Minor: confusing hexadecimal codepoint * Update bruteforce random tests - Add detokenizer checks - New generator: ascii_lr_strip - New generator: apostrophe - Add more vocabs files - Detokenize special tokens. - Replace errors with '\uFFFD' when detokenizing to 'utf-8' - More edge cases - Better detokenization results check * Fix add_space_prefix, set false by default * Better leading space removal * Do not remove space when decoding special tokens * Bugfix: custom regexs splits undefined unicode codepoints * 'viking' detokenizer clean spaces	2024-07-05 19:01:35 +02:00
Concedo	388a2aff00	workaround for deepseek not working	2024-07-05 22:39:53 +08:00
Georgi Gerganov	f3f65429c4	llama : reorganize source code + improve CMake (#8006 ) * scripts : update sync [no ci] * files : relocate [no ci] * ci : disable kompute build [no ci] * cmake : fixes [no ci] * server : fix mingw build ggml-ci * cmake : minor [no ci] * cmake : link math library [no ci] * cmake : build normal ggml library (not object library) [no ci] * cmake : fix kompute build ggml-ci * make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE ggml-ci * move public backend headers to the public include directory (#8122) * move public backend headers to the public include directory * nix test * spm : fix metal header --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * scripts : fix sync paths [no ci] * scripts : sync ggml-blas.h [no ci] --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-06-26 18:33:02 +03:00

22 commits