koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2026-06-01 14:29:33 +00:00

History

Johannes Gäßler e11bd856d5 CPU/CUDA: Gemma 2 FlashAttention support (#8542 ) * CPU/CUDA: Gemma 2 FlashAttention support * apply logit_softcap to scale in kernel * disable logit softcapping tests on Metal * remove metal check		2024-08-24 21:34:59 +02:00
..
CMakeLists.txt	llama : move vocab, grammar and sampling into separate files (#8508 )	2024-07-23 13:10:17 +03:00
llama-grammar.cpp	ggml : reduce hash table reset cost (#8698 )	2024-07-27 04:41:55 +02:00
llama-grammar.h	llama : fix build + fix fabs compile warnings (#8683 )	2024-07-25 19:57:31 +03:00
llama-impl.h	llama : better replace_all (cont) (#8926 )	2024-08-09 18:23:52 +03:00
llama-sampling.cpp	Fix a spelling mistake (#9001 )	2024-08-12 11:46:03 +02:00
llama-sampling.h	llama : move vocab, grammar and sampling into separate files (#8508 )	2024-07-23 13:10:17 +03:00
llama-vocab.cpp	llama : std::move llm_bigram_bpe from work_queue (#9062 )	2024-08-21 10:32:58 +03:00
llama-vocab.h	common : remove duplicate function llama_should_add_bos_token (#8778 )	2024-08-15 10:23:23 +03:00
llama.cpp	CPU/CUDA: Gemma 2 FlashAttention support (#8542 )	2024-08-24 21:34:59 +02:00
unicode-data.cpp	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
unicode-data.h	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
unicode.cpp	llama : move vocab, grammar and sampling into separate files (#8508 )	2024-07-23 13:10:17 +03:00
unicode.h	llama : move vocab, grammar and sampling into separate files (#8508 )	2024-07-23 13:10:17 +03:00